# PyParis logo
from IPython.display import Image
Image("PyParis.png")

Dataviz with matplotlib and seaborn - PyParis 2018¶

Francis Wolinski - Yotta Conseil¶

Python & data Science

twitter: https://twitter.com/@fran6wol
web: https://yotta-conseil.fr
github : https://github.com/fran6w/PyParis2018

0. Tutorial objectives and materials¶

0.1 Objectives¶

Introduction to matplotlib.pyplot
Advanced graphics with matplotlib
Introduction to seaborn
Mixing seaborn and matplotlib

0.2 Documentation¶

matplotlib : http://matplotlib.org
seaborn : http://seaborn.pydata.org

0.3 Materials¶

import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

# display options
pd.set_option("display.max_rows", 16)
pd.set_option("display.max_columns", 30)

1. Matplotlib.pyplot¶

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

In matplotlib.pyplot the 3 main objects are:

Figure: The top level container for all the plot elements.

Axes (ou Subplots): The Axes contains most of the figure elements and sets the coordinate system.

Axis: X or Y axis of a graphics, different from Axes.

Nota bene: all instructions from the creation of a figure to its display are cumulated in the same graphics in a script or in a cell of a notebook.

# style
plt.style.use('seaborn-darkgrid')
plt.subplots(figsize=(5, 5));

# available styles in matplotlib.pyplot.style.available
print(*plt.style.available, sep=' ')

bmh classic dark_background fast fivethirtyeight ggplot grayscale seaborn-bright seaborn-colorblind seaborn-dark-palette seaborn-dark seaborn-darkgrid seaborn-deep seaborn-muted seaborn-notebook seaborn-paper seaborn-pastel seaborn-poster seaborn-talk seaborn-ticks seaborn-white seaborn-whitegrid seaborn Solarize_Light2 _classic_test

# styling with context manager
with plt.style.context('fivethirtyeight'):
    plt.subplots(figsize=(5, 5))

1.1 Introduction¶

1. Elementary graphics¶

# pseudo-random walk
#np.random.seed(0)
plt.plot((np.random.random(100) - 0.5).cumsum());

2. Simple graphics¶

# a figure with a unique subplot
fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111)  # equivalent to ax = fig.add_subplot(1, 1, 1)
ax.set_title("Figure 1")
ax.plot((np.random.random(100) - 0.5).cumsum())
ax.axhline(y=0, color='k')
ax.legend(["Random walk"]);

Exercise 1

Implement a random walk with 2 curves in the same plot as a function.
Then add lines with the mean of each curve.
Watch out for the legends.

# %load exercises/ex1.py

def plot_random_walk2():
    fig = plt.figure(figsize=(8, 6))
    ax = fig.add_subplot(111)  # equivalent to ax = fig.add_subplot(1, 1, 1)
    ax.set_title("Figure 1")
    a = (np.random.random(100) - 0.5).cumsum()
    b = (np.random.random(100) - 0.5).cumsum()
    ax.plot(a, c='g')
    ax.plot(b, c='r')
    ax.axhline(y=a.mean(), color='g', ls=':')
    ax.axhline(y=b.mean(), color='r', ls=':')
    ax.legend(["Random walk 1", "Random walk 2"]);

plot_random_walk2()

Position of legend¶

Parameter loc, default value best.

It is also possible to set a relative position with the option bbox_to_anchor=(x, y):

x: 0.0 = left, 1.0 = right
y: 0.0 = bottom, 1.0 = top

Exercise 2

Modify the function so that all keywords parameters are passed to the `legend()` method.
Try different positions for the legend.
Try the option `bbox_to_anchor` to set the legend in the middle on the right outside of the figure.

# %load exercises/ex2.py

def plot_random_walk2(**kwargs):
    fig = plt.figure(figsize=(8, 6))
    ax = fig.add_subplot(111)  # equivalent to ax = fig.add_subplot(1, 1, 1)
    ax.set_title("Figure 1")
    a = (np.random.random(100) - 0.5).cumsum()
    b = (np.random.random(100) - 0.5).cumsum()
    ax.plot(a, c='g')
    ax.plot(b, c='r')
    ax.axhline(y=a.mean(), color='g', ls=':')
    ax.axhline(y=b.mean(), color='r', ls=':')
    ax.legend(["Random walk 1", "Random walk 2"], **kwargs);

plot_random_walk2(loc='lower left')
# plot_random_walk2(bbox_to_anchor=(1.3, 0.6))

3. Compound graphics¶

1) With the add_subplot() method.

# compound graphics
fig = plt.figure(figsize=(8, 6))

ax1 = fig.add_subplot(221)
ax1.set_title("Figure 1")
ax1.plot(np.random.random(10))

ax2 = fig.add_subplot(222)
ax2.set_title("Figure 2")
ax2.plot(np.random.random(10), 'r--')

ax3 = fig.add_subplot(223)
ax3.set_title("Figure 3")
x = np.random.random(10)
ax3.plot(x, 'c:')
ax3.plot(x, '*', color='darkred')

ax4 = fig.add_subplot(224)
ax4.set_title("Figure 4")
x = np.random.random(10)
ax4.plot(x, '-.', color='0.3')
ax4.plot(x, '^', color='#ff0080');

2) With the subplots() function.

# compound graphics
fig, [[ax1, ax2], [ax3, ax4]] = plt.subplots(2, 2, figsize=(8, 6))

ax1.set_title("Figure 1")
ax1.plot(np.random.random(10))

ax2.set_title("Figure 2")
ax2.plot(np.random.random(10), 'r--')

ax3.set_title("Figure 3")
x = np.random.random(10)
ax3.plot(x, 'c:')
ax3.plot(x, '*', color='darkred')

ax4.set_title("Figure 4")
x = np.random.random(10)
ax4.plot(x, '-.', color='0.3')
ax4.plot(x, '^', color='#ff0080');

In matplotlib there are:

4 types of lines: '-' (solid), '--' (dashed), ':' (dotted), '-.' (dashdotted)
several referential of colors:
- 8 basic colors: 'b' (blue), 'g' (green), 'r' (red), 'c' (cyan), 'm' (magenta), 'y' (yellow), 'k' (black), 'w' (white)
- grey levels: number between '0.0' (black) and '1.0' (white), in strings
- 148 named colors: see variable matplotlib.colors.cnames
- 16+ millions of RGB colors in hexadecimal: #xxyyzz
41 markers: see variable matplotlib.lines.Line2D.markers
Line width can also be set with the lw keyword.

print(*mpl.colors.cnames, sep=' ')

aliceblue antiquewhite aqua aquamarine azure beige bisque black blanchedalmond blue blueviolet brown burlywood cadetblue chartreuse chocolate coral cornflowerblue cornsilk crimson cyan darkblue darkcyan darkgoldenrod darkgray darkgreen darkgrey darkkhaki darkmagenta darkolivegreen darkorange darkorchid darkred darksalmon darkseagreen darkslateblue darkslategray darkslategrey darkturquoise darkviolet deeppink deepskyblue dimgray dimgrey dodgerblue firebrick floralwhite forestgreen fuchsia gainsboro ghostwhite gold goldenrod gray green greenyellow grey honeydew hotpink indianred indigo ivory khaki lavender lavenderblush lawngreen lemonchiffon lightblue lightcoral lightcyan lightgoldenrodyellow lightgray lightgreen lightgrey lightpink lightsalmon lightseagreen lightskyblue lightslategray lightslategrey lightsteelblue lightyellow lime limegreen linen magenta maroon mediumaquamarine mediumblue mediumorchid mediumpurple mediumseagreen mediumslateblue mediumspringgreen mediumturquoise mediumvioletred midnightblue mintcream mistyrose moccasin navajowhite navy oldlace olive olivedrab orange orangered orchid palegoldenrod palegreen paleturquoise palevioletred papayawhip peachpuff peru pink plum powderblue purple rebeccapurple red rosybrown royalblue saddlebrown salmon sandybrown seagreen seashell sienna silver skyblue slateblue slategray slategrey snow springgreen steelblue tan teal thistle tomato turquoise violet wheat white whitesmoke yellow yellowgreen

print(*mpl.lines.Line2D.markers, sep=' ')

. , o v ^ < > 1 2 3 4 8 s p * h H + x D d | _ P X 0 1 2 3 4 5 6 7 8 9 10 11 None None

Exercise 3

Modify the graphics above by adding 1 column with 2 figures.
Figure 5: solid line and gold color width of 2 + black squares.
Figure 6: dashed line and light grey + blue circles.

# %load exercises/ex3.py

# compound graphics
fig, [[ax1, ax2, ax3], [ax4, ax5, ax6]] = plt.subplots(2, 3, figsize=(12, 6))

ax1.set_title("Figure 1")
ax1.plot(np.random.random(10))

ax2.set_title("Figure 2")
ax2.plot(np.random.random(10), 'r--')

ax3.set_title("Figure 3")
x = np.random.random(10)
ax3.plot(x, 'c:')
ax3.plot(x, '*', color='darkred')

ax4.set_title("Figure 4")
x = np.random.random(10)
ax4.plot(x, '-.', color='0.3')
ax4.plot(x, '^', color='#ff0080')

ax5.set_title("Figure 5")
x = np.random.random(10)
ax5.plot(x, '-', color='gold', lw=2)
ax5.plot(x, 's', color='k')

ax6.set_title("Figure 6")
x = np.random.random(10)
ax6.plot(x, '--', color='0.7')
ax6.plot(x, 'o', color='b');

4. Graphics with pandas¶

Bar plot¶

# load a set of data
df = pd.read_table('Summer Olympic medallists 1896 to 2008 - ALL MEDALISTS.txt')
df.head()

# setting category on medals

medals = ['Bronze', 'Silver', 'Gold']

if pd.__version__ < '0.21.0':
    df['Medal'] = geo['Medal'].astype('category', categories=medals, ordered=True)
else:
    from pandas.api.types import CategoricalDtype
    cat_medals = CategoricalDtype(categories=medals, ordered=True)
    df['Medal'] = df['Medal'].astype(cat_medals)
    
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 29216 entries, 0 to 29215
Data columns (total 10 columns):
City            29216 non-null object
Edition         29216 non-null int64
Sport           29216 non-null object
Discipline      29216 non-null object
Athlete         29216 non-null object
NOC             29216 non-null object
Gender          29216 non-null object
Event           29216 non-null object
Event_gender    29216 non-null object
Medal           29216 non-null category
dtypes: category(1), int64(1), object(8)
memory usage: 2.0+ MB

# cross table Edition x Medal
table = pd.crosstab(df['Edition'], df['Medal'])
table

# plot all medals in same graphics
ax = table.plot(kind='bar',
                figsize=(12, 4),
                 title="Medals by edition and metal")
ax.set_xticks(range(len(table)))
ax.set_xlabel("Editions")
ax.set_xticklabels(table.index);

Exercise 4

Compute a cross table Edition by Gender and plot for each edition the ratio of gender.
Compute a cross table Sport by Medal and plot.

# %load exercises/ex4.py

# for each edition the ratio of medals by gender
table1 = pd.crosstab(df['Edition'], df['Gender'])
table1 = table1.div(table1.sum(axis=1), axis=0)
ax = table1.plot(kind='bar',
                 stacked=True,
                 figsize=(12, 4),
                 title="Medals by edition and gender")
ax.set_xticks(range(len(table1)))
ax.set_xlabel("Editions")
ax.set_xticklabels(table1.index);

# for each sport the number of medals by metal
table2 = pd.crosstab(df['Sport'], df['Medal'])
ax = table2.plot(kind='bar',
                figsize=(12, 4),
                 title="Medals by edition and metal")
ax.set_xticks(range(len(table2)))
ax.set_xlabel("Sports")
ax.set_xticklabels(table2.index);

Exercise 5

Modify the graphics above so that:

the graphics is made of different subplots
the X tick labels are rotated by 60°
the colors for the medals are darkorange, silver and gold
no legend is displayed in the subplots
the space between the subplots is increased (use the `subplots_adjust()` function with the `hspace=...` argument)

Modify the graphics above so that the editions are replaced by the cities.

you will need to adjust the xtick labels to the right (use the `pyplot.xticks()` function with the `ha=...` argument)

# %load exercises/ex5.py

# graphics with subplots
table1 = pd.crosstab(df['Edition'], df['Medal'])
axes = table1.plot(figsize=(9, 6),
               title="Medals by metal and edition",
               kind='bar',
               subplots=True,
               #sharey=True,
               color=['darkorange', 'silver', 'gold'],
               rot=60)
plt.subplots_adjust(hspace=0.4)
axes[-1].set_xticks(range(len(table1)))
axes[-1].set_xlabel("Editions")
axes[-1].set_xticklabels(table1.index)
for ax in axes:
    ax.legend().set_visible(False);
    
# graphics with subplots
table2 = pd.crosstab(df['Sport'], df['Medal'])
axes = table2.plot(figsize=(9, 6),
               title="Medals by metal and sport",
               kind='bar',
               subplots=True,
               #sharey=True,
               color=['darkorange', 'silver', 'gold'],
               rot=60)
plt.subplots_adjust(hspace=0.4)
plt.xticks(ha='right')
axes[-1].set_xticks(range(len(table2)))
axes[-1].set_xlabel("Sports")
axes[-1].set_xticklabels(table2.index)
for ax in axes:
    ax.legend().set_visible(False);

FIne tunings of default matplotlib parameters can be achieved by modifying:

either the matplolib.rcParams variable
or the matplotlibrc file, see matplotlib.matplotlib_fname().

Of course, these repositories require expertise and attention when modifying them.

# path of matplotlibrc file
print(mpl.matplotlib_fname())

C:\Users\Francis\Anaconda3\lib\site-packages\matplotlib\mpl-data\matplotlibrc

Scatter plot¶

# load a set of data
geo = pd.read_csv('correspondance-code-insee-code-postal.csv', sep=';',
                 usecols=range(10),
                index_col='Code INSEE')
geo[['Latitude', 'Longitude']] = geo['geo_point_2d'].str.extract('(.+), (.+)', expand=True).astype(float)
geo.head()

# scatter plot provides naive maps
plt.scatter(geo['Longitude'],
            geo['Latitude'],
            s=3);

Exercise 6

Limit the geo DataFrame to Metropolitan France and scatter plot.
Provide a scatter plot where the color depends on the 'Altitude Moyenne' with the colormap 'coolwarm'.

# %load exercises/ex6.py

# Metropolitan France
metro = geo.loc[geo['Latitude'] > 40]
plt.scatter(metro['Longitude'],
            metro['Latitude'],
            s=3);

metro = metro.sort_values('Altitude Moyenne')
plt.scatter(metro['Longitude'],
            metro['Latitude'],
            c=metro['Altitude Moyenne'],
            s=3,
            cmap=plt.cm.coolwarm)
plt.colorbar();

Colormaps¶

The matplotlib and seaborn modules manage also colormaps, i.e. palettes of colors associated with discrete or continuous data:

matplotlib, see: http://matplotlib.org/users/colormaps.html
seaborn, see: http://seaborn.pydata.org/tutorial/color_palettes.html

These modules manage also other palettes:

ColorBrewer, see: http://colorbrewer2.org
xkcd, see: https://xkcd.com/color/rgb/, see also: http://www.luminoso.com/colors/

print(*plt.cm.datad.keys(), sep=' ')

afmhot autumn bone binary bwr brg CMRmap cool copper cubehelix flag gnuplot gnuplot2 gray hot hsv jet ocean pink prism rainbow seismic spring summer terrain winter nipy_spectral spectral Blues BrBG BuGn BuPu GnBu Greens Greys Oranges OrRd PiYG PRGn PuBu PuBuGn PuOr PuRd Purples RdBu RdGy RdPu RdYlBu RdYlGn Reds Spectral YlGn YlGnBu YlOrBr YlOrRd gist_earth gist_gray gist_heat gist_ncar gist_rainbow gist_stern gist_yarg coolwarm Wistia Accent Dark2 Paired Pastel1 Pastel2 Set1 Set2 Set3 tab10 tab20 tab20b tab20c Vega10 Vega20 Vega20b Vega20c afmhot_r autumn_r bone_r binary_r bwr_r brg_r CMRmap_r cool_r copper_r cubehelix_r flag_r gnuplot_r gnuplot2_r gray_r hot_r hsv_r jet_r ocean_r pink_r prism_r rainbow_r seismic_r spring_r summer_r terrain_r winter_r nipy_spectral_r spectral_r Blues_r BrBG_r BuGn_r BuPu_r GnBu_r Greens_r Greys_r Oranges_r OrRd_r PiYG_r PRGn_r PuBu_r PuBuGn_r PuOr_r PuRd_r Purples_r RdBu_r RdGy_r RdPu_r RdYlBu_r RdYlGn_r Reds_r Spectral_r YlGn_r YlGnBu_r YlOrBr_r YlOrRd_r gist_earth_r gist_gray_r gist_heat_r gist_ncar_r gist_rainbow_r gist_stern_r gist_yarg_r coolwarm_r Wistia_r Accent_r Dark2_r Paired_r Pastel1_r Pastel2_r Set1_r Set2_r Set3_r tab10_r tab20_r tab20b_r tab20c_r Vega10_r Vega20_r Vega20b_r Vega20c_r

Exercise 7

Add 'Densité' of population on geo and switch the 'Statut' column to an ordered category.
Provide a scatter plot where:

All cities which 'Statut' is less that 'Préfecture' are plotted with a small point
All cities which 'Statut' is more that 'Préfecture' are plotted with a circle which radius depends on Population and color depends on 'Densité' with the colormap Reds
All cities which Statut is more that 'Préfecture de région' (except those with Arrondissements: Paris, Lyon and Marseille) are plotted along with their names

# %load exercises/ex7.py

geo['Densité'] = geo['Population'] / geo['Superficie']

status = list(geo['Statut'].value_counts().index)
if pd.__version__ < '0.21.0':
    geo['Statut'] = geo['Statut'].astype('category', categories=status, ordered=True)
else:
    from pandas.api.types import CategoricalDtype
    cat_status = CategoricalDtype(categories=status, ordered=True)
    geo['Statut'] = geo['Statut'].astype(cat_status)

metro = geo.loc[geo['Latitude'] > 40]

# Noms des préfectures de région
plt.figure(figsize=(7, 5))
metro_A = metro.loc[metro["Statut"] >= "Préfecture"]
metro_A = metro_A.sort_values("Population", ascending=False)
metro_B = metro.loc[metro["Statut"] < "Préfecture"]

# communes
plt.scatter(metro_B["Longitude"],
            metro_B["Latitude"],
            c='y',
            s=3,
            edgecolors='none')

# préfectures
ax = plt.scatter(metro_A["Longitude"],
                metro_A["Latitude"],
                c=metro_A["Densité"],
                s=metro_A["Population"],
                cmap=plt.cm.Reds,
                edgecolors='none')

# noms des préfectures de région hors PLM
metro_C = metro.loc[(metro["Statut"] >= "Préfecture de région") & ~metro["Commune"].str.contains("ARRONDISSEMENT")]
for i, row in metro_C.iterrows():
    plt.text(row["Longitude"],
                 row["Latitude"],
                 row["Commune"].title(),
                 fontsize=8)
    
plt.colorbar(ax);

2. seaborn¶

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

barplot¶

Show point estimates and confidence intervals as rectangular bars.

# number of athletes for each sport, country, gender and medal
table = df.pivot_table(index=['Sport', 'NOC', 'Gender', 'Medal'], values='Athlete', aggfunc='count') #pd.Series.nunique)
table.reset_index(inplace=True)
sports = table['Sport'].value_counts().index[:10]
table = table.loc[table['Sport'].isin(sports)]
table

# barplot number of athletes by sport and medal
fig, ax = plt.subplots(figsize=(12, 8))
sns.barplot(y='Sport',
            x='Athlete',
            data=table,
            hue='Medal',
            palette=['darkorange', 'silver', 'gold'],
            #ci=0,
            ax=ax);

stripplot¶

Draw a scatterplot where one variable is categorical.

# number of athlete by sport and gender
fig, ax = plt.subplots(figsize=(12, 8))
sns.stripplot(y='Sport',
            x='Athlete',
            data=table,
            hue='Gender',
            palette=['blue', 'red'],
            jitter=True,
            ax=ax);

countplot¶

Show the counts of observations in each categorical bin using bars.

# limit to main countries
countries = df['NOC'].value_counts().index[:20]
table = df.loc[df['NOC'].isin(countries)]
table

# countplot of medals by country
fig, ax = plt.subplots(figsize=(12, 5))
sns.countplot(x='NOC', data=table, ax=ax)
ax.set_xlabel('Country');

distplot + kdeplot¶

Flexibly plot a univariate distribution of observations.

# distplot + regplot of Altitude Moyenne
sns.distplot(geo['Altitude Moyenne']);

distplot + rugplot¶

Flexibly plot a univariate distribution of observations.

# distplot + regplot of Superficie
hds = geo[geo['Département'] == 'HAUTS-DE-SEINE']
sns.distplot(hds['Superficie'], hist=False, rug=True);

regplot¶

Plot data and a linear regression model fit.

# regplot = scatter plot + linear regression
x = np.random.random(100)
y = x * (1 + np.random.random(100)) / 2
g = sns.regplot(x, y);

heatmap¶

Plot rectangular data as a color-encoded matrix

# crosstab Gender by Edition
table1 = pd.crosstab(df['Gender'], df['Edition'])
sns.heatmap(table1, cmap='Blues');

Exercise 8

Modify the heatmap so that the graphics is wider, all labels are rotated to be more easily read.
Do a heatmap with Sports and Genders sorted by Men's number of medals and including the number of medals.
Do a boolean heatmap with Sports and Editions for women only.

# %load exercises/ex8.py

fig, ax = plt.subplots(figsize=(12, 2))
sns.heatmap(table1, cmap='Blues', ax=ax);
ax.set_xticklabels(table1.columns, rotation=60)
ax.set_yticklabels(table1.index, rotation=0)
ax.set_ylabel('Gender', rotation=0);

table2 = pd.crosstab(df['Sport'], df['Gender'])
table2.sort_values('Men', inplace=True)
fig, ax = plt.subplots(figsize=(5, 12))
sns.heatmap(table2, cmap='Blues', annot=True, fmt='d', ax=ax);

dfw = df.loc[df['Gender'] == 'Women']
table3 = pd.crosstab(dfw['Sport'], dfw['Edition']).apply(lambda x: x > 0).astype(int)
fig, ax = plt.subplots(figsize=(12, 8))
sns.heatmap(table3, cmap='Reds', cbar=False, ax=ax);

	City	Edition	Sport	Discipline	Athlete	NOC	Gender	Event	Event_gender	Medal
0	Athens	1896	Aquatics	Swimming	HAJOS, Alfred	HUN	Men	100m freestyle	M	Gold
1	Athens	1896	Aquatics	Swimming	HERSCHMANN, Otto	AUT	Men	100m freestyle	M	Silver
2	Athens	1896	Aquatics	Swimming	DRIVAS, Dimitrios	GRE	Men	100m freestyle for sailors	M	Bronze
3	Athens	1896	Aquatics	Swimming	MALOKINIS, Ioannis	GRE	Men	100m freestyle for sailors	M	Gold
4	Athens	1896	Aquatics	Swimming	CHASAPIS, Spiridon	GRE	Men	100m freestyle for sailors	M	Silver

Medal	Bronze	Silver	Gold
Edition
1896	40	47	64
1900	142	192	178
1904	123	159	188
1908	211	282	311
1912	284	300	301
1920	355	446	497
1924	285	298	301
1928	242	239	229
...	...	...	...
1980	472	455	460
1984	500	476	483
1988	535	505	506
1992	596	551	558
1996	634	610	615
2000	685	667	663
2004	679	660	659
2008	710	663	669

	Code Postal	Commune	Département	Région	Statut	Altitude Moyenne	Superficie	Population	geo_point_2d	Latitude	Longitude
Code INSEE
31080	31350	BOULOGNE-SUR-GESSE	HAUTE-GARONNE	MIDI-PYRENEES	Chef-lieu canton	301.0	2470.0	1.6	43.2904403081, 0.650641474176	43.290440	0.650641
11143	11510	FEUILLA	AUDE	LANGUEDOC-ROUSSILLON	Commune simple	314.0	2426.0	0.1	42.9291375888, 2.90138923544	42.929138	2.901389
43028	43200	BESSAMOREL	HAUTE-LOIRE	AUVERGNE	Commune simple	888.0	743.0	0.4	45.1306448726, 4.07952494849	45.130645	4.079525
78506	78660	PRUNAY-EN-YVELINES	YVELINES	ILE-DE-FRANCE	Commune simple	155.0	2717.0	0.8	48.5267627187, 1.80513972814	48.526763	1.805140
84081	84310	MORIERES-LES-AVIGNON	VAUCLUSE	PROVENCE-ALPES-COTE D'AZUR	Commune simple	49.0	1042.0	7.6	43.9337788848, 4.90875878315	43.933779	4.908759

	City	Edition	Sport	Discipline	Athlete	NOC	Gender	Event	Event_gender	Medal
0	Athens	1896	Aquatics	Swimming	HAJOS, Alfred	HUN	Men	100m freestyle	M	Gold
6	Athens	1896	Aquatics	Swimming	HAJOS, Alfred	HUN	Men	1200m freestyle	M	Gold
11	Athens	1896	Athletics	Athletics	LANE, Francis	USA	Men	100m	M	Bronze
12	Athens	1896	Athletics	Athletics	SZOKOLYI, Alajos	HUN	Men	100m	M	Bronze
13	Athens	1896	Athletics	Athletics	BURKE, Thomas	USA	Men	100m	M	Gold
14	Athens	1896	Athletics	Athletics	HOFMANN, Fritz	GER	Men	100m	M	Silver
15	Athens	1896	Athletics	Athletics	CURTIS, Thomas	USA	Men	110m hurdles	M	Gold
16	Athens	1896	Athletics	Athletics	GOULDING, Grantley	GBR	Men	110m hurdles	M	Silver
...	...	...	...	...	...	...	...	...	...	...
29201	Beijing	2008	Wrestling	Wrestling Gre-R	GUENOT, Christophe	FRA	Men	66 - 74kg	M	Bronze
29204	Beijing	2008	Wrestling	Wrestling Gre-R	CHANG, Yongxiang	CHN	Men	66 - 74kg	M	Silver
29206	Beijing	2008	Wrestling	Wrestling Gre-R	MINGUZZI, Andrea	ITA	Men	74 - 84kg	M	Gold
29207	Beijing	2008	Wrestling	Wrestling Gre-R	FODOR, Zoltan	HUN	Men	74 - 84kg	M	Silver
29209	Beijing	2008	Wrestling	Wrestling Gre-R	WHEELER, Adam	USA	Men	84 - 96kg	M	Bronze
29210	Beijing	2008	Wrestling	Wrestling Gre-R	KHUSHTOV, Aslanbek	RUS	Men	84 - 96kg	M	Gold
29211	Beijing	2008	Wrestling	Wrestling Gre-R	ENGLICH, Mirko	GER	Men	84 - 96kg	M	Silver
29215	Beijing	2008	Wrestling	Wrestling Gre-R	BAROEV, Khasan	RUS	Men	96 - 120kg	M	Silver