Visualizing the Geospatiotemporal Spread of the Pandemic

Displaying COVID-19 deaths per country per day on a rotating 3D Earth.

An animation displays on a 3D Earth the total number of deaths for each day. They are represented in log scale as cones positionned on capital's locations of each country. During the animation, the 3D Earth is rotating around its own axis eastward to enable the user to see the spread of the pandemic. The speed of the rotation is not related to the speed of Earth's rotation.

The outcome figure itself intends to look like the coronavirus as a sphere with spikes.

Data sources:

Limitations of the data visualization:

  • For large countries like China or the USA, displaying data per region would be more meaningful instead of data for the whole country and positionned on capital's location, provided data are available.
  • The displayed 3D Earth uses a pure spherical model, it is not the actual shape of the Earth.
  • The displayed 3D Earth is transparent, it might be opacified slightly to ease the interpretation of the graphic.
  • The animation widget displays the number of days since the pandemic started, it might be switched to the actual date of the data.

Author: Francis Wolinski

0. Import libraries

The visualization relies on a few Python libraries.

In [1]:
# imports
import numpy as np
import pandas as pd
import ipyvolume as ipv
import shapefile
import pythreejs
In [2]:
# version of the used modules
for m in [np, pd, ipv, shapefile, pythreejs]:
    try:
        print(m.__name__, m.__version__)
    except AttributeError:
        print(m.__name__, m._version.__version__)
numpy 1.16.4
pandas 0.25.1
ipyvolume 0.6.0-alpha.4
shapefile 2.1.0
pythreejs 2.2.0

1. Data loading

The 4 datasets are loaded.

1.1 COVID data per country and per day

In [3]:
# load COVID data for 2020
df_covid = pd.read_csv('data/owid-covid-data.csv')
df_covid = df_covid.loc[df_covid['date'].str.startswith('2020')]
df_covid.head(3)
Out[3]:
iso_code location date total_cases new_cases total_deaths new_deaths total_cases_per_million new_cases_per_million total_deaths_per_million ... aged_65_older aged_70_older gdp_per_capita extreme_poverty cvd_death_rate diabetes_prevalence female_smokers male_smokers handwashing_facilities hospital_beds_per_100k
0 ABW Aruba 2020-03-13 2 2 0 0 18.733 18.733 0.0 ... 13.085 7.452 35973.781 NaN NaN 11.62 NaN NaN NaN NaN
1 ABW Aruba 2020-03-20 4 2 0 0 37.465 18.733 0.0 ... 13.085 7.452 35973.781 NaN NaN 11.62 NaN NaN NaN NaN
2 ABW Aruba 2020-03-24 12 8 0 0 112.395 74.930 0.0 ... 13.085 7.452 35973.781 NaN NaN 11.62 NaN NaN NaN NaN

3 rows × 32 columns

1.2 Country data with ISO 2 and 3 codes

In [4]:
# load country data with ISO 2 and 3 codes
var = pd.read_html('data/geonames.htm')
df_countries = var[1]
df_countries = df_countries[['ISO-3166alpha2', 'ISO-3166alpha3', 'Country']]
df_countries.head(3)
Out[4]:
ISO-3166alpha2 ISO-3166alpha3 Country
0 AD AND Andorra
1 AE ARE United Arab Emirates
2 AF AFG Afghanistan

1.3 Capital data with longitude and latitude

In [5]:
# load city data with longitude and latitude of capitals (type == 'PPLC')
df_capitals = pd.read_csv('data/cities15000.txt',
                        sep='\t',
                        header=None,
                        usecols=[4, 5, 7, 8],
                        names=['lat', 'long', 'type', 'ISO-3166alpha2'])
df_capitals = df_capitals.loc[df_capitals['type'] == 'PPLC']
df_capitals.head(3)
Out[5]:
lat long type ISO-3166alpha2
1 42.50779 1.52109 PPLC AD
15 24.45118 54.39696 PPLC AE
48 34.52813 69.17233 PPLC AF

1.4 Shape file with coastlines

In [6]:
# load shape file with coastlines
sf = shapefile.Reader('ne_110m_coastline/ne_110m_coastline.shp')

2. Data integration and transformation

In [7]:
# This function adds to a DataFrame 3D point coordinates
# computed from longitude and latitude in degrees.
# It relies on a spherical model which simplifies the actual shape of the Earth.
# It also perfoms a permutation between axis (x -> -x, y -> z, z -> y)
# Used for capitals as well as coastlines

degres_to_radians = np.pi / 180.0

def compute_xyz(df):
    long_in_radians = df['long'] * degres_to_radians
    lat_in_radians = df['lat'] * degres_to_radians
    df['x'] = - np.cos(long_in_radians) * np.cos(lat_in_radians)
    df['y'] = np.sin(lat_in_radians)
    df['z'] = np.sin(long_in_radians) * np.cos(lat_in_radians)

2.1 Merge countries and capitals

In [8]:
# merge countries and capitals using ISO 2 codes
df_geo = pd.merge(df_countries,
                  df_capitals,
                  on='ISO-3166alpha2',
                  how='inner')
df_geo.head(3)
Out[8]:
ISO-3166alpha2 ISO-3166alpha3 Country lat long type
0 AD AND Andorra 42.50779 1.52109 PPLC
1 AE ARE United Arab Emirates 24.45118 54.39696 PPLC
2 AF AFG Afghanistan 34.52813 69.17233 PPLC

2.2 Compute a table with total deaths (log scale) by country and by day

In [9]:
# table with total deaths by country and day
col = 'total_deaths'
df_pandemic = df_covid.pivot_table(index='iso_code',
                                   columns='date',
                                   values=col)
df_pandemic = df_pandemic.fillna(0).astype(int)
df_pandemic = df_pandemic.fillna(1.0).clip(1.0, None).apply(np.log)
df_pandemic.head(3)
Out[9]:
date 2020-01-01 2020-01-02 2020-01-03 2020-01-04 2020-01-05 2020-01-06 2020-01-07 2020-01-08 2020-01-09 2020-01-10 ... 2020-05-21 2020-05-22 2020-05-23 2020-05-24 2020-05-25 2020-05-26 2020-05-27 2020-05-28 2020-05-29 2020-05-30
iso_code
ABW 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.098612 1.098612 1.098612 1.098612 1.098612 1.098612 1.098612 1.098612 1.098612 1.098612
AFG 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 5.231109 5.262690 5.323010 5.375278 5.384495 5.389072 5.393628 5.424950 5.459586 5.505332
AGO 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.098612 1.098612 1.098612 1.098612 1.386294 1.386294 1.386294 1.386294 1.386294 1.386294

3 rows × 151 columns

2.3 Merge the total deaths table and the geo data

The final DataFrame contains one column per day (from 2020-01-01 to 2020-05-31) with the total deaths per country and the appropriate 3D positions of the capital of each country.

In [10]:
# merge total deaths and geo data using ISO 3 codes
df_pandemic = pd.merge(df_pandemic.reset_index(),
                       df_geo,
                       left_on='iso_code',
                       right_on='ISO-3166alpha3',
                       how='inner')
compute_xyz(df_pandemic)
df_pandemic.head(3)
Out[10]:
iso_code 2020-01-01 2020-01-02 2020-01-03 2020-01-04 2020-01-05 2020-01-06 2020-01-07 2020-01-08 2020-01-09 ... 2020-05-30 ISO-3166alpha2 ISO-3166alpha3 Country lat long type x y z
0 ABW 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.098612 AW ABW Aruba 12.52398 -70.02703 PPLC -0.333449 0.216848 -0.917490
1 AFG 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 5.505332 AF AFG Afghanistan 34.52813 69.17233 PPLC -0.292926 0.566811 0.770013
2 AGO 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.386294 AO AGO Angola -8.83682 13.23432 PPLC -0.961887 -0.153621 0.226217

3 rows × 161 columns

3. The final figure

The final figure is plotted in few steps:

  • Plot the total deaths per day in darkred arrows which heights represent the total deaths in log scale. The size argument is used by the animation.
  • Plot the coastlines by transforming the lists of longitudes and latitudes of each point in 3D point coordinates by reusing the compute_xyz() function.
  • Set up the animation.
  • Set up the automatic rotation of the 3D Earth.
In [11]:
# plot the figure
fig = ipv.figure()

# total deaths data are in columns whose name starts with '2020'
cols = [col for col in df_pandemic.columns if col.startswith('2020')]

# plot the total deaths (log scale) as red arrows around the 3D Earth
# all data are fixed except the size which will change at each step
xs = df_pandemic['x'].values
ys = df_pandemic['y'].values
zs = df_pandemic['z'].values
s = ipv.scatter(xs * 1.05, ys * 1.05, zs * 1.05,  # factor to move the arrow out of the 3D Earth
                vx=xs, vy=ys, vz=zs,
                size=df_pandemic[cols].T,
                color='darkred', marker="arrow")

# plot the coastlines
for shape in sf.shapes():
    df = pd.DataFrame(shape.points, columns=['long', 'lat'])
    compute_xyz(df)
    xs = df['x'].values
    ys = df['y'].values
    zs = df['z'].values
    ipv.pylab.plot(xs, ys, zs, color='darkgrey')
    
# display parameters
ipv.xyzlim(1)
ipv.style.use('minimal')
animation_control = ipv.animation_control(s, interval=200, sequence_length=len(cols))
ipv.show()

# control parameters
control = pythreejs.OrbitControls(controlling=fig.camera)
fig.controls = control
control.rotateSpeed = 0.07
control.autoRotate = True
fig.render_continuous = True

The animated GIF file has been produced thanks to the LICEcap software, see: https://www.cockos.com/licecap/