Working with real-world data

As an example of how pymnet can be used to analyze real-world data, we take a look at data that interesting data that actually matters. The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) is an international agreement governing the legal trade in endangered species. Data on trade in endangered species is crucial for monitoring progress toward the United Nations’ Sustainable Development Goal 15, and the CITES database makes such data available at a very high granularity, accompanied by great documentation.

To keep matters simple, we analyze an excerpt from the database, which is shipped with this tutorial for convenience and licensed under the UN Data License (see here for a documentation of the preprocessing steps). More precisely, we take a look at the trade in endangered species for two different years (2010 and 2020), focusing on two different trade terms (live specimens and meat), and restricting ourselves to species taken from the wild and traded for commercial purposes.

Our multilayer-network model will be the following:

  • Nodes: importing and exporting countries or territories (represented by their ISO-2 codes)

  • Aspects:

    • Year (two elementary layers: 2010 and 2020)

    • (Trade) Term (two elementary layers: “live” and “meat”)

  • Edges: trade volume between countries, measured in number of specimens (“live”) or kg (“meat”), where we use the maximum amount reported by any of the trade parties as the edge weight

Example data

First, we take a look at our data using pandas.
Note that we need to be careful with the NA-handling, since in the ISO-2 country codes, “NA” stands for “Namibia”:

import pandas as pd

df = pd.read_csv('cites-trades_commercial_live-meat_2010-2020.csv', na_values='null', keep_default_na=False)
df.head()
Year Term Importer Exporter Max_reported_quantity Unit
0 2010 live AD ES 26.0 Number of specimens
1 2010 live AE AT 74.0 Number of specimens
2 2010 live AE BE 609.0 Number of specimens
3 2010 live AE BH 2258.0 Number of specimens
4 2010 live AE BJ 1912.0 Number of specimens

Network modeling

Now, let us transform this data into a MultilayerNetwork using pymnet (programmatic I/O from and to CSV files will become available soon):

from pymnet import *
aspect_columns_1 = ["Year", "Term"]
aspect_columns_2 = ["Year", "Term"]
node_columns = ["Exporter", "Importer"]
weight_column = "Max_reported_quantity"
directed = True
fully_interconnected = False

assert len(aspect_columns_1) == len(aspect_columns_2)
net = MultilayerNetwork(aspects=len(aspect_columns_1), directed=directed, fullyInterconnected=fully_interconnected)
for idx, row in df.iterrows():
    net[row[node_columns[0]],*[row[aspect] for aspect in reversed(aspect_columns_1)]
                               ][row[node_columns[1]],*[row[aspect] for aspect in reversed(aspect_columns_2)]] = row[weight_column]
assert len(net.edges) == len(df)
len(net.edges)
3378

Network visualization

Plotting in pymnet currently supports only directed networks with a single aspect. Therefore, we first aggregate the bidirectional trade flows between trade partners and create separate networks for each year (2010 and 2020; first aspect).

df["IE"] = [tuple(sorted([x,y])) for x,y in zip(df.Importer,df.Exporter)]
df_undir = df.groupby(["IE","Year"]).agg(dict(Importer="min", Exporter="max",Term="min",Unit="min",Max_reported_quantity="sum")).reset_index()

net2010 = MultiplexNetwork(directed=False, fullyInterconnected=False, couplings="categorical")
net2020 = MultiplexNetwork(directed=False, fullyInterconnected=False, couplings="categorical")

for idx, row in df_undir.query("Exporter != Importer and ((Max_reported_quantity >= 1000000 and Term == 'live') or "
               "(Max_reported_quantity >= 1000 and Term == 'meat'))").iterrows():
    term = row["Term"]
    if row[aspect_columns_1[0]] == 2010:
        net2010[row[node_columns[0]],*[row[aspect] for aspect in reversed(aspect_columns_1[1:])]
                                   ][row[node_columns[1]],*[row[aspect] for aspect in reversed(aspect_columns_2[1:])]] = row[weight_column] / (1000 if term == "live" else 1)
    else:
        net2020[row[node_columns[0]],*[row[aspect] for aspect in reversed(aspect_columns_1[1:])]
                                   ][row[node_columns[1]],*[row[aspect] for aspect in reversed(aspect_columns_2[1:])]] = row[weight_column]  / (1000 if term == "live" else 1)

Now, for each year, we can visualize each trade term (“live” and “meat; second aspect) as a separate layer of a single-aspect network. We also ensure that the nodes are aligned across all networks and layers, highlight OECD countries, and do scale nodes by degree as well as edge widths and edge colors by trade flow:

import matplotlib.pyplot as plt
import networkx as nx
from itertools import product
def get_width(e):
    weight = e[-1]
    return max(1,min(weight / 1000, 5))
G = nx.Graph()
nodes = sorted(set(df_undir.Importer) | set(df_undir.Exporter))
G.add_nodes_from(nodes)
pos = nx.circular_layout(G)
oecd = ['AU', 'AT', 'BE', 'CA', 'CL', 'CO', 'CR', 'CZ', 'DK', 'EE', 'FI', 'FR', 'DE', 'GR', 'HU', 'IS', 'IE', 'IL', 'IT', 'JP', 'LV',
  'LT', 'LU', 'MX', 'NL', 'NZ', 'NO', 'PL', 'PT', 'SK', 'SI', 'KR', 'ES', 'SE', 'CH', 'TR', 'GB', 'US']
config = dict(layout="circular",
           nodeSizeRule={"rule":"degree", "propscale":0.05},
           nodeColorDict={(x,y):"crimson" for x,y in product(oecd, ["live","meat"])},
           edgeColorRule={"rule":"edgeweight", "colormap":"viridis", "scaleby":5e-5,},
           edgeWidthDict={((e[0],e[2]),(e[1],e[3])):get_width(e) for e in list(net2010.edges) + list(net2020.edges)},
           defaultLayerColor="#DDDDDD",
           defaultLayerLabelLoc=(0.95,0.95), 
           defaultNodeLabelColor="black",
              layergap=1.5, 
              nodeCoords=pos,
              defaultLayerAlpha=0.5
              )
fig, ax = plt.subplots(1,1,figsize=(9,9),subplot_kw=dict(projection='3d'), constrained_layout=True)
draw(net2010, **config, ax=ax)
plt.savefig("cites-trades_commercial_live-meat_2010.png", bbox_inches='tight', transparent=True)
../_images/f6c132cc21c361b82decfe0c854723b8e59d688f9f5a2ee7c61c707878bc2570.png
fig, ax = plt.subplots(1,1,figsize=(9,9),subplot_kw=dict(projection='3d'), constrained_layout=True)
draw(net2020, **config, ax=ax)
plt.savefig("cites-trades_commercial_live-meat_2020.png", bbox_inches='tight', transparent=True)
../_images/bc977d4694edc2f8075815a0beb8866f90128aad162a27116b52a6aa2cdf4221.png