Section 4 The Mathematics of Economic Complexity

This section is adapted from [7] but it differs at some points. In particular [11] and [12] provide useful technical details.

4.1 Revealed Comparative Advantage (RCA)

When associating countries to products it is important to take into account the size of the export volume of countries and that of the world trade of products. This is because, even for the same product, I expect the volume of exports of a large country like China, to be larger than the volume of exports of a small country like Uruguay. By the same, I expect the export volume of products that represent a large fraction of world trade, such as cars or footwear, to represent a larger share of a country’s exports than products that account for a small fraction of world trade, like cotton seed oil or potato flour.

To make countries and products comparable I use Balassa’s definition of Revealed Comparative Advantage (RCA). Balassa’s definition says that a country has revealed Comparative advantage in a product if it exports more than its “fair” share, that is, a share that is equal to the share of total world trade that the product represents. For example, in 2008, with exports of $42 billion, soybeans represented 0.35% of world trade. Of this total, Brazil exported nearly $11 billion, and since Brazil’s total exports for that year were $140 billion, soybeans accounted for 7.8% of Brazil’s exports. This represents around 21 times Brazil’s “fair share” of soybean exports (7.8% divided by 0.35%), so I can say that Brazil has revealed comparative advantage in soybeans.

Let \(\hat{x}_{c,p}\) represent the exports of country \(c\) in product \(p\), just as it was defined in 1, I can express the Revealed Comparative Advantage that country \(c\) has in product \(p\) as:

\[\begin{equation} \renewcommand{\vec}[1]{\boldsymbol{#1}} \newcommand{\R}{\mathbb{R}} \tag{4.1} RCA_{c,p} = \frac{\hat{x}_{c,p}}{\sum_c \hat{x}_{c,p}} / \frac{\sum_p \hat{x}_{c,p}}{\sum_{c}\sum_{p} \hat{x}_{c,p}} \end{equation}\]

4.2 Smooth Revealed Comparative Advantage (SRCA)

I smooth changes in export volumes induced by the price fluctuation of commodities by using a modification of (4.1) in which \(x_{c,p}\) is averaged over the previous three years by using weights:

\[ SRCA_{c,p}^{(t)} = \frac{\tilde{x}_{c,p}^{(t)}}{\sum_c \tilde{x}_{c,p}^{(t)}} / \frac{\sum_p \tilde{x}_{c,p}^{(t)}}{\sum_{c}\sum_{p} \tilde{x}_{c,p}^{(t)}} \]

Where

\[ \tilde{x}_{c,p}^{(t)} = \frac{2\hat{x}_{c,p}^{(t)} + \hat{x}_{c,p}^{(t-1)} + \hat{x}_{c,p}^{(t-2)}}{4} \]

Consider that for some years this needs to be altered. As an example, for HS2007 classification there is no data before 2007, hence for the year 2007 the SRCA is the same as RCA and for the year 2008 the \(x_{c,p}^{(t-2)}\) part is omitted and the denominator is changed to 3.

I use this measure to construct a matrix that connects each country to the products that it makes. The entries in the matrix are 1 if country \(c\) exports product \(p\) with Revealed Comparative Advantage larger than 1, and 0 otherwise. I define this as the matrix \(S \in \mathbb{R}^{C\times P}\) with entries:

\[\begin{equation} \tag{4.2} s_{c,p} = \begin{cases}1 & \text{ if } SRCA_{c,p}^{(t)} \geq 1\cr 0 & \text{ otherwise} \end{cases} \end{equation}\]

\(S\) is the matrix summarizing which country makes what, and is used to construct the product space and our measures of economic complexity for countries and products.

In order to compute some of the equations exposed here I had to reduce \(S\) by removing cols and rows where each entry is zero. For some years the number of countries \(C\) can be less than 128 as it was exposed in 3. The number of products \(P\) for a given year can also experience a small decrease.

It is also important that beyond computability some products were intensively exported in past decades but then their were replaced by other products. Think of floppy disks exports or saltpeter exports.

4.3 Diversity and Ubiquity

With \(S\) defined as in the previous sections, I can measure Diversity and Ubiquity simply by summing over the rows or columns of that matrix.

I define Diversity as:

\[k_{c}^{(0)} = \sum_p s_{c,p}\]

And Ubiquity as: \[k_{p}^{(0)} = \sum_c s_{c,p}\]

4.4 Reflections Method

To generate a more accurate measure of the number of capabilities available in a country, or required by a product, I need to correct the information that diversity and ubiquity carry by using each one to correct the other. For countries, this is to calculate the average ubiquity of the products that it exports, the average diversity of the countries that make those products and so forth. For products, this is to calculate the average diversity of the countries that make them and the average ubiquity of the other products that these countries make. This can be expressed by the recursion:

\[\begin{equation} \tag{4.3} k_{c}^{(n)} = \frac{1}{k_{c}^{(0)}} \sum_p s_{c,p} k_{p}^{(n-1)} \end{equation}\] \[\begin{equation} \tag{4.4} k_{p}^{(n)} = \frac{1}{k_{p}^{(0)}} \sum_c s_{c,p} k_{c}^{(n-1)} \end{equation}\]

Then I insert (4.4) into (4.3) to obtain:

\[\begin{equation} \tag{4.5} k_{c}^{(n)} = \sum_c \left[\frac{1}{k_{c}^{(0)}} \sum_p s_{c,p} \frac{1}{k_{p}^{(0)}} s_{c,p} \right] k_{c}^{(n-2)} \end{equation}\]

The equation above can be conveniently written as a matrix equation (this formulation takes some ideas from [11] and [12]):

\[\begin{equation} \tag{4.6} \vec{k}^{(n)} = \hat{S}\vec{k}^{(n-2)} \end{equation}\]

Where

\[\begin{equation} \tag{4.7} \hat{s}_{c,c'} = \frac{1}{k_{c}^{(0)}} \sum_p s_{c,p} \frac{1}{k_{p}^{(0)}} s_{c,p} \end{equation}\]

In a similar way I can define \(\tilde{S}\) with entries:

\[\begin{equation} \tag{4.8} \tilde{s}_{p,p'} = \frac{1}{k_{p}^{(0)}} \sum_c s_{c,p} \frac{1}{k_{c}^{(0)}} s_{c,p} \end{equation}\]

I note (4.6) is satisfied when \(k_{c}^{(n)} = k_{c}^{(n-2)} = 1\). This is the eigenvector of \(\tilde{S}\) which is associated with its largest eigenvalue. Since this eigenvector is a vector of ones, it is not informative. I look, instead, for the eigenvector associated with the second largest eigenvalue. This is the eigenvector that captures the largest amount of variance in the system and is our measure of Economic Complexity.

In particular, the interpretation of the scores changes when considering odd or even iteration order \(n\), high-order iterations are difficult to interpret, and the process asymptotically converges to a trivial fixed point.

For the analysis I used \(n=19\) to compute \(k_c\) and \(n=20\) to compute \(k_p\).

4.5 Economic Complexity Index (ECI)

From the Reflections Method, I define the Economic Complexity Index (ECI) as:

\[\begin{equation} \tag{4.9} ECI_c = \frac{v_c - \mu_{v}}{\sigma_{v}} \end{equation}\]

Where

  • \(\vec{v}\) is a vector whose coordinates are given by \(k_{c}^{(19)}\) where \(p \in 1,\ldots,C\)
  • \(\mu_v = \sum_c v_c / C\) (mean of \(\vec{v}\))
  • \(\sigma_v = \sqrt{\sum_c (v_c - \mu_v)^2 / (C - 1)}\) (standard deviation of \(\vec{v}\))

4.6 Product Complexity Index (PCI)

Similar to the Economic Complexity Index (ECI), I define a Product Complexity Index (PCI). Because of the symmetry of the problem, this can be done simply by exchanging the index of countries \(c\) with that for products \(p\) in the definitions above. I define PCI as:

\[\begin{equation} \tag{4.10} PCI_p = \frac{w_p - \mu_{w}}{\sigma_{w}} \end{equation}\]

Where

  • \(\vec{w}\) is a vector whose coordinates are given by \(k_{p}^{(20)}\) where \(p \in 1,\ldots,P\)
  • \(\mu_w = \sum_p w_p / P\) (mean of \(\vec{w}\))
  • \(\sigma_w = \sqrt{\sum_p (w_p - \mu_w)^2 / (P - 1)}\) (standard deviation of \(\vec{w}\))

4.7 Product Proximity

To make products you need chunks of embedded knowledge which I call capabilities. The capabilities needed to produce one good may or may not be useful in the production of other goods. Since I do not observe capabilities directly, I create a measure that infers the similarity between the capabilities required by a pair of goods by looking at the probability that they are coexported. To quantify this similarity I assume that if two goods share most of the requisite capabilities, the countries that export one will also export the other.

Our measure is based on the conditional probability that a country that exports product \(p\) will also export product \(p'\) (see figure). Since conditional probabilities are not symmetric I take the minimum of the probability of exporting product \(p\), given \(p'\) and the reverse, to make the measure symmetric and more stringent. As an example, assume that in the year 2017, 20 countries exported wine, 25 exported grapes and 15 exported both, all with \(RCA > 1\). Then, the product proximity between wines and grapes is 15/25.

For a pair of goods \(p\) and \(p'\) I define Product Proximity \(\Phi \in \mathbb{R}^{P\times P}\) as:

\[ \Phi = (S^t S) \odot U \]

Where \(\odot\) denotes element-wise multiplication and \[u_{p,p'} = 1 / \max(k_{p}^{(0)}, k_{p'}^{(0)})\]

In other terms, each entry of \(\Phi\) corresponds to: \[ \phi_{p,p'} = \frac{\sum_c s_{c,p} s_{c,p'}}{\max(k_{p}^{(0)}, k_{p'}^{(0)})} \]

An illustrative example for the product proximity measure

Figure 3.1: An illustrative example for the product proximity measure

4.8 Country Proximity

Similar to 4.8, I define Country Proximity \(\Lambda \in \mathbb{R}^{C\times C}\) as:

\[ \Lambda = (SS^t) \odot D \]

Where \[d_{c,c'} = 1 / \max(k_{c}^{(0)}, k_{c'}^{(0)})\]

In other terms, each entry of \(\Lambda\) corresponds to: \[ \lambda_{c,c'} = \frac{\sum_p s_{c,p} s_{c,p'}}{\max(k_{c}^{(0)}, k_{c'}^{(0)})} \]

4.9 Product Density

This concept appears in [13], however the matricial formulation is not a part of the original article.

Product Density is the weighted mean proximity of a new potential product \(p\) to a country’s current productive capability. In formal terms this is the matrix \(\Psi \in \mathbb{R}^{C\times P}\) defined by:

\[ \Psi = (S\Phi) \oslash (\tilde{S}\Phi) \]

Where \(\hat{m}_{c,p} = 1\) and \(\oslash\) denotes element-wise division.

In other terms, each entry of \(\Psi\) corresponds to: \[ \psi_{c,p} = \frac{\sum_{p} s_{c,p} \phi_{p,p'}}{\sum_{p} \phi_{p,p'}} \]

4.10 Country Density

Similar to 4.9, I define Country Density as the matrix \(\Omega \in \mathbb{R}^{C\times P}\):

\[ \Omega = (\Lambda S) \oslash (\Lambda\tilde{S}) \]

In other terms, each entry of \(\Omega\) corresponds to: \[ \omega_{c,p} = \frac{\sum_{c} s_{c,p} \phi_{c,c'}}{\sum_{c} \phi_{c,c'}} \]

References

[7] C. Hidalgo, R. Hausmann, S. Bustos, M. Coscia, A. Simoes, and M. Yildirim, The atlas of economic complexity: Mapping paths to prosperity. Mit Press, 2014.

[11] M. Mariani, A. Vidmer, M. Medo, and Y.-C. Zhang, “Measuring economic complexity of countries and products: Which metric to use?” The European Physical Journal B, vol. 88, no. 11, p. 293, 2015.

[12] E. Kemp-Benedict, “An interpretation and critique of the method of reflections,” University Library of Munich, 2014.

[13] C. Hidalgo, B. Klinger, A.-L. Barabási, and R. Hausmann, “The product space conditions the development of nations,” Science, vol. 317, no. 5837, pp. 482–487, 2007.