I discuss the scope of the analysis of the EU industry production dataset and point to possible extensions with additional datasets.
In the previous study I found that geography is an important factor for characterizing the industry production growth of EU countries. However, this leaves more questions open than it answers. Such questions, e.g. “What is the production in absolute numbers?”, can be used to find natural extensions of the study and point to specific datasets to analyze, thus providing guidance for choosing the next steps. In this example, a candidate dataset is the “annual detailed enterprise statistics for industry”, also available at Eurostat via the EU Open Data Portal.
Table of contents
As described in the project summary report, I imported a dataset containing the EU countries’ industry production index from the EU Open Data Portal, extracted the information on the manufacturing branch, tidied, analyzed and visualized it, distilled a growth parameter using a model, and finally showed that Eastern European countries showed a much more dynamic growth than the EU average, in stark contrast to Southern Europe.
So much data…
However, I only extracted a small dataset, limiting the conclusions that can be drawn. The manufacturing branch is just one subset of the industry sector (others are, e.g., mining or electricity), and the production index is only one indicator among others for economic growth.
A more comprehensive analysis would combine the production indices for the entire industry sector and make use of other datasets that are available in the EU Open Data Portal (e.g., producer prices or labor input), but also material from different sources, e.g. the World Bank Open Data Catalog. This study demonstrated that data from other categories like geography or demographics can be beneficial – the power lies in the combination.
Asking the right question(s)
In this sea of data, it is certainly possible to discover hidden trends and correlations, but this can be tedious and time-consuming without narrowing down the task.
Things get easier and more economic if there is a clear question to be answered.
In the context of this study, such a question might be:
Now that I know the relative growth of the production in the EU countries, what is the production in absolute numbers?
To answer this question, recent numbers for the absolute production values are needed. Those are provided, e.g., by the “annual detailed enterprise statistics for industry” dataset. The table includes, among other statistics, the production value in million Euros; the most recent data is for 2016.
Combining such questions can lead to a more comprehensive understanding of the subject.
Note also the official article on the Eurostat website, which provides a both broader and deeper analysis of the EU’s manufacturing branch. This gives a glimpse at what is possible when combining more of the available Eurostat datasets.
The information gained from the analysis of the EU industry production index can lead to new questions that may serve as a starting point for subsequent projects. One example is the comparison of the absolute industry production values to find out if the weaker countries catch up or the stronger ones hurry away.
Such questions thus provide tool for navigating the endless sea of data.
I am a data scientist with a background in solar physics, with a long experience of turning complex data into valuable insights. Originally coming from Matlab, I now use the Python stack to solve problems.