Spotting trends in the manufacturing growth dynamics: Which region grew the fastest?

Table of contents TL;DR I use the visualization of the EU countries’ manufacturing growth rate with a pandas/matplotlib bar chart to show that the performance mostly depends on geographical position: the East beats the South. Long Description In the previous project, I plotted the growth dynamics of the EU countries’ industry production manufacturing branch, measured…

 Continue reading

Different countries’ growth dynamics at a glance with bar charts

Table of contents TL;DR I use pandas’ interface to the matplotlib library to create bar charts that visualize the manufacturing growth dynamics of European countries. Long Description I read in the dataframe that contains the slope and intercept parameter values from the linear regression with scikit-learn that I performed in the last project. Using the…

 Continue reading

Reducing complexity – from a time series to a single number: coding

Table of contents TL;DR Using the linear model from Python’s scikit-learn package, I obtain the slopes in the EU industry production time series for each country. Long Description I prepare the normalized EU industry production index dataset for the fit routine of the scikit-learn linear model by forcing the time stamps into a 2D numpy…

 Continue reading

Reducing complexity – from a time series to a single number: modeling

Table of contents TL;DR I select a linear model with slope and intercept parameters to describe the growth dynamics of the EU industry production index of each country. Long Description Inspired by line plots of the EU industry production index time series that were previously normalized by the EU average time series, I choose to…

 Continue reading

Removing common trends from a set of time series to highlight their differences

Table of contents TL;DR I divide the EU industry production index time series for each country by the smoothed EU average time series to bring out the countries’ individual development for further modeling. Long Description Using a chain of pandas methods to obtain a rolling-mean average, I smooth the EU average time series of the…

 Continue reading

Exploring the industry production history with EDA

Table of contents TL;DR I use statistical and graphical tools to perform exploratory data analysis (EDA) on the EU industry production dataset as a starting point for modeling the time series. Long Description With the help of the pandas describe method and the matplotlib package I explore the statistics of the EU industry production dataset,…

 Continue reading

Making the numbers shine: Cleaning EU industry production index values

Table of contents TL;DR I make EU industry production index values, which I previously put in a tidy form, ready for analysis by splitting numbers and flag values with pandas methods. Long Description Now that the EU industry production dataset has a tidy dataframe structure, I clean up the production index values. The values are…

 Continue reading

Bringing an EU industry production dataframe into good shape

Table of contents TL;DR I use pandas dataframe methods to bring EU industry production data into a tidy format to facilitate further analysis. Long Description Here I use the Python packages SQLAlchemy and pandas to read in a subset of the EU industry production dataset from a local PostgreSQL database into a dataframe. I apply…

 Continue reading

Using SQL queries to extract data from a PostgreSQL database

Table of contents TL;DR Making use of SQLAlchemy and SQL queries, I extract EU industry production data for further analysis from the PostgreSQL database where I previously stored it. Long Description Building on the previous projects, I use SQLAlchemy to connect to a local PostgreSQL database that contains as a table an EU industry production…

 Continue reading

Storing a pandas dataframe in a PostgreSQL database

Table of contents TL;DR Paragraph I store EU industry production data in a PostgreSQL database using the SQLAlchemy package. Long Description Building on the previous project, I download an EU industry production dataset from the EU Open Data Portal, put it in a pandas dataframe, and store it in a PostgreSQL database. Using such a…

 Continue reading