Pragmatic Data Science | 3, meta analysis

Before you start implementing your application it is important to do some preliminary meta-analysis of how the data is constructed. The mindset that you should have is that it is fast to think but slow to implement, and therefore we want to implement the right kind of application so that we solve a problem that actually exist.

We also need to do some formatting of the structure of the data. We start by reading in the data from the csv file to memory. This is fine for small data sets, and our use-case, but other approaches will have to be taken for bigger data-sets that do not fit in memory. We also need to convert data type formats, like dates, to something that is easier to work with when you are using arithmetic operators (such as plus, minus multiplication, division).

We will be using python3 for this implementation, so make sure that you are not using python2 if you intend to follow along with the series. We will also be using pyplot from matplotlb. If you don’t have it installed you can install it with pip3.

pip3 install matplotlib

If we start by plotting the temperature, in Farenheit. We get the following graph.

We can see that there is a common trend in the temperature over time, which of course is expected for anyone that have ever experienced any kind of weather seasons. In the coming tutorials we are going to exploit the temperature trend to see how many people Gunnar need to call in to help him out in the shop.

If you want to convert the temperature to Celsius you can use the following formula.