Pragmatic Data Science | 1, Introduction

Data science toolbox

We are going to be using python 3 for this course. You need to have some basic knowledge of how to run python scripts, how to use pip3 to install python packages and basic programming concepts such as how to use data structures, loops and control statements. If you know JavaScript it will help but it is not mandatory. The same goes for reading academic papers. If you have previous experience of where to find them, how to read them and how to understand them it will help but all algorithms will be handed out so you don’t need to find them yourself. If you know basic arithmetic, such as plus, minus, division and multiplication  you are good to go. The linear algebra needed for the course will be presented in a fashion where no prior knowledge is needed. However, it might be assumed that you spend a little time on your own making sure you understand them.

Why should you care about data science?

Data science and machine learning (ML) is becoming bigger and more hyped. Python as a language is growing and have a lot of nice libraries for data science and ML.

Because of the hype it might be expected of you at your current workplace to have some kind of “in the ballpark” knowledge about the topics. You might be interested in getting these knowledges to boost your career  to land some prestigous new work. It is no secret that data science jobs are very well paid if salary is an interest of yours. You might be interested in gaining some knowledge about a more theoretical area than you normally spend your time in, e.g. linear algebra and its applications. I will also say that the biggest benefit of this tutorial series is that we will actually see practical implementations and use cases, so the videos will be more pragmatic than normally seen at universities or online courses.

Content

The content is dynamic and I will take request on topics that you are interested in. Just leave it in the comment section on youtube and I will see what I can do.

Some of the topics that will be covered are linear algebra, because many algorithms are based on manipulation of matrices, like matrix multiplication and vector multiplications. Other topics include data normalization, so that the data behaves in nice way making certain algorithms practically feasible. We will also touch on the subject of similarity measurements, so that we can make sense of our data when we are comparing highly dimensional data. This is not as trivial as it might seem at first glance.