As expected, the first lesson is about why one should study data science, a high level overview of what is known as
The Data Science Process and some CS109 specific details (like when the labs are held, homeworks assignments etc.) that are not applicable to me.
Why data science?
Basically for crazy good job prospects! But jokes aside, you should have other motivations for wanting to learn data science and not just do it for the money. It is going to be a long and arduous journey and honestly, money can be easier had elsewhere! I, too, have a few reasons why I chose to learn data science.
The Data Science Process
The 5 steps in the image above gives us a rough idea of what data science is all about; asking interesting questions, getting the data that you need, modelling it and seeing if your hypothesis made sense and communicating the results. It should be stressed that you do not need to strictly follow the order as shown above. For instance, sometimes you might be given some data and be asked to find interesting patterns.
|Ask an interesting question||What do you want to achieve from analysing these data?|
|Get the Data||If you were provided the data, you want to ask questions like |
|Explore the Data||Exploring the data is all about plotting it out, looking for patterns, anomalies, outliers etc.|
|Model the Data||Build, fit and validate the model (whatever that means).|
|Communicate / Visualise the data||To share with others in an effective manner about the findings you discovered by communicating the information through any form of visual medium like graphs, diagrams, plots etc.|
Aaaand that is about it for the first lesson. If you noticed any mistakes, please let me know! I am also still trying to figure out a better way of communicating my thoughts and learning so if you have any tips, do share them my way!
Last modified on 11 October 2020.
Attributions, if any, can be found here.