How to break into machine learning

4 minute read

Published: October 24, 2022

I have a master’s degree in civil and structural engineering, where I dedicated the last year of my MSc study, including my MSc thesis, to vibration-based damage detection using machine learning (ML). At that time, I was working together with the Monitoring of Structures research group at Aarhus University, who wanted to apply their structural health monitoring technologies within a probabilistic framework to perform damage detection. I had a solid background in applied mathematics and a semi-solid background in classical statistics, but machine learning was a completely new field to me. As most people would do, I stated out by finding book recommendations on the topic, collected the most highly rated books, i.e., Bishop (2006) and Murphy (2012), and dived in. These books are somewhat of a mouthful when you are not familiar with the (Bayesian) ML jargon and workflow, but then I found Andrew Ng’s original, online Stanford/Coursera.org course, which became a game changer for me, and I now have a PhD in probabilistic modeling and analysis (ML) and work as a Postdoc and consultant in this field.

So, to help others break into ML as well, I have put together a progression outline that I would follow if I had to do it all over again in 2022. One thing to note here is that you need to have a solid foundation in the underlaying basis, i.e., mathematics and programming, to fully understand and confidently apply ML technologies in practice. Therefore, this outline will not only focus on building you knowledge on the core ML topics but also on the basis.

Mathematics

Mathematics for ML is covered in detail in the excellent textbook by Deisenroth, Faisal, and Ong (2020), as well as in the online specialization by the same name offered Imperial College London through Coursera (link). I would recommend taking the specialization, while reading up on the subjects in the textbook.

Programming

The most popular programming language for applying ML is Python, so why not focus on this for now. If you have no prior experience in programming, I would start out by following an introductory course. Depending on your interests, it could be a short course (e.g., link) that covers only the basic syntax, or it could be a more elaborate course (e.g., link) that touches upon some of the more advanced functionalities. In this regard, I recommend that you fucus your attention on learning Python 3 – the newest version – thus you should pick a course that is based on Python 3.

Machine learning

For me there is only one way to venture into ML and that is to take Andrew Ng’s ML course, which is now available in a 2022 version (3-course specialization; link) that uses Python for the lab exercises. Andrew’s ability to explain complex topics in a straightforward way is unique to this program, and I have simply not seen a better introduction to the field.

After this general introduction to the ML field, you are ready to dive into more specialized branches that align with your work or research, as well as to put your newly acquired skills into production. The former relies on an individual choice of what to learn next, e.g., you may want to explore deep learning in detail (see e.g., link), whereas the latter relies on learning the data science methodology and tools to support it, i.e., data collection, data preparation/wrangling and splitting, data modeling, model validation and testing, and model deployment (see e.g., link).

Projects

The best way to gain experience in implementing ML algorithms, as well as implementing the full data science methodology, is by doing so on real data problems. These problems could be your own work-related or research problems, but they could also be problem posted in ML competitions, like those provided by kaggle. Whether it is the one or the other does not matter; what matters is that you practice implementing ML systems for real data problems – the more the merrier.

Moreover, an extremely powerful – and mandatory – skill for ML professionals is version control. In this regard, I would highly recommend taking an introductory course on Git and start practicing version control on your own ML project. An introductory Git-course that you would recommend is freely available on Udacity (link).

I hope this post gets you well on your way towards entering the exciting field of ML and applying it in your own domain of interest :)

Deisenroth, M.P., Faisal, A.A. and Ong, C.S., Mathematics for machine learning. Cambridge University Press, 2020. (link)

Murphy, Kevin P. Machine learning: a probabilistic perspective. MIT press, 2012.

Bishop, Christopher M., Pattern recognition and machine learning. Springer, 2006.

Share on

Twitter Facebook LinkedIn

SebastianGlavind
stglavind

How to break into machine learning

Mathematics

Programming

Machine learning

Projects

Share on

You May Also Enjoy

On data lekage in machine learning

SebastianGlavindstglavind

Mathematics

Programming

Machine learning

Projects

Share on

You May Also Enjoy

On data lekage in machine learning

SebastianGlavind
stglavind