by Karlijn Willems

How Machines Learn: A Practical Guide

X4XqtcA65qINSnpZ8rpY0qxbKVN7cT4lj1z1

You may have heard about machine learning from interesting applications like spam filtering, optical character recognition, and computer vision.

Getting started with machine learning is long process that involves going through several resources. There are books for newbies, academic papers, guided exercises, and standalone projects. It’s easy to lose track of what you need to learn among all these options.

So in today’s post, I’ll list seven steps (and 50+ resources) that can help you get started in this exciting field of Computer Science, and ramp up toward becoming a machine learning hero.

Note that this list of resources is not exhaustive and is meant to get you started. There are many more resources around.

1. Get the necessary background knowledge

You might remember from DataCamp’s Learn Data Science infographic that mathematics and statistics are key to starting machine learning (ML). The foundations might seem quite easy because it’s just three topics. But don’t forget that these are in fact three broad topics.

There are two things that are very important to keep in mind here:

  • First, you’ll definitely want some further guidance on what exactly you need to cover to get started.
  • Second, these are the foundations of your further learning. Don’t be scared to take your time. Get the knowledge on which you’ll build everything.

The first point is simple: it’s a good idea to cover linear algebra and statistics. These two are the bare minimum that one should understand. But while you’re at it, you should also try to cover topics such as optimization and advanced calculus. They will come in handy when you’re getting deeper into ML.

Here are some pointers on where to get started if you are starting from zero:

4LbhpCaTCjArtcb6Q48GoKqaOePSKpPZan4O
Statistics is one of the keys to learning ML

If you’re more into books, consider the following:

However, in most cases, you’ll start off already knowing some things about statistics and mathematics. Or maybe you have already gone through all the theory resources listed above.

In these cases, it’s a good idea to recap and assess your knowledge honestly. Are there any areas that you need to revise or are you good for now?

If you’re all set, it’s time to go ahead and apply all that knowledge with R or Python. As a general guideline, it’s a good idea to pick one and get started with that language. Later, you can still add the other programming language to your skill set.

Why is all this programming knowledge necessary?

Well, you’ll see that the courses listed above (or those you have taken in school or university) will provide you with a more theoretical (and not applied) introduction to mathematics and statistics topics. However, ML is very applied and you’ll need to be able to apply all the topics you have learned. So it’s a good idea to go over the materials again, but this time in an applied way.

If you want to master the basics of R and Python, consider the following courses:

When you have nailed down the basics, check out DataCamp’s blog on the 40+ Python Statistics For Data Science Resources. This post offers 40+ resources on the statistics topics you need to know to get started with data science (and by extension also ML).

Also make sure you check out this SciPy tutorial on vectors and arrays and this workshop on Scientific Computing with Python.

To get hands-on with Python and calculus, you can check out the SymPy package.

2. Don’t be scared to invest in the “theory” of ML

A lot of people don’t make the effort to go through some more theoretical material because it’s “dry” or “boring.” But going through the theory and really investing your time in it is essential and invaluable in the long run. You’ll better understand new advancements in machine learning, and you’ll be able to link back to your background knowledge. This will help you stay motivated.

Additionally, the theory doesn’t need to be boring. As you read in the introduction, there are so many materials that will make it easier for you to get into it.

Books are one of the best ways to absorb the theoretical knowledge. They force you to stop and think once in a while. Of course, reading books is a very static thing to do and it might not agree with your learning style. Nonetheless, try out the following books and see if it might be something for you:

  • Machine Learning textbook, by Tom Mitchell might be old but it’s gold. This book goes over the most important topics in machine learning in a well-explained and step-by-step way.
  • Machine Learning: The Art and Science of Algorithms that Make Sense of Data (you can see the slides of the book here): this book is great for beginners. There are many real-life applications discussed, which you might find lacking in Tom Mitchell’s book.
  • Machine Learning Yearning: this book by Andrew Ng is not yet complete, but it’s bound to be an excellent reference for those who are learning ML.
  • Algorithms and Data Structures by Jurg Nievergelt and Klaus Hinrichs
  • Also check out the Data Mining for the Masses by Matthew North. You’ll find that this book guides you through some of the most difficult topics.
  • Introduction to Machine Learning by Alex Smola and S.V.N. Vishwanathan.
QZcUedk1XakAKTLRS90ptnGRLAoAwv-eBYPO
Take your time to read books and to study the material covered in them

Videos / MOOCs are awesome for those who learn by watching and listening. There are a lot of MOOCs and videos out there, but it can also be hard to find your way through all those materials. Below is a list of the most notable ones:

At this point, it’s important for you to go over the separate techniques and grasp the whole picture. This starts with understanding key concepts: the distinction between supervised and unsupervised learning, classification and regression, and so on. Manual (written) exercises can come in handy. They can help you understand how algorithms work and how you should go about them. You’ll most often find these written exercises in courses from universities. Check out this ML course by Portland State University.

3. Get hands-on

Knowing the theory and understanding the algorithms by reading and watching is all good. But you also need to surpass this stage and get started with some exercises. You’ll learn to implement these algorithms and apply the theory that you’ve learned.

First, you have tutorials which introduce you to the basics of machine learning in Python and R. The best way is, of course, to go for interactive tutorials:

Also check out the following tutorials, which are static and will require you to work in an IDE:

Besides the tutorials, there are also courses. Taking courses will help you apply the concepts that you’ve learned in a focused way. Experienced instructors will help you. Here are some interactive courses for Python and ML:

  • Supervised Learning with scikit-learn: you’ll learn how to build predictive models, tune their parameters, and predict how well they will perform on unseen data. All while using real world datasets. You’ll do so with Scikit-Learn.
  • Unsupervised Learning in Python: shows you how to cluster, transform, visualize, and extract insights from unlabeled datasets. At the end of the course, you’ll build a recommender system.
  • Deep Learning in Python: you’ll gain hands-on, practical knowledge of how to use deep learning with Keras 2.0, the latest version of a cutting-edge library for deep learning in Python.
  • Applied Machine Learning in Python: introduces the learner to applied ML and focuses more on the techniques and methods than on the statistics behind these methods.
3V20GPHNc3Ubmi3HdE1rWaKy9w7nX8sJVMkQ
After the theory, take your time to apply the knowledge you have gained.

For those who are learning ML with R, there are also these interactive courses:

  • Introduction to Machine Learning gives you a broad overview of the discipline’s most common techniques and applications. You’ll gain more insight into the assessment and training of different ML models. The rest of the course focuses on an introduction to three of the most basic ML tasks: classification, regression, and clustering.
  • R: Unsupervised Learning provides a basic introduction to clustering and dimensionality reduction in R from a ML perspective. This allows you to get from data to insights as quickly as possible.
  • Practical Machine Learning covers the basic components of building and applying prediction functions with an emphasis on practical applications.

Lastly, there are also books that go over ML topics in a very applied way. If you’re looking to learn with the help of text and an IDE, check out these books:

4. Practice

Practice is even more important than getting hands-on and revising the material with Python. This step was probably the hardest one for me. Check out how other people have implemented ML algorithms when you have done some exercises. Then, get started on your own projects that illustrate your understanding of ML algorithms and theories.

One of the most straightforward ways is to see the exercises a tiny bit bigger. You want to do a bigger exercise which requires you to do more data cleaning and feature engineering.

Tip: don’t forget that there are handy resources to help you out when you’re practicing — Check out these data science cheat sheets.

gc5Y8710KszO5Xr8rEiwN8B1nn6oKRpq99AF
Practice makes perfect.

5. Projects

Doing small exercises is good. But in the end, you’ll want to make a project in which you can demonstrate your understanding of the ML algorithms with which you’ve been working.

The best exercise is to implement your own ML algorithm. You can read more about why you should do this exercise and what you can learn from it in the following pages:

Next, you can check out the following posts and repositories. They’ll give you some inspiration from others and will show how they have implemented ML algorithms.

4KwVBeqcnBFnnNlek5Dmn6mDpbK9ToJoNaVl
Projects can be hard at start, but they’ll increase your understanding even more.

6. Don’t stop

Learning ML is something that should never stop. As many will confirm, there are always new things to learn — even when you’ve been working in this area for a decade.

There are, for example, ML trends such as deep learning which are very popular right now. You might also focus on other topics that aren’t central at this point but which might be in the future. Check out this interesting question and the answers if you want to know more.

Papers may not be the first thing that spring to mind when you’re worried about mastering the basics. But they are your way to get up to date with the latest research. Papers are not for those who are just starting out. They are definitely a good fit for those who are more advanced.

Other technologies are also something to consider. But don’t worry about them when you’re just starting out. You can, for example, focus on adding Python or R (depending on which one you already know) to your skill set. You can look through this post to find interesting resources.

If you also want to move towards big data, you could consider looking into Spark. Here are some interesting resources:

Other programming languages, such as Java, JavaScript, C, and C++ are gaining importance in ML. In the long run, you can consider also adding one of these languages to your to-do list. You can use these blog posts to guide your choice:

OPEeTTX8lW8oPy9PVVKIl9NjXD2eA28j7Duc
You’re never done learning.

7. Make use of all the material that is out there

Machine learning is a difficult topic which can make you lose your motivation at some point. Or maybe you feel you need a change. In such cases, remember that there’s a lot of material on which you can fall back. Check out the following resources:

Podcasts. Great resource for continuing your journey into ML and staying up-to-date with the latest developments in the field:

There are, of course, many more podcasts, but this list is just to get you started!

Documentation and package source code are two ways to get deeper into the implementation of the ML algorithms. Check out some of these repositories:

  • Scikit- Learn: Well-known Python ML package
  • Keras: Deep learning package for Python
  • caret: very popular R package for Classification and Regression Training

Visualizations are one of the newest and trendiest ways to get into the theory of ML. They’re fantastic for beginners, but also very interesting for more advanced learners. The following visualizations will intrigue you and will help you gain more understanding into the workings of ML:

3sVBBu4LRzlxRMNEx8yqJQC3tNPEaQJQ5Pfg
Some variety in your learning can and will motivate you even more.

You Can Get Started Now

Now it’s up to you. Learning ML is something that’s a continuous process, so the sooner you get started, the better. You have all of the tools in your hands now to get started. Good luck and make sure to let us know how you’re progressing.

This post is based on an answer I gave to the Quora question How Does A Total Beginner Start To Learn Machine Learning.