How do I learn machine learning? by Xavier Amatriain
Answer by Xavier Amatriain:
(Spanish version of this answer)
I didn't do a PhD on machine learning (was mostly focused on Signal Processing and Software Engineering) so I get this question a lot. The typical person that asks me this question is a software engineer with a computer science background, so I will address it from that perspective. If you are a Math major, for example, my answer might be less useful.
Take an online course
The first thing I tell someone who wants to get into machine learning is to take Andrew Ng's. I think Ng's course is very much to-the-point and very well organized, so it is a great introduction for someone wanting to get into ML. I am surprised when people tell me the course is "too basic" or "too superficial". If they tell me that I ask them to explain the difference between Logistic Regression and Linear Kernel SVMs, PCA vs. Matrix Factorization, regularization, or gradient descent. I have interviewed candidates who claimed years of ML experience that did not know the answer to these questions. They are all clearly explained in Ng's course. There are many other other online courses you can take after this one (see but at this point you are mostly ready to go to the next step.
Implement an algorithm
My recommended next step is the following. Get a good ML book (my list below), read the first intro chapters, and then jump to whatever chapter includes an algorithm you are interested. Once you have found that algo, dive into it, understand all the details, and, especially, implement it. In the previous online course you would already have implemented some algorithms in Octave. But, here I am talking about implementing an algorithm from scratch in a "real" programming language. You can still start with an easy one such as L2-regularized Logistic Regression, or k-means, but you should also push yourself to implement more interesting ones such as LDA (Latent Dirichlet Allocation) or SVMs. You can use a reference implementation in one of the many existing libraries to make sure you are getting comparable results, but ideally you don't want to look at the code but actually force yourself to implement it directly from the mathematical formulation in the book.
Some book recommendations
So, what are some good books to do this? Many have been mentioned before. Some of my favorite (seefor more details):
- Kevin Murphy's
- Hastie, Tibshirani, and Friedman's
- David Barber's
- Larry Wasserman's(more details on this book in my edit below)
You can also go directly to a research paper that introduces an algorithm or approach you are interested on and dive into it.
My main point is that machine learning is both about breadth as depth. You are expected to know the basics of the most important algorithms (see my answer to). On the other hand, you are also expected to understand low-level complicated details of algorithms and their implementation details. I think the approach I am describing addresses both these dimensions and I have seen it work.
Ready for a career in Machine Learning?
The next logic step some people ask about is whether they should now be ready to start a career in machine learning. That is, of course, a different question. Please refer tofor that.
In response to some comments and questions, I feel that I should add another book recommendation. If you feel like you lack some background in Statistics, I would totally recommend:
- Larry Wasserman's
This books covers all the topics in statistics that you need for understanding ML concepts. As a matter of fact, the book itself has a pretty good introduction to many of the typical ML approaches such as regression and classification.