Wednesday, July 27, 2011

Learning Karate by Waxing Cars

I will introduce here an article by Peter Norvig that I recently read and involves a discussion about Artificial Intelligence, Statistical Models and Machine Learning, but before posting the link I will introduce the discussion with an example:

When I was a child I learned how to write proper Spanish by following what I will refer in this article as the karate-kid or Mr.-Miyagi approach. I present here two approaches to learn and improve your basic writing skills.
1. Take a class on grammatical and orthographic rules.
2. Read lots of grammatically and orthographically correct text, not bothering about rules.

By using the first approach you can get a sense of how to construct correct sentences early on, but the effects depend a lot on having rules hard-coded in your brain very heavily, practicing with those rules with lots of examples can certainly reinforce the rules to be learned.

The second approach does not involve learning rules at all but just reading lots of text data, let's say books. After reading lots of correctly structured sentences and words you can develop a sense of how a correctly structured sentence or word feels like without being conscious about rules. In other words for some cases you will be using rules almost without consciously thinking about them due to the amazing ability of our brains to find patterns. This second approach is the data-driven approach or as I prefer to call it, the karate-kid approach because I suspect this is the path Mr. Miyagi would have chosen if he had to mentor a pupil about how to write properly.

Mr. Miyagi asking his pupil to wax cars over and over again.
The field of Artificial Intelligence used to follow the first approach. If you want a smart computer, then hard code rules on it so that it can behave as desired. Hard-coding rules doesn't scale very well so you might want to learn the rules from data or adapt the rules over time but ultimately people realized that you might not really need to care about rules at all, as long as you just care about the system behaving as desired.  This is the topic of discussion in the article by Peter Norvig in response to Noam Chomsky's remarks where Chomsky apparently derided machine learning researchers. You can read it in the following link, I highly recommend it: