Dr. Rutu Mulkar-Mehta

Natural Language Processing vs. Machine Learning vs. Deep Learning

data-science NLProc machine-learning deep-learning recurrent-neural-networks

What is Natural Language Processing?

Natural Language Processing (or NLP) is an area that is a confluence of Artificial Intelligence and linguistics. It involves intelligent analysis of written language. If you have a lot of data written in plain text and you want to automatically get some insights from it, you need to use NLP techniques. These insights could be - sentiment analysis, information extraction, information retrieval, search etc. to name a few.

Machine Learning

Machine Learning (or ML) is an area of Artificial Intelligence (AI) that is a set of statistical techniques for problem solving. These techniques can be applied to a wide variety of problems which are not limited to - vision based research, fraud detection, price prediction, and even NLP. In order to apply ML techniques to NLP problems, we need to usually convert the unstructured text into a structured format.

Deep Learning

Deep Learning (which includes Recurrent Neural Networks, Convolution neural Networks and others) is a type of Machine Learning approach. It is an extension of Neural Networks. Deep Learning is used quite extensively for vision based classification (e.g. distinguishing images of airplanes from images of dogs). Deep Learning can be used for NLP tasks as well. However it is important to note that Deep Learning algorithms do not exclusively deal with text.

Relationship between NLP, ML and Deep Learning

The image below shows graphically how NLP is related ML and Deep Learning. Deep Learning is one of the techniques in the area of Machine Learning - there are several other techniques such as Regression, K-Means, and so on.

Relationship between NLP, ML and Deep Learning

ML and NLP have some overlap, as Machine Learning is often used for NLP tasks. LDA (Latent Dirichlet Allocation which is a Topic Modeling Algorithm) is one such example of unsupervised machine learning.

However, NLP has a strong linguistics component (not represented in the image), that requres an understanding of how we use language. The art of understanding language involves understanding humor, sarcasm, subconscious bias in text, etc. Once we can understand that is means to to be sarcastic (yeah right!) we can encode it into a machine learning algorithm to automatically discover similar patterns for us statistically.

To summarize, in order to do any NLP, you need to understand language. Language is different for different genres (research papers, blogs, twitter have different writing styles), so there is a strong component of looking at your data manually to get a fell of what it is trying to say to you, and how you- as a human would analyze it. Once you figure out what you are doing as a human reasoning system (ignoring hash tags, using smiley faces to imply sentiment), you can use a relevant ML approach to automate that process and scale it.