Natural Language Processing vs. Machine Learning vs. Deep Learning
NLP, Machine Learning and Deep Learning are all parts of Artificial Intelligence, which is a part of the greater field of Computer Science. The following image visually illustrates CS, AI and some of the components of AI -
Robotics (AI for motion)
Vision (AI for visual space - videos, images)
NLP (AI for text)
There are other aspects ...
Online Word2Vec for Gensim
Word2Vec [1] is a technique for creating vectors of word representations to capture the syntax and semantics of words. The vectors used to represent the words have several interesting features, here are a few:
Addition and subtraction of vectors show how word semantics are captured:
e.g. \(king - man + woman = queen\)
This example capt...
Understanding your Data - Basic Statistics
Have you ever had to deal with a lot of data, and don’t know where to start? If yes, then this post is for you. In this post I will try to guide you through some basic approaches and operations you can perform to analyze your data, make some basic sense of it, and decide on your approach for deeper analysis of it. I will use python and a small s...
An Introduction to Probability
This post is an introduction to probability theory. Probability theory is the backbone of AI, and the this post attempts to cover these fundamentals, and bring us to Naive Bayes, which is a simple generative classification algorithm for text classification.
Random Variables
In this world things keep happening around us. Each event occurring is...
Build your own search Engine
In this post, I will take you through the steps for calculating the $tf \times idf$ values for all the words in a given document. To implement this, we use a small dataset (or corpus, as NLPers like to call it) form the Project Gutenberg Catalog. This is just a simple toy example on a very small dataset. In real life we use much larger corpora, ...
The Math behind Lucene
Lucene is an open source search engine, that one can use on top of custom data and create your own search engine - like your own personal google. In this post, we will go over the basic math behind Lucene, and how it ranks documents to the input search query.
THE BASICS - TF*IDF
The analysis of language often brings us in situations where we a...
14 post articles, 2 pages.