Over the last two months I’ve been visiting “Natural Language Processing” course, which was held by Linguist Developers from our company.
During this course we have learned basics of Natural Language Processing. Here is a short list of topics that were covered:
- Sentiment Analysis
- Part-of-speech Tagging
- Text canonization (lemmatization, stemming)
- Syntactic structure analysis
- Measuring string similarity
- Basic tasks of Machine Learning
- Support Vector Machines (SVM)
- Basics of Artificial Neural Networks
The final task of the Course was Sentiment Analysis of Tweet messages. Based on existing corpus of tweets with known polarity we were asked to classify tweets from test corpus as positive, negative or neutral. In my solution I have used SVC model with extended feature set containing Semantic Orientation of words from the Tweet, Bag of Words, hashtags, emoticons, etc. As a result, I have reached value of F1 score of about 0.645, which according to our teachers is a pretty good result.
Here is our group:
And here is a certificate of completion that I have received: