top of page

Natural Language Processing In Python : Part - 1



This is the part - 1 of our series "Natural Language Processing". In this blog we will covers all related topics and libraries which is use to work with NLP.


Before start it first we will know basic information about NLP.


What is NLP ?


It is the branch of data science that consists of systematic processes for analyzing, understanding, and how to driving information from the text data in a smart and efficient manner.


First install libraries which is related to NLP -


nltk, numpy, matplotlib.pyplot, tweepy, TwitterSearch, unidecode, langdetect, langid, gensim


And then import all of these:

Install these all libraries which use in this


import nltk # https://www.nltk.org/install.html

import numpy # https://www.scipy.org/install.html

import matplotlib.pyplot # https://matplotlib.org/downloads.html

import tweepy # https://github.com/tweepy/tweepy

import TwitterSearch # https://github.com/ckoepp/TwitterSearch

import unidecode # https://pypi.python.org/pypi/Unidecode

import langdetect # https://pypi.python.org/pypi/langdetect

import langid # https://github.com/saffsd/langid.py

import gensim



List of Topics which we will covers in this series:


  • Text-analysis using NLTK library

  • N-Grams

  • Detecting text language

  • Language identifier

  • Stemming and Lemmatization using Bigrams

  • Finding unusual words

  • part of speech and meaning

  • Name-Gender identifier

  • Classify document into categories

  • Sentiment Analysis

  • Sentiment Analysis with NLTK

  • Work with Twitter streaming and Cleaning

  • Language detection


Now let's starts Topics -


Text-analysis using NLTK library



Run in jupyter notebook


Now we will converts it again tokens to text format:


Find word from string:


Use concordance() method to find word from string



Reading string file:


f = open('string.txt','rU') #rU means reading file in universal mode

f.read()


Then you can analysis this file using ntlk.


Count word in string:


Find similar word:


#Distributional similarity: find other words which appear in the same contexts as the specified word; list most similar words first

use similar()


t.similar('put sting word here')


Plot string word as per repetitions


By using dispersion_plot, we will plot the graph




What is "corpora" in NLP -


It is the large collection of text, here we use nltk library - nltk.corpus to find text.


Thanks for reading this blog, next part we will covers N-Grams.

If you like Codersarts blog and looking for Assignment help,Project help, Programming tutors help and suggestion  you can send mail at contact@codersarts.com or visit Codersarts official website.

Please write your suggestion in comment section below if you find anything incorrect in this blog post 


Commentaires


bottom of page