Text-Classification-Python

This repository covers whole range of text classification problems using different machine learning algorithms.

1. Installation & Requirements:

The general installation guide to run different projects is provided here. However if any error occurs due to missing libraries, please read the error and install the library according to that information.

Clone the tool if you have git installed.

Git Installation Guide:

Windows - https://git-scm.com/download/win
Linux - https://git-scm.com/download/linux

Then run these command in the Command Prompt or Terminal.

git clone https://github.com/Yunus0or1/Text-Classification-Python.git
cd Text-Classification-Python

OR

Download from the link: https://github.com/Yunus0or1/Text-Classification-Python/archive/master.zip

Then, run these command in the Command Prompt or Terminal.

cd Text-Classification-Python

pip install -r requirements.txt

There are issues regarding the installation of Tensorflow. To check versioning and other aspects, please click this link to make a clear understanding of tensoflow installation guide.

Run

These are all Python files.

Install Python3 or Python2.7 
Open CMD
Go to directory path and write below command
python3 <filename.py>

2. Usage:

i. Run the python file from the directories.

Or type the command in terminal/command prompt:

python _filename_

Source code explanations

There is a urge necessity to use Embedding Layer in neural network to do text classification. To understand why, hit this Medium article.

Convolutional Neural Network in action to do text classification.
Layers: Embedding→Conv1D→MaxPooling1D→Conv1D→MaxPooling1D→LSTM→Dense
softmax activatation is used to do a normalized probability distribution among multiple classes.

NER-Python

An NER system using preposition to extract location from social media posts.
Uses NLTK library to get the Parts of Speech tags and identify place names on three steps.
All the POS tags along with a video tutorial can be found in this link.
The program analyses the given String and look up different prepositions in order to find a valid location name

Neural_Network_Classification

neural_network_conv.py contains source code on Convolutional Neural Network in action to do text classification. Layers: Embedding → Conv1D → MaxPooling1D → Conv1D → MaxPooling1D → LSTM → Dense.
neural_network_dense.py contains source code on a very simple neural network in action to do text classification. Layers: Dense 256 neurons → Dense 10 neurons. Very fast to do text classification. No batch.
neural_network_lstm.py contains source code on a LSTM neural network in action to do text classification. Layers: Embedding → Dense → LSTM → Dense.
softmax activatation is used in the last layer to do a normalized probability distribution among multiple classes.

Data Labels

1- Traffic Jam
2- No Traffic Jam
3- Road Condition
6- Accident
7- Fire

Neural_Translator

A neural network to translate phonetic Bangla to Bangla.
For a simple neural tranlator the layer is: GRU → TimeDistributed → Dropout → TimeDistributed .
For a complex neural tranlator the layer is: Bidirectional → TimeDistributed → Dropout → TimeDistributed .
No hot encoding.
However achieved very poor performance due to lack of translation data. Only 400 data are available.

Road-Condition-Analysis

This is research based project. The research paper is submitted to IEEE ICCIT 2020.
The research is based on road condition analyses of Dhaka city from social media posts.
machine_classification.py contains source code of road condition anaylsis using different machine learning algorthims such as MultinomialNB, LogisticRegression, KNeighborsClassifier
nueraul_classification.py contains source code of road condition anaylsis using neural network. This procedure is similar to Neural_Network_classification problems.

Wrong_Word_Correction

This is research based project that has been published. Hit this Journal to get details on this project.
wg.py contains source code that generates about 80 wrong words from one single defined correct word.
ml.py contains source code that classifies wrong words using different machine learning algorthims such as MultinomialNB, LogisticRegression, KNeighborsClassifier, RandomForestClassifier etc.

To be noted, when running the ml.py program, it prompts for choices. Theses are the meaning.

WBT  = Word Based Tokenization
CBT  = Character Based Tokenization
ACB = Advance Character Based Tokenization

NON Saved Model processing = Ground up training, evaluation and predict new wrong word
Saved Model processing = Loading pre trained model weights and predict new wrong word

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Classification-Python

1. Installation & Requirements:

Run

2. Usage:

NER-Python

Neural_Network_Classification

Neural_Translator

Road-Condition-Analysis

Wrong_Word_Correction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
NER-Python		NER-Python
Neural_Network_Classification		Neural_Network_Classification
Neural_Translator		Neural_Translator
Road-Condition-Analysis		Road-Condition-Analysis
Wrong_Word_Correction		Wrong_Word_Correction
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Text-Classification-Python

1. Installation & Requirements:

Run

2. Usage:

NER-Python

Neural_Network_Classification

Neural_Translator

Road-Condition-Analysis

Wrong_Word_Correction

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages