NLP_DisasterResponse

Installation

The main libraries needed to run the code are the following:

Python 3.6.7 |Anaconda, Inc.|
Scikit-Learn for Machine Learning algorithms
NumPy for numerical vectorize calculations
Pandas for data manipulation
NLTK for Natural Language Processing
sqlalchemy for interaction within databases

Project Motivation

For this project, I designed an ETL pipeline, machine learning pipeline that uses Natural Language Processing (NLP) to analyze disaster data from FigureEight to build a model for an API that classifies disaster messages.

The project includes a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data.

Project Components

There are three main components for this project:

1. ETL Pipeline

The Python script, process_data.py, contains a data cleaning pipeline that:

Loads the messages and categories datasets
Merges the two datasets
Cleans the data
Stores it in a SQLite database

2. ML Pipeline

In the Python script, train_classifier.py, there is a machine learning pipeline that:

Loads data from the SQLite database
Splits the dataset into training and test sets
Builds a text processing and machine learning pipeline
Trains and tunes a model using GridSearchCV
Outputs results on the test set
Exports the final model as a pickle file

3. Flask Web App

Web App that shows data visualizations and allows a person to use the trained model to classify disaster messages.

Instructions

Run the following commands in the project's root directory to set up the database and model.

To run ETL pipeline that cleans data and stores in database

python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
To run ML pipeline that trains classifier and saves

python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl

Run the following command in the app's directory to run the web app.

python run.py
Go to http://0.0.0.0:3001/ (Go to your web app link)

Licensing, Authors, Acknowledgements

Must give credit to FigureEight for the data and project idea. Author: Gustavo Cedeno following recommendations and requirements from Udacity's Data Science ND Program.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
app		app
data		data
models		models
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Installation

Project Motivation

Project Components

Instructions

Licensing, Authors, Acknowledgements

NLP_DisasterResponse

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Installation

Project Motivation

Project Components

Instructions

Licensing, Authors, Acknowledgements

NLP_DisasterResponse

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages