Skip to content

engelsjo/DocumentClassifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocumentClassifier (CIS 678 - Project 2)

Authors

Michael Baldwin, Joshua Engelsma, Adam Terwilliger

Objective

Document Classification using the Naive Bayes Algorithm

Specification

The basic idea is to write a program that, given a collection of training data consisting of category-labeled documents, “learns” how to classify new documents into the correct category using a Naïve Bayes classifier.

Background

The Naïve Bayes algorithm uses probabilities to perform classification. The probabilities are estimated based on training data for which the value of the classification is known (i.e. it is another form of Supervised Learning). The algorithm is called “naïve” because it makes the simplifying assumption that attribute values are completely independent, given the classification.

TO-DO

Required

  • Design Doc
  • Visualization of validation results

Project Info

About

tool that accepts as input a document, and outputs the category said document belongs in

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors