A web app that reads documents of every kind, guesses what the contents are and creates index of documents. Then it could predict the other documents.
Run this project:
in local
- clone it on your computer
- use IDE(intellij) to open the "pom.xml" as project
- import dependencies (right click on "pom.xml")
- run src\main\java\org\utpe\freeopenuniversity\intelligentdocumentclassifier\IntelligentdocumentclassifierApplication.java
- visite "http://localhost:8180/"
on server
- java -jar intelligentdocumentclassifier-0.0.1-SNAPSHOT.jar &
- visite http://66.76.242.195:8180/
An application can read files, guesses what the contents are and creates and index of documents. (1) Send a text content to the server, the classifier will classify the documents automatically (2) The Classifier could continue to update
- Stanford CoreNLP https://nlp.stanford.edu/wiki/Software/Classifier
- Java
- Java Spring Boot
⚫ The core part based on the ColumnDataClassifier API of Stanford NLP. It could train data set and predict new files
https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/classify/ColumnDataClassifier.html
⚫ Data folder
intelligentdocumentclassifier\data The folder to store data intelligentdocumentclassifier\data\train The folder which store training data, the app will scan this folder and generate target file for classifier. This folder may have different sub folders named with catogory intelligentdocumentclassifier\data\permanent The folder to generate target file for classifier