this is a simple TF-IDF algorithm that use python open source package "JIEBA" to cut Chinese character string into individual word, then use TfidfTransformer from sklearn to calcullate the TF-IDF value for each word in each setences. please use
chelseayang / a-simple-tf-idf-algorithm-handle-chinese-text Goto Github PK
View Code? Open in Web Editor NEWthis is a simple TF-IDF algorithm that use python open source package "JIEBA" to cut Chinese character string into individual word, then use TfidfTransformer from sklearn to calcullate the TF-IDF value for each word in each setences.