Code Monkey home page Code Monkey logo

machine-learning-course's Introduction

Homework for Introduction to Machine Learning

Homework 1: Naive Bayes

代码结构

代码主要有:

  • ./main.py 主程序,负责执行相关程序
  • ./data.py 负责处理/读入数据
    • method chinese_email_data_set(path) 本次实验所用的数据
    • class Dictionary 一个类,用来整理Corpus的单词,并把单词的字符串转成index。
    • method shuffle_data(full_data, split) 用来对数据集做random shuffle,并按照比例来划分成training/test dataset。
  • ./model.py 主要负责实现的程序
    • class NaiveBayes 本次实验的模型,在构造函数的时候传入training set和字典,即进行estimate相关参数,从query(data)方法中可以给出预测(返回spam的概率)
  • ./evaluation.py 主要负责评估结果:
    • method binary_classifier(model, test) 对于二分类问题,返回accuracy, precision, recall, F1 score,其中model需要有query(data)方法,test是测试集合,test[i][0]是第i个数据的label,必须是0/1,test[i][1]是第i个数据的feature,需要喝modelquery(data)方法传的data一致。

运行

需要把data/data_cut/文件放在文件夹./trec06c-utf8/

需要安装numpy。

运行python main.py即可使用8:2的划分,使用全部training set的数据,运行5次,并得到相关的评估指标的值。

machine-learning-course's People

Contributors

boundedsbullet avatar wmyw96 avatar

Watchers

 avatar

Forkers

thegodone

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.