This is a git repo deidicated to track my assignment submission for the Spring 2017 course of Machine Learning and Data Mining. A total of seven assignments will be completed this semester that are meant to be introductions to various machine learning algorithms. Below are the brief descriptions of the programs taken from the instruction sheet. For detailed instructions, please navigate to the repository of a given assignment.
In this problem set you will implement the Perceptron algorithm and apply it to the problem of e-mail spam classification.
Suppose you are selling your house and you want to know what a good market price would be. One way to do this is to first collect information on recent houses sold and make a model of housing prices. The file housing.txt contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price of the house. (...)
The MNIST dataset is a database of handwritten digits. This problem will apply SVMs to automatically classify digits; the US postal service uses a similar optical character recognition (OCR) of zip codes to automatically route letters to their destination. The original dataset can be downloaded at http://yann.lecun.com/exdb/mnist/. For this problem, we randomly chose a subset of the original dataset. We have provided you with two data files, mnist_train.txt, mnist_test.txt. (...)
You are given a data set with 5000 handwritten digits and their corresponding labels. Each training example is a 20 pixel by 20 pixel grayscale image of the digit. Each pixel is represented by a number indicating the grayscale intensity at that location. Thus, your neural network will have 400 inputs. Your network will have 3 layers: an input layer with 400 inputs, output layer with 10 outputs (corresponding to the ten digits), and a hidden layer with 25 units. You will add bias units at the first and second layers. Thus between the first and second layers there are 401*25 = 10,025 weights. Between the second and third layers there are 260 weights. The total number of weights is therefore 10, 285. You are also provided with the set of weights to use for this assignment. (...)