chenpengyao / 19fall_msbd5001_individualpro Goto Github PK
View Code? Open in Web Editor NEWThe codes and the description of my own project for 19Fall MSBD5001 Individual Project in HKUST BDT Program
The codes and the description of my own project for 19Fall MSBD5001 Individual Project in HKUST BDT Program
# 19Fall_MSBD5001_IndividualPro The codes and the description of my own project for 19Fall MSBD5001 Individual Project in HKUST BDT Program There are two different model to solve with this problem. The difference is 'whether using a classificator to predict if the user will play one game in testing dataset'. It is same to process the rough dataset into training set and testing set, I combine all of 'attribute' words within variables 'genres', 'categories' and 'tags' and transfer them as dummy variables. Moreover, Processing the NA: delete 4 Na sample. Transfer the time varibles 'release date' and 'purchase date' via calculating the difference of this two time-dtype variable, which means create a new varibles 'date_diff': 'purchase date'-'release date'. And for Method 2, I just use 5 variables to predict the play_time, without those dummy variable, and the model is simple linear model And for Method 1, firstly I combine those 5 variables with dummy variables, and then using GBDT as a classificator to predict whether the user will play one game, which means I consider there is a new label 'play_or_not' for samples in training set, if the 'playtime_forever'==0, this variable will be 0, otherwise, it will be 1. Then training a simple linear regression model via the subset of training set with 'play_or_not'==1, and predicting the 'playtime_forever' for testing data which is predicted will be played by user And for the code, there is only one coef for function, which means the size of self-testing set splited from training set. And the Program language is Python 3.7 the package including: pandas,sklearn. Finally, if you want to run this code, please modify your own file_path for data in this code, and choose a 'test_size' for yourself
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.