Code Monkey home page Code Monkey logo

airbnbpriceprediction's Introduction

IEOR 4579 Machine Learning in Practice โ€“ Final Project

Airbnb Price Prediction Using Machine Learning and Sentiment Analysis

Authors: Ziad El Assal, Youssef Lahlou, Lorenzo Rega

Forked from Pouya Rezazadeh Kalehbasti's repository

We reproduce the results of Kalehbast et al., extend using demographic and socio-economic features and conduct a robustness analysis on the models used. Although we slightly modified data_cleanup.py and data_preprocessing.py`, most of our code is in the following notebooks:

  • data_cleanup.ipynb joins the new dataset to the original one,
  • run_models.ipynb runs the models using the enhanced dataset and statistical significance of the metrics,
  • robustness.ipynb contains the robustness analysis.

The new dataset can be found at http://faculty.baruch.cuny.edu/geoportal/resources/nyc_geog/nyc_zcta_census_data.xlsx

###########################################

Airbnb Price Prediction Using MachineLearning and Sentiment Analysis

Authors:

Pouya Rezazadeh Kalehbasti ([email protected])

Liubov Nikolenko ([email protected])

Hoormazd Rezaei ([email protected])

Link to source paper for citation: https://arxiv.org/abs/1907.12665

###########################################

In order to run the code make sure you pre-instal all the dependecies such as TextBlob and sklearn

  • DOWNLOAD THE DATASET:

Create a directory called "Data", and download the datasets from this link into the directory: https://drive.google.com/drive/folders/1xk5RyR-UgF6M-ddhn11SXHEWJeB0fQo5?usp=sharing

  • INITIAL DATA PREPROCESSING:
  1. Generate a fine with review sentiment: python sentiment_analysis.py
  2. Clean the data: python data_cleanup.py
  3. Normalize and split the data: data_preprocessing_reviews.py
  • GENERATE THE FEATURE SELECTION .NPY FILE:
  1. For P-value feature selection: python feature_selection.py
  2. For Lasso CV: python cv.py
  • TRAIN AND RUN THE MODELS: python run_models.py Note that by commenting/uncommenting certain lines of code you will be able to run different configurations of the models.
  1. To run the models with Lasso CV feature selection comment out line 240 coeffs = np.load('../Data/selected_coefs_pvals.npy') and uncomment line 241 coeffs = np.load('../Data/selected_coefs.npy').
  2. To run the models with p-value feature selection uncomment line 240 coeffs = np.load('../Data/selected_coefs_pvals.npy') and comment out line 241 coeffs = np.load('../Data/selected_coefs.npy').
  3. To run the baseline uncomment the lines 277, 278
    print("--------------------Linear Regression--------------------")
    LinearModel(X_concat, y_concat, X_test, y_test)

and comment out everything below these lines. Also, comment out the lines 268, 269 and 270

   X_train = X_train[list(col_set)]
   X_val = X_val[list(col_set)]
   X_test = X_test[list(col_set)]

Warning: certain models take a while to train and run!

airbnbpriceprediction's People

Contributors

pouyarez avatar ziadelassal avatar lnikolenko avatar

Forkers

y-lahlou

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.