MoscowRent

Analysis of Moscow rental property's pricing

The project presents building a model that predicts rental price per square meter for 1 to 5-room flats in Moscow. The data for training were scraped from advertisements on a popular classified avito.ru. Scraping was conducted daily from August 1 to August 14, leading to 14 thousand observations. Random forest is used for prediction and achieves mean absolute percentage error of 16%.

Description of the files (in the order of execution):

scraping.py - scraping data from Avito and saving them to pickles
stations.py - downloading information about Moscow underground stations and computing their distances to the city center
make_dataset.py - constructing a dataframe for exploratory data analysis from pickle files
eda.ipynb - exploratory analysis of the prices, publications flow and commissions
modelling.ipynb - hyperparameter optimization with cross-validation
build_features.py - constructing a dataframe for training from pickle files
train_model.py - training random forest with hyperparameters found in modelling.py
predict_model.py - making predictions with the trained model

Only eda.ipynb and modelling.ipynb contain comments.

Project Organization

├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── models             <- Trained and serialized models
│
├── notebooks          <- Jupyter notebooks.
│   |
|   ├── eda.ipynb      <- Exploratory data analysis
|   └── modelling.ipynb<- Hyperparameter optimization
|
├── requirements.txt   <- The requirements file for reproducing the analysis environment
├── src                <- Source code for use in this project.
    ├── __init__.py    <- Makes src a Python module
    │
    ├── data           <- Scripts to download or generate data
    │   ├── make_dataset.py
    |   ├── scraping.py
    |   └── stations.py
    │
    ├── features       <- Scripts to turn raw data into features for modeling
    │   ├── build_features.py
    │   
    |   
    ├── models         <- Scripts to train models and then use trained models to make
        │                 predictions
        ├── predict_model.py
        └── train_model.py

Project based on the cookiecutter data science project template. #cookiecutterdatascience

melroy89 / moscowrent Goto Github PK

moscowrent's Introduction

MoscowRent

Project Organization

moscowrent's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent