Code Monkey home page Code Monkey logo

delivery-pickup-analysis's Introduction

Delivery Pickup Analysis with Food Delivery Data

This project includes creating graph with the restaurant and the grouped delivery locations with median value of delivery times, creating ML model using delivery features (distance, traffic, weather condition etc.) to predict incoming order delivery time and using OSRM to find duration and distance of the optimal route.

Setup OSRM with Docker (with the help of steps in https://hub.docker.com/r/osrm/osrm-backend/)

Food Delivery Dataset is on India map, so to get India information from OSRM first:

wget http://download.geofabrik.de/asia/india-latest.osm.pbf

File is in <project_dir>/data. project_dir is also current directory. To do so extract information from installed osm.pbf file:

docker run -t -v "${PWD}/data:/data" osrm/osrm-backend osrm-extract -p /opt/car.lua /data/india-latest.osm.pbf

After that, run:

docker run -t -v "${PWD}/data:/data" osrm/osrm-backend osrm-partition /data/india-latest.osrm
docker run -t -v "${PWD}/data:/data" osrm/osrm-backend osrm-customize /data/india-latest.osrm

Note: There must be enough space to execute Docker commands in your machine, otherwise Docker will exit with code 137 (out of memory).

  • System Information: Docker processes allocate memory of 18-24 GB (approximately).

    System Info

Up OSRM backend with (in detach mode):

docker run -t -d -i -p 5000:5000 -v "${PWD}/data:/data" osrm/osrm-backend osrm-routed --algorithm mld /data/india-latest.osrm
  • The meaning of .osm is OpenStreetMap and .pbf file is an alternative to XML format, has smaller size. It is used for GIS transfer.

Redis Graph for Delivery Grouping, Connections and Visualization

Dataset contains latitudes and longitudes of restaurant and delivery addresses in the original form. This information is converted to hexagon id (hex id) with using H3 library to group delivery addresses in the same hex id and reduce the dimension. Also, for the future predictions for unknown locations, distance from source and destination locations are calculated and added to dataset as a column. It increases the success the CatBoost Regressor model.

  • Grouping: There may be many orders between one restaurant and one delivery location in hexagon form, so to avoid multiple nodes represents one delivery id, grouping was a good choice. Median value of the delivery times take place as an edge attribute between source and destination nodes in the Redis Graph. (If the data were enough and suitable for time-based analysis, this median value could be updated hourly.)

    Example group result:

    Group Result with Polars

  • Connections: At first, graph is constructed with initial data (in dataset) after grouping. If new delivery data comes, source-destination connection in the graph is examined:

    • If there is a connection already, data is sent to OSRM backend to increase the correctness of duration. OSRM returns optimal route duration after compares all routes (with alternatives) durations with their time and distance parameters. Lastly, the mean of the edge duration and OSRM duration values is used to update duration attribute of the edge.
    • If there is no connection yet (for existing nodes. other conditions will be handled later), it means that there is no delivery between existing nodes. The duration of delivery is found out by CatBoost Regressor model referencing the previous deliveries' features. Also, OSRM result is calculated for from-to locations and edge is created among nodes with the mean of the model prediction and the OSRM result.

    Example new edge creation for RedisGraph: (yellow->source, orange->destination)

    New Edge Request Schema

    New Edge Schema

    Example edge update for RedisGraph:

    Before Update Schema

    Edge Update Request Schema

    Edge Update Schema

A new Redis Graph would be created for each hour but there is no enough data for time series. Also, this data can be used to detect novelties with different techniques (ml, statistics etc.)

LightGBM - CatBoost Comparison with MLFLow

Two well-known ML regression models were compared for the delivery dataset. LightGBM which is the winner of the M5 (Makridakis) competition and CatBoost that has automatic label encoding functionality comparison with different parameters results are in below.

MLFlow Schema

To obtain results, run:

python3 mlflow_comparison.py

and go to MLFlow UI runs on localhost:5000 via:

mlflow ui

Run

System will be ready to accept requests from FastApi in localhost:3000 after:

docker-compose up -d --build

delivery-pickup-analysis's People

Contributors

slcnyagmurnew avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.