Backend API FULFIL.IO Assignment

Backend API FULFIL.IO Assignment is a Django Backend API that reads data from a csv file, delivers its computation to a celery worker, and stores in PostgreSQL. Thereby, you can interact with those data : retrieve, add, delete, and update

Table of contents

Technologies used and Why ?

To resolve this problem, we have used django, djangorestframework, celery & redis , django-signals , socketIO , PostgreSQL , and pandas.

django: among the best python web frameworks.
djangorestframework: we are supposed to build a small REST API. Therefore, Django Rest Framework is suitable for the solution.
celery & redis: performs asynchronous tasks with the redis broker
django-signals: handles webhooks configurations.
socketIO: sends socket messages to the client. It's an alternative of SSE, not correctly working with Django
PostgreSQL: Database used to store our data.
pandas: reads the input Csv file and deduplicate the data.

Installation

To run my Backend solution, you must have python, pip, redis-server, and PostGreSQL installed in your system and configure the redis server and postgresql with django

Download the project from GitHub

To clone my code, run the command below in the CLI

git clone "https://github.com/adrienTchounkeu/backend_assignment_fulfil.git"

You can also download the project by clicking the link Backend_assignment_fulfil

Install Dependencies

After downloading the code, open the CLI in the root directory and execute the command :

pip install -r requirements.txt

NB: "requirements.txt is a file which contains all the project dependencies"

After all the project dependencies are installed, run the command

python manage.py runserver # on Windows

python3 manage.py runserver # on Linux

To run the Celery worker, run the command

celery -A backend_assignment worker -l info --pool=solo # to launch celery

NB: The server generally starts on the port 8000

Heroku Deploy and Frontend app

The Backend API is available through the link https://backend-assignment-fulfil.herokuapp.com

Assumptions & Issues

To deploy my application, two add-ons were needed : postgresql and redis. I, therefore, connected my visa card account to heroku because unable to add more than one add-on otherwise.
Due to some dynos(processes on Heroku) limitations, my backend is not working properly. Some endpoints are neither returning the good response nor performing the request. Though, it is working perfectly in the local environment

NB : You will see in the commit history, many useless commits when is was tyring to figure out heroku deployment errors

Frontend App

The Backend communicate with the Frontend app, written in VueJs. You can access through the link Frontend_assignment_fulfil

Analyzing The Solution

Before starting coding, we have to understand the problem and think of the solution. We have structured our project as follow :

Choose a great tool to read large csv files : Pandas for instance
Create custom signals to dispatch when there's a manual create/update action.
After loosing a lot of time on trying to integrate SSE with Django, I finally choose SocketIO to send live streams events to the Client
To avoid high cost performance in our app, we use a worker to handle asynchronous tasks and a redis server to work along with Celery, and channels our socket events.
A high in performance SQL Database : PostGreSQL for instance.

Solving Backend API FULFIL.IO Assignment -------------------------------------------

Assumptions

To solve the problem, we did some hypothesis:

The file is stored in other for the worker to efficiently process it.

Solution

To solve the problem, we use DataFrames and pandas as pd functions, workers, brokers, sockets and signals

read large CSV files with pd.read_csv in chunks(100000)
drop duplicates on sku in DataFrames with pd.drop_duplicates
bulk_create django orm functions to store all the data at once
celery workers to perform asynchronous tasks, along with brokers
sockets to send data status event messages to the client
signals to handle webhooks configurations

Tests

No tests were done to test the endpoints and functions

Further perspectives

Limitations & Optimizations

Even though my code is solving the problem, I have some performance and resources used issues. To optimize my solution, I think

implement parallelization : optimization reading CSV files
use SSE to establish a unidirectional connection with the client, for speed and security issues
after lots of research, Flask along with SQLAlchemy best fits the solution because it functions smoothly with SSE
Regarding deployment, we should implement the solution on a well-designed server (Linux for instance) rather than using an easy deploy service(huge limitation)

Real-life Adaptation

Assuming that we have files coming from more multiple sources, we will encounter the following problems:

performance issues while reading files
storing huge amounts of data
requesting on huge amount of data
computing huge amounts of data

To solve this problem, we need, to begin, create indexes on our columns in our database to optimize queries, use a server with great memory and processor, and finally use efficient tools to read and deduplicate, dask must be tested because of his apparently proven performance.

adrientchounkeu / backend_assignment_fulfil Goto Github PK

backend_assignment_fulfil's Introduction

Backend API FULFIL.IO Assignment

Technologies used and Why ?

Installation

Download the project from GitHub

Install Dependencies

Heroku Deploy and Frontend app

Assumptions & Issues

Frontend App

Analyzing The Solution

Assumptions

Solution

Tests

Further perspectives

Limitations & Optimizations

Real-life Adaptation

backend_assignment_fulfil's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org