Code Monkey home page Code Monkey logo

codeawstest's Introduction

Structural flow

AWS web app structural flow

Running Demo: Scraped Content Interaction

Running demo Scraped Content Interaction

Running Demo: Scraping Feature

Running demo Scraping Feature

Workflow Overview

  1. Client Authentication:

    • Action: The user logs in through the React application.
    • Process:
      • The React app sends user credentials to the AuthService via Kong.
      • AuthService verifies the credentials and returns an authentication token, which will tthen be process by Kong to ensure authentication through all trafic.
      • The token is stored in the browser’s local storage for subsequent requests.
  2. Client Interaction:

    • Action: The user sets search filters (e.g., price range, mileage) for car listings.
    • Process:
      • The React app sends the search criteria and the stored authentication token to the Scraper Task Producer via the Kong API Gateway.
  3. Scraper Task Producer:

    • Action: The Scraper Task Producer generates a Facebook Marketplace search link based on the user’s filters.
    • Process:
      • The Scraper Task Producer scrapes the listing page, collecting all relevant car listing links.
      • It uses concurrent.futures to manage background tasks that run Selenium in parallel for scraping.
      • Once all links are collected, the task producer sends them to Kong API Gateway for further processing.
  4. Kong (API Gateway & Load Balancer):

    • Action: Kong handles routing and load balancing.
    • Process:
      • Kong verifies the authentication token provided by the Scraper Task Producer.
      • Kong distributes the scraping tasks to various instances in the NLP Services EC2 cluster based on load and availability.
      • It routes the requests to the appropriate NLP services.
  5. NLP Services:

    • Action: The NLP services process car listing links.
    • Process:
      • Each instance in the NLP Services cluster processes the provided car listing links, extracting detailed attributes and filtering out undesired listings (e.g., scams, down payments).
      • NLP Services use concurrent.futures to manage background tasks that run Selenium for scraping and spaCy for NLP processing.
      • The processed data is then sent to the Scraped Content Service via Kong.
  6. Scraped Content Service:

    • Action: The Scraped Content Service stores the processed car listings.
    • Process:
      • Kong verifies the authentication token provided by NLP Services.
      • The Scraped Content Service stores detailed car listings in its database using Django ORM.
      • It ensures data persistence and makes it accessible for future queries.
  7. SSE Service:

  • Action: Handles real-time updates for new car listings.

  • Process:

    • Event Publication: The Scraped Content Service publishes events to Redis when new car listings are available.
    • Event Subscription: The SSE Service, using FastAPI, subscribes to Redis channels to receive these events.
    • Real-Time Broadcasting: The SSE Service broadcasts the events to clients via Server-Sent Events (SSE).
    • UI Update: The React app subscribes to SSE events and updates the UI with new car listings in real-time.

Tech Stack and Deployment

1. Client Application (React)

  • Tech Stack:
    • Frontend: React
    • UI Components: Material-UI (MUI)
  • Deployment:
    • Hosting: AWS S3
    • Details: The React application is built as static files and deployed on S3 for static website hosting.

2. AuthService

  • Tech Stack:
    • Backend Framework: Django
    • Authentication: Django REST Framework
  • Deployment:
    • Hosting: AWS EC2
    • Details: The Django application is deployed on EC2 instances. AuthService is registered in Kong API Gateway for handling authentication requests.

3. Scraper Task Producer

  • Tech Stack:
    • Backend Framework: FastAPI
    • Background Task Execution: Python concurrent.futures (for managing background tasks)
    • Web Scraping: Selenium with Chrome
  • Deployment:
    • Hosting: AWS EC2
    • Details: The FastAPI application is deployed on EC2. concurrent.futures manages background tasks for scraping using Selenium. The service is registered in Kong API Gateway with routes for scraping tasks.

4. NLP Services

  • Tech Stack:
    • Backend Framework: FastAPI
    • NLP Library: spaCy
    • Background Task Execution: Python concurrent.futures (for managing background tasks)
    • Web Scraping: Selenium with Chrome
  • Deployment:
    • Hosting: AWS EC2 (Cluster of instances)
    • Details: FastAPI applications run on multiple EC2 instances. concurrent.futures manages background tasks for Selenium and spaCy. Each service instance is registered in Kong API Gateway with routes for NLP processing.

5. Scraped Content Service

  • Tech Stack:
    • Backend Framework: Django
    • ORM: Django ORM
  • Deployment:
    • Hosting: AWS EC2
    • Details: The Django application is deployed on EC2. It uses Django ORM to store data in a database. The service is registered in Kong API Gateway for content management.

6. SSE Service

  • Tech Stack:
    • Backend Framework: FastAPI
    • Pub/Sub Broker: Redis
  • Deployment:
    • Hosting: AWS EC2
    • Details: The FastAPI application is deployed on EC2. Redis handles pub/sub messaging for real-time updates. The service is registered in Kong API Gateway for SSE endpoints.

7. Kong (API Gateway & Load Balancer)

  • Tech Stack:
    • API Gateway: Kong
    • Load Balancing: Kong
  • Deployment:
    • Hosting: AWS EC2
    • Details: Kong is deployed on EC2 and configured for API management and load balancing. It routes requests to microservices and manages load distribution for NLP services.

codeawstest's People

Contributors

albayona avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.