Code Monkey home page Code Monkey logo

countries-scheduled-job's Introduction

countries-scheduled-job

Overview

A Solution to Replicate data from System A(Snowflake) to System B(Postgres)

Assumptions

  • Ordering on the replicated data does not need to be maintained
  • The Case Table structure is identical in both databases
  • Table only contains new/fresh data. Table will not contain data from the previous scheduled job

Metadata

Table: case

Column Name Type
country_id int
country varchar

Error Queue Message:

{
  "country_id": Number,
  "country" : String
}

DLQ Message:

{
  "country_id": Number,
  "country" : String
}

Design Decisons

  • Country column uses a 3-letter code to represent the country
  • Our Case Table has two columns/fields (CountryId, Country)
  • CountryId is the primary key of the Case Table
  • Utilize Mulesoft and Anypoint MQ
  • Use regulare queues (non-FIFO) and Dead Letter Queues (DLQ)
  • Retry on failed records
  • Scale applications independently

Conceptual Design

Diagram

Our system will run on a scheduled job. Schedule and frequency will be set based on business needs. We will have a process application (proc-countries-scheduler-app) which will pull data from the database in System A and insert that data in batches into our database in System B. This application can be scaled horizontally and vertically if needed.

If any errors occur during the process of replicating data over to System B, proc-countries-scheduler-app will publish messages of failed records onto our countries-schedule-error-queue in Anypoint MQ. This allows us to process any failed records asyncronously. This also allows us to have a more sophisticated fallback strategy as we can set a number of retries on those records/messages.

To implement the error stategy using an error queue we will need to have another process application (proc-countries-error-app). This application will subsrcibe to the queue and attempt to insert the failed records into the database in System B. If records fail again after a certain amount of retries, they will be dropped off into our countries-schedule-dlq DLQ.

Our proc-countries-dlq-app will take any messages off of the DLQ and append those failed records on to a failure report file as our lowest state of fallback.

Dead Letter Queues

Anypoint will allow us to assign a DLQ to any queue (countries-schedule-error-queue). After our set amount of redeliveries/retries, the message will be dropped into the assign DLQ. DLQs are meant to be subscribed in the same fashion as other queues. The benefit of subscribing to the queue is the application can handle messages in real-time.

Limitations

  • Anypoint MQ supports up to 120,000 in-flight messages per standard queue.
  • The amount of resources allocated to our application are limited to Mulesoft.

Considerations

  • Limiting the amount of records pulled in per scheduled job
  • Only remove processed records from System A database
  • Could move all the logic in proc-countries-error-app into proc-countries-scheduler-app. Should allocate more workers to handle the extra load put onto the app.
  • If we want real time data, consider having another application which will publish the new records found onto a seperate queue and then having an application subscribe/listen on that queue. Diagram
  • Have a system to process the failure report and determine a strategy to recover or compensate. If customers are impacted, send notifications.

countries-scheduled-job's People

Contributors

aljayf avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.