Code Monkey home page Code Monkey logo

signatures-poc's Introduction

Slack Status

Signatures POC

This repository is dedicated to generating dummy data and a sample dashboard that mirrors a company that manages digital document signatures in real-time. This tool is particularly useful for testing and development purposes.

screenshot of Signatures Dashboard

Setup

Requirements

  • Node.js < v. 18
  • Python < v. 3.8
  1. Setup your Tinybird account

Click this button to deploy the data project to Tinybird 👇

Deploy to Tinybird

Follow the guided process, and your Tinybird workspace is now ready to start receiving events.

  1. Setup this repository locally
git clone https://github.com/tinybirdco/signatures-POC.git
cd signatures-POC
  1. Install dependencies
npm install
  1. Setup Tinybird CLI

The install script above will automatically install and configure the tinybird-cli for this project.

Choose your region: 1 for us-east, 2 for eu. A new .tinyb file will be created.tb

Go to https://ui.tinybird.co/tokens and copy the token with admin rights.

⚠️Warning! The Admin token, the one you copied following this guide, is your admin token. Don't share it or publish it in your application. You can manage your tokens via API or using the Auth Tokens section in the UI. More detailed info at Auth Tokens management

This script will also push the data project to your Tinybird workspace.

  1. Start generating data!

In the terminal, run the following command:

npm run seed

Go to your Tinybird workspace and check the data is flowing.

  1. Setup the organizations materialized view

Go to the all_unique_organizations pipe in your Tinybird workspace and click the dropdown carrot button next to "Create API Endpoint, and select "Create Materialized View". This will create the organizations materialized view data source in your workspace.

image

  1. Copy the environment variables to .env

Locally, be sure to paste the admin token from Step 3 into the .env file.

  1. Run the Dashboard locally
npm run dev
  1. Open http://localhost:3000 with your browser to see the result.

License

This project is licensed under the MIT License.

Need help?

Community SlackTinybird Docs

Authors

signatures-poc's People

Contributors

joekarlsson avatar ygnuss avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

signatures-poc's Issues

Unhandled Exception in Parallelized Data Processing Routine

While running the parallelized data processing routine (process_data_parallel function) in the data_processing.py script, an unhandled exception occurs, halting the entire operation. Error handling mechanisms don't seem to work.
Steps to Reproduce

Import process_data_parallel from data_processing.py.
Run process_data_parallel(input_data, num_threads=4) where input_data is a data frame with 1 million rows.

Expected Behavior

The function should process data on all available threads without any errors, and return a processed data frame.
Actual Behavior

Throws an unhandled IndexError and halts the process.
Environment

Python version: 3.8
Library versions: Pandas 1.3.3, Numpy 1.21.2
OS: Linux Ubuntu 20.04

Possible Solutions

Conventional: Try-catch blocks within each thread to catch and log exceptions for later debugging. But that’s old school and doesn't help to continue with the other tasks.

Contrarian/Proactive: Implement a fallback mechanism that reroutes the failed tasks to a dedicated single thread, which could execute a more robust, although slower, data processing function.

New Technology: Utilize Python’s concurrent.futures with a custom exception handler wrapped around each future.

Quality Product: For mission-critical data pipelines, consider moving to a more robust data processing library like Apache Flink, which has mature fault tolerance.

Note:

Implementing the contrarian solution could anticipate and seamlessly handle similar errors in future without halting the operation, thereby improving the robustness of the function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.