Code Monkey home page Code Monkey logo

tokio_logdna-rust's Introduction

Requirements

  • Receive a CSV file as a POST body (but not as a multipart upload through a form).
    • (Initially) it's OK to accept only a very simple CSV: Header field names are case sensitive, no special handling of quotes, no escaping.
  • Return as an array of JSON objects.
  • Create tests.

Optional requirements

Position-independent CSV headers. A file is accepted regardless of the order of columns, as far as their names match.

Store in Postgres. Discuss schema designs.

Preferred technologies

Tokio, Axum

Usage

Start the server

Submit a request

  • wget --post-file=tests/assets/addresses.csv http://127.0.0.1:8080/addresses or
  • curl -H "Content-Type: text/csv" --data-binary @tests/assets/addresses.csv 127.0.0.1:8080/addresses
    • Use --data-binary instead of --data, otherwise newlines are stripped - and those are a part of CSV format.

Debug

curl -w "%{http_code}" -H "Content-Type: text/json" --data-binary @tests/assets/addresses.csv 127.0.0.1:8080/addresses

Tests

  • export API_KEY=....
  • cargo test

Roadmap

Tradeoffs and Decisions

  • Postgres with https://docs.rs/tokio-postgres/latest/tokio_postgres - chosen because it's a part of Tokio project => reliable.
  • OpenAPI generation. It seems useless for CSV. But if we had parameters in an HTTP query/form:
  • Write to Postgres
    • Do we have a defined schema for each client/company/data source, shared across their uploads? If yes:
        1. Schema can change over time, so each CSV column info would have to have its applicability period (two timestamps, and/or a client/company-specific version number/string). Or
        1. Everytime a client/company changes their schema, we create a new endpoint (and a new DB table).
      1. Alternatively, we have a dynamic schema generated from CSV, independent, per-upload.
    • Out of scope: merging with existing data (which would involve flagging conflicting/unmergable entries and human intervention). Hence:
    • Each upload creates a new subset of entries, all associated with the same "upload" entry, and we return a new ID of that upload.
    • This MVP has only one endpoint, hence #3 from above. Treating all values as texts. Four tables (mutli-dimensional flattened). Descriptions are Postgres-agnostic:
      • uploads: id (generated primary), uploaded (timestamp)
      • schema_field: id (generated primary), upload_id (foreign), field_name (text), field_max_length (numeric integer)
      • upload_row: id (generated primary), upload_id (foreign)
      • upload_field: id (generated primary), upload_row_id (foreign), schema_field_id (foreign), value (text). Optionally add (redundant) upload_id to simplify queries (if need be).

tokio_logdna-rust's People

Contributors

peter-kehl avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.