beginner-corp / begin-data Goto Github PK

A durable and fast key/value store for Begin built on top of DynamoDB

JavaScript 99.35% Arc 0.65%

begin-data's Introduction

Begin Data

`@begin/data`

Begin Data is an easy to use, fast, and durable key/value and document store built on top of DynamoDB. Originally built for Begin serverless apps, Begin Data’s core API has three simple methods: get, set, and destroy.

Concepts

Begin Data organizes itself into tables. A table contain documents which are just collections of plain Objects. Documents stored in Begin Data always have the properties table and key.

Optionally a document can also have a ttl property with a UNIX epoch value representing the expiry time for the document.

Usage

Begin Data operates on one DynamoDB table named data with a partition key scopeID and a sort key of dataID (and, optionally, a ttl for expiring documents).

Example app.arc:

@app
myapp

@tables
data
  scopeID *String
  dataID **String
  ttl TTL

Or equivalent CloudFormation YAML:

AWSTemplateFormatVersion: "2010-09-09"
Resources:
    BeginData:
        Type: "AWS::DynamoDB::Table"
        Properties:
            TableName: "data"
            BillingMode: "PAY_PER_REQUEST"
            KeySchema:
              -
                AttributeName: "scopeID"
                KeyType: "HASH"
              -
                AttributeName: "dataID"
                KeyType: "RANGE"
            SSESpecification:
                Enabled: "false"
            TimeToLiveSpecification:
                AttributeName: "ttl"
                Enabled: "TRUE"

Note: projects not based on Architect will need a BEGIN_DATA_TABLE_NAME environment variable. You can also use this env var to override and name the table anything you want. This also allows for multiple apps to share a single table.

API

let data = require('@begin/data')

The core API is three methods:

data.get(params[, callback]) → [Promise] for retreiving data
data.set(params[, callback]) → [Promise] for writing data
data.destroy(params[, callback]) → [Promise] for removing data

Additional helper methods are also made available:

data.incr(params[, callback]) → [Promise] increment an attribute on a document
data.decr(params[, callback]) → [Promise] decrement an attribute on a document
data.count(params[, callback]) → [Promise] get the number of documents for a given table

All methods accept a params object and, optionally, a Node-style errback. If no errback is supplied, a Promise is returned. All methods support async/await.

Writes

Save a document in a table by key. Remember: table is required; key is optional.

let taco = await data.set({
  table: 'tacos',
  key: 'al-pastor'
})

All documents have a key. If no key is given, set will generate a unique key.

let token = await data.set({
  table: 'tokens',
})
// {table:'tokens', key:'LCJkYX9jYWwidW50RhSU'}

Batch save multiple documents at once by passing an Array of Objects.

let collection = await data.set([
  {table: 'ppl', name:'brian', email:'[email protected]'},
  {table: 'ppl', name:'sutr0', email:'[email protected]'},
  {table: 'tacos', key:'pollo'},
  {table: 'tacos', key:'carnitas'},
])

Reads

Read a document by key:

let yum = await data.get({
  table: 'tacos',
  key: 'baja'
})

Batch read by passing an Array of Objects. With these building blocks you can construct secondary indexes and joins, like one-to-many and many-to-many.

await data.get([
  {table:'tacos', key:'carnitas'},
  {table:'tacos', key:'al-pastor'},
])

Destroy

Delete a document by key.

await data.destroy({
  table: 'tacos',
  key: 'pollo'
})

Batch delete documents by passing an Array of Objects.

await data.destroy([
  {table:'tacos', key:'carnitas'},
  {table:'tacos', key:'al-pastor'},
])

Pagination

Large sets of data can not be retrieved in one call because the underlying get api paginates results. In this case use the for await syntax with a limit set to get paginated data.

let pages = data.page({ table:'ppl', limit:25 })
let count = 0  
for await (let page of pages) {
  console.log(page)
  count++
}

Additional Superpowers

Documents can be expired by setting ttl to an UNIX epoch in the future.
Atomic counters: data.incr and data.decr

See the tests for more examples!

Patterns

Coming soon! Detailed guides for various data persistence tasks:

Denormalizing
Pagination
Counters
Secondary indexes
One to many
Many to many

begin-data's People

Contributors

Stargazers

Watchers

Forkers

kristoferjoseph zaverden azurecloudmonk thislooksfun exa2020 tnypxl enhance-dev

begin-data's Issues

Add support for sort keys

I'd like to be able to specify a sort key so I can return (paginated) results of a table in order without running a scan and processing results. Not huge news, I know! Reference.

Improve error message if `data` table is not present

Currently, if the data table required by Begin Data is not present, you may get an unhelpful error such as

/path/to/project/node_modules/@aws-lite/client/src/error.js:13
  let err = error instanceof Error ? error : Error()
                                             ^
ParameterNotFound: @aws-lite/client: SSM.GetParameter: unknown error
    at errorHandler (/path/to/project/node_modules/@aws-lite/client/src/error.js:13:46)
    at GetParameter (/path/to/project/node_modules/@aws-lite/client/src/client-factory.js:211:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  statusCode: 400,
  headers: {
    'content-type': 'application/json',
    date: 'Sat, 03 Feb 2024 20:51:57 GMT',
    connection: 'close',
    'content-length': '45'
  },
  __type: 'ParameterNotFound',
  code: 'ParameterNotFound',
  service: 'ssm',
  property: 'SSM',
  awsDoc: 'https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_GetParameter.html',
  readme: 'https://aws-lite.org/services/ssm#getparameter',
  time: '2024-02-03T20:51:57.603Z'
}

We should improve that output as it is not helpful.

See: https://github.com/beginner-corp/begin-data/blob/main/src/helpers/_get-table-name.js#L45

Deno version

maybe this isn't even in this repo but we need to track the issue anyhow. we're adding support for Deno to Begin and we'll also want to support begin/data for our customers in that runtime. I was pleased to discover the ecosystem is a step ahead of us tho and there is a dynamo client already! https://github.com/chiefbiiko/dynamodb/tree/master/

@architect/sandbox v4.xx breaks with @begin/2.0.1, and v3 is not released

I looked at the changelog for @begin/data, and it seems there is a dependency where @architect/sandbox v4.xx requires @begin/data v3, but v3.0 is not released to npm yet, or tagged in git

I am using @architect/sandbox v3.7.4 in the mean time, but it took me quite a while to figure out why my app was breaking after upgrading to @architect/sandbox v4

Help with writing more than 25 documents to Begin Data

Begin Data allows you to either set a single document, or an array of documents. But the array is limited to 25 documents.

If you want to write, say, 532 documents you will have to batch them into groups of 25. If you don't, if you try to issue 532 set calls (a naive approach) the Lambda will likely time-out. So batching for list sizes > X is effectively mandatory.

So, it would be lovely if Begin Data either:

Auto batched calls to set where the array is > 25 documents or
Provides a one-liner helper method to break up a large array into an array of smaller arrays each 25 documents or less

Feat req: UI for Data

First of all, very cool project! 🎉

I created the Todo CRUD example app and added some todos to it. Then I clicked the Data link in sidebar expecting to see a submenu with tables listed, then clicking the table would show a table of the data, but got this instead:

Any plans to add a web UI for the Data section?

support for `begin` param on bulk get

Is there a way to support the begin param on a bulk get call?

This works for a single get (as per the integration tests):

let sept = await data.get({ table: 'hockey', begin: '2019-09' })

However, for a bulk get, a key is required. The following fails with the error Invalid params: all items must have table and key:

let sept = await data.get([
  { table: 'hockey', begin: '2019-09' },
  { table: 'curling', begin: '2019-09' }
])

Add range queries for both table and name

startsWith
gt
lt
?

Re-enable publishing module to GHPR when it goes GA

tl;dr: uncomment this: https://github.com/smallwins/begin-data/blob/master/.github/workflows/build.yml#L72-L90

Add support for secondary indexes

I'd like to be able to specify additional indexes on a single Begin Data table. My use case isn't super surprising, but looks something like this:

itemID (main index, good for a get of that item)
timestamp (see: sort key support issue)
eventType (ideal secondary index, query to get all items of an event type)

Raw scopeID and dataID are returned from incr/decr

For following code:

await data.incr({ table: 't', key: 'k', prop: 'value' })

Actual return value:

{
  scopeID: 'begin-app',
  dataID: 'staging#t#k',
  value: 42
}

Expected return value:

{
-  scopeID: 'begin-app',
-  dataID: 'staging#t#k',
+  table: 't',
+  key: 'k',
   value: 42
}

keys with `-` hanging set

steps to repro

await data.set({table: 'mytable', key: 'my-key'})

expected: should save the key with dashes or fail very loudly and clearly if thats not possible for some reason

Q: How would you model one-to-many using @begin/data and DynamoDB?

How would you model the following entities?
User --> has many Posts
Post --> each Post belongs to a single User
Image --> each Image belongs to either a User or a Post or an Organisation
Organisation --> each Organisation has many Users, each User can belong to many Organisations

Would you create each entity in a separate table and have a field pointing to an owner entity (a la FK)?
Would you store all of them in one big Organisation table containing nested json fields, with denormalised data duplicated multiple times across different documents? if so - how would you go about updating a User field that belongs to several Organisations?

"main" field pointing to wrong place on file system in package.json throws deprecation warning

(node:92682) [DEP0128] DeprecationWarning: Invalid 'main' field in '@begin/data/package.json' of 'src/index.js'. Please either fix that or report it to the module author
(Use node --trace-deprecation ... to show where the warning was created)