Code Monkey home page Code Monkey logo

flowage's Introduction

Flowage

npm version

Contents

Motivation

This package simplifies transformations and filtering of NodeJS object streams. Think about it as Underscore.js for streams.

The basic use case I faced many times was a transformation of a large number of JSON objects that are finally stored in some database. Transformation is the quick part but then you have to then chunk data in size allowed by your database to limit the number of queries and control the flow of the whole stream based on how fast you are able to save the transformed data.

Basic usage

const { Readable } = require('stream');
const Flowage =  require('flowage');

// Let's have some stream that will output a series of objects { n: 0 }, { n: 1 }, { n: 2 }, { n: 3 }, ...
const readable = new Readable({ objectMode: true });
let n = 0;
setInterval(() => readable.push({ n: n++ }), 1000);

// Pipe it thru Flowage() to get stream extended by helper methods.
const flowage = readable.pipe(new Flowage());

// Split the stream into a stream of odd objects and even objects and extend them with some field is='odd' or is='even'.
const oddStream = flowage
    .filter(obj => obj.n % 2)
    .map(obj => Object.assign({}, obj, { is: 'odd' }));

const evenStream = flowage
    .filter(obj => obj.n % 2 === 0)
    .map(obj => Object.assign({}, obj, { is: 'even' }));

// Then merge them back.
const mergedStream = oddStream.merge(evenStream);

// Chunk them by 100 records.
const chunkedStream = mergedStream.chunk(100);

// Save them to MongoDB in batches of 100 items with concurrency 2.
// This also corks the stream everytime the period when max concurrency is reached.
chunkedStream.onSeries(async (arrayOf100Items) => {
    await datase.collection('test').insert(arrayOf100Items);
}, { concurrency: 2 });

Reference

merge stream1.merge(stream2)

Returns stream containing values merged from 2 given streams. Merged stream ends when both streams ends.

const mergedStream = stream1.merge(stream2);

collect stream.collect()

Returns Promise that gets resolved when stream ends to an array of all the values.

const data = await stream.collect();

filter stream.filter(function)

Returns stream containing filtered values.

// Filter out even items from stream.
const filteredStream = stream.filter(val => val.index % 2 === 0);

chunk stream.chunk(length)

Returns stream where each item is an array given number of items from original stream.

// Chunk values into arrays of 10 items.
const chunkedStream = stream.chunk(10);

map stream.map(function)

Returns stream where original items are transformed using given function.

// Extend each object in the stream with `.foo = 'bar'` field.
const mappedStream = stream.map(val => Object.assign({}, val, { foo: 'bar' }));

omit stream.omit(field1, field2, ...)

Returns stream where given fields where omitted.

// Omit field1 and field2 from stream objects.
const resultingStream = stream.omit('field1', 'field2');

pick stream.pick(field1, field2, ...)

Returns stream where each item contains only the given fields.

// Pick only field1 and field2 from stream objects.
const resultingStream = stream.pick('field1', 'field2');

pluck stream.pluck(field);

Returns stream with given field picked from each item.

// Pick only field1 and field2 from stream objects.
const resultingStream = stream.pluck('field1');

uniq stream.uniq(field)

Returns stream containing only unique items based on given field. You need enough memory to keep a set of all unique values hashed using sha256.

// Filter unique items based on id field.
const uniquesStream = stream.uniq('id');

weakSort stream.weakSort(sortFunction, [bufferMinSize=75], [bufferMaxSize=100])

Returns stream containing values sorted using given function and floating buffer of a given size.

This method is helpful when only a few neighboring items may have the wrong order. This may happen for example when a client is pushing data into the storage via API with concurrency higher than 1 and the quests reach the server in the wrong order. Or the API has multiple redundant instances that may process the incoming requests with different speed.

This method uses a buffer for streamed items. Every time the buffer reaches bufferMaxSize gets sorted and bufferMaxSize - bufferMinSize items are outputted to the stream.

const sortFunction = (a, b) => a.index < b.index ? -1 : 1;
const sortedStream = stream.sort(sortFunction, 75, 100);

onSeries stream.onSeries(async function, [concurrency=1])

Returns a promise that gets resolved when given function gets finished for the last item of the stream.

Everytime the given concurrency is reached it pauses the stream.

// Store items in MongoDB with concurrency 10.
await stream.onSeries(async (item) => {
    await database.collection('items').insert(item);
}, 10);

flowage's People

Contributors

mtrunkat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

showroom101

flowage's Issues

Add Unit Examples?

i'm trying to compare this package to mississippi

Do you think it makes sense to add some unit examples to your code base of what you think a normal usage pattern is - so that I can make the case of if this make sense for my use case - or keep using the ol'mississippi

Ref
1: npm.im/mississippi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.