Code Monkey home page Code Monkey logo

iter's Introduction

Package iter - Streaming Iterators for Go

Overview

Package iter is providing simple interfaces and facilities for streaming data through iterators in a functional style. This is useful if you want to create an application that is reading data from a DB, processing it and immediately streaming the results to some client (without waiting for all data to be processed first).

This package makes it easy to instantiate Streaming Iterators, taking care of all the brittle details of asynchronous, concurrent streaming while you just need to provide a mapper function to be applied to the stream.

Features

  • provides a simple Iterator interface
  • takes care of all streaming details behind the scene
  • you just need to provide a Mapper function for processing of the streamed data
  • configurable amount of worker routines
  • configurable channel buffer size
  • supports continue on error
  • supports streaming from the 3 most common sources directly:
    • Generators, Iterators and Channels
  • Iterators can be chained
  • easy to extend to specific types

Problem Statement

Implementing the iterator pattern in Go is straightforward. And thanks to Channels and Go-routines it is very easy to stream values asynchronously - just spin up a sender and a receiver Go-routine and let them communicate over a channel. If you want to also notify about errors you add another channel. If you want to cancel the sender when the receiver is done things start to become complex. At latest when you start to have multiple senders/receivers running concurrently, maybe using buffered channels, you will inevitably run into issues and surprising effects, like

  • race conditions / deadlocks
  • flaky tests / Heisenbugs that are horrible to debug
  • unpredictable ordering
  • valid results on subsequent calls after an error
  • getting an EOF before all go routines are done
  • ...

I hit these issues many times and if you want to stream values through several layers of an application you start to re-implement all this functionality (and the mistakes) again and again.

Usage

Package iter is build up on three concepts:

  • a Mapper func is applied to an input item and creates an output item
  • a Stream is a function that can be applied on an input source, sets up go-routines to apply the Mapper func on the input items and returns an Iterator for iterating over the results.
  • an Iterator can be used to iterate over the result items of a stream and to cancel the background processes when done.

There are three types of directly supported input sources to stream from: Generator funcs, Iterators and Channels.

Stream from a Generator Func

stream := iter.NewGeneratorStream(context.Background(), mapperFunc)
iterator := stream(generatorFunc)
defer iterator.Close()
for {
    item, err := iterator.Next()
    ...
}
 

Stream from an Iterator

stream := iter.NewStream(context.Background(), mapperFunc)
iterator := stream(inputIter)
defer iterator.Close()
for {
    item, err := iterator.Next()
    ...
}
 

Stream from Channels

stream := iter.NewChannelStream(context.Background(), mapperFunc)
iterator := stream(inputChan, errChan)
defer iterator.Close()
for {
    item, err := iterator.Next()
    ...
}
 

Stream Options

// supported options with their defaults:
bufsize := iter.BufSizeOpt(0)
workers := iter.WorkersOpt(1)
contOnErr := iter.ContOnErrOpt(false)

stream := iter.NewStream(context.Background(), mapperFunc, bufSize, workers, contOnErr)
...

Important Properties

  • setting less than 1 worker will cause a panic
  • after iter.Close(), iter.Next() may still return results with more than 1 worker or when using buffers
  • if an iter.Next() call returns an error, subsequent calls may still return valid results from other workers or a buffered channel
  • choosing more than 1 worker will make the order of results unpredictable
  • by default, a Stream will eventually stop streaming after a Mapper returned an error
    • use ContOnErrOpt(true) to change this behavior
  • make sure that generators and mappers are threadsafe if you want to use more than one worker
    • the iterator returned by New is threadsafe
  • adding many buffers in a chain of streams will lead to pre-fetching of many items that may be disregarded when downstream is canceling the stream

iter's People

Contributors

hphilipps avatar

Stargazers

 avatar

Watchers

 avatar  avatar

iter's Issues

Support filtering

The current Stream implementation allows to map input items one-by-one to output items by providing a generic Mapper func. We should extend the functionality of Streams to also support filtering (only returning matching items).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.