Code Monkey home page Code Monkey logo

fluxo's Introduction

Exporters project documentation

Exporters provide a flexible way to export data from multiple sources to multiple destinations, allowing filtering and transforming the data.

This Github repository is used as a central repository.

Full documentation can be found here http://exporters.readthedocs.io/en/latest/

Getting Started

Install exporters

First of all, we recommend to create a virtualenv:

virtualenv exporters
source exporters/bin/activate

Installing:

pip install exporters

Creating a configuration

Then, we can create our first configuration object and store it in a file called config.json.
This configuration will read from an s3 bucket and store it in our filesystem, exporting only the records which have United States in field country:
{
     "reader": {
         "name": "exporters.readers.s3_reader.S3Reader",
         "options": {
             "bucket": "YOUR_BUCKET",
             "aws_access_key_id": "YOUR_ACCESS_KEY",
             "aws_secret_access_key": "YOUR_SECRET_KEY",
             "prefix": "exporters-tutorial/sample-dataset"
         }
     },
     "filter": {
         "name": "exporters.filters.key_value_regex_filter.KeyValueRegexFilter",
         "options": {
             "keys": [
                 {"name": "country", "value": "United States"}
             ]
         }
     },
     "writer":{
         "name": "exporters.writers.fs_writer.FSWriter",
         "options": {
             "filebase": "/tmp/output/"
         }
     }
}

Export with script

We can use the provided script to run this export:

python bin/export.py --config config.json

Use it as a library

The export can be run using exporters as a library:

from exporters import BasicExporter

exporter = BasicExporter.from_file_configuration('config.json')
exporter.export()

Resuming an export job

Let's suppose we have a pickle file with a previously failed export job. If we want to resume it we must run the export script:

python bin/export.py --resume pickle://pickle-file.pickle

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.