Code Monkey home page Code Monkey logo

dataflows's Introduction

logo DataFlows

Travis Coveralls PyPI - Python Version Gitter chat

DataFlows is a simple and intuitive way of building data processing flows.

  • It's built for small-to-medium-data processing - data that fits on your hard drive, but is too big to load in Excel or as-is into Python, and not big enough to require spinning up a Hadoop cluster...
  • It's built upon the foundation of the Frictionless Data project - which means that all data produced by these flows is easily reusable by others.
  • It's a pattern not a heavy-weight framework: if you already have a bunch of download and extract scripts this will be a natural fit

Read more in the Features section below.

QuickStart

Install dataflows via pip install.

(If you are using minimal UNIX OS, run first sudo apt install build-essential)

Then use the command-line interface to bootstrap a basic processing script for any remote data file:

# Install from PyPi
$ pip install dataflows

# Inspect a remote CSV file
$ dataflows init https://raw.githubusercontent.com/datahq/dataflows/master/data/academy.csv
Writing processing code into academy_csv.py
Running academy_csv.py
academy:
#     Year           Ceremony  Award                                 Winner  Name                            Film
      (string)      (integer)  (string)                            (string)  (string)                        (string)
----  ----------  -----------  --------------------------------  ----------  ------------------------------  -------------------
1     1927/1928             1  Actor                                         Richard Barthelmess             The Noose
2     1927/1928             1  Actor                                      1  Emil Jannings                   The Last Command
3     1927/1928             1  Actress                                       Louise Dresser                  A Ship Comes In
4     1927/1928             1  Actress                                    1  Janet Gaynor                    7th Heaven
5     1927/1928             1  Actress                                       Gloria Swanson                  Sadie Thompson
6     1927/1928             1  Art Direction                                 Rochus Gliese                   Sunrise
7     1927/1928             1  Art Direction                              1  William Cameron Menzies         The Dove; Tempest
...

# dataflows create a local package of the data and a reusable processing script which you can tinker with
$ tree
.
├── academy_csv
│   ├── academy.csv
│   └── datapackage.json
└── academy_csv.py

1 directory, 3 files

# Resulting 'Data Package' is super easy to use in Python
[adam] ~/code/budgetkey-apps/budgetkey-app-main-page/tmp (master=) $ python
Python 3.6.1 (default, Mar 27 2017, 00:25:54)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from datapackage import Package
>>> pkg = Package('academy_csv/datapackage.json')
>>> it = pkg.resources[0].iter(keyed=True)
>>> next(it)
{'Year': '1927/1928', 'Ceremony': 1, 'Award': 'Actor', 'Winner': None, 'Name': 'Richard Barthelmess', 'Film': 'The Noose'}
>>> next(it)
{'Year': '1927/1928', 'Ceremony': 1, 'Award': 'Actor', 'Winner': '1', 'Name': 'Emil Jannings', 'Film': 'The Last Command'}

# You now run `academy_csv.py` to repeat the process
# And obviously modify it to add data modification steps

Features

  • Trivial to get started and easy to scale up
  • Set up and run from command line in seconds ...
    • dataflows init => flow.py
    • python flow.py
  • Validate input (and esp source) quickly (non-zero length, right structure, etc.)
  • Supports caching data from source and even between steps
    • so that we can run and test quickly (retrieving is slow)
  • Immediate test is run: and look at output ...
    • Log, debug, rerun
  • Degrades to simple python
  • Conventions over configuration
  • Log exceptions and / or terminate
  • The input to each stage is a Data Package or Data Resource (not a previous task)
    • Data package based and compatible
  • Processors can be a function (or a class) processing row-by-row, resource-by-resource or a full package
  • A pre-existing decent contrib library of Readers (Collectors) and Processors and Writers

Learn more

Dive into the Tutorial to get a deeper glimpse into everything that dataflows can do. Also review this list of Built-in Processors, which also includes an API reference for each one of them.

dataflows's People

Contributors

akariv avatar anuveyatsu avatar cabral avatar colinmaudry avatar cschloer avatar gperonato avatar jornh avatar orihoch avatar pwalsh avatar roll avatar rufuspollock avatar sglavoie avatar shevron avatar starsinmypockets avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataflows's Issues

Split into sub packages for modularity

we need at least:

  • dataflows-cli
  • dataflows-core
  • dataflows-stdlib

installing dataflows would install all of the sub packages (+will import everything from all of them for compatibility and usability)

Adding foreign keys

It could be just like set_primary_key processor, however, adding foreign keys is probably less common. At the moment, the only option I can see is to use update_resource processor by providing the entire schema of a resource. Is there a way to get generated schema so that I could just add a new key into it (e.g., foreignKeys)?

allow generators to pass a schema on the first row

I think this will be useful

from dataflows import Flow, printer

def generator():
  yield {"__dataflows_schema": True,
            "name": "my-resource", 
            "path": "my-resource.csv", 
            "schema": {"fields": [{"name": "i", "type": "string"]}}
  for i in [1,'two', 'three']:
    yield {"i": i}

Flow(generator(), printer()).process()

add stream and unstream processors

For more accurate streaming of data - dump_to_path removes some information (e.g. date timezones) and modifies the schema (e.g. file format).
Also preparing for possible integration with dpp processors.

child of #59

Unpivoting with regex

How would I unpivot the following table using regex:

2000,2001,2002
a1,b1,c1,d1
a2,b2,c2,d2

I'd call unpivot like below:

unpivoting_fields = [
    {'name': r'\d{4}', 'keys': {'year': r'\d{4}'}}
]
extra_keys = [
    {'name': 'year', 'type': 'year'}
]
extra_value = {'name': 'value', 'type': 'string'}

unpivot(unpivoting_fields, extra_keys, extra_value)

but this results to:

year,value
\\d{4},a1
\\d{4},a2
\\d{4},b1
...

am I missing something?

Once we figure it out, I will update the docs as it would be great to have an example for this one 😄

Can't name/rename resources

By default, resources are named like following: res_1, res_2 etc.. and paths to the resources look similar res_1.csv, res_2.csv...

As a dataflows user, I want to name/rename resource(s) with choice of mine, so that I'm able to reuse resource(s) and find them by name, or just look nice

Acceptance Criteria

  • I can name however I want
  • Paths to the files are appropriate

Analysis

I tried to create a custom processor that changes the name of the resource but does not really work.

Option one: modify resource object descriptor:

def name_resource(package):
    package.pkg.resources[0].descriptor['name'] = 'countries'
    package.pkg.resources[0].descriptor['path'] = 'countries.csv'
    package.pkg.resources[0].commit()
    yield package.pkg
    yield from package

f = Flow(
      [{'hello': 'world'}],
      name_resource,
      dump_to_path('data'),
)

This kind of work as the output file is named countries.csv, but nothing is changed inside datapackage.json

$ cat data/nato_countries_official/datapackage.json 
{
  "name": "m-package",
  "resources": [
    {
      "name": "res_1",
      "path": "res_1.csv",
      "profile": "tabular-data-resource",
      "schema": {
        "fields": [
          {
            "format": "default",
            "name": "country_name",
            "type": "string"
          }
        ]
      }
    }
  ]
}

Option 2: modify pkg object descriptor:

def name_resource(package):
    package.pkg.descriptor['resources'][0]['name'] = 'countries'
    package.pkg.descriptor['resources'][0]['path'] = 'countries.csv'
    package.pkg.commit()
    yield package.pkg
    yield from package

f = Flow(
      [{'hello': 'world'}],
      name_resource,
      dump_to_path('data'),
)

This results in the error, thinking the resource is gone at all

Traceback (most recent call last):
  File "flows/run_all.py", line 4, in <module>
    nato_countries_official.process()
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/flow.py", line 15, in process
    return self._chain().process()
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 66, in process
    for res in ds.res_iter:
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 57, in <genexpr>
    res_iter = (it if isinstance(it, ResourceWrapper) else ResourceWrapper(res, it)
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/processors/dumpers/dumper_base.py", line 80, in process_resources
    for resource in resources:
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 54, in <genexpr>
    res_iter = (ResourceWrapper(get_res(rw.res.name), rw.it)
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 57, in <genexpr>
    res_iter = (it if isinstance(it, ResourceWrapper) else ResourceWrapper(res, it)
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 31, in process_resources
    for res in resources:
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 55, in <genexpr>
    for rw in res_iter)
  File "/home/.virtualenvs/fedex/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 51, in get_res
    assert ret is not None
AssertionError

Suggested feature: Dataflows REPL

see #48 and #49

Example use-case of dataflows REPL for Kubernetes management

$ pip install dataflows-kubernetes
$ pip install dataflows-prometheus

$ dataflows --checkpoints

> kubernetes.get:deployments --namespace=demo5 
Retrieving deployments for namespace demo5
Saved checkpoint 1

> prometheus.get:full_deployment_stats --namespace=demo5 --since=2018-10-01
Loading checkpoint 1
Retrieving full pod stats since 2018-10-01 for pod AAA namespace demo5
Retrieving full pod stats since 2018-10-01 for pod BBB namespace demo5
Saved checkpoint 2

> --inline-row-step 'row["interesting"] = row["avg_cpu_load"] > 90 or row["uptime_percent"] < 90'
Loading checkpoint 2
Saved checkpoint 3

> filter_rows --equals={"interesting":true}
Loading checkpoint 3
Saved checkpoint 4

> printer
deployment_name | datetime | avg_cpu_load | uptime_percent
------------------------------------------------------------------------------------
foobar | 2018-10-15 03:44 | 98 | 5

[proposal] package-set level operations

Right now a dataflows and datapackage-pipelines can only do operations within one package at a time, but I think adding another layer to the api for handling a stream of multiple packages would make sense. This would be useful for:

  • splitting complex packages for multiple consumers
  • streaming multiple, self contained packages (say, bundles of user-wise data)

Save data as json not working

Traceback (most recent call last):
  File "exercise.py", line 24, in <module>
    dump_to_path(out_path='data', format='json')
  ...
  File "/home/zelima/anaconda3/envs/dataflows/lib/python3.7/site-packages/dataflows/processors/dumpers/file_formats.py", line 30, in __init__
    self.headers = [f.name for f in schema.fields]
AttributeError: 'NoneType' object has no attribute 'fields'
  • Add tests for this scenario
  • Make sure they pass

Excel output processor

Do we have one already? Where do I check this sort of stuff?

As a Developer I want to output the Data Package as an Excel file with each resource as a seperate tab so that I can share the excel file with people who use Excel

  • Targetting modern Excel (xlsx)
  • Bonus: output metadata in a separate sheet e.g. each resource with its metadata followed by 2 blank lines
  • Bonus: metadata per resource as extra rows at the top of a sheet (?)

0.0.51 breaks passing in named resources to Flow

I need some time to figure out what's up here and to provide you better documention (it's possible this is on my end, if so I will close this issue) but there seems to be an issue with passing in a name for a resource to load since 0.0.51. Instead of only using the passed in name, it creates two resources:

  1. One using the passed in name, containing the correct headers, the correct types, and no rows.
  2. Another using the default empty name (aka the file name), containing the correct headers, the correct rows and no types.

I'll come back to this asap to give you more details, for now I'm using v0.0.50

dataflow vs dataflow*s*

I don't think a problem if we have to have plural for PyPI but i think we should keep singular for command line (or even package import) because it is simpler and makes more sense (you are building a flow not flows).

head / tail processors

def head(num_rows=10):
    
    def step(rows):
        for rownum, row in enumerate(rows):
            if rownum >= num_rows:
                break
            yield row
    
    return step

def tail(num_rows=10):

    def step(rows):
        for row in deque(rows, maxlen=num_rows):
            yield row
    
    return step

concurrency?

can we please have concurrency? dataflows works great for small datasets, but it's rather time consuming when it comes to a slightly larger dataset

ImportError: No module named dataflows

Fresh install on macOS 10.13.6 High Sierra. I'm new to Python, following the tutorial at http://okfnlabs.org/blog/2018/08/29/data-factory-data-flows-introduction.html

09:08:25 jonathan:~/Documents/projects/datafactory (master) $ dataflows --help

Usage: dataflows [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  init  Bootstrap a processing pipeline script.

09:08:29 jonathan:~/Documents/projects/datafactory (master) $ dataflows init https://rawgit.com/datahq/demo/_/first.csv

Writing processing code into first_csv.py
Running first_csv.py
Processing failed, here's the error:
python: VERSIONER_PYTHON_VERSION environment variable error (ignored)
Traceback (most recent call last):
  File "first_csv.py", line 1, in <module>
    from dataflows import Flow, load, dump_to_path, dump_to_zip, printer, add_metadata
ImportError: No module named dataflows

09:16:13 jonathan:~/Documents/CatalystIT/projects/datafactory (master) $ dataflows init

Hi There!
    DataFlows will now bootstrap a data processing flow based on your needs.

    Press any key to start...
    
[?] What is the source of your data?: File
 ❯ File
   Remote URL
   SQL Database
   Other

At first I thought maybe the rawgit.com URL is bad (it doesn't serve data now), but
dataflows init https://raw.githubusercontent.com/datahq/demo/master/first.csv
also generates the ImportError: No module named dataflows message.

Why not combining the data resources and the datapackage.json into one json file?

I've been using dataflows to process data and dumping to s3 using a custom dumper I wrote based on the datapackage-pipelines-aws package. Everything works pretty well however when it comes to version control, I've encountered issues.
because the data file(usually a csv) and the datapacakge files are dumped separately, it makes it difficult to compare existing versions (using md5 checksum). so I would create a new version of the datapackage.json but not the csv. With the current structure, it's difficult to say if we are creating a new datapackage.json, cache a new csv too.
I was wondering if it would be beneficial to dump data resources with datapackage.json in one big json file?

How would I add minimum/maximum constraints?

E.g., I have a table with time series and I wish to have constraints.minimum and constraints.maximum in the schema so that I have aggregated information about the table.

Is there a standard approach using existing processors or should I go for a custom processor?

Write up a concrete use case or two

Give an example of a project you did where you used (or now would use) DataFlows and what it replaced.

Makes this tangible and makes clear the benefit.

Supports arguments to python flow.py

The following commands should work - i.e. you can pass command line options to flow.py

python flow.py help
python flow.py --debug
python flow.py --start-at=...

This implies we need a special cli runner by default in templates that parses cli arguments ...

if __name__ == __main__:
    dataflows.runcli(...)

suggested features: dataflows CLI + auto-numbered checkpoints

Example shell session using suggested features for dataflows CLI and checkpoints:

$ dataflows load "/foo/bar/datapackage.json" | dataflows ./my-flow.py:my_step "arg_a" "arg_b" | dataflows printer
FOO | BAR
-------|-------
aaa  | ccc
^^^^^^^^^^^

$ dataflows ./my-flow.py:my_other_step | dataflows checkpoint
Saved checkpoint 1

$ dataflows checkpoint 1 | dataflows join --source_name=foo --source_key='["my_id"]' --source_delete=false --target_name=bar --target_key='["my_id"] --fields='{"baz": {}}' | dataflows checkpoint
Loading from checkpoint 1
Saving to checkpoint 2

$ dataflows checkpoint last | dataflows printer
Loading from checkpoint 2
FOO | BAR
-------|-------
aaa  | ccc
^^^^^^^^^^^

Could be used to support integration with singer (#16)

$ dataflows singer exchangerates --coin=BTC | dataflows printer
Date | Coin | Rate
------------------------
2017 | BTC | 5000$
2018 | BTC | 20000$
2019 | BTC | 5$

Join takes too long time (or hangs) to process the data

I'm trying to solve this exercise https://github.com/ViderumGlobal/programming-exercise but join needs so big time to process that I thought it just hang and could not finish the task. Don't see any while loops in join.py so I doubt I'm getting in an infinite loop, making me think that it's just slow.

I simplified the code

from dataflows import Flow, load, join, printer, filter_rows, 

def filter_over_10(rows):
    for row in rows:
        if row.get('order') is not None and row.get('order') > 10:
            continue
        yield row

res = Flow(
        load('data/movies/datapackage.json'),
        load('data/credits/datapackage.json'),
        filter_over_10,
        filter_rows(not_equals=[{'revenue': 0}], resources=['tmdb_5000_movies']),
        filter_rows(not_equals=[{'gender': 0}], resources=['tmdb_5000_credits']),
        join('tmdb_5000_movies', ['id'], 'tmdb_5000_credits', ['id'], fields={'revenue':{}}, full=False),
        printer(),
).results()
  • movies is ~4000 rows
  • credits ~40000 after the filter

Comparison with Meltano, Mara, Airflow and other ETL tools

As a potential User of dataflows I want to understand how it compares to other tools so that I get what use cases it was designed for and why (or why not) i should use it (and also deepen respect for its creators because i know they know their stuff).

As an example of this done very well see VuePress https://vuepress.vuejs.org/guide/#why-not (short) and VueJS https://vuejs.org/v2/guide/comparison.html (long)

Tasks

Add filter rows with callable

Filter rows using a callable in Python's built in filter.
This is very useful and can be reused. It might make sense to merge with dataflows.filter_rows.
h/t @shevron

def filter_rows_callable(cb):
    def f(rows):
        yield from filter(cb, rows)
    return f

[cli] Misc holding page for ideas from Rufus

Options for dataflow init

  • --interactive - default = false
  • --package - produce a full data package layout. default=false

Options for run

  • run looks for flow.py in current directory or in script (flow)/flow.py
  • --step

Normalize resource

Add ability to normalize a resource - i.e.

  • extract a few columns out of a resource into a deduplicated 'lookup table' resource
  • add proper cross indexes and pointers (i.e. replace extracted columns with the index of the
  • create foreign key relationships between the two resources.

when using dump_to_path, it gives blank lines between each row

Other problem occurred when using dump_to_path, it gives blank lines between each row.
Possible solution: https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row

Blank lines between the rows have been fixed by doing next steps:

file: https://github.com/datahq/dataflows/blob/master/dataflows/processors/dumpers/file_dumper.py
line: 92
add newline='' as argument in tempfile.NamedTemporaryFile
from this: temp_file = tempfile.NamedTemporaryFile(mode="w+", delete=False)
to this: temp_file = tempfile.NamedTemporaryFile(mode="w+", delete=False, newline='')
This fix should be checked Linux and Mac how it behaves.

Originally posted by @svetozarstojkovic in #57 (comment)

Add computed field with callable

Add a computed field that is computed using a Python callable
This is very useful and can be reused. It might make sense to merge with dataflows.add_computed_field.
h/t @shevron

def add_computed_field_callable(name, type, callback, **options):
    def func(package):
        # Alter the schema to add a field
        for resource in package.pkg.descriptor['resources']:
            resource['schema']['fields'].append(dict(name=name, type=type, **options))
        yield package.pkg
        
        def value_setter(rows):
            for row in rows:
                row[name] = callback(row)
                yield row

        for resource in package:
            yield value_setter(resource)

    return func

Runtime execution contract documentation

From what I can tell, when a flow is processed, each row goes through the entire pipeline before the next is processed (for the most part). Also as discussed in gitter, resources are only available after the package.pkg has been yielded (in a package level processor).

Suggested feature: Dataflows DSL

$ dataflows -c '
load "/foo/bar/datapackage.json"
./my-flow.py:my_step "arg_a" "arg_b"
printer
'
FOO | BAR
-------|-------
aaa  | ccc
^^^^^^^^^^^

Create file: my-flow.dataflow

#!/usr/bin/env dataflows

my_module.steps:my_step "${1}" "${2}" '{"baz":"bax"}'
checkpoint

Run it:

$ chmod +x my-flow.dataflow
$ ./my-flow.dataflow "PARAM_1" "PARAM_2"

Saving checkpoint 1

$ dataflows -c '
checkpoint 1
join --source_name=foo --source_key=["my_id"] \
       --source_delete=false --target_name=bar --target_key=["my_id"] \
       --fields={"baz": {}}
checkpoint
'
Loading from checkpoint 1
Saving to checkpoint 2

Related: #48

Demos (in a demos directory)

  • Load a local csv
  • Load a local xls
  • Load google analytics
  • Load a remote CSV
  • Load a wikipedia page and scrape a table

Using inline data as source raises AttributeError

Trying to load inline data as described in the tabulator docs - https://github.com/frictionlessdata/tabulator-py#inline-read-only

But it raises the following error. Looks like it's expecting string not list object as a source. I thought there is a way to tell that this source is inlined data and tried to specify format='inline' but it didn't help:

Traceback (most recent call last):
  File "date.py", line 32, in <module>
    Calendar_Date_Dimension()
  File "date.py", line 28, in Calendar_Date_Dimension
    flow.process()
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/base/flow.py", line 15, in process
    return self._chain().process()
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 83, in process
    ds = self._process()
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 72, in _process
    datastream = self.source._process()
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 72, in _process
    datastream = self.source._process()
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/base/datastream_processor.py", line 75, in _process
    self.datapackage = self.process_datapackage(self.datapackage)
  File "/Users/anuarustayev/Desktop/repos/sandbox-cubes/cubes/lib/python3.6/site-packages/dataflows/processors/load.py", line 88, in process_datapackage
    if self.load_source.startswith('env://'):
AttributeError: 'list' object has no attribute 'startswith'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.