data-processing Goto Github PK
Type: Organization
Type: Organization
A hot bloop for your productivity
Cassovary is a simple big graph processing library for the JVM
Cloudera Development Kit
Open Source Web Crawler for Java
Crunch is an Apache TLP now, and lives at http://crunch.apache.org/
Task scheduling and blocked algorithms for parallel processing
DataPipeline for humans.
Disque is a distributed message broker
Python clone of Spark, a MapReduce alike framework in Python
Real²time Exploratory Analytics on Large Datasets
This repository hold the Amazon Elastic MapReduce sample bootstrap actions
Exelixi is a distributed framework based on Apache Mesos, mostly implemented in Python using gevent for high-performance concurrency. It is intended to run cluster computing jobs (partitioned batch jobs, which include some messaging) in pure Python. By default, it runs genetic algorithms at scale.
Python Stream Processing
Data analysis and reporting tool for quick access to custom charts and tables in Jupyter Notebooks and in the shell.
Mirror of Apache Flink
Web crawling framework based on asyncio for everyone.
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, Baidu and others) by using proxies (socks4/5, http proxy) and with many different IP's, including asynchronous networking support (very fast).
The Java gRPC implementation. HTTP/2 based RPC
HiBench is a Hadoop benchmark suite.
Hydra is a framework for elegantly configuring complex applications
source examples to support the "Cascading for the Impatient" blog post series
Mirror of Apache Samza
Runs embedded, in-memory Apache Kafka instances. Helpful for integration testing.
A tool for managing Apache Kafka.
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+, while using Apache Avro as the data serialization format.
Hadoop utilities for Kafka
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.