Code Monkey home page Code Monkey logo

Optimized Analytics Package for Spark Platform (OAP)'s Projects

arrow icon arrow

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

arrow-data-source icon arrow-data-source

Spark DataSouce plugin for reading files from various formats like Parquet into Arrow compatible columnar vectors.

cloudtik icon cloudtik

Cloud Scale Platform for Distributed Analytics and AI

gazelle_plugin icon gazelle_plugin

Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

oap-mllib icon oap-mllib

Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.

oap-tools icon oap-tools

Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.

pmem-common icon pmem-common

Common library for accessing PMEM native library functions including memkind, vmemcache and so on.

pmem-shuffle icon pmem-shuffle

Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.

pmem-spill icon pmem-spill

Spark plug-in package for accelerating Spark runtime spill functions using PMem such as RDD cache PMem extension.

protobuf icon protobuf

A Intel customized Protocol Buffers - Google's data interchange format

raydp icon raydp

RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.

remote-shuffle icon remote-shuffle

Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.

sql-ds-cache icon sql-ds-cache

Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.

velox icon velox

A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.