Code Monkey home page Code Monkey logo

awesome-dataframes's Introduction

Awesome Dataframes

An awesome list of dataframe (and dataframe-like) libraries. This list focuses on libraries and tools intended for local (on your personal computer) manipulation of tabular data.

Libraries

Python

  • pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.
  • Modin - Speed up your Pandas workflows by changing a single line of code.
  • Lemuras - A small pure Python library to deal with big tables.
  • Ibis - A pandas-like deferred expression system, with first-class SQL support.
  • agate - agate is a Python data analysis library that is optimized for humans instead of machines. It is an alternative to numpy and pandas that solves real-world problems with readable code.
  • Prosto - A Python data processing toolkit to programmatically author and execute complex data processing workflows. Conceptually, it is an alternative to purely set-oriented approaches to data processing like map-reduce, relational algebra, SQL or data-frame-based tools like pandas.
  • siuba - Python library for using dplyr like syntax with pandas and SQL.
  • Vaex - A high performance Python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets.
  • dfply - dplyr-style piping operations for pandas dataframes.
  • kadro - A friendly pandas wrapper with a more composable grammar support.
  • dexplo - Data exploration library with a pandas-like API.
  • pands_cub - A detailed project that teaches you how to build your own Python data analysis library, pandas_cub, from scratch.
  • fletcher - Pandas ExtensionDType/Array backed by Apache Arrow

R

  • dplyr - A grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges.
  • data.table - Provides a high-performance version of base R's data.frame with syntax and feature enhancements for ease of use, convenience and programming speed.
  • dance - Dancing ๐Ÿ’ƒ with the stats, aka tibble() dancing ๐Ÿ•บ. dance is a sort of reinvention of dplyr classic verbs, with a more modern stack underneath, i.e. it leverages a lot from vctrs and rlang.

JavaScript

  • Arquero - A JavaScript library for query processing and transformation of array-backed data tables. Following the relational algebra and inspired by the design of dplyr, Arquero provides a fluent API for manipulating column-oriented data frames.
  • dataflow-api - JavaScript API for dataflow processing using the vega-dataflow reactive engine. Perform common database operations (sorting, filtering, aggregation, window calculations) over JavaScript objects.
  • datalib - A JavaScript data utility library. It provides facilities for data loading, type inference, common statistics, and string templates.
  • Tidy.js - Tidy up your data with JavaScript, inspired by dplyr and the tidyverse.
  • Data-Forge - The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
  • zebras - A data manipulation and analysis library written in JavaScript offering the convenience of pandas or R.
  • dataframe-js - A javascript library providing a new data structure for datascientists and developers.

Julia

  • DataFrames.jl - Tools for working with tabular data in Julia.
  • DataKnots.jl - A Julia library for querying data with an extensible, practical and coherent algebra of query combinators.
  • Volcanito.jl - Backend agnostic for tabular data operations in Julia.
  • Query.jl - A package for querying julia data sources. It can filter, project, join and group data from any iterable data source, including all the sources supported in IterableTables.jl.

Clojure

  • tech.ml.dataset - A Clojure high performance data processing system.
  • tablecloth - Dataset manipulation library build on the top of tech.ml.dataset.

C++

  • DataFrame - A C++ statistical library that provides an interface similar to Pandas package in Python.

Elm

  • tidy - Leaning heavily on the principles of the tidyverse, and especially tidy data, this package makes it easy to reshape and tidy tabular data for easier data analysis and visualization.

Java

  • Tablesaw - Java dataframe and visualization library.

Kotlin

  • krangl - A {K}otlin library for data w{rangl}ing.

Lua

  • Assistant - A data science library providing flexible dataframes for Lua 5.1+

Ruby

  • rover - Simple, powerful data frames for Ruby.
  • daru - daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data in Ruby.

Rust

  • polars - A blazingly fast DataFrames library implemented in Rust.

Database

  • SQLite - A C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine
  • DuckDB - An embeddable SQL OLAP Database Management System.

CLI

  • VisiData - A terminal spreadsheet multitool for discovering and arranging data.

GUI

  • Power Query - A core capability of Power Query is to filter and combine, that is, to mash-up data from one or more of a rich collection of supported data sources.

Other

Papers

Other Lists

awesome-dataframes's People

Contributors

jcmkk3 avatar xhochy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.