Code Monkey home page Code Monkey logo

fold's Introduction

fold

A unified data model for multidimensional arrays.

Building on a foundation of generalized shapes, the fold project aims to factor the evolving multidimensional array repertiore - including advanced indexing, indexed assignments, strided storage, ragged arrays, views, sparsity and other structured storage techniques, scatter/gather, etc. - into a small collection of orthogonal concepts:

  • extraction vs injection along an index: do we want to

    • extract elements of array a at the locations in index i to form a new array: b = a[i]
    • inject the elements of array b into array a at i's locations: a[i] = b
  • materialization vs functionalization: given an operation that produces an observable result, do we want to

    • perform the operation immediately and materialize the result?
    • defer the operation and provide a functional interface through which the result can be observed piecemeal?
  • explicit vs implicit indexing: given a set of locations we wish to express, should we

    • explicitly provide the locations as values? or
    • give the locations implicitly, by placing some associated values at those locations?

To make this factoring concrete, we need the following foundations:

  • a data model for generalized shapes. Much of the power of multidimensional arrays comes from the reification of shape into metadata, decoupling abstract shape from physical layout and letting us define many important operations as simple metadata transforms. But the conventional model - a tuple of positive integers, one per dimension - is limited to expressing (hyper)rectangles. We need to generalize this model to include representations of the nonrectangular shapes we'll need, while preserving the current representations of rectangular shapes as a subset.

  • a data model for generalized strides that is expressive enough to represent any traversal of underlying storage to visit the elements of an array of any generalized shape. This universality lets us exploit the derivative/integral relationship between strides and positions (analogous to the relationship between extents and offsets in shapes).

  • a model for encoded sequences: by capturing regularity in shape and stride patterns, not only do these ensure that the quantity of metadata needed is correlated to entropy rather array size, interpreting per-dimension metadata as an encoded sequence is what lets express the shapes and strides of rectangular arrays in the usual way within our generalized abstraction.

This repo

This repo contains a prototype implementation of the fold array model, implemented in Python and using PyTorch Tensors as array backing stores (although full integration into PyTorch isn't attempted).

For a working introduction, check out these notebooks, starting with the one on generalized shapes.

fold's People

Contributors

bhosmer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.