Code Monkey home page Code Monkey logo

core.matrix's Introduction

core.matrix

Clojars Project

Build Status Dependency Status

The core.matrix library provides array programming as a language extension for Clojure/Clojurescript, with a focus on numerical computing.

core.matrix will become an official part of Clojure Contrib now that the API is mature and proven, at which point this repo will move to org.clojure/core.matrix

The central objective of core.matrix is to make matrix and array programming idiomatic, productive, elegant and fast in the Clojure environment.

(+ [[1 2]
    [3 4]]
   (* (identity-matrix 2) 3.0))

=> [[4.0 2.0]
    [3.0 7.0]]

(shape [[2 3 4] [5 6 7]]) ; => [2 3]

(mmul
  (array [[2 2] [3 3]])
  (array [[4 4] [5 5]])) ; => [[18 18] [27 27]]

;; Note: nested Clojure vectors can be used as matrices

Key goals of core.matrix:

  • Provide a clear, standard API / abstraction for matrix and array programming in Clojure
  • Enable pluggable support for different underlying matrix library implementations
  • Provide general purpose n-dimensional array implementations
  • Provide a foundation for other projects in the ecosystem (e.g. Incanter)
  • Enable high performance numerical computing
  • Allow idiomatic Clojure coding for numerical code

Documentation

For general core.matrix documentation and examples see the Wiki:

API documentation is available here

For a general introduction, the slide and video from the 2013 Clojure Conj talk are available here:

Clojurescript

To develop for clojurescript you will need to use the cljs-dev profile like this:

lein with-profile +cljs-dev repl

or using figwheel:

lein with-profile +cljs-dev figwheel

To build the Clojurescript unit tests you can run:

lein with-profile +cljs cljsbuild once

and then load resources/public/test.html in a browser to run the tests.

Docker

A Docker setup is available for quick reproducable envivironment.

Sample Commands

To build the dev environment image. This is currently based on clojure:openjdk-11-lein-buster image.

`docker build -f ./docker/Dockerfile . -t core.matrix-dev`

To run the build after the dev image is ready

`docker run --rm core.matrix-dev`

To run interactive bash shell for adhoc build commands

`docker run -it --entrypoint bash core.matrix-dev`

Status

core.matrix is fully functional and usable in production applications. As well as supporting the standard Clojure data structures, multiple back end implementations exist that provide optimised matrix implementations. The most mature implementations are currently:

  • vectorz-clj : a fast pure-JVM matrix library for Clojure, supporting full ND arrays
  • Clatrix : native code matrix library using BLAS
  • NDArray : a general purpose pure Clojure N-dimensional array implementation, included as part of core.matrix itself
  • nd4clj : This is a wrapper around the Nd4j api a native code and GPU accelerated matrix library

For Clojurescript:

  • aljabr : a cljc matrix library supporting Clojurescript

The API is relatively mature but still subject to some changes at present (at least up until release 1.0.0), so users should be prepared to deal with potential breaking changes when updating to future releases.

Contributing

All contributions / ideas welcome!

There are a number of proposed enhancements listed as GitHub issues: these are a good place to start if you wish to contribute to core.matrix:

If you wish to contribute code, please ensure you have a Clojure Contributors Agreement signed and on file. For more information see:

Discussions related to core.matrix generally take place on the "Numerical Clojure" group:

If you are interested in writing a core.matrix implementation, see:

You can also find a protocol implementation summary here:

Thanks

YourKit is kindly supporting this open source project with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.

core.matrix's People

Contributors

avgerin0s avatar bmabey avatar danielcompton avatar drone29a avatar ds923y avatar edwastone avatar ejackson avatar fyquah avatar gerrrr avatar grtlr avatar heffalump avatar hokkaido avatar irfn avatar japonophile avatar keithschulze avatar mikera avatar mschuene avatar munk avatar nblumoe avatar prasant94 avatar quantisan avatar rosejn avatar shark8me avatar si14 avatar siscia avatar tim-brooks avatar tommy-mor avatar tschady avatar ulsa avatar unwarysage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

core.matrix's Issues

Eliminate PersistentVector's implementations usage from default.clj

This is a long-term goal, but I think generally it's better to do this.

For now we reuse PersistentVector's implementations in many places in default.clj, coercing a value to persistent vector, performing an operation and coercing it back. This causes slowness (like in 5714695) and bugs (like in #51). We can avoid this in 2 ways:

  1. use NDArray instead of vectors, as it is presumably faster;
  2. implement defaults using mandatory protocols on arguments.

This options does not contradict with each other, so we can use them both.

Move protocols into separate namespace

Maybe protocols should be moved into a separate namespace:

Pros:

  • Hide protocols from API users
  • Allow protocols to be used independently of core.matrix
  • Might solve circular dependency issues with default implementations (core.matrix requires default implementation requires protocol)

Matrix construction functions

We need a way of constructing matrices that:

  • Is easy to use
  • can be hooked into by different matrix implementations

The idea is that something like:

(def M (matrix [[1 0] [0 1]]))

Will construct a 2x2 identity matrix using the current matrix implementation we are using.

N-dimensional pretty printing

Follow on from #15

Pretty-printing of different (non-2D) array sizes currently fails with errors like:

java.lang.UnsupportedOperationException: nth not supported on this type:      ScalarWrapper
    at clojure.lang.RT.nthFrom(RT.java:846) 
    at clojure.lang.RT.nth(RT.java:796)
    at clojure.core.matrix$pm.invoke(matrix.clj:999)

pm needs extending to handle these cases

Can't register an implementation without PConversion

Here is a file/revision in question: https://github.com/si14/matrix-api/blob/70b376f58ec3846df6622b971001c3ade32d0725/src/main/clojure/clojure/core/matrix/impl/ndarray.clj#L283

When trying to run

(imp/register-implementation (empty-ndarray [1]))
java.lang.NullPointerException: null
 at clojure.lang.RT.alength (RT.java:2120)
    clojure.core.matrix.impl.wrappers.NDWrapper.dimensionality (wrappers.clj:248)
    clojure.core.matrix.protocols$persistent_vector_coerce.invoke (protocols.clj:458)
    clojure.core.matrix.impl.wrappers.NDWrapper.toString (wrappers.clj:285)
    clojure.core$str.invoke (core.clj:497)
    clojure.core/fn (core_print.clj:94)
    clojure.lang.MultiFn.invoke (MultiFn.java:167)
    clojure.core$pr_on.invoke (core.clj:3266)
    clojure.core$print_map$fn__5292.invoke (core_print.clj:197)
    clojure.core$print_sequential.invoke (core_print.clj:58)
    clojure.core$print_map.invoke (core_print.clj:200)
    clojure.core/fn (core_print.clj:204)
    clojure.lang.MultiFn.invoke (MultiFn.java:167)
    clojure.core$pr_on.invoke (core.clj:3266)
    clojure.core$pr.invoke (core.clj:3278)
    clojure.tools.nrepl.middleware.pr_values$pr_values$fn$reify__531$fn__533.invoke (pr_values.clj:23)
    clojure.tools.nrepl.middleware.pr_values$pr_values$fn$reify__531.send (pr_values.clj:23)
    clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__547$fn__558.invoke (interruptible_eval.clj:67)
    clojure.main$repl$read_eval_print__6405.invoke (main.clj:246)
    clojure.main$repl$fn__6410.invoke (main.clj:266)
    clojure.main$repl.doInvoke (main.clj:266)
    clojure.lang.RestFn.invoke (RestFn.java:1096)
    clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__547.invoke (interruptible_eval.clj:56)
    clojure.lang.AFn.applyToHelper (AFn.java:159)
    clojure.lang.AFn.applyTo (AFn.java:151)
    clojure.core$apply.invoke (core.clj:601)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1771)
    clojure.lang.RestFn.invoke (RestFn.java:425)
    clojure.tools.nrepl.middleware.interruptible_eval$evaluate.invoke (interruptible_eval.clj:41)
    clojure.tools.nrepl.middleware.interruptible_eval$interruptible_eval$fn__588$fn__590.invoke (interruptible_eval.clj:171)
    clojure.core$comp$fn__4034.invoke (core.clj:2278)
    clojure.tools.nrepl.middleware.interruptible_eval$run_next$fn__581.invoke (interruptible_eval.clj:138)
    clojure.lang.AFn.run (AFn.java:24)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1110)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:603)
    java.lang.Thread.run (Thread.java:722)

The error doesn't arise when PConversion is present. PConversion isn't marked as mandatory in protocols.clj, so I'm wondering if this is a bug.

Parallel Colt

I tried to make a very first implementation of the matrix of Parallel Colt (PC), however it is a huge mess... I guess I am missing something in the multimethod stuff...

It is not complete https://gist.github.com/4525672

What I noticed is that (PC) does not have a real and clear abstraction of matrix...

It may be me, but I guess that implement PC is gonna be a big mess...

Cheers

Simone

Mechanism to load matrix implementations

We could give users an easy way to load a specific matrix implementation.

I'm thinking something like:

(use 'core.matrix)

(use-matrix-implementation :jblas)

Mechanism would need to do a few things:

  • look up the implementation in a list of known implementations
  • require the implementation to ensure it is loaded. In case of failure, do something sane (e.g. give some instructions on how to get the dependency on the classpath)
  • possibly "hook" into some core.matrix functions, e.g. functions to construct new matrices should produce JBLAS matrices by default

NDArray loading time

Currently core.matrix loading NDArray loading is extremely slow, sometimes as much as 20secs.

This isn't acceptable for general purpose use of core.matrix, so for the moment I've made NDArray optional so that it loads lazily, see this commit:

9f1cfa4

With this commit, the NDArray initialisation time is only incurrened the first time an NDArray is used, e.g.

(time (array :ndarray 1))
"Elapsed time: 17640.514591 msecs"

(time (array :ndarray 1))
"Elapsed time: 0.118685 msecs"

What does 'pre-scale' mean as opposed to 'scale'?

Bit confused about this.

(defprotocol PMatrixScaling
  "Protocol to support matrix scaling by scalar values"
  (scale [m a])
  (pre-scale [m a]))

Surely pre- and post-multiplication by a scalar are the same thing (just element-wise scaling)? Or did you have something else in mind?

Missing cross product for vectors?

Is there a function for the cross product between vectors? I couldn't find it anywhere in the API. Maybe you could point me in the right direction.

Add API functions for standard matrix decompositions

We should have API functions for most standard matrix decompositions.

In particular we should have:

  • LU (lower upper trinagular decomposition)
  • SVD (singular value decomposition)
  • QR
  • Cholesky distribution
  • probably a few others.....

Naming is TDB, we have various proposals:

  • lu
  • decompose-lu
  • lu-decompose
  • decomposition-lu
  • lu-decomposition

Need to decide on a consistent and logical naming scheme before we commit to a public API.

Also need to define return values. Current thinking is a map (or defrecord?) that contains clearly labeller values e.g.

(decompose-lu matrix)
=> {lower: [matrix1]
      :upper: [matrix 2]
      :determinant [double value]

See:

Terminology: use 'array' for general case, 'matrix' only for 2D arrays?

Hi all

So I'm a bit confused at present looking at PImplementation and PDimensionInfo. They're addressing the general NDArray case (great!), but are using the term "matrix" not just for a (2D) matrix, but also for vectors and for general ND arrays.

I suggest we follow tradition in referring to the general case as an 'array' (equally happy with 'ndarray' as in numpy, 'tensor' as in matlab, or similar), and use 'matrix' only for the 2D case. I've just never heard anyone refer to a non-2-dimensional array as a matrix.

Also the "matrix" terminology adds extra unnecessary confusion around the distinction between a 1D array, and a 2D matrix whose shape is 1xN (row matrix?) or Nx1 (column matrix?).

I'd like to suggest that referring to a 'row vector' or 'column vector' is a bad idea because of the potential for confusion about what it means (a 1D vector, or a 2D matrix whose shape is 1xN or Nx1 ?). For reasons discussed on the mailing list, in a general NDArray framework it's preferable not to have special cases for 1D arrays identifying them with 2D matrices, just use a consistent broadcasting rule for all array shapes. A 1D array doesn't really have any natural orientation as a row or a column in a matrix context, at least not unless the chosen broadcasting rule makes it so.

Suggest using just 'vector' for the first case, and 'column matrix' or 'row matrix' for the latter, to make this more explicit.

Happy to put a patch together if people agree.

Lazy instantiation of specialised NDArray types

Currently NDArray is pre-defined with specific set of NDArray specialised types (currently Object and double)

Ideally, it should be possible to create (at runtime if necessary) a new specialised NDArray for any arbitrary type.

This would give several advantages:

  • It makes the NDArray useful for custom specialised types, e.g. Complex
  • Users can get better performance for their specific types
  • It makes it possible to initialise NDArrays for proimitive types like byte if needed
  • It would also be possible to lazy-load Object and double, to reduce start up time issues / AOT compilation requirement

Reflection warnings in NDArray

The current NDArray implementation has some reflection warnings whenever mvn test is run:

Reflection warning, clojure/core/matrix/impl/ndarray.clj:123 - reference to field data can't be resolved.
Reflection warning, clojure/core/matrix/impl/ndarray.clj:124 - reference to field ndims can't be resolved.
Reflection warning, clojure/core/matrix/impl/ndarray.clj:125 - reference to field shape can't be resolved.
Reflection warning, clojure/core/matrix/impl/ndarray.clj:126 - reference to field strides can't be resolved.
Reflection warning, clojure/core/matrix/impl/ndarray.clj:127 - reference to field offset can't be resolved.

Finalise matrix operation function naming

Need to decide on a consistent function name set for matrix operations.

There seems to be a lot of options, e.g.

  • mul (vecmath, jblas, vectorz)
  • mult (clatrix, colt)
  • multiply (english!)

Tempted to go with mul for consistency with majority of underlying implementations - any better ideas?

Others are probably easier - use the obvious three letter abbreviations if we go with mul

  • add
  • sub
  • div

Maths functions can mimic java.lang.Math like clatrix?

  • exp
  • log10
  • signum
  • etc.

Other operations should generally use full english words, especially where they match mathematical terms:

  • negate
  • determinant
  • trace
  • etc.

AOT compilation fails in 0.9.0

I noticed that aot compilation fails with core.matrix 0.9.0

I created a fresh leiningen project, added core.matrix 0.9.0 as dependency and
added the existing core namespace to the project.clj :aot key. When I then require core.matrix
in the core namespace and do a lein compile, I get the following Exception:

Exception in thread "main" java.lang.IllegalArgumentException: No implementation of method: :implementation-key of protocol:
#'clojure.core.matrix.protocols/PImplementation found for class: clojure.core.matrix.impl.ndarray.NDArray, compiling:(ndarray.clj:351:1)

When I revert back to core.matrix 0.8.0, it compiles just fine.

Matrix division fails on 0.10

I encountered this exception on matrix division while trying out 0.10. I do not see this problem on 0.9. The repl session in the screenshot below defines a 3x3 matrix and then divides it by itself to demonstrate the issue:

image

The exception I get is the following:

java.lang.ClassCastException: clojure.core.matrix.impl.ndarray.NDArray cannot be cast to java.lang.Number

John

Normalise benchmarks to per-operation timing

New benchmarks look good Dmitry!

I think it would be an improvement if the benchmarks normalised the results by deviding by the number of operations performed, so that the result is a per-operation timing. This would let us know precisely how many nanoseconds each operation is taking.

length not working

It seems to me that the length implementation is not found correctly.

(use 'core.matrix)
(length [1 1])

Returns an IllegalArgumentException:

java.lang.IllegalArgumentException: No implementation of method: :length of protocol: #'clojure.core.matrix.protocols/PVectorOps found for class: clojure.lang.PersistentVector

I don't fully understand how the protocols of the matrix-api work yet, maybe you can point me in the right direction?

Thank you!

Make NDArray into default implementation

Currently the core.marix default implementation is :persistent-vector

We should switch this to :ndarray once we are comfortable with the robustness of the NDArray implementation.

Dmitry - important milestone for you during GSoC, I think!

Protocols revision

We need to revise distribution of methods between protocols at some point. Here is an example why this is needed:

(defprotocol PZeroDimensionAccess 
  (get-0d [m])
  (set-0d! [m value]))

(defprotocol PZeroDimensionSet
  (set-0d [m value]))

There is no semantic reason why this two protocols should be different.

Pretty-printing for matrices

Should have a pretty-printer for matrix output, allowing at least the following:

  • Alignment of columns/ decimal places
  • Truncation of double values to a fixed number of decimal places
  • Max size truncation for very large matrices (maybe display 20x20 max by default?)

Printed output should still be readable clojure data.

Output might be something like:

[[-1.000  0.000  0.000]
 [ 0.000  3.141  0.000]
 [ 0.000  0.000 -2.718]]

Broken emap for NDArray?

emap appears to incorrectly iterate too deep when an NDArray contains Clojure vectors, e.g.

(def m (new-array :ndarray [3 3])) ;; create a 3x3 NDarray
(def my-vectors (for [i (range 9)] [i (inc i)])) ;; a sequence of 9 vectors
(assign-array! m (object-array my-vectors)) ;; assign all the elements using an array

(emap count m) ;; should apply count to each of the 9 vectors
UnsupportedOperationException count not supported on this type: Long  clojure.lang.RT.countFrom (RT.java:545)

This could be a problem of NDArray depending on the persistent vector implementation, for which nested vectors are a known limitation?

.toString broken for NDArray?

Currently .toString appears to be broken for NDArray (latest develop branch)

(def m (new-array :ndarray [3 3]))
(.toString m)
RuntimeException Can't coerce to vector: class clojure.core.matrix.impl.ndarray.NDArray  clojure.core.matrix.protocols/persistent-vector-coerce (protocols.clj:489)

unable to connect to project via lein

I'm unable to connect to the matrix-api project via lein. I've updated my local repository, as well as lein.

When I do lein check I get the following error:

Exception in thread "main" java.io.FileNotFoundException: Could not locate clojure/core/matrix/compliance_tester__init.class or clojure/core/matrix/compliance_tester.clj on classpath: 

When I connect to the project in the repl with lein repl (in the project root) and try to use clojure.core.matrix this is what happens:

user=> (use 'clojure.core.matrix)
FileNotFoundException Could not locate clojure/core/matrix__init.class or clojure/core/matrix.clj on classpath:   clojure.lang.RT.load (RT.java:432)

Maybe the problem is that the files are located in src/main/clojure/clojure/core/matrix rather than src/main/clojure/core/matrix?

Sparse matrix support?

Do any of the supported core.matrix implementations cover sparse matrices? If not, what would you recommend for inclusion?

PImplementation

Hi,

I do have some question about the new PImplementation, what should come out of new-vector, new-matrix and new-matrix-nd ???

Should come out another vector, and other matrixs ?

There is really a point doing so ?

I mean, it bring to do something like this...

(def matrix (matrix [[1 2] [2 1]]))
(new-vector matrix [1 2 3 4])

To get a vector... Why we need to put the first matrix in the call new-vector ???
To have the same implementation of vector ?

Am I missing something ?

I will make something like define new-vector-parallel-colt, new-matrix-parallel-colt, and then a big case statement

(defn new-vector [data & {:keys [type :or type :parrallel-colt]}]
  (case type
    :parallel-colt (new-vector-parallel-colt data)
    :other (new-vector-other data)))

Sorry if I am missing the obvious...

Submatrix view construction

It should be possible with the NDArrayView to create a "submatrix" of an existing matrix, i.e. a rectangular view over a subset of any other matrix.

It should works as a "view", i.e. modifying the view matrix will modify the underlying data (assuming the source matrix is mutable)

Needs some functions / protocols in the main API to create these. Probably something like

(sub-matrix m [[0 2] [2 4]])
=> 2x2 matrix starting at [0,2] in the original matrix m

Question: Should we use [start, end] or [start,length] to specify index ranges?

equals not working for different-sized NDArrays

(equals (array :ndarray 1) (array :ndarray [1 1]))
=> true

I believe these should be non-equal : different shaped arrays should not be equal to each other (even if the broadcasted version would be)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.