Code Monkey home page Code Monkey logo

marlin's People

Contributors

myasuka avatar ronggu avatar wangzk avatar xingkungao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

marlin's Issues

Matrix 5000 x 5000 inverse

Hello,

I am trying to invert a 5000 x 5000 matrix at Google DataProc, (code below) the code already works for a 1000 x 1000 matrix in my local pc.

However, it seems something is happening when calling the inverse method, the job fails and I get this in the log :
Any ideas ?

LOG:

fourth
fifth
17/09/14 14:32:15 INFO org.apache.hadoop.mapred.FileInputFormat: Total input paths to process : 1
sixth
septh

[Stage 1:> (0 + 2) / 2]
[Stage 1:=============================> (1 + 1) / 2]

17/09/14 14:32:28 INFO com.github.fommil.jni.JniLoader: successfully loaded /tmp/jniloader3386225062470282445netlib-native_system-linux-x86_64.so
17/09/14 14:32:29 INFO com.github.fommil.jni.JniLoader: already loaded netlib-native_system-linux-x86_64.so

CODE
def main(args: Array[String]) {
System.out.println("first")
val conf = new SparkConf()
System.out.println("second")
conf.set("spark.default.parallelism","8")
System.out.println("third")
val sc = new SparkContext(conf)
System.out.println("fourth")
val SIZE = 5000
System.out.println("fifth")
val ma = sc.textFile("gs://sparkfilesjsaray/matr_5000.csv")
.map(line => line.split(",").map(_.toDouble)).zipWithIndex().map(line=> (line._2, BDV(line._1)) )
System.out.println("sixth")
val matrix = new DenseVecMatrix(ma,SIZE,SIZE)
System.out.println("septh")
val inverse = matrix.inverse()
System.out.println("eight")
inverse.saveToFileSystem("gs://sparkfilesjsaray/output5000.csv")
System.out.println("nine")
System.out.println("Done")
System.out.println("first")
}

error when compiling from master branch

Saw an error when compiling from the master branch:

MatrixSuite.scala:306: type mismatch;
found : Int (2)
required: (Int, Int, Int)
[ERROR] var result = ma.multiply(denVecMat, 2)
^

one error found

I think it should be something like:
val result = ma.multiply(denVecMat, (2, 2, 2))

canal

Merge with Spark 2.0 or 1.6.2

Hello PasaLab,
Thanks for your amazing work.
Can you please update your code in order to work with Spark 2.0 or 1.6.2 at least?
Regards

How to use MKL with saury and spark without root account?

After trials and errors, I finally made spark and saury work with MKL on my working cluster without su or sudo (I don't have the password for root). Here is the procedures:

Example environment: MKL, spark-1.0.2, saury

Package needed and the download path:
blas: > wget http://www.netlib.org/blas/blas.tgz
cblas: > wget http://www.netlib.org/blas/blast-forum/cblas.tgz
netlib-java: > git clone https://github.com/fommil/netlib-java.git

\0. prepare /lib and /include directory at home

mkdir ~/lib
cd ~/lib
ln -s /opt/intel/mkl/lib/intel64/libmkl_rt.so libblas.so.3
ln -s /opt/intel/mkl/lib/intel64/libmkl_rt.so liblapack.so.3
(symbolic link libblas.so.3 and liblapack.so.3 to libmkl_rt.so)
mkdir ~/include
export LD_LIBRARY_PATH=/home/***/lib

\1. build netlib BLAS

tar zxvf blas.tgz
cd BLAS/
make all
cp ./blas_LINUX.a ~/lib/blas.a

\2. build netlab CBLAS

tar zxvf cblas.tgz
cd CBLAS/
ln -s Makefile.LINUX Makefile.in (this step is required by CBLAS/README, but failed in my installation)
modify Makefile.in:
modify BLLIB, CBLIB, CBDIR (see CBLAS/README for detail)
make all
cp CBLAS/lib/cblas.a ~/lib/
cd CBLAS/include/
cp * ~/include/ (copied cblas_f77.h cblas.h to ~/include/)

\3. build netlib-java to get netlib-native_system-linux-x86_64-natives-1.1.jar, jniloader.jar and native_system-java.jar

cd netlib-java/
sed -i "s/1.2-SNAPSHOT/1.1/g" grep -rl 1.2-SNAPSHOT .
mvn package (build may fail, ignore it)
cd native_system/
mvn package (build may fail, ignore it)
cd xbuilds/
mvn package (build may fail, ignore it)
cd linux-x86_64/
mvn package (build may fail, ignore it)
vi target/netlib-native/com_github_fommil_netlib_NativeSystemBLAS.c

line 36
-- #include <cblas.h>
++ #include "/home/***/include/cblas.h"

cd ../../../netlib/JNI/
vi netlib-jni.c

line 2
-- #include <cblas.h>
++ #include "/home/***/include/cblas.h"

cd - (return to linux-x86_64/)
vi pom.xml

line 78, line 79
-- -lblas
-- -llapack
++ -lmkl_rt
line 54 to line 68
delete 15 lines:

com.github.fommil.netlib
generator


blas


lapack


arpack


mvn package (this build should succeed)
cd target/
ls
you should see netlib-native_system-linux-x86_64-natives.jar
cd lib/
ls
you should see jniloader.jar and native_system-java.jar

\4. build spark-1.0.2 with the jars we just get
reference: http://apache-spark-user-list.1001560.n3.nabble.com/Native-library-can-not-be-loaded-when-using-Mllib-PCA-td7042.html

(1). build the spark assembly once

./make-distribution.sh -Pnetlib-lgpl

(2). copy jniloader.jar, native_system-java-1.1.jar and netlib-native_system-linux-x86_64-1.1-natives.jar to $SPARK_HOME/lib_managed/natives .

(3). copy netlib-native_system-linux-x86_64-1.1-natives.jar to ~/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-x86_64/jars to replace the existing one. make sure the name is consistent with the original one.

(4). modify $SPARK_HOME/assembly/pom.xml add a plugin under build/plugins

        com.googlecode.addjars-maven-plugin         addjars-maven-plugin         1.0.5                                                         add-jars                                                                                                         ${basedir}/../lib_managed/natives                                                                                       (5). rebuild spark

Now you should have spark-1.0.2 with call to MKL as BLAS
Enjoy it!

Performance Test

As for matrix multiplication,how big matrices can you support at most using your current configurations(one 32G master plus sixteen 24G workers)???Looking forward to your reply!

is it possible to support Complex Double data type ?

Hi,
I am looking into the code to check if it is feasible to support Complex Double data type for matrix inverse and multiplication.

I realize that you have use a couple of external packages:
BLAS : you use dspr so I can replace it with zspr I presume ?
ARPACK : you use dsaupd and dseupd, I can not find a equivalent method.
Breeze : it supports Complex data type, so should be fine I guess ?

What's your assessment/advice for supporting Complex Double data type ?

many thanks
canal

roadmap

Do you have any plan update the library to base on spark 1.6?

Inverse of a matrix

Hello,
Looking at the source code, there is a comment in DenseVecMatrix for the inverse method (line 570: "get the inverse of the triangular matrix"). But I think since we support LU decomposition, we can inverse a non-triangular square matrix right ?

And also, the matrix inverse and multiplication is 'out-of-core' - that means the calculation is not limited to the available physical memory, am I correct ? I have a fairly large matrix (1 million x 1 million, double precision)

thank you for sharing the code,
canal

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.