Code Monkey home page Code Monkey logo

spark-ts-examples's Introduction

spark-ts-examples

Description

Examples showing how to use the spark-ts time series library for Apache Spark.

Minimum Requirements

  • Java 1.8
  • Maven 3.0
  • Apache Spark 1.6.0

Using this Repo

Building

We use Maven for building Java / Scala. To compile and build the example jar, navigate to the jvm directory and run:

mvn package

Running

To submit one of the Java or Scala examples to a local Spark cluster, run the following command from the jvm directory:

spark-submit --class com.cloudera.tsexamples.Stocks target/spark-ts-examples-0.0.1-SNAPSHOT-jar-with-dependencies.jar

You can substitute any of the Scala or Java example classes as the value for the --class parameter.

To submit a Python example, run the following command from the python directory:

spark-submit --driver-class-path PATH/TO/sparkts-0.3.0-jar-with-dependencies.jar Stocks.py

The --driver-class-path parameter value must point to the Spark-TS JAR file, which can be downloaded from the spark-timeseries Github repo.

spark-ts-examples's People

Contributors

pegli avatar sryza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spark-ts-examples's Issues

JavaStocks class won't give the right result

spark-submit --class com.cloudera.tsexamples.JavaStocks target/spark-ts-examples-0.0.1-SNAPSHOT-jar-with-dependencies.jar

The output min and max gives:
(AAL,NaN)
(AAL,NaN)

The issue may be related to the call of timeSeriesRDDFromObservations. In the mean time, the Scala version and Stocks.java prints the correct result.

Compile error on JavaStocks.java

Hi @sryza so the project does not compile :)

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project spark-ts-examples: Compilation failure: Compilation failure:
[ERROR] /Users/jorge/Downloads/spark-ts-examples/jvm/src/main/java/com/cloudera/tsexamples/JavaStocks.java:[67,61] cannot find symbol
[ERROR] symbol: method javaTimeSeriesRDDFromObservations(com.cloudera.sparkts.DateTimeIndex,org.apache.spark.sql.DataFrame,java.lang.String,java.lang.String,java.lang.String)
[ERROR] location: class com.cloudera.sparkts.api.java.JavaTimeSeriesRDDFactory
[ERROR] /Users/jorge/Downloads/spark-ts-examples/jvm/src/main/java/com/cloudera/tsexamples/JavaStocks.java:[89,42] cannot find symbol
[ERROR] symbol: variable swap
[ERROR] location: variable s of type scala.Tuple2<java.lang.String,java.lang.Double>
[ERROR] /Users/jorge/Downloads/spark-ts-examples/jvm/src/main/java/com/cloudera/tsexamples/JavaStocks.java:[89,48] cannot find symbol
[ERROR] symbol: variable min
[ERROR] location: class org.apache.spark.api.java.JavaRDD<java.lang.Object>
[ERROR] /Users/jorge/Downloads/spark-ts-examples/jvm/src/main/java/com/cloudera/tsexamples/JavaStocks.java:[90,42] cannot find symbol
[ERROR] symbol: variable swap
[ERROR] location: variable s of type scala.Tuple2<java.lang.String,java.lang.Double>
[ERROR] /Users/jorge/Downloads/spark-ts-examples/jvm/src/main/java/com/cloudera/tsexamples/JavaStocks.java:[90,48] cannot find symbol
[ERROR] symbol: variable max
[ERROR] location: class org.apache.spark.api.java.JavaRDD<java.lang.Object>

You have missing some ; and I delete some stuff on my for.

java.lang.NoSuchMethodError trying to run example

Hi,

I cloned the repository and did mvn package. I was able to build the jar but when I try to run the example from jvm folder by running spark-submit --class com.cloudera.tsexamples.Stocks target/spark-ts-examples-0.0.1-SNAPSHOT-jar-with-dependencies.jar , I get the below exception -

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.createDataFrame(Lorg/apache/spark/rdd/RDD;Lorg/apache/spark/sql/types/StructType;)Lorg/apache/spark/sql/DataFrame;
	at com.cloudera.tsexamples.Stocks$.loadObservations(Stocks.scala:37)
	at com.cloudera.tsexamples.Stocks$.main(Stocks.scala:46)
	at com.cloudera.tsexamples.Stocks.main(Stocks.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

BusinessDayFrequency error (Python)

Getting this error on BusinessDayFrequency(1, 1, sc)
__init__() takes exactly 3 arguments (4 given)

when I change it to BusinessDayFrequency(1, sc) I get this error:

--> self._jfreq = sc._jvm.com.cloudera.sparkts.BusinessDayFrequency(bdays)
TypeError: 'JavaPackage' object is not callable

Issues with python

Hi,

I am having a couple of issues similar to those posted here before. Not sure if this is a spark timeseries issue or examples.
I am running spark-2.0.0 and a snapshot of spark-timeseries (though renamed during compile to 0.4.0)

The nosetest fails most of the time, only completing without failure every now and then.
nosetest.txt

Also attached a log spark-ts.txt of running the python example, this returns

print(dwStats.min())
([nan], u'AAL')
print(dwStats.max())
([nan], u'AAL')

The only other error i can see is :
tickerTsrdd.cache()
MapPartitionsRDD[13] at map at NativeMethodAccessorImpl.java:-2

Any tips or fixing these problems would be appreciated.

Rene

org.threeten which is not available error!

Following error is coming while generation DateTimeIndex:

uncaught exception during compilation: scala.reflect.internal.Types$TypeError
scala.reflect.internal.Types$TypeError: bad symbolic reference. A signature in DateTimeIndex.class refers to term extra
in value org.threeten which is not available.
It may be completely missing from the current classpath, or the version on
the classpath might be incompatible with the version used when compiling DateTimeIndex.class.

Spark2

anybody who has upgraded to Spark2 yet?

sc = SparkContext(appName="Stocks") - Spark context creation Error

Hi,
I am using following to execute Stocks.py example
spark-submit --driver-class-path sparkts-0.4.1-jar-with-dependencies.jar Stocks.py
I see below Error.My Spark context is failing.can you please help.
Traceback (most recent call last):
File "/home/doma.rd/Stocks.py", line 28, in
sc = SparkContext(appName="Stocks")
File "/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in init
File "/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in call
File "/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoSuchMethodError: scala.runtime.VolatileByteRef.create(B)Lscala/runtime/VolatileByteRef;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.