Code Monkey home page Code Monkey logo

sparkts's Introduction

sparkts

Project Status: Active - The project has reached a stable, usable state and is being actively developed. CRAN_Status_Badge License: MIT

The goal of sparkts is to provide a test bed of sparklyr extensions for the spark-ts framework which was modified from the spark-timeseries framework.

Installation

You can install sparkts from GitHub with:

# install.packages("devtools")
devtools::install_github("nathaneastwood/sparkts")

For details on how to set up for further developing the package, please see the development vignette.

Example

This is a basic example which shows you how to calculate the standard error for some time series data:

library(sparkts)

# Set up a spark connection
sc <- sparklyr::spark_connect(
  master = "local",
  version = "2.2.0",
  config = list(sparklyr.gateway.address = "127.0.0.1")
)

# Extract some data
std_data <- spark_read_json(
  sc,
  "std_data",
  path = system.file(
    "data_raw/StandardErrorDataIn.json",
    package = "sparkts"
  )
) %>%
  spark_dataframe()

# Call the method
p <- sdf_standard_error(
  sc = sc, data = std_data,
  x_col = "xColumn", y_col = "yColumn", z_col = "zColumn",
  new_column_name = "StandardError"
)

p %>% dplyr::collect()
# # A tibble: 8 x 5
#   ref       xColumn yColumn zColumn StandardError
#   <chr>     <chr>     <dbl>   <dbl>         <dbl>
# 1 000000000 200        120.     10.          10.6
# 2 111111111 300        220.     20.          14.1
# 3 222222222 400        320.     30.          16.8
# 4 333333333 500        420.     40.          19.1
# 5 444444444 600        520.     53.          22.4
# 6 555555555 700        620.     60.          22.9
# 7 666666666 800        720.     70.          24.6
# 8 777777777 900        820.     80.          26.2

# Disconnect from the spark connection
spark_disconnect(sc = sc)

sparkts's People

Contributors

nathaneastwood avatar vidhyamanisankar avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sparkts's Issues

sdf_melt doesn't currently work

sdf_melt doesn't work

melt_data <- sparklyr::spark_read_json(
  sc,
  "melt_data",
  path = system.file(
    "data_raw/Melt.json",
    package = "sparkts"
  )
) %>%
  sparklyr::spark_dataframe()
p <- sdf_melt$new(sc = sc, data = melt_data)
p$melt_1(
  id_variables = list("identifier", "date"),
  value_variables = list("two", "one", "three", "four"),
  variable_name = "variable",
  value_name = "turnover"
)

This returns the following error

Error: com.ons.sml.businessProcesses.ONSRuntimeException: Missing Columns Detected
	at com.ons.sml.businessMethods.impl.BaseImpl$BaseMethodsImpl.checkColNames(BaseImpl.scala:19)

sdf_lag doesn't currently work

sdf_lag returns the error

Error: java.lang.Exception: No matched method found for class 

when you try to run it. This is misleading and is due to the arguments being passed not being correct. Passing an R list to Scala via sparklyr gives a Scala ArrayType and not a Scala List as needed. We need to fix this on the Scala side. See this documentation for more details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.