Code Monkey home page Code Monkey logo

scala_on_databricks's Introduction

Welcome to request_with_scala

This project has educational goals and aims to explore how to use Scala in Databricks, consuming an API, and dealing with the response using Spark and native solutions.

link to the solution in Databricks --> here <--

The Reddit APIs were used. It returns the top 50 stocks discussed in the Wallstreetbets subreddit over the last 15 minutes, including a sentiment analysis of the discussions. Documentation is available here.

For this project, DBR 13.3 was used*, and the current ENV was set to use JDK-11 on the cluster:

JNAME=zulu11-ca-amd64

This JVM is necessary to enable java.net.http.HttpRequest** in Databricks and is required by the request libraries described below:

Scala package Description Maven coordinates Reference
sttp.client3 Scala library that provides HTTP request and response handlers. com.softwaremill.sttp.model:core_2.12:1.7.10 Documentation
sttp.model Provides HTTP models such as headers, URIs, methods, etc. Required for sttp.client. com.softwaremill.sttp.tapir:tapir-sttp-client_2.12:1.10.6 Documentation

*DBR 11.3 until 14.3 are tested and is not expected incompatibility.

**error found: BootstrapMethodError: java.lang.NoClassDefFoundError: java/net/http/HttpRequest. Solution find here.

A class to deal with sttp.client Response, with attributes:

  • client: A SimpleHttpClient from sttp.client instance. Used to execute the request.
  • requestEndpoint: The endpoint informed
  • successStatusCode: The 200 status code

And the methods:

  • getResponse: Return the get response from the endpoint informed.
  • checkRequestStatusCode: Raises a exception if response status code is different from 200.
  • transformResponseToDataframe: Return a spark dataframe if request was succesfull.
  1. Upload request_with_scala.dbc or request_with_scala.scala on your Databricks Workspace;
  2. Install the packages listed in Cluster Configs;
  3. Open a PR with your improvements!
  • Use the tables for a logistic regression model.
  • Made a star schema with the current layers.

scala_on_databricks's People

Contributors

calilisantos avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.