databricks / benchmarks Goto Github PK
View Code? Open in Web Editor NEWA place in which we publish scripts for reproducible benchmarks.
License: Other
A place in which we publish scripts for reproducible benchmarks.
License: Other
Please publish configs for Presto as you did for Spark.
When you test benchmark in standalone mode, pandas vs pyspark.
I got curios, can it implement in production ?
How you set how many slave?
Executor cores?
Executor memory?
Executor number?
And if I used spark on docker will it be better ? or best performance when on baremetal spark ?
Would databricks provides public S3 bucket containing pre-generated TPC-DS data?
Thanks.
More specifically:
All sounds fair, at least for Flink and Spark. We will note that the fact that Spark and Flink use in-memory data generation and KStreams goes through Kafka is itself a bias against KStreams.
We executed the notebook on a Databricks cluster with version 6.2 (includes Apache Spark 2.4.4, Scala 2.11), and ran into the following problem on command 10:
Wrote 162 bytes.
executing command - kafka/bin/kafka-topics.sh --delete --topic output --zookeeper 1212-215524-fry930-10-172-248-143:2181 on host: 1212-215524-fry930-10-172-248-143
bash: kafka/bin/kafka-topics.sh: No such file or directory
FAILED: command - kafka/bin/kafka-topics.sh --delete --topic output --zookeeper 1212-215524-fry930-10-172-248-143:2181 on host: 1212-215524-fry930-10-172-248-143
executing command - kafka/bin/kafka-topics.sh --create --topic output --partitions 1 --replication-factor 1 --zookeeper 1212-215524-fry930-10-172-248-143:2181 on host: 1212-215524-fry930-10-172-248-143
bash: kafka/bin/kafka-topics.sh: No such file or directory
FAILED: command - kafka/bin/kafka-topics.sh --create --topic output --partitions 1 --replication-factor 1 --zookeeper 1212-215524-fry930-10-172-248-143:2181 on host: 1212-215524-fry930-10-172-248-143
java.lang.RuntimeException: Command failed
It seems that the system cannot find the file kafka/bin/kafka-topics.sh
for some reason.
Is there anything we can do to fix this? Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.