sindbach / mongodb-spark-docker Goto Github PK
View Code? Open in Web Editor NEWAn example of docker compose to set up a single Spark node connecting to MongoDB via Spark Connector
An example of docker compose to set up a single Spark node connecting to MongoDB via Spark Connector
This project has helped me a lot, but when I use "apt install", I have no permission, because I am not “root” user. So I want to know the password for the container. Do you remember the password you set several years ago?
Hi, I am using mongdb and spark now.
Is it very inefficient that using mapreduce in a mongodb cluster to do data analysis, even mongodb is using Google V8 engine? Should I use spark and mongo-spark-connector to analyze data if I have the resources to maintain a Spark cluster?
Thank you very much.
Hi @sindbach,
I'm trying to run the scala files placed in the repository and also the spark container. Whenever I run the following command, it doesn't work as expected:
spark-shell --conf "spark.mongodb.input.uri=mongodb://mongodb:27017/spark.times" --conf "spark.mongodb.output.uri=mongodb://mongodb/spark.output" --packages org.mongodb.spark:mongo-spark-connector_${SCALA_VERSION}:${MONGO_SPARK_VERSION} -i ./initDocuments.scala
and I face with this error:
Ivy Default Cache set to: /home/ubuntu/.ivy2/cache
The jars for the packages stored in: /home/ubuntu/.ivy2/jars
:: loading settings :: url = jar:file:/home/ubuntu/spark-2.4.3-bin-hadoop2.6/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.mongodb.spark#mongo-spark-connector_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-d0f95242-e9b9-4d49-8dde-42afc7c55e9a;1.0
confs: [default]
You probably access the destination server through a proxy server that is not well configured.
You probably access the destination server through a proxy server that is not well configured.
You probably access the destination server through a proxy server that is not well configured.
You probably access the destination server through a proxy server that is not well configured.
:: resolution report :: resolve 40879ms :: artifacts dl 0ms
:: modules in use:
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.pom
Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.jar
Host dl.bintray.com not found. url=https://dl.bintray.com/spark-packages/maven/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.pom
Host dl.bintray.com not found. url=https://dl.bintray.com/spark-packages/maven/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.jar
module not found: org.mongodb.spark#mongo-spark-connector_2.11;2.2.0
==== local-m2-cache: tried
file:/home/ubuntu/.m2/repository/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.pom
-- artifact org.mongodb.spark#mongo-spark-connector_2.11;2.2.0!mongo-spark-connector_2.11.jar:
file:/home/ubuntu/.m2/repository/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.jar
==== local-ivy-cache: tried
/home/ubuntu/.ivy2/local/org.mongodb.spark/mongo-spark-connector_2.11/2.2.0/ivys/ivy.xml
-- artifact org.mongodb.spark#mongo-spark-connector_2.11;2.2.0!mongo-spark-connector_2.11.jar:
/home/ubuntu/.ivy2/local/org.mongodb.spark/mongo-spark-connector_2.11/2.2.0/jars/mongo-spark-connector_2.11.jar
==== central: tried
https://repo1.maven.org/maven2/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.pom
-- artifact org.mongodb.spark#mongo-spark-connector_2.11;2.2.0!mongo-spark-connector_2.11.jar:
https://repo1.maven.org/maven2/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.jar
==== spark-packages: tried
https://dl.bintray.com/spark-packages/maven/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.pom
-- artifact org.mongodb.spark#mongo-spark-connector_2.11;2.2.0!mongo-spark-connector_2.11.jar:
https://dl.bintray.com/spark-packages/maven/org/mongodb/spark/mongo-spark-connector_2.11/2.2.0/mongo-spark-connector_2.11-2.2.0.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: org.mongodb.spark#mongo-spark-connector_2.11;2.2.0: not found
::::::::::::::::::::::::::::::::::::::::::::::
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.mongodb.spark#mongo-spark-connector_2.11;2.2.0: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1306)
at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:315)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
How can I solve it to work properly?
Thanks,
Mostafa
mongo container has start
I use command like : sudo docker start mongodb-spark-docker_spark_1
mongodb-spark-docker_spark_1 just can not start or stop after start soon
why?
thanks
Follow readme help me out, thanks
Cannot Build the Spark Dockerfile
The URL to Download gives 404
mongodb-spark-docker/spark/Dockerfile
Line 32 in 21f4f95
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.