Comments (32)
Thanks guys. I recently created a tutorial about installing Sedona on Fabric. This will be published to Sedona website soon
https://github.com/apache/sedona/blob/master/docs/setup/fabric.md
from sedona.
Found the solution. I placed the Jar files for Sedona in an Azure Blob Storage container, and then set the %%configure magic before running the code above. Here's the doc: https://learn.microsoft.com/en-us/fabric/data-engineering/author-execute-notebook#spark-session-configuration-magic-command
And here's close to what was added to the first cell in a Notebook:
%%configure -f
{
"jars": ["https://xxxxxx.blob.core.windows.net/jars/sedona-spark-shaded-3.0_2.12-1.5.0.jar", "https://xxxxxx.blob.core.windows.net/jars/geotools-wrapper-1.5.0-28.2.jar"]
}
I was then able to set the SedonaContext. Note that adding the libraries under the Workspace libraries did not appear to work. https://learn.microsoft.com/en-us/fabric/data-engineering/environment-manage-library
from sedona.
from sedona.
from sedona.
@rovin-ms You can compile the Sedona doc locally (https://sedona.apache.org/1.5.1/setup/compile/#compile-the-documentation):
Install libs
pip install mkdocs
pip install mkdocs-material
pip install mkdocs-macros-plugin
pip install mkdocs-git-revision-date-localized-plugin
pip install mike
Run:
mkdocs serve
You can add a page here: https://github.com/apache/sedona/tree/master/docs/setup , then add it to mkdocs.yml
. Then it will show up here like the tutorial for Databricks, EMR: https://sedona.apache.org/1.5.1/setup/emr/
from sedona.
I can confirm that this is an issue in Fabric, quite annoying that this has to be set at a session level every time, requiring a restart of the spark pool.
from sedona.
Those could able to resolve the issue of using Apache Sedona in Microsoft Fabric, please could you lay down the steps in some sequential order so that the others can follow it properly.
Thanks in advance,
Regards
Adil
from sedona.
I guess rovin-ms described that how he made it. I followed that as well and it works. The issue is that it adds about 2-4 minutes worth of time to computations.
from sedona.
i was not clear on the jar files where those will need to be placed
from sedona.
i have them on a blob storage, and i am accessing them through https just like rovin-ms. But you could might as well host them on github or whatever place cause it would work as well.
from sedona.
thanks for the response. I will definitely host those files on Azure blob storage with the help of my company Azure specialist. i will get back if i come across with the issues.
from sedona.
https://jar-download.com/download-handling.php
is this the right side for the jar files to be get downloaded?
from sedona.
i think you should use one of the maven repositories and match your scala and spark version : https://mvnrepository.com/search?q=sedona
from sedona.
Sorry, for being novice in this part, i am not able to find the respective jar files.
Further, is there any folder name along with the rights that needs to be given ?
from sedona.
Hi @adild2k , below are the Sedona jars you will need:
Spark 3.0 to 3.3 + Scala 2.12: https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/1.5.1/
Spark 3.4 + Scala 2.12: https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.4_2.12/1.5.1/
Spark 3.5 + Scala 2.12:https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.5_2.12/1.5.1/
In addition, you will need geotools-wrapper: https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.1-28.2/
For example, sedona-spark-shaded-3.5_2.12-1.5.1.jar
from sedona.
Thanks again for the sweet response.
Further, is there any specific folder name on Azure Blob storage that these files needs to be placed along with the rights that needs to be given?
from sedona.
It does not have to be stored in AZ blob storage, but if you store it there make sure that the container you are trying to reach and folder are reachable without any kind of Authentication. That config magic command will only work in that form if it is reachable without any kind of Auth.
from sedona.
Thanks for the response. I have save the files in the respective folder on Azure, check the snapshot below. After that i need to follow this step mentioned above. Is my understanding right?
%%configure -f
{
"jars": ["https://xxxxxx.blob.core.windows.net/jars/sedona-spark-shaded-3.0_2.12-1.5.0.jar", "https://xxxxxx.blob.core.windows.net/jars/geotools-wrapper-1.5.0-28.2.jar"]
}
from sedona.
run it inline in a notebook and it will take some time to fire up the new spark cluster.
from sedona.
Please check the snapshot below, what should be the next step?
from sedona.
Follow the tutorial from the official website on creating the sedona context.
from sedona.
Sorry being a novice in this area, please check the snapshot. i know there is something missing which i am not able to figure it out
from sedona.
you have to add the python libraries as well, apache-sedona geopandas, 0.11 i think keplergl and pydeck
from sedona.
i have already install keplergl and pydeck earlier along with geopandas. Still its giving the same error
from sedona.
what about the apache-sedona python module? have you installed that as well?
from sedona.
Yes, off course. that was the first one that has been installed
from sedona.
still the same error. Any suggestions or pointers?
from sedona.
By the way, I forwarded a ticket to Microsoft regarding this. It is an environment problem which they should fix. We shouldn't do any work arounds.
from sedona.
@jiayuasu isn't it better to refer to the jar files from the maven repo? I did something like this, so i don't need to host the jar files on remote storage.
%%configure -f
{
"jars": ["https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.1-28.2/geotools-wrapper-1.5.1-28.2.jar", "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/1.5.1/sedona-spark-shaded-3.0_2.12-1.5.1.jar"]
}
from sedona.
@robertnagy1 Good point. But does Microsoft Fabric always have internet access? Will some user intentionally shut it down for security purpose?
from sedona.
I guess having it on a remote abfss requires internet as well, and accessing Fabric requires internet, and the spark clusters require internet. The Lakehouse (as far as i know, but i might be wrong) is an abstraction layer above Microsoft One Lake, which is separate from the Spark Instances. I think it should be safe to assume that Fabric requires internet to work.
from sedona.
Makes sense. Will update the doc accordingly.
from sedona.
Related Issues (20)
- AttributeError: 'sedona' has no attribute 'read' HOT 2
- St_isempty(geometry) finds non null geometries but does not find null geometries. HOT 2
- ST_Snap example code does not work HOT 2
- Flink Sedona,geomTbl.execute().print() happen error: HOT 2
- Sedona fails to write Delta Lake on Databricks 15.3 Beta: ClassCastException HOT 7
- ST_IsPolygonCW, ST_IsPolygonCCW, ST_ForcePolygonCW and ST_ForcePolygonCCW fails on Polygons without interior ring
- Breaking change between 1.5.3 and 1.6.0 affecting RASTER functions java.lang.NoSuchMethodError: void org.geotools.coverage.grid.GridGeometry2D HOT 5
- There was garbled code when reading Chinese from shp file HOT 3
- sedona's docker image can not run successfully on k8s HOT 2
- Cannot run sedona examples into spark-shell HOT 3
- Using setMaster(“ spark://master:7077) An error occurred during operation, but there is no problem running locally HOT 3
- Sedona website homepage layout bug in mobile view
- Sedona website favicon checker issues and missing files
- An error occurred when submitting the file. The actual directory does exist.
- Initialization of sedona 1.6.0 is quite slower than sedona 1.5.1 HOT 2
- 1.6.0 documentation needs update? HOT 6
- st_union - understanding difference with PostGIS HOT 6
- Issue with st_dump? HOT 2
- ShapefileReader with Unity Catalogue on Databricks HOT 14
- ST_ClusterDBSCAN Feature Request HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sedona.