Code Monkey home page Code Monkey logo

Comments (6)

carrodher avatar carrodher commented on August 16, 2024 1

The issue may not be directly related to the Bitnami container image or Helm chart, but rather to how the application is being utilized or configured in your specific environment.

Having said that, if you think that's not the case and are interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

If you have any questions about the application itself, customizing its content, or questions about technology and infrastructure usage, we highly recommend that you refer to the forums and user guides provided by the project responsible for the application or technology.

With that said, we'll keep this ticket open until the stale bot automatically closes it, in case someone from the community contributes valuable insights.

from charts.

kayvansol avatar kayvansol commented on August 16, 2024 1

I test above code with docker compose too with bitnami image and the result was the same fault in creation of *.parquert file:

csv read success:

readsuccess

parquet file creation failure:

parquetErr

docker-compose.yml :

version: '3.6'

services:

  spark:
    container_name: spark
    image: bitnami/spark:latest
    environment:
      - SPARK_MODE=master
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=spark   
    ports:
      - 127.0.0.1:8081:8080
    

  spark-worker:
    image: bitnami/spark:latest
    environment:
      - SPARK_MODE=worker
      - SPARK_MASTER_URL=spark://spark:7077
      - SPARK_WORKER_MEMORY=2G
      - SPARK_WORKER_CORES=2
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=spark

docker run :

docker-compose up --scale spark-worker=2

ctp.py :

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("WritingParquet").getOrCreate()

df = spark.read.option("header", True).csv("csv/file.csv")

df.show()

df.write.mode('overwrite').parquet("a.parquet")

spark submit :

./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://35368355157f:7077 csv/ctp.py

please help me 👍

from charts.

kayvansol avatar kayvansol commented on August 16, 2024 1

I create a discussion in kubernetes community too link

from charts.

kayvansol avatar kayvansol commented on August 16, 2024 1

I tested the python code for saving dataframe to json format, but the result was the same problem as I mentioned before :

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("WritingJson").getOrCreate()

df2 = spark.createDataFrame([(1, "Alice", 10),
                            (2, "Bob", 20),
                            (3, "Charlie", 30)], 
                            ["id", "name", "age"])


df2.show()

df2.write.mode('overwrite').json('file_name.json')

jsonErr

please say something helpfull.

from charts.

kayvansol avatar kayvansol commented on August 16, 2024 1

with scala shell (spark-shell), everything is ok.

val df = spark.read.csv("csv/file.csv")

df.write.mode("overwrite").format("json").save("file_name.json")

jsonScala

jsonScalaFile

but with pyspark and spark-submit python code file not found !

from charts.

kayvansol avatar kayvansol commented on August 16, 2024 1

I tested the java code for saving dataframe to json format, but the result was the same problem as I mentioned before :

Javacsvread

JavacsvreadSchema

JavacsvreadNoFile

package arka;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

public class ctjson {

	public static void main(String[] args) {

		SparkSession SPARK_SESSION = SparkSession.builder().appName("Mahla ctjson")
				.master("spark://6fe9e36ddaa9:7077")
				.getOrCreate();

		Dataset<Row> df = SPARK_SESSION.read().option("inferSchema", "true")
				.option("header", "true")
				.csv("csv/file.csv");

		df.show();

		df.printSchema();
		
		df.write().mode("overwrite").format("json").save("file_name.json");
		
	}
}

pom.xml :

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>com.mahla</groupId>
	<artifactId>arka</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<name>csvtojson</name>

	<dependencies>

		<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-core_2.12</artifactId>
			<version>3.5.1</version>
		</dependency>

		<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-sql_2.12</artifactId>
			<version>3.5.1</version>
			<scope>provided</scope>
		</dependency>
		
	</dependencies>

</project>

jar file :
ctj.zip

submit command :

./bin/spark-submit --class arka.ctjson --master spark://6fe9e36ddaa9:7077 csv/ctj.jar

Could you please check the issue.

from charts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.