This is a simple project to setup pyspark with docker
For this project you need to have docker and docker compose installed on your computer. You can download and install docker in this link: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-20-04-es
In this case our docker file and docker-compose is ready, the only thing we have to do is to build the container
We build the container
docker-compose up --build spark_dev_v
If everything went well we should get a message in console
in the project we have 3 notebooks where we do some basic operations in the dataframes and rdds
- Tell others about this project ๐ข
- Invite a beer ๐บ or a coffee โ .
- Say thank you publicly ๐ค.
- etc.