Code Monkey home page Code Monkey logo

👋 Hello, Welcome to my profile.

I'm a passionate Big Data Engineer with 4+ years of hands-on experience in designing and implementing robust data solutions. My goal is to leverage the power of data to drive meaningful insights and solve complex business challenges.

💡 Expertise:
In my journey as a Big Data Engineer, I have honed my skills in:

🔹 Big Data Technologies: I have a strong command over Hadoop, Spark and their ecosystems. I specialize in building scalable data pipelines, processing large datasets, and optimizing performance for efficient data processing.

🔹 Programming Languages: I am proficient in Python, SQL and Spark, using them to develop data-centric applications, perform data analysis, and build machine learning models.

🔹 Data Warehousing: I have hands-on experience with data warehousing principles, including data modeling, ETL (Extract, Transform, Load) processes, and dimensional modeling. I am well-versed in designing and implementing data warehouses for improved data accessibility and reporting.

🔹 Database Management: I have a strong grasp of SQL and have worked extensively with both relational databases (MySQL, PostgreSQL) and NoSQL databases (MongoDB, Cassandra). I excel at writing complex queries, optimizing database performance, and ensuring data integrity.

🔹 Cloud Platforms: I am adept at working with cloud-based environments, particularly on AWS and Azure.

🔹 Data Visualization: I possess a keen eye for visualizing data insights and effectively communicating complex findings to stakeholders. I am skilled in using tools like Tableau and Power BI to create intuitive dashboards and reports.

⚒️ Skills:

🧑‍💻 Programming Languages:
Python | SQL | Spark

⛓️ Distributed Framework:
Spark | Hadoop | Hive | Kafka | Sqoop

💾 Databases:
MySQL | MongoDB | Cassandra | HBase

🧬 Version Control:
Git | DVC

⏰ Workflow Management:
Airflow | Mage

☁️ AWS Services:
S3 | EC2 | EMR | RDS | Redshift | Glue | CloudWatch | ECS

☁️ Azure Services:
Data Factory | Databricks | Functions | Blob | Synapse | Delta Lake

🚀 MLOps:
Docker | Docker Compose | GitHub Actions | MLflow

🪄 ML Frameworks:
Pandas | Numpy | Sklearn | PySpark | Pytorch | Matplotlib | Seaborn | TFX

🌐 Socials:

LinkedIn Topmate Resume Portfolio

📊 GitHub Stats:



✍️ Random Dev Quote

😂 Random Dev Meme


Vishal Singh's Projects

opencv icon opencv

Open Source Computer Vision Library

openrefine icon openrefine

OpenRefine is a free, open source power tool for working with messy data and improving it

optimum icon optimum

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

opyrator icon opyrator

🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.

overwatch icon overwatch

Capture deep metrics on one or all assets within a Databricks workspace

paddleocr icon paddleocr

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

pedalboard icon pedalboard

🎛 🔊 A Python library for working with audio.

peft icon peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

pezzo icon pezzo

Pezzo is an open-source AI development toolkit designed to streamline prompt design, version management, publishing, collaboration, troubleshooting, observability and more.

phoenix icon phoenix

ML Observability in a Notebook - Uncover Insights, Surface Problems, Monitor, and Fine Tune your Generative LLM, CV and Tabular Models

pipenv icon pipenv

Python Development Workflow for Humans.

ploomber icon ploomber

The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

poetry icon poetry

Python packaging and dependency management made easy

polyaxon icon polyaxon

MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle

poniard icon poniard

Streamline scikit-learn model comparison.

postgresml icon postgresml

The GPU-powered AI application database. Get your app to market faster using the simplicity of SQL and the latest NLP, ML + LLM models.

pqdm icon pqdm

Comfortable parallel TQDM using concurrent.futures

pragyan icon pragyan

A simple and fast multiuser content management system to organize collaborative web-content. This CMS allows very fine user&group permissions, generating pages like articles, forms, quizzes, forums, etc, search powered by sphider.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.