Code Monkey home page Code Monkey logo

llm-rag's Introduction

This image, generated with DALL-E, depicts a wide Moroccan landscape where ancient ruins and modern AI structures blend, symbolizing the harmony between the past and the future.

πŸ˜„ About Me:

Typing animation showing my roles and certifications

  • 🌱 Hello, I'm Saad, a 23-year-old based in France, with a deep passion for creating projects in the realms of Data and Artificial Intelligence.
  • πŸŽ“ I hold a Data Engineering degree from INPT.
  • πŸ’Ό Currently working as a Machine Learning Engineering Apprentice at AXA - Direct Assurance.
  • πŸ“š I'm also preparing for a Master's degree in Machine Learning and Data Science at Paris CitΓ© University.

πŸ… Certifications: (5x Azure Certified)

  • Azure Data Engineer
  • Azure Data Scientist
  • Azure Data Fundamentals
  • Azure AI Fundamentals
  • Azure Fundamentals

πŸ“š Contributions:

Contributed to repackaging and updating the GIT Clustering algorithm πŸ”„ based on insights from an arXiv paper, with implementation available in the GitHub repository πŸ“‚ and distribution through the TestPyPI Package πŸ“¦.

πŸ’Ό Work Experience:

  • Machine Learning Engineer / Data Scientist Apprenticeship at AXA - Direct Assurance, Paris, France (Ongoing) More details
  • Data Engineer / Data Scientist Internship at Chefclub, Paris, France (6 months) More details
  • Data Engineer Intern at Capgemini Engineering, Casablanca, Morocco (2 months)
  • Data Scientist Intern at AIOX Labs, Rabat, Morocco (2 months)
  • Web/Backend Developer Intern at DXC Technologies, Rabat, Morocco (2 months)

🌟 Top 4 Repositories

1. LLM RAG - Streamlit RAG Language Model App πŸ€–

Description: A Streamlit application leveraging a Retrieval-Augmented Generation (RAG) Language Model (LLM) πŸ€– with FAISS indexing πŸ—ƒοΈ to provide answers from uploaded markdown files. Users can upload documents πŸ“, input queries, and receive contextually relevant answers using Similarity Search πŸ”, showcasing a practical application of NLP technologies πŸ€–. The project is also equipped with a CI/CD pipeline πŸ”„ ensuring code quality & tests and simple deployment, and it supports containerization with Docker 🐳 for easy distribution and deployment.

  • Technologies/Tools: Streamlit, OpenAI API Models (LLMs, Embedding Models), FAISS, Python, Docker, CI/CD (Github Actions), Makefile, venv.

2. Kedro Energy Forecasting Machine Learning Pipeline 🏯

Description: A showcase of MLOps best practices using Kedro πŸ› οΈ, this repository shows the journey of Machine Learning Models from development to deployment πŸš€, utilizing Docker 🐳. Featuring straightforward training, evaluation, and deployment of models such as XGBoost Regressor, LightGBM πŸ’‘ and Random Forest Regeressor 🌳, it integrates built-in visualization πŸ“Š and logging πŸ“ for effective monitoring. Dive into the world of modular and scalable data pipelines with Kedro πŸ“š Kedro Documentation. The integration of an automated CI pipeline πŸ”„ with Github Actions ensures code quality βœ… and reliability πŸ”’.

  • Technologies/Tools: Docker, Kedro, MLOps, CI/CD (Github Actions), Machine Learning (XGBoost, Random Forest, LightGBM), Jupyter Notebook, Makefile, venv, Python.

3. Repackaged GIT Clustering Algorithm 🧩

Description: An upgraded version of the GIT Clustering algorithm πŸ”„, informed by insights from an arXiv paper πŸ“„, with easy deployment via TestPyPI πŸ“¦. Includes benchmarking notebooks πŸ“Š comparing it to state-of-the-art clustering algorithms πŸ”.

  • Technologies/Tools: Benchmarking, Poetry Packaging, PyPI Distributing, Machine Learning (K-means, DBSCAN, AgglomerativeClustering, Gaussian Mixture..), Jupyter Notebook, Makefile, venv, Python.

4. Monthly & Daily Energy Forecasting Docker API ⚑

Description: This repository πŸ“¦ houses an Energy Forecasting API ⚑ that uses Machine Learning to predict daily πŸ“… and monthly πŸ—“ energy consumption from historical data πŸ“Š. It's designed as a practical demonstration of a ML Engeineering/Data Science workflow, from initial analysis to a deployable API packaged with Docker 🐳.

  • Technologies/Tools: MLOps, Docker, API design, Machine Learning (XGBoost, Random Forest), Jupyter Notebook, Makefile, venv, Python.

πŸ™Œ Connect with Me:

LinkedIn Kaggle

Let's make something innovative together! Feel free to reach out for collaborations or discussions in Data & Artificial Intelligence!

πŸ”„ Last Updated:

  • README last updated on 17/04/2024. Regularly updated to reflect current work and interests.

llm-rag's People

Contributors

github-actions[bot] avatar labrijisaad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

llm-rag's Issues

Dockerize the application !

Things to be done:

  • Organize the code more effectively (refactor the code, improve variable naming, remove unnecessary codes and files).
  • Dockerize the application.
  • Allow the possibility to mount volumes for secrets and markdown documents using Docker volume.
  • Add more Make commands in the Makefile.
  • Update the Readme file.

Add a README.md file!

Add a README file to explain the main idea, and also provide explanations for the prototyping notebooks πŸ™Œ

Update the CI/CD workflows.

  • Add CI worflow to test the code for FAISS Index Creation.
  • Add tests/ folder and create tests with pytest for the workflow.
  • Add OpenAI key as a repository secret and set up readme mock data for testing.
  • Add other types of tests...

Deploy the App πŸ€–

  • Update the Readme file.
  • Organize the code more effectively (refactor the code, improve variable naming, remove unnecessary codes and files).
  • Push the Docker Image in a Container Registry.
  • Deploy the App in Cloud ( GCP, Azure.. )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.