Code Monkey home page Code Monkey logo

whatsapp-llama's Introduction

WhatsApp-Llama: Fine-tune Llama 7b to Mimic Your WhatsApp Style

This repository is a fork of the facebookresearch/llama-recipes, adapted to fine-tune a Llama-2 7b chat model to replicate your personal WhatsApp texting style. By simply inputting your WhatsApp conversations, you can train the LLM to respond just like you do! Llama-2 7B chat is finetuned using parameter efficient finetuning (QLoRA) and int4 quantization on a single GPU (P100 with 16GB gpu memory).

My Results

  1. Quick Learning: The fine-tuned Llama-2 model picked up on my texting nuances rapidly.

    • The average words generated in the finetuned Llama-2 is 300% more than vanilla Llama-2. I usually type longer replies, so this checks out
    • The model accurately replicated common phrases I say and my emoji usage
  2. Turing Test with Friends: As an experiment, I asked my friends to ask me 3 questions on WhatsApp, and responded with 2 candidate responses (one from me and one from the LLM). My friends then had to guess which candidate response was mine and which one was Llama's.

The result? The model fooled 10% (2/20) of my friends. Some of the model's responses were eerily similar to my own. Here are some examples (Candidate A is Llama-2 7B):

  • Example 1:

    image

  • Example 2:

    image

    I believe that with access to more compute, this number could easily be pushed to ~40% (which would be near random guessing).

Getting Started

Here's a step-by-step guide on setting up this repository and creating your own customized dataset:

1. Exporting WhatsApp Chats

Details on how to export your WhatsApp chats can be found here. I exported 10 WhatsApp chats from friends who I speak to often. Be sure to exclude media while exporting. Each chat was saved as <friend_name>Chat.txt.

2. Preprocessing the Dataset

Complete the steps below to convert the exported chat into a format suitable for training:

Convert text files to json:

python preprocessing.py <your_name> <your_contact_name> <friend_name> <friend_contact_name> <folder_path>
  1. your_name refers to your name (Llama will learn this name)
  2. your_contact_name refers to how you've saved your number on your phone
  3. friend_name refers to the name of your friend (Llama will learn this name)
  4. friend_contact_name refers to the name you've used to save your friend's contact
  5. folder_path should be the path in which you've stored your whatsapp chats.

You'll need to run this command once for every friend's chat you've exported

Convert json files to csv

Once you're done converting all texts to json, you can run the command below to create the dataset

python prepare_dataset.py <dataset_folder> <your_name> <save_file>
  1. dataset_folder refers to the folder with your json files
  2. your_name refers to your name (Llama will learn this name)
  3. save_file file path of the final csv

3. Validating the Dataset

Here's the expected format for the preprocessed dataset:

| ID |   Context  |    Reply   |
| -- | ---------- | ---------- |
| 1  | You: Hi    | What's up? |
|    | Friend: Hi |            |

Ensure your dataset looks like the above to verify you've done it correctly.

4. Model Configuration

Once you're done with the above steps, run WhatsApp_Finetune.ipyb

  • If you're using a P100 GPU, load the model in 4 bits:

  • If you're using an A100 GPU, you can load the model in 8 bits:

PEFT adds around 4.6M parameters, or 6% of total model weights.

Additionally, you'll need to make the following 2 changes to ft_datasets/whatsapp_dataset.py:

  1. Update the prompt to one of your choosing (line 8)
  2. Update the file path of your dataset in the dataset.load_dataset() command (line 5)

5. Training Time

For reference, a 10MB dataset will complete 1 epoch in approximately 7 hours on a P100 GPU. My results shared above were achieved after training for just 1 epoch.

Conclusion

This adaptation of the Llama model offers a fun way to see how well a LLM can mimic your personal texting style. Remember to use AI responsibly and inform your friends if you're using the model to chat with them!


whatsapp-llama's People

Contributors

hamidshojanazeri avatar chauhang avatar sekyondameta avatar ads-cmu avatar lchu-ibm avatar mreso avatar cmiller01 avatar luobots avatar awgu avatar barnjamin avatar nairbv avatar chengyuma avatar eltociear avatar kpister avatar philparzer avatar rohan-varma avatar polym avatar irajmoradi avatar activescott avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.