Code Monkey home page Code Monkey logo

lolguy91 / perfect-llm-imho Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 16 KB

The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.

License: Creative Commons Zero v1.0 Universal

ai concept llm perfect generative-ai language-model machine-learning lightweight llms better

perfect-llm-imho's Introduction

The perfect LLM(Concept)

Star the repo if you agree

The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.

A rough sketch

it would have the following features:

  • 1 or 2 billion parameters, lightweight for running on CPU
  • Mixture Of Eperts architecture
  • Support for 8 modalities/formats: text, image ,GIF ,binary blob ,PDF ,EPUB ,audio ,MIDI
  • Trained on heavily filtered, yet unbiased and high quality data(only filtering out garbage like text from web apps)
  • Uncensored(!!!)
  • Quiet StAR support
  • Beeg context window(at least 512k)

Parameter count

Seeing that QWEN 1.5b runs perfectly on my PC, I can imagine some other model of comparable parameter count (maybe with a hint of quantization) can run well on a shitty laptop like my PC. We could get the performance of a larger model with Mixture of experts, which allows for "larger" models to run just a little, and Quiet StAR which gives the model a native inner monolouge.

Mixture of experts

The basic idea is to have multiple sets of weights for the model, and switch them around according to the reccomendation of a switcher model, this allows for only a fraction of the parameters of the model to be used at one time. This means less computation for the poor CPU that has to run AI.

Quiet StAR

IDK anything about this tech, but I think it adds an inner monolouge.

Training

The idea is to put a ton of datasets together, deduplicate it, and filter out the bullshit. For images, we may use the trick of training an image summarising AI to create better captions for the image dataset. We also want to use self-made or self-collected data, because that is higher quality.

No censorship or bias

The process is simple, add in some obedience dataset and some banned books, and it will be uncensored. As for bias, thats probably impossible to eliminate, but maybe we can control/limit it smh.

Beeg context window

It is worthless if the model is biased twoards some point in the conversation

perfect-llm-imho's People

Contributors

lolguy91 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.