Code Monkey home page Code Monkey logo

femtogpt's Introduction

Hi there ๐Ÿ‘‹

This is ๐ŸชKeyvan, a hardcore computer geek who adores reinventing the wheel! (HEY! If you are also a fan of building everything from scratch, read my book: The Super Programmer ๐Ÿ“–)

I have written:

Curious about my current occupation?

  • ๐Ÿ”ญ Iโ€™m currently the director of Nobitex Labs, a blockchain research team backed by Iran's biggest cryptocurrency exchange, Nobitex.
  • ๐ŸŽธ Iโ€™m currently learning to play guitar, so more passionate and curious about music, thus also learning about computer synthesizers and sound wave generation ๐ŸŽต
  • ๐Ÿ‡ท๐Ÿ‡บ ะ˜ะทัƒั‡ะฐัŽ ั€ัƒััะบะธะน ัะทั‹ะบ! ๐Ÿ‘…
  • ๐Ÿ“ซ Reach me on Telegram: @keyvankambakhsh

Previous occupations:

  • ๐Ÿช™ Ziesha Network - I am the founder of a new layer-1 cryptocurrency which uses ZK-proofs under the hood to provide privacy/scalability. (Rust)
  • ๐Ÿงฑ Divar - A giant Iranian web company in which I was working as a Back-end Software Engineer. (Python + Golang + Kubernetes)
  • ๐Ÿ—ƒ๏ธ Filecoin Project (Protocol Labs) - Making Proof-of-Replication and Proof-of-SpaceTime faster, by accelerating zkSNARK libraries with GPUs. (Rust + OpenCL)
  • โšก Finality Labs - Researching and developing layer-2 scalability solutions for Ethereum blockchain. (Solidity)
  • ๐Ÿค‘ Stellar Asset Management - Building an infrastructure for trading-bots to operate in, using Clojure. (Clojure + ClojureScript)
  • ๐Ÿง‘โ€๐Ÿซ CodeLearnr - Startup of my own, an interactive platform for teaching programming to persian-speaking users. (Python)

femtogpt's People

Contributors

cutoken avatar eltociear avatar keyvank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

femtogpt's Issues

Random discussions

Hi,
Assert fails in tensor/mod.rs when number of heads = number of layers. Not sure if it's not a good number but in nanogpt, equal values are supported - see the cpu section (apples to oranges so please ignore if it's not a good comparison):

https://github.com/karpathy/nanoGPT

Make femtoGPT the easiest GPT library ever made

It would be good if people could store their entire model+descritption+training_data in a single file so that others could easily infer or fine tune them.

Let's make femtoGPT a library!

loss jittering

Is it normal for the loss to be jittering?

I've been training my model for a few hours (like 6?), at the start the loss was mostly only decreasing but at around 3.00 it started jittering and the jittering only gets more intense. Currently my loss ranges from ~2.67 to ~2.92.

I'm training on my own dataset (10k lines, ~150KB) with 78 unique characters and 312k parameters (not sure if that matters)

AMD Gpu traning not working

Hello, I'm trying to run femtoGPT on my rx 6600 under Ubuntu Linux, I've installed the required Rocm OpenCL drivers but, when I run the program using

cargo run --release --features gpu

I get an index out of bounds exception

thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', src/graph/gpu/mod.rs:119:22
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Edit

This is the log with RUST_BACKTRACE=1

0: rust_begin_unwind
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
1: core::panicking::panic_fmt
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
2: core::panicking::panic_bounds_check
at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:162:5
3: femto_gpt::graph::gpu::GpuGraph::new
4: femto_gpt::main
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.

How to add a new decoder after gpt is created with ::new call ?

hi @keyvank ,
let's say I want to add a new decoder layer (the one that gets constructed as part of 0..num_layers loop) at run time after the gpt::new() call, how do I go about it ? As I understand you are just pushing the computations one by one incremented by tensorid so adding a layer at a later point of time will need incrementing the ids for the next layers as well (for example adding one more decoder layer along with all the sub layers like attention etc means incrementing the vocab out and other variables outside the for loop ?)

Also why keep computations as a btree when in reality it's being used more like a Vec as we are not even using the id against which each computation is stored (please correct me if I missed something :) )

Error When generating text

thread 'main' panicked at src/tokenizer/simple.rs:43:47:
called Option::unwrap() on a None value
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

How to increase the number of parameters

Hi,
It's mentioned that this has 300k parameters. How to increase the number of parameters. Is it by increasing the number of layers/heads ? (also how are the total parameters calculated)

Thank you for attempting a pure Rust implementation of GPT. Helps a lot to understand things without having to understand python voodoo.

Edit: Also if possible having some comments around important parts of the code will help a lot from learning perspective. Would be great if you wrote a blog post on it :)

Cleaner exception handling

Currently, I have been doing .unwrap()s all over the code which is very bad. It's good to have different error types and raise them instead of panicking.

E.g: TensorError when a faulty tensor operation happens, or GraphError on faulty computation graphs or GptError for core LLM related errors.

Add more documentation/comments

Right now, the code is very undocumented. As femtoGPT is also an educational library, it's good to have comprehensive guides, documentations and commentsm

How to let the model fill text

Hi @keyvank ,
Instead of letting the model generate the whole text in infer, how do we make it complete a user provided text?

I'm guessing it should be by filling context vector in infer method.

Fine-tune meta-parameters

Find the best num-layers/embedding-degree/learning-rate/etc. in terms of accuracy and training time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.