Code Monkey home page Code Monkey logo

tinygrad's Introduction

tiny corp logo

tinygrad: For something between PyTorch and karpathy/micrograd. Maintained by tiny corp.

GitHub Repo stars Unit Tests Discord


This may not be the best deep learning framework, but it is a deep learning framework.

Due to its extreme simplicity, it aims to be the easiest framework to add new accelerators to, with support for both inference and training. If XLA is CISC, tinygrad is RISC.

tinygrad is still alpha software, but we raised some money to make it good. Someday, we will tape out chips.

Features

LLaMA and Stable Diffusion

tinygrad can run LLaMA and Stable Diffusion!

Laziness

Try a matmul. See how, despite the style, it is fused into one kernel with the power of laziness.

DEBUG=3 python3 -c "from tinygrad import Tensor;
N = 1024; a, b = Tensor.rand(N, N), Tensor.rand(N, N);
c = (a.reshape(N, 1, N) * b.T.reshape(1, N, N)).sum(axis=2);
print((c.numpy() - (a.numpy() @ b.numpy())).mean())"

And we can change DEBUG to 4 to see the generated code.

Neural networks

As it turns out, 90% of what you need for neural networks are a decent autograd/tensor library. Throw in an optimizer, a data loader, and some compute, and you have all you need.

from tinygrad import Tensor, nn

class LinearNet:
  def __init__(self):
    self.l1 = Tensor.kaiming_uniform(784, 128)
    self.l2 = Tensor.kaiming_uniform(128, 10)
  def __call__(self, x:Tensor) -> Tensor:
    return x.flatten(1).dot(self.l1).relu().dot(self.l2)

model = LinearNet()
optim = nn.optim.Adam([model.l1, model.l2], lr=0.001)

x, y = Tensor.rand(4, 1, 28, 28), Tensor([2,4,3,7])  # replace with real mnist dataloader

for i in range(10):
  optim.zero_grad()
  loss = model(x).sparse_categorical_crossentropy(y).backward()
  optim.step()
  print(i, loss.item())

See examples/beautiful_mnist.py for the full version that gets 98% in ~5 seconds

Accelerators

tinygrad already supports numerous accelerators, including:

And it is easy to add more! Your accelerator of choice only needs to support a total of ~25 low level ops.

Installation

The current recommended way to install tinygrad is from source.

From source

git clone https://github.com/tinygrad/tinygrad.git
cd tinygrad
python3 -m pip install -e .

Direct (master)

python3 -m pip install git+https://github.com/tinygrad/tinygrad.git

Documentation

Documentation along with a quick start guide can be found in the docs/ directory.

Quick example comparing to PyTorch

from tinygrad import Tensor

x = Tensor.eye(3, requires_grad=True)
y = Tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad.numpy())  # dz/dx
print(y.grad.numpy())  # dz/dy

The same thing but in PyTorch:

import torch

x = torch.eye(3, requires_grad=True)
y = torch.tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad.numpy())  # dz/dx
print(y.grad.numpy())  # dz/dy

Contributing

There has been a lot of interest in tinygrad lately. Following these guidelines will help your PR get accepted.

We'll start with what will get your PR closed with a pointer to this section:

  • No code golf! While low line count is a guiding light of this project, anything that remotely looks like code golf will be closed. The true goal is reducing complexity and increasing readability, and deleting \ns does nothing to help with that.
  • All docs and whitespace changes will be closed unless you are a well-known contributor. The people writing the docs should be those who know the codebase the absolute best. People who have not demonstrated that shouldn't be messing with docs. Whitespace changes are both useless and carry a risk of introducing bugs.
  • Anything you claim is a "speedup" must be benchmarked. In general, the goal is simplicity, so even if your PR makes things marginally faster, you have to consider the tradeoff with maintainablity and readablity.
  • In general, the code outside the core tinygrad/ folder is not well tested, so unless the current code there is broken, you shouldn't be changing it.
  • If your PR looks "complex", is a big diff, or adds lots of lines, it won't be reviewed or merged. Consider breaking it up into smaller PRs that are individually clear wins. A common pattern I see is prerequisite refactors before adding new functionality. If you can (cleanly) refactor to the point that the feature is a 3 line change, this is great, and something easy for us to review.

Now, what we want:

  • Bug fixes (with a regression test) are great! This library isn't 1.0 yet, so if you stumble upon a bug, fix it, write a test, and submit a PR, this is valuable work.
  • Solving bounties! tinygrad offers cash bounties for certain improvements to the library. All new code should be high quality and well tested.
  • Features. However, if you are adding a feature, consider the line tradeoff. If it's 3 lines, there's less of a bar of usefulness it has to meet over something that's 30 or 300 lines. All features must have regression tests. In general with no other constraints, your feature's API should match torch or numpy.
  • Refactors that are clear wins. In general, if your refactor isn't a clear win it will be closed. But some refactors are amazing! Think about readability in a deep core sense. A whitespace change or moving a few functions around is useless, but if you realize that two 100 line functions can actually use the same 110 line function with arguments while also improving readability, this is a big win.
  • Tests/fuzzers. If you can add tests that are non brittle, they are welcome. We have some fuzzers in here too, and there's a plethora of bugs that can be found with them and by improving them. Finding bugs, even writing broken tests (that should pass) with @unittest.expectedFailure is great. This is how we make progress.
  • Dead code removal from core tinygrad/ folder. We don't care about the code in extra, but removing dead code from the core library is great. Less for new people to read and be confused by.

Running tests

You should install the pre-commit hooks with pre-commit install. This will run the linter, mypy, and a subset of the tests on every commit.

For more examples on how to run the full test suite please refer to the CI workflow.

Some examples of running tests locally:

python3 -m pip install -e '.[testing]'  # install extra deps for testing
python3 test/test_ops.py                # just the ops tests
python3 -m pytest test/                 # whole test suite

tinygrad's People

Contributors

adamritter avatar adriangb avatar chaosagent avatar chenyuxyz avatar cloud11665 avatar dc-dc-dc avatar dosier avatar eichenroth avatar eliulm avatar flammit avatar g1y5x3 avatar geohot avatar geohotstan avatar jla524 avatar kartik4949 avatar liamdoult avatar marcelbischoff avatar mmmkkaaayy avatar nimlgen avatar patosai avatar python273 avatar qazalin avatar roelofvandijk avatar ryanneph avatar stevenandersonz avatar szymonozog avatar uuuvn avatar wozeparrot avatar wpmed92 avatar zenginu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinygrad's Issues

Backward Error Running on Windows Anaconda Enviroment

torch forward pass: 20.993 ms
torch backward pass: 210.071 ms


                                  Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls

                          aten::addmm_        19.93%      51.973ms        19.93%      51.973ms      30.217us          1720
              aten::threshold_backward        11.94%      31.134ms        12.04%      31.387ms       3.139ms            10
                       aten::threshold        11.57%      30.184ms        11.65%      30.384ms       3.038ms            10
            aten::thnn_conv2d_backward         9.75%      25.432ms        40.07%     104.499ms      10.450ms            10
                           aten::fill_         9.45%      24.652ms         9.45%      24.652ms      50.310us           490
         aten::max_pool2d_with_indices         8.99%      23.435ms         9.10%      23.720ms       2.372ms            10
                          aten::select         7.79%      20.309ms         9.17%      23.919ms       6.165us          3880
             aten::thnn_conv2d_forward         5.39%      14.059ms        14.23%      37.100ms       3.710ms            10
aten::max_pool2d_with_indices_backward         2.61%       6.814ms         6.66%      17.368ms       1.737ms            10
                              aten::mm         2.47%       6.444ms         2.51%       6.540ms     435.980us            15

Self CPU time total: 260.772ms

E

ERROR: test_mnist (main.TestConvSpeed)

Traceback (most recent call last):
File "c:\Users\Nehad Hirmiz\Documents\Programming\Python\Tutorials\tinygrad\test_speedynet.py", line 83, in test_mnist
out.backward()
File "c:\ProgramData\Anaconda3\envs\deeptorch\lib\site-packages\tinygrad\tensor.py", line 68, in backward
t.backward(False)
File "c:\ProgramData\Anaconda3\envs\deeptorch\lib\site-packages\tinygrad\tensor.py", line 68, in backward
t.backward(False)
File "c:\ProgramData\Anaconda3\envs\deeptorch\lib\site-packages\tinygrad\tensor.py", line 68, in backward
t.backward(False)
[Previous line repeated 1 more time]
File "c:\ProgramData\Anaconda3\envs\deeptorch\lib\site-packages\tinygrad\tensor.py", line 63, in backward
if g.shape != t.data.shape:
AttributeError: 'tuple' object has no attribute 'shape'

Error while trying to run examples/efficientnet.py

Traceback (most recent call last):
File "examples/efficientnet.py", line 14, in
from extra.efficientnet import EfficientNet
ModuleNotFoundError: No module named 'extra.efficientnet'

How do I get module 'extra.efficientnet'??

tinygrad is growing! should be tiny! add CI test for < 1000 lines

I deleted all the fastconv crap because it wasn't tiny. Output from sloccount:

SLOC    Directory       SLOC-by-Language (Sorted)
325     tinygrad        python=325
260     test            python=260


Totals grouped by language (dominant language first):
python:         585 (100.00%)

Can someone add sloccount to CI and have it fail if it ever gets above 1000?

Also, no code golf, but refactors that reduce complexity are very welcome.

can not access tinygrad from examples/

In examples, objects in all the files under tinygrad are imported. However, running python3 examples/efficientnet.py is not able to access tinygrad and I'm thrown an ImportError

Can't import fetch from ultis

ImportError: cannot import name 'fetch' from 'tinygrad.utils' (/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/tinygrad/utils.py)
This occur when i'm running the example.
I'm not a py developer then

Add Ensemble wrapper i.e TTA, other Ensembling techniques.

Im thinking to add an ensembling wrapper for the ensembling of models .
i.e

from tinygrad import ensemble
# example 1
ensembled_model_object = ensemble(models=[efficientnet_model_object], type='tta', aug=['original','fliplr'] )
out = ensembled_model_object.forward(image)
# tta improves results by 2-3 MaP units.


# example 2
ensembled_model_object = ensemble(models=[efficientnet_model_object, other_model_obj], type='parallel_ensemble', aug=None )
# two models will be running in parallel processes may be using ray, and save time but this might have memory  constraints
# so if used type='ensemble', internally will build sequential graph.
out = ensembled_model_object.forward(image)

@geohot thoughts?

EfficientNet runs slower on GPU than CPU

Running EfficientNet in examples/efficientnet.py runs slower on the GPU than CPU for some reason. Benchmarks:

PYTHONPATH=. GPU=1 python3.8 examples/efficientnet.py https://image.shutterstock.com/image-illustration/compact-white-car-3d-render-260nw-405716083.jpg
Output (GPU):

656 7.561172 minivan
did inference in 1.13 s

PYTHONPATH=. python3.8 examples/efficientnet.py https://image.shutterstock.com/image-illustration/compact-white-car-3d-render-260nw-405716083.jpg
Output (CPU):

656 7.5611706 minivan
did inference in 0.71 s

What could be causing this? Im runninng this with Python 3.8 on a MacBook Pro 2018 with Intel Iris Plus Graphics 1536 MB running macOS Catalina.

TinyGrad core for respecting 1K loc limit

How about separating the core (tensor, ops, opsgpu, nn, utils etc.) logic into tiny-grad core project and creating another repo for extensions (models, examples, notebooks, etc.) IMO core code should be high quality(lol) but complete.

GPU EfficientNet is weirdly slow

did inference in 0.28 s
                 Mul : 163       29.18 ms
                 Add : 140       25.53 ms
                 Pow :  98       18.43 ms
               Pad2D :  17       16.97 ms
              Conv2D :  81       14.49 ms
             Sigmoid :  65       10.23 ms
             Reshape : 230        9.94 ms
                 Sub :  49        9.75 ms
           AvgPool2D :  17        5.93 ms
                 Dot :   1        1.06 ms

Run with DEBUG=1 for profiling. Conv2D isn't even close to the top in time users.

No module named 'tinygrad' when running test from terminal.

Running the command:

$ python3.8 test/test_mnist.py TestMNIST.test_sgd_gpu

Outputs following error:

Traceback (most recent call last):
  File "test/test_mnist.py", line 5, in <module>
    from tinygrad.tensor import Tensor, GPU
ModuleNotFoundError: No module named 'tinygrad'

This is the result of importing in the following manner

from tinygrad.tensor import Tensor, GPU                                         
from tinygrad.utils import layer_init_uniform, fetch

Shouldn't we use relative path imports here instead? Shouldn't the code be able to run without installing with:

pip3 install git+https://github.com/geohot/tinygrad.git --upgrade

Python version support?

@geohot What versions of python do you want to support?

If you can specify this, I will add it to automation and documentation.

[Not a bug] Warning for Python 3.9.0

As mentioned in setup.py tinygrad requires > python 3.8
Pytorch installation fails on python3.9.0. So, tinygrad examples will not work for Python 3.9.0

See: pytorch/pytorch#47354 for more details.

I am using Python 3.8.6 and it is working fine but was not able to install requirements.txt due to torch and torchvision on Python 3.9.0 most possibly because in PyPI there are no wheels, ready to install binaries, for Python 3.9 as its still quite new.

backward pass in pow seems to have issues...

and should not be computed if requires_grad = False.

tinygrad/ops_cpu.py:58: RuntimeWarning: invalid value encountered in log
  unbroadcast((x**y) * np.log(x) * grad_output, y.shape)

lr tensor shape mismatch when GPU is on

# %%
from tinygrad.tensor import Tensor
from tinygrad.utils import layer_init_uniform
from tinygrad.optim import SGD

from pprint import pprint


class MLP:
    def __init__(
        self,
        input_size,
        network_size,
        gpu=True,
        learning_rate=0.001,
    ):
        self.gpu = gpu
        self.input_size = input_size
        self.network_size = network_size

        layer_sizes = zip([input_size, *network_size[:-1]], network_size)

        self.layers = [
            Tensor(layer_init_uniform(in_size, out_size), gpu=self.gpu)
            for (in_size, out_size) in layer_sizes
        ]
        self.optimizer = SGD(self.layers, lr=learning_rate)

    def __call__(self, x):
        output = Tensor(x, gpu=self.gpu)
        for i, layer in enumerate(self.layers):
            output = output.dot(layer)
            if i != len(self.layers) - 1:
                output = output.relu()
        return output

    def learn(self, x, y):

        _y = Tensor(y, gpu=self.gpu)
        two = Tensor([[2]], gpu=self.gpu)

        output = self.__call__(x)
        loss = (output - _y).pow(two).mean()

        loss.backward()
        # self.optimizer.step()


mlp = MLP(3, [2, 1], gpu=True)

x = [[1.0, 1.0, 1.0]]
y = [[1.0]]


mlp.learn(x, y)
# pprint([layer for layer in mlp.layers])

for layer in mlp.optimizer.params:
    print(layer.grad)
    print((layer.grad * Tensor([[0.01]], gpu=True)).cpu())
    print((layer.grad * mlp.optimizer.lr).cpu())
print((layer.grad * Tensor([[0.01]], gpu=True)).cpu())

print out the grad right to me

Tensor array([[ 0.        , -0.00866722],
       [ 0.        , -0.00866722],
       [ 0.        , -0.00866722]], dtype=float32) with grad None

while

print((layer.grad * mlp.optimizer.lr).cpu())

reports

shape mismatch in binop a*b: (3, 2) (1,)

if GPU is off, there's no this issue.

Not sure if there's something I've missed here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.