Code Monkey home page Code Monkey logo

Comments (13)

ttbrunner avatar ttbrunner commented on August 24, 2024 3

Are you running sample.py or jukebox/interacting_with_jukebox.ipynb? The notebook seems to be new and needs less memory.

sample.py needs more system RAM because it loads all three priors at once. The code in the jupyter notebook loads only the lvl2 prior, draws samples, deallocates memory, and only afterwards loads the lvl1 and lvl0 upsamplers.

Also, the code in the notebook has different hyperparams as sample.py, so you might want to play around with sample length, total length, etc. I had to figure this out yesterday.

from jukebox.

PiotrMi avatar PiotrMi commented on August 24, 2024 2

Do you have an idea which parameters I can tweak to reduce the load on the GPU? I tried lower samples and lower sample lengths

from jukebox.

BlueProphet avatar BlueProphet commented on August 24, 2024 2

I get this seems to be the same issue though. This is with 2 1080 TI on Ubuntu

0: Loading prior in eval mode
Conditioning on 1 above level(s)
Checkpointing convs
Checkpointing convs
Loading artist IDs from /home/taj/jukebox/jukebox/data/ids/v2_artist_ids.txt
Loading artist IDs from /home/taj/jukebox/jukebox/data/ids/v2_genre_ids.txt
Level:1, Cond downsample:4, Raw to tokens:32, Sample length:262144
Downloading from gce
Restored from /home/taj/.cache/jukebox-assets/models/5b/prior_level_1.pth.tar
0: Loading prior in eval mode
Killed

from jukebox.

maddy023 avatar maddy023 commented on August 24, 2024 1

Due to GPU over excess usage from colab its causing the code to close.

from jukebox.

maddy023 avatar maddy023 commented on August 24, 2024 1

Do you have an idea which parameters I can tweak to reduce the load on the GPU? I tried lower samples and lower sample lengths

reduce the n_samples to 2 - 4 and model: 1b_lyrics it works fine in colab.

from jukebox.

ttbrunner avatar ttbrunner commented on August 24, 2024 1

I get this seems to be the same issue though. This is with 2 1080 TI on Ubuntu

How much RAM do you have in your system (not GPU)? I had this problem, but managed to run it after I added more swap to my system memory.

My low-end specs:
Ubuntu 18, 16GB RAM + 16GB Swap, 1x1070 (8GB). I can run 1b_lyrics with n_samples=4 (barely).

from jukebox.

PiotrMi avatar PiotrMi commented on August 24, 2024

Do you have an idea which parameters I can tweak to reduce the load on the GPU? I tried lower samples and lower sample lengths

reduce the n_samples to 2 - 4 and model: 1b_lyrics it works fine in colab.

Tried this out today. Unfortunately it still stopped the execution. Not sure if many people were using Colab at the moment. I think I had 16 GB RAM available.

from jukebox.

made-by-chris avatar made-by-chris commented on August 24, 2024

@ttbrunner please could you share your colab parameters? are you successfully running locally or hosted? I managed to get local set up ( i changed "5b_lyrics" to "1b_lyrics" )
I have an RTX 2060 and 16gb RAM https://ghostbin.co/paste/no6d6

but at this step:

zs = [t.zeros(hps.n_samples,0,dtype=t.long, device='cuda') for _ in range(len(priors))]
zs = _sample(zs, labels, sampling_kwargs, [None, None, top_prior], [2], hps)

I get the error:

RuntimeError: CUDA out of memory. Tried to allocate 12.00 MiB (GPU 0; 5.76 GiB total capacity; 4.50 GiB already allocated; 12.19 MiB free; 4.82 GiB reserved in total by PyTorch)

from jukebox.

ttbrunner avatar ttbrunner commented on August 24, 2024

@basiclaser Hey, your card has 6GB video RAM, mine has 8GB. That might be the problem. I guess you can try n_samples=1, but if it doesn't work then I guess you can't run it :(

from jukebox.

made-by-chris avatar made-by-chris commented on August 24, 2024

@ttbrunner yeh i gave it a shot.. lots of crashing and then memory allocation errors :) Thanks for your advice though. Are most people running this thing entirely in the cloud? I'm curious about the costs of using it in a hosted way.

from jukebox.

stevedipaola avatar stevedipaola commented on August 24, 2024

Anyone solve this for us with GPU on stand alone Linux PCs - we are running out of mem and crashing off the bat - I have a 1080ti and can't get it working at any n_samples.

from jukebox.

ttbrunner avatar ttbrunner commented on August 24, 2024

Yes, I can run it on Ubuntu 18 with a 1070. To sum up,

  • Allocate heaps of swap space if you have <32GB system RAM
  • Use 1b_lyrics (not 5b)
  • Try n_samples=1
  • Set a short sample_length, but not too short (any less than 20 seconds might return an error)
  • Don't use sample.py, but create your own based on the code in the jupyter notebook. The crucial difference is that the top prior (which generates the initial sample) is loaded separately from the two upsamplers that follow. sample.py loads them all in one, which causes OOM.

It's possible to save the intermediate results to disk with torch.save(zs, PATH) and continue later. On a low-end machine, I would create two separate scripts, one for drawing from the top prior and one for the upsampling. Sampling the top prior just takes a couple of minutes, you can already listen to it and decide if you want to upsample it.

from jukebox.

terrafying avatar terrafying commented on August 24, 2024

mine crashes as soon as i run rank, local_rank, device = setup_dist_from_mpi()

from jukebox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.