Code Monkey home page Code Monkey logo

Comments (3)

kcrosley-leisurelabs avatar kcrosley-leisurelabs commented on August 24, 2024 1

Kind of an aside/mildly off-the-original non-issue... Some thoughts on jukebox's output:

I've had this same thought myself, but in a rather simpler form: I don't quite understand how it is that vocal renditions get superimposed on the backing track (but that must... kinda-sorta... be how it works). I'd not expect that individual instruments could be extracted (what you're hearing is an amalgamation of "what comes next" in the waveform), but since lyrics seem to be fit to the track somehow, it may be possible they could be rendered separately with (possibly, at least) minimal instrumental artifacts around them.

Point being: If any of this is possible at all (with jukebox as it is) it is that the vocal/lyrical part could (maybe) be rendered separately than the "backing track". Though note that this may be entirely impossible. I note that — while the artifacts from tokenization/downsampling and subsequent upsampling are part of the problem with the audio quality of the final result — another issue is that what's being rendered here is an approximation of a fully-mixed track, based on a huge training set that encompasses many common production techniques.

To wit: jukebox has a terrible time emulating/synthesizing time-domain effects (e.g., reverb and delay). Much of the "swampiness" in final rendered "Level 0" tracks actually sounds like approximations of reverb tails to me. And the sometimes-repeated vocals are not always the lyrics jumping around, but would seem to be emulations of delay effects. Basically, lots of output sounds like awesome songs mixed and mastered by tone-deaf engineers.

You can fix this rather a lot:

First nearly everything that comes out of jukebox benefits from EQ cut in the lower mids, where there is often a lot of muddiness. Vocal lines are (for whatever reason) often much too low in the mix and so a commensurate EQ boost in the mid-highs helps. There's not much we can do about completely missing high-frequency content, but we can use things like an Exciter type plugin to at least synthesize missing high-frequency content (we can't EQ boost what isn't there to boost, right?).

Just those things will help. If you have access to more advanced plugins, I've found that Zynaptiq's Unveil works miracles on jukebox tracks. (It's a de-reverberation plugin, essentially, and in removing 'verb it also removes general muddiness in pretty much just-the-right-way and can also synthesize missing high-frequency content.) Similarly, drum balance can be totally whack in rendered tracks and Zynaptiq's UNMIX:Drums plugin can work a lot of magic. I suspect Zynaptiq's "Unchirp" plugin (designed to fix codec artifact-related issues) might do a lot of the same stuff as Unveil, but I don't own that, can't trial it again, and can't be bothered by their awful iLok copy protection to venture to buy it.

Certain mastering plugs like multiband limiters are very useful for the above (or in conjunction with the above) as well. There's a lot of interesting "production forensics" to be performed on jukebox tracks... and they are fun and challenging to remaster!

Anyway, my main point is that jukebox does is not "write music". jukebox synthesizes a (surprisingly good) rendition of an entire recorded musical track. It does not "know" what individual instruments are playing and what notes they are hitting. (But gosh golly, doesn't it sound like it?)

All the above being said, it'd be interesting to run some fixed-up jukebox output through deezer/spleeter (which I've not played with) to see what comes out. (And I guess if one is too lazy to transcribe one might be able to extract MIDI out of the stems, if one is lucky.)

from jukebox.

johndpope avatar johndpope commented on August 24, 2024

Hi Keith, thanks for comments. I suspect in 6-12 months time - things will be at a point where you can take some raw / noisy sounds + cleaned EQ processed datasets and have the neural nets learn how to apply this EQ using GAN. like a style transfer for audio form. The quality of sounds aren't such a concern. But the shaping and orchestrating is exciting area. I need to read the paper and do some digging.

from jukebox.

johndpope avatar johndpope commented on August 24, 2024

just reflecting on your comments that this is a synthesiser - I guess a good step of research would be to narrow the audio into an instrument - say drums / using the spleeter isolated tracks - training it the 6000 tracks - then being able to guide it some how - if it was piano - you might have chord progress or beats per minute or something to cue it.

from jukebox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.