Code Monkey home page Code Monkey logo

Comments (4)

arash-vahdat avatar arash-vahdat commented on August 18, 2024 2

@kaushik333 Your answer is very complete and correct. Thanks!

from nvae.

kaushik333 avatar kaushik333 commented on August 18, 2024 1

Hi @Lukelluke

  1. I think they are just dummy labels being assigned to the dataset. A more generic framework of dataloader which gives you the (image, label) pair. If you look at Line 146, only the data is being used and not the label. And I dont see it being used in evaluate.py or the test() function too.

  2. To use 1 channel data, these are the changes I did.
    a. Add a separate elif case for your data class in https://github.com/NVlabs/NVAE/blob/master/datasets.py
    Add a "greyscale=True" parameter to the LMDBDataset class.
    Change this to

    if not self.greyscale:
        img = img.convert('RGB')
    else:
        img = img.convert('L')
    

    b. Change this to

     Cin = 1 if self.dataset in {'mnist','yourDataName'} else 3
    

    c. Change this to

    C_out = 1 if self.dataset in {'mnist','yourDatasetName'} else 10 * self.num_mix_output
    

    d. Since you're using grayscale images, change this to

    if self.dataset in {'mnist', 'yourDatasetName'}:
    

    or you can also NOT use bernoulli dist and use the mixture of dist instead.

@arash-vahdat please feel free to add anything else if you feel is important.

from nvae.

Lukelluke avatar Lukelluke commented on August 18, 2024

Thank you soooooooooooooo much! Dear @kaushik333 and Dr.@arash-vahdat !

Thank you for your quickly help!

I will follow your tutorial to practice right now.

Ps. Actually, I'm trying my dataset in .wav with mono, which channel==1 . And I get so much inspirations from your timely help, and as for dataset.py, there I did some changes in another way to fit it.

As for decoder_output , I need to take some more time to figure out how Bernoulli and DiscMixLogistic work.

All in all, thank you very much for your generous help ! Hope that I can become better in coding like you :) !

If it succeeds, I will give you good news as soon as possible and release related implement.

All the best,

Luke Huang

from nvae.

Lukelluke avatar Lukelluke commented on August 18, 2024

Hi, dear @kaushik333

I did as your help. Say thank you again sincerely ! And this help me understand NVAE better !

During this period, there still a big question hang over my head:

  • As we know, in image field, usually we cut image to the shape of [h, w], furthermore, we usually make height==wide for convenience, just as NVAE do.
  • And I wonder, how to apply data which has the shape of [H, W], where H != W to NVAE model ?
  • Just as: input(=[batch, channel, H, W]) ——> NVAE(input) ——> output(=[batch, channel, H, W]) H != W

Ps. This doubt is derived from audio field, where we usually turn audio spectrum to [batch, channel=1(mono), FRAME, Dimension of spectrum]. Where we usually make Dimension==80, however, frames(which denotes the length of one .wav field), is always != Dimension.

Hope to get any inspiration from IMG field, just as the 'Channel Problem' that you teach me above.

Please feel free to teach me anything, important or not important all is well !

Again to express my most sincere thanks to you !

All the best,

Luke Huang

from nvae.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.