Code Monkey home page Code Monkey logo

Comments (9)

dyelax avatar dyelax commented on September 24, 2024

I have never tested this with outputting more than one frame at a time. The original paper added some extra tricks to make that work well. But the easiest way to tweak this to do what you want is in constants.py set HIST_LEN = 10 and in SCALE_FMS_G, set the last element of each array (currently all 3s) to be 9 (3 channels * 3 output frames). You will have to write something to parse the output, since it will be all 3 images stacked on top of one another.

This may also break some of the training loop that has to do with visualizing the images, since those functions are expecting 3-channel inputs.

Let me know how this works for you! And be sure to use TensorFlow 0.12 (doesn't work on the latest versions yet)

from adversarial_video_generation.

bitxsw93 avatar bitxsw93 commented on September 24, 2024

Thank you for your kindly reply.

The original paper gave the models of 8 input frames and 8 output frames. Can I just modify the SCALE_CONV_FMS, SCALE_KERNEL_SIZES and SCALE_FC_LAYER_SIZES. set HIST_LEN=8 to get
the 8 prediction frames?

Best wishes!

from adversarial_video_generation.

dyelax avatar dyelax commented on September 24, 2024

Yes, after re-reading that section of the paper, if you tweak those hyperparameters it should work.

Please let me know how it turns out!

from adversarial_video_generation.

bitxsw93 avatar bitxsw93 commented on September 24, 2024

I don't understand why you add a column to SCALE_CONV_FMS_D, SCALE_FMS_G, SCALE_FC_LAYER_SIZES_D, instead of just using the models of 4 inputs and 1 output provided by
the original paper? when I tweak those hyperparameters for 8 inputs and 8 outputs, need I add it too?

from adversarial_video_generation.

dyelax avatar dyelax commented on September 24, 2024

I don't understand your question. Could you point to the piece of code you are confused about, and give an example of what you think it should be?

from adversarial_video_generation.

bitxsw93 avatar bitxsw93 commented on September 24, 2024

In your code,
SCALE_FMS_G = [[3 * HIST_LEN, 128, 256, 128, 3],
[3 * (HIST_LEN + 1), 128, 256, 128, 3],
[3 * (HIST_LEN + 1), 128, 256, 512, 256, 128, 3],
[3 * (HIST_LEN + 1), 128, 256, 512, 256, 128, 3]]
There are not the first and the last columns in the original paper. why do you add them? Can you explain
it. Thank you

from adversarial_video_generation.

dyelax avatar dyelax commented on September 24, 2024

The first column is the depth of the input (3 channels * the number of input frames), and the last column is the depth of the output (3 channels * 1 output frame). I set it up this way so it would be easy to change the number of input or output frames

from adversarial_video_generation.

bitxsw93 avatar bitxsw93 commented on September 24, 2024

I have some data which is 883 instead of your training data 32323. I just revise the TRAIN_HEIGHT & TRAIN_WIDTH to 8. but when running to preds = tf.nn.conv2d(last_input, conv_ws[i], [1, 1, 1, 1], padding=c.PADDING_D). there is an error:ValueError: Negative dimension size caused by subtracting 3 from 1 for 'discriminator/scale_net_0/calculation/convolutions/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1,3], [3,3,3,64].
what else should I revise? Thank u

from adversarial_video_generation.

dyelax avatar dyelax commented on September 24, 2024

I'm guessing this is because there are four scale networks that each downsample the image by 2x, so if your original images are 8 pixels wide, the image input to the smallest scale network will be 1 pixel, which could be too small to convolve over with 3x3 or 5x5 kernels

from adversarial_video_generation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.