D-Soft: Research and Implement Image-to-Video application
-
Variational Autoencoder (VAE): Encode images to a compressed size, then decode back to the original size, while learning the distribution of the data
-
Generative Adversarial Network (GAN): They have two parts (the Generator and the Discriminator) that help each other get better. The Generator learns to make data that looks real, and the Discriminator learns to tell the difference between real and fake data.
-
Flow-based Generative Model: Create new data that’s similar to the data they were trained on and then calculate how likely a certain output is
-
Auto-Regressive Model: Model the conditional probability of each pixel given previous pixels. Then use the probability distribution to generate new data
-
Diffusion Model: Systematically and slowly destroy struture in data distribution though an iterative
forward diffusion process
. We then learn areverse diffusion process
that restores structure in data, yielding a highly flexible and tractable generative model of the data.
-
Encoder: Extract features from the input image, reduce the spatial information, and compress the image into a smaller size.
-
Decoder: Upsample the features to the original size, and restore the spatial information.
-
Skip Connections: Connect the encoder and decoder layers to preserve the spatial information.
-
Output Layer: Produce the final segmentation map with the sampe spatial dimension as the input image.
The UNet can take more information in the form of embeddings
-
Time embedding: Related to the timestep and noise level
-
Context embedding: Control the content of the generated image
-
Download Pretrained Model:
-
Load Pretrained Model
-
Fine Tuning
-
Sampling Function
-
Generation
We can follow this tutorial to fine-tune the model: Colab
- Authencation: Controls on access to API endpoint services and resources
UNET: Convolutional Networks for Biomedical Image Segmentation [Paper]
High Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [Paper]
Stable Video Diffusion [Paper]
Latent Flow Diffusion Models [Paper]
Denoise Diffusion Probabilistic Models [Paper]