Code Monkey home page Code Monkey logo

longercrafter's Introduction

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ The LongerCrafter for longer high-quality video generation are now released!

โœ… totally no tuning ย ย ย ย  โœ… less than 20% extra time ย ย ย ย  โœ… support 512 frames ย ย ย ย 

ย ย ย ย ย  ย ย ย ย ย  Replicate

Haonan Qiu, Menghan Xia*, Yong Zhang, Yingqing He,
Xintao Wang, Ying Shan, and Ziwei Liu*


(* corresponding author)

From Tencent AI Lab and Nanyang Technological University.

Input: "A chihuahua in astronaut suit floating in space, cinematic lighting, glow effect";
Resolution: 1024 x 576; Frames: 64.

Input: "Campfire at night in a snowy forest with starry sky in the background";
Resolution: 1024 x 576; Frames: 64.

๐Ÿ”† Introduction

๐Ÿค—๐Ÿค—๐Ÿค— LongerCrafter (FreeNoise) is a tuning-free and time-efficient paradigm for longer video generation based on pretrained video diffusion models.

1. Longer Single-Prompt Text-to-video Generation

Longer single-prompt results. Resolution: 256 x 256; Frames: 512. (Compressed)

2. Longer Multi-Prompt Text-to-video Generation

Longer multi-prompt results. Resolution: 256 x 256; Frames: 256. (Compressed)

๐Ÿ“ Changelog

  • [2023.10.24]: ๐Ÿ”ฅ๐Ÿ”ฅ Release the LongerCrafter (FreeNoise), longer video generation!
  • [2023.10.25]: ๐Ÿ”ฅ๐Ÿ”ฅ Release the 256x256 model and support multi-prompt generation!

๐Ÿงฐ Models

Model Resolution Checkpoint Description
VideoCrafter (Text2Video) 320x512 Temporarily Unavailable Support 128 frames on NVIDIA A100 (40GB)
VideoCrafter (Text2Video) 576x1024 Hugging Face Support 64 frames on NVIDIA A100 (40GB)
VideoCrafter (Text2Video) 256x256 Hugging Face Support 512 frames on NVIDIA A100 (40GB)

(Reduce the number of frames when you have smaller GPUs, e.g. 256x256 resolutions with 64 frames.)

โš™๏ธ Setup

Install Environment via Anaconda (Recommended)

conda create -n freenoise python=3.8.5
conda activate freenoise
pip install -r requirements.txt

๐Ÿ’ซ Inference

1. Longer Text-to-Video

  1. Download pretrained T2V models via Hugging Face, and put the model.ckpt in checkpoints/base_1024_v1/model.ckpt.
  2. Input the following commands in terminal.
  sh scripts/run_text2video_freenoise_1024.sh

2. Longer Multi-Prompt Text-to-Video

  1. Download pretrained T2V models via Hugging Face, and put the model.pth in checkpoints/base_256_v1/model.pth.
  2. Input the following commands in terminal.
  sh scripts/run_text2video_freenoise_mp_256.sh

๐Ÿงฒ Support For Other Models

FreeNoise is supposed to work on other similar frameworks. An easy way to test compatibility is by shuffling the noise to see whether a new similar video can be generated (set eta to 0). If your have any questions about applying FreeNoise to other frameworks, feel free to contact Haonan Qiu.

๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Crafter Family

VideoCrafter: Framework for high-quality video generation.

ScaleCrafter: Tuning-free method for high-resolution image/video generation.

TaleCrafter: An interactive story visualization tool that supports multiple characters.

๐Ÿ˜‰ Citation

@misc{qiu2023freenoise,
      title={FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling}, 
      author={Haonan Qiu and Menghan Xia and Yong Zhang and Yingqing He and Xintao Wang and Ying Shan and Ziwei Liu},
      year={2023},
      eprint={2310.15169},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

๐Ÿ“ข Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.


longercrafter's People

Contributors

arthur-qiu avatar chenxwh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.