Code Monkey home page Code Monkey logo

mplug-owl-inference's Introduction

mPLUG-Owl-Inference

Batch video inference notebook and script for mPLUG-Owl. Very barebones, but makes it quick to start using the model. Video inference with the default settings takes 15-16 GB VRAM. Lowering the num_frames to 16 or 12 (which may also have better performance) may make it easier for 16GB graphics cards.

Installation

Use the install-mpeg-owl.ipynb notebook to install mPlug-Owl. If there are issues refer to the Main Repo.

The script replaces the model weights due to the HF model being out of date due to a NaN bug

If you place the videos files you would like processed in the "videos" folder you can run the mplug-owl-inference.py script to process all the videos in the folder. Otherwise you can pass arguments to the script to process a different folder.

usage: mplug-owl-inference
options:
> -h, --help show this help message and exit
> -f FOLDER, --folder FOLDER
Folder containing videos to caption
> -j JSON, --json JSON
JSON file to save captions to
> -m MAX_LENGTH, --max-length MAX_LENGTH
Max length of captions
> -k TOP_K, --top-k TOP_K
Top k for sampling
> -n NUM_FRAMES, --num-frames NUM_FRAMES
Number of frames to process
> -s SKIP_EXISTING, --skip-existing SKIP_EXISTING
Skip videos that already exist in the JSON
> -D PROMPT, --prompt PROMPT
The main descriptive prompt for the video
> -C PROMPT_CLASSIFICATION, --prompt-classification
The main classifier prompt for the video

mplug-owl-inference's People

Contributors

bfasenfest avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.