the-database / mpv-upscale-2x_animejanai Goto Github PK

View Code? Open in Web Editor NEW

313.0 13.0 6.0 138.44 MB

Real-time anime upscaling to 4k in mpv with Real-ESRGAN compact models

License: Other

Lua 1.92% Python 75.12% Batchfile 0.09% C# 22.87%

4k anime anime-upscaling esrgan mpv real-esrgan super-resolution vapoursynth video nvidia tensorrt

mpv-upscale-2x_animejanai's Introduction

Upscaling Anime in mpv with 2x_AnimeJaNai V3

^{(click image to enlarge)}

Overview

This project provides a collection of Real-ESRGAN Compact ONNX upscaling models, along with a custom build of mpv video player. The video player (currently Windows only), enables real-time upscaling of 1080p content to 4K by running these models using TensorRT (NVIDIA only) or DirectML (for AMD or Intel Arc). While the default configuration upscales using the 2x_AnimeJaNai models, it can be easily customized to utilize any Real-ESRGAN Compact ONNX models.

Join the JaNai Discord server to get the latest news, download pre-release and experimental models, get support and ask questions, share your screenshots (use the s key in mpv), or share your feedback. 日本語も大丈夫です。

Usage Instructions

Ensure your NVIDIA graphics drivers are up to date. Download and extract the latest release archive of mpv-upscale-2x_animejanai. Open the video player at mpvnet.exe.

When playing a video for the first time, a TensorRT engine file will be created for the selected ONNX model. Playback will be paused and a command prompt box will open. Please make sure to wait while the engine is created. Engine creation only needs to happen once per model. Playback will resume on its own when finished.

To confirm upscaling status, press ctrl+J to view upscaling stats. This shows the current profile, and the currently running upscaling models if any.

The player is preconfigured to upscale with 2x_AnimeJaNai models, and makes 3 upscaling profiles available by default. The available profiles are described in more detail below. Any of these profiles can be selected on the fly using the keybinding listed below.

Profile	Description	Keybinding	Minimum recommended GPU for upscaling 1080p to 4k
Quality	Highest quality model	`Shift+1`	RTX 4090
Balanced	High quality model which trades slight quality for major performance gains	`Shift+2`	RTX 3080
Performance	Fastest performance model which sacrifices a bit more quality	`Shift+3`	RTX 3060

The default upscaling profile is the Balanced profile which is recommended for users running an NVIDIA RTX 3080 or higher.

Customizing Profiles and Other Settings

Upscaling can be further customized using the AnimeJaNaiConfEditor which can be launched by pressing ctrl+E from mpvnet. The editor allows the setup of up to 9 custom slots and also the use of custom chains, conditional settings based on video resolution and framerate, downscaling to improve performance, and more. The default upscaling profile can also be set using the conf editor.

All other mpv settings can be configured by editing mpv-upscale-2x_animejanai/portable_config/mpv.conf (see the mpv manual for all options) for mpv options or mpv-upscale-2x_animejanai/portable_config/input.conf for mpv keybindings.

By default, screenshots can be taken with the s key and are stored in mpv-upscale-2x_animejanai/portable_config/screenshots.

Setup for AMD or Intel Arc users.

mpv-upscale-2x_animejanai is configured to use TensorRT by default for optimal performance, but TensorRT requires an NVIDIA GPU. Users with AMD or Intel Arc GPUs can use DirectML instead. See the wiki page for detailed instructions.

2x_AnimeJaNai Models

The 2x_AnimeJaNai models are a collection of real-time 2x Real-ESRGAN Compact, UltraCompact, and SuperUltraCompact models designed specifically for doubling the resolution of HD and SD models.

2x_AnimeJaNai HD V3 Models

Most HD anime are not produced in native 1080p resolution but rather have a production resolution between 720p and 1080p. When the anime is distributed to consumers via TV broadcast, web streaming, or home video, the video is scaled up to 1080p, leading to scaling artifacts and a loss of image clarity in the source video. The aim of these models is to address these scaling and blur-related issues while upscaling to deliver a result that appears as if the anime was originally mastered in 4K resolution.

The development of the V3 models spanned over seven months, during which over 100 release candidate models were trained and meticulously refined. The V3 models introduce several notable improvements compared to their V2 counterparts, including:

More faithful appearance to original source
Improved handling of oversharpening artifacts, ringing, aliasing
Better at preserving intentional blur in scenes using depth of field
More accurate line colors, darkness, and thickness
Better preservation of soft shadow edges

Overall, the V3 models yield significantly more natural and faithful results compared to the V2 models.

2x_AnimeJaNai SD V1 Models

2x_AnimeJaNai SD V1 models are in developmnent. The latest release of mpv-upscale-2x_animejanai includes an early beta model for 2x_AnimeJaNai SD V1. While the 2xAnimeJaNai HD models can also work well for some SD sources, those models were specifically trained to upscale HD anime and don't always work well for SD sources. The SD models are designed to upscale SD anime to appear as if the anime was mastered in HD resolution. With sufficient hardware, these models can be stacked with the HD models to upscale SD anime to 4k resolution.

Benchmarks

Benchmarks for various hardware configurations tested against various upscaling configurations are available on the wiki.

Support for Other Media Players

Any media player which supports external DirectShow filters should be able to run these models, by using avisynth_filter to get VapourSynth running in the video player.

Prerendering Videos using Other Graphics Cards

The 2x_AnimeJaNai_V2 ONNX models can be used on a PC with any graphics card to render upscaled videos, even when using graphics cards not fast enough for realtime playback. Please see the AnimeJaNaiConverterGui project to create upscaled video files using a Windows GUI. Other options include chaiNNer or VSGAN-tensorrt-docker, which are multiplatform options for Windows and non-Windows users.

Related Projects

MangaJaNai: Upscale manga with ESRGAN models
VideoJaNai: Windows GUI for upscaling videos with extremely fast performance
traiNNer-redux: Software for training upscaling models

Acknowledgements

Upscale Wiki and associated Discord server
422415 for significant assistance in dataset preparation and continuous feedback during development of V2 models
Community feedback on V1 models
MPV_lazy and vs-mlrt
traiNNer-redux
Dataset Destroyer
Real-ESRGAN
OpenModelDB
getnative and anibin

mpv-upscale-2x_animejanai's People

Contributors

Stargazers

Watchers

Forkers

dvize zineos rest-ia nakedlittlezombie cnwxzhaoyang mzb98

mpv-upscale-2x_animejanai's Issues

Do you plan to release the missing SuperUltraCompact models?

I use 2x_AnimeJaNai_ Strong_V1_ SuperUltraCompact_net_g_100000 on my 1060 GTX card to upsacle sd material to fullhd.

It would be nice to have the normal and soft variant of the model. Do you plan to train and release them?

Tip regarding nvenc encoding in the bat-files.

Instead of
set video_settings=hevc_nvenc -preset slow -profile:v main10 -b:v 50M
use
set video_settings=hevc_nvenc -preset p7 -profile:v main10 -b:v 50M
instead.

Also works for h264_nvenc.

You can read more about the more modern nvenc presets here:
https://developer.nvidia.com/blog/introducing-video-codec-sdk-10-presets/

Slow seek performance

Works great but can anything be done to improve the seek performance.
Running a 4090 + i7 12700k.
Are the buffers to be filled the cause of pause on seek? If so, can they be reduced.

Encoding script hangs and leaks RAM

animejanai_v2_encode.vpy doesn't work, needs investigation

I just made the setting to hold the frame drop

This time, I've made optimization after a lot of trial and error.

The amendments are as follows

1. Add "scale_to_540, 675, 810"

Upscale_twice or 60 frame videos above 1080p almost unconditionally get frame drops.
To find a solution to this, I tried to manipulate the "animejanai_v2.conf" data and find a figure that prevents frame drops while preserving the most quality.

In conclusion, I think the frame drop was the best when it was "resize_height_before_first_2x=540", and I started to make a code based on this

scale_to_810: 1440p60 upscaling
scale_to_675 : 1080p60 Upscaling
scale_to_540 : 809p30~540p30 upscale_twice
scale_to_1080: 1079p to 810p upscaling

2. Different engine settings for different situations

Compact / UltraCompact/ SuperUltraCompact (strong.v1)

I used all three of these to make the best combination

30 frames
2159p - 810p >>> resize1080 + UltraCompact >>> 2160p
809p - 540p >>> resize540 + SuperUltraCompact + SpuerUltraCompact >>> 2160p
539p - 1p >>> Compact + UltraCompact >>> 2156p - 4p

60 frames
2159p - 1081p >>> resize810 + SuperUltraCompact >>> 1620p
1080p - 810p >>> resize675 + UltraCompact >>> 1350p
809p - 540p >>> UltraCompact >>> 1618p - 1082p
539p - 1p >>> Compact >>> 1078p - 2p

I am very pleased with this result

rife_cuda.py also I've tried this too. but I think it's going to be hard.

Below is the code I used

I'll mark ###new### for the part where I fixed the code

animejanai_v2.py

import vapoursynth as vs
import os
import subprocess
import logging
import configparser
import sys
from logging.handlers import RotatingFileHandler

sys.path.append(os.path.dirname(os.path.abspath(__file__)))

import rife_cuda
import animejanai_v2_config
# import gmfss_cuda

# trtexec num_streams
TOTAL_NUM_STREAMS = 4

core = vs.core
core.num_threads = 4  # can influence ram usage

plugin_path = os.path.join(os.path.dirname(os.path.abspath(__file__)),
                           r"..\..\vapoursynth64\plugins\vsmlrt-cuda")
model_path = os.path.join(plugin_path, r"..\models\animejanai")

formatter = logging.Formatter(fmt='%(asctime)s %(levelname)-8s %(message)s',
                              datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger('animejanai_v2')

config = {}



def init_logger():
    global logger
    logger.setLevel(logging.DEBUG)
    rfh = RotatingFileHandler(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'animejanai_v2.log'),
                              mode='a', maxBytes=1 * 1024 * 1024, backupCount=2, encoding=None, delay=0)
    rfh.setFormatter(formatter)
    rfh.setLevel(logging.DEBUG)
    logger.addHandler(rfh)


# model_type: HD or SD
# binding: 1 through 9
def find_model(model_type, binding):
    section_key = f'slot_{binding}'
    key = f'{model_type.lower()}_model'

    if section_key in config:
        if key in config[section_key]:
            return config[section_key][key]
    return None


def create_engine(onnx_name):
    onnx_path = os.path.join(model_path, f"{onnx_name}.onnx")
    if not os.path.isfile(onnx_path):
        raise FileNotFoundError(onnx_path)

    engine_path = os.path.join(model_path, f"{onnx_name}.engine")

    subprocess.run([os.path.join(plugin_path, "trtexec"), "--fp16", f"--onnx={onnx_path}",
                    "--minShapes=input:1x3x8x8", "--optShapes=input:1x3x1080x1920", "--maxShapes=input:1x3x1080x1920",
                    f"--saveEngine={engine_path}", "--tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT"],
                   cwd=plugin_path)


def scale_to_1080(clip, w=1920, h=1080):
    if clip.width / clip.height > 16 / 9:
        prescalewidth = w
        prescaleheight = w * clip.height / clip.width
    else:
        prescalewidth = h * clip.width / clip.height
        prescaleheight = h
    return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

###new###
def scale_to_810(clip, w=1440, h=810):
   if clip.width / clip.height > 16 / 9:
      prescalewidth = w
      prescaleheight = w * clip.height / clip.width
   else:
      prescalewidth = h * clip.width / clip.height
      prescaleheight = h
   return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

###new###
def scale_to_675(clip, w=1200, h=675):
   if clip.width / clip.height > 16 / 9:
      prescalewidth = w
      prescaleheight = w * clip.height / clip.width
   else:
      prescalewidth = h * clip.width / clip.height
      prescaleheight = h
   return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

###new###
def scale_to_540(clip, w=960, h=540):
   if clip.width / clip.height > 16 / 9:
      prescalewidth = w
      prescaleheight = w * clip.height / clip.width
   else:
      prescalewidth = h * clip.width / clip.height
      prescaleheight = h
   return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

###new###
def upscale2x(clip, sd_engine_name, hd_engine_name, shd_engine_name, num_streams):
    if clip.height == 675 or clip.width == 1200:
        engine_name = hd_engine_name
    else:
        if (clip.height < 540 or clip.width < 960):
            engine_name = sd_engine_name
        if (clip.height == 1080 or clip.width == 1920) or ((clip.height > 540 and clip.width > 960) and (clip.height < 1080 or clip.width < 1920)):
            engine_name = hd_engine_name
        if (clip.height == 540 or clip.width == 960) or (clip.height == 810 or clip.width == 1440):
            engine_name = shd_engine_name
    if engine_name is None:
        return clip
    engine_path = os.path.join(model_path, f"{engine_name}.engine")

    message = f"upscale2x: scaling 2x from {clip.width}x{clip.height} with engine={engine_name}; num_streams={num_streams}"
    logger.debug(message)
    print(message)

    if not os.path.isfile(engine_path):
        create_engine(engine_name)

    return core.trt.Model(
        clip,
        engine_path=engine_path,
        num_streams=num_streams,
    )
###new###
def upscale22x(clip, hd_engine_name, shd_engine_name, num_streams):
    if clip.height == 1080 or clip.width == 1920:
        engine_name = shd_engine_name
    else:
        if clip.height < 1080 or clip.width < 1920:
            engine_name = hd_engine_name
    if engine_name is None:
        return clip
    engine_path = os.path.join(model_path, f"{engine_name}.engine")

    message = f"upscale22x: scaling 2x from {clip.width}x{clip.height} with engine={engine_name}; num_streams={num_streams}"
    logger.debug(message)
    print(message)

    if not os.path.isfile(engine_path):
        create_engine(engine_name)

    return core.trt.Model(
        clip,
        engine_path=engine_path,
        num_streams=num_streams,
    )



def run_animejanai(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
                   resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
                   resize_to_1080_before_second_2x, upscale_twice, use_rife):
    if do_upscale:
        colorspace = "709"
        colorlv = clip.get_frame(0).props._ColorRange
        fmt_in = clip.format.id

        if clip.height < 720 or clip.width < 1280:
            colorspace = "170m"

        if resize_height_before_first_2x != 0:
            resize_factor_before_first_2x = 1

        try:
            # try half precision first
            clip = vs.core.resize.Bicubic(clip, format=vs.RGBH, matrix_in_s=colorspace,
                                          width=clip.width/resize_factor_before_first_2x,
                                          height=clip.height/resize_factor_before_first_2x)

            clip = run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
                                          resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
                                          resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv,
                                          fmt_in)
        except:
            clip = vs.core.resize.Bicubic(clip, format=vs.RGBS, matrix_in_s=colorspace,
                                          width=clip.width/resize_factor_before_first_2x,
                                          height=clip.height/resize_factor_before_first_2x)
            clip = run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
                                          resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
                                          resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv,
                                          fmt_in)
            ###new###
    if use_rife:
        clip = rife_cuda.rife(clip, clip.width, clip.height, container_fps)

    clip.set_output()

    ###new###
def run_animejanai_upscale(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
                          resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
                          resize_to_1080_before_second_2x, upscale_twice, use_rife, colorspace, colorlv, fmt_in):
    ###new###
    if resize_height_before_first_2x == 540:
        if (clip.height >= 540 or clip.width >= 960) and container_fps >= 45:
            if (clip.height <= 1080 or clip.width <= 1920):
                clip = scale_to_675(clip)
            else:
                clip = scale_to_810(clip)
        else:
            if (clip.height >= 540 or clip.width >= 960) and clip.height < 810 and clip.width < 1440 and container_fps < 45:
                clip = scale_to_540(clip)
            if (clip.height >= 810 or clip.width >= 1440) and clip.height < 2160 and clip.width < 3840:
                clip = scale_to_1080(clip)



    # if not 540, error occurred at upscale2x, upscale22x
    if resize_height_before_first_2x != 540 and resize_height_before_first_2x != 0 :
        clip = scale_to_1080(clip, resize_height_before_first_2x * 16 / 9, resize_height_before_first_2x)

    # pre-scale 720p or higher to 1080 > NO
    if resize_720_to_1080_before_first_2x:
        if (clip.height >= 810 or clip.width >= 1440) and clip.height < 1080 and clip.width < 1920:
            clip = scale_to_1080(clip)

    num_streams = TOTAL_NUM_STREAMS
    if upscale_twice and ( clip.height <= 540 or clip.width <= 960 ) and container_fps < 45:
        num_streams /= 2

    # upscale 2x
    clip = upscale2x(clip, sd_engine_name, hd_engine_name, shd_engine_name, num_streams)

    # upscale 2x again if necessary
    if upscale_twice and ( clip.height <= 1080 or clip.width <= 1920 ) and container_fps < 45:
        # downscale down to 1080 if first 2x went over 1080,
        # or scale up to 1080 if enabled >> NO
        if resize_to_1080_before_second_2x and ( clip.height > 720 or clip.width > 1280):
            clip = scale_to_1080(clip)

        # upscale 2x again
        clip = upscale22x(clip, hd_engine_name, shd_engine_name, num_streams)


    fmt_out = fmt_in
    if fmt_in not in [vs.YUV410P8, vs.YUV411P8, vs.YUV420P8, vs.YUV422P8, vs.YUV444P8, vs.YUV420P10, vs.YUV422P10,
                      vs.YUV444P10]:
        fmt_out = vs.YUV420P10

    return vs.core.resize.Bicubic(clip, format=fmt_out, matrix_s=colorspace, range=1 if colorlv == 0 else None)

# keybinding: 1-9
def run_animejanai_with_keybinding(clip, container_fps, keybinding):
    sd_engine_name = find_model("SD", keybinding)
    hd_engine_name = find_model("HD", keybinding)
    shd_engine_name = find_model("SHD", keybinding)
    section_key = f'slot_{keybinding}'
    do_upscale = config[section_key].get(f'upscale_2x', True)
    upscale_twice = config[section_key].get(f'upscale_4x', True)
    use_rife = config[section_key].get(f'rife', True)
    resize_720_to_1080_before_first_2x = config[section_key].get(f'resize_720_to_1080_before_first_2x', True)
    resize_factor_before_first_2x = config[section_key].get(f'resize_factor_before_first_2x', 1)
    resize_height_before_first_2x = config[section_key].get(f'resize_height_before_first_2x', 0)
    resize_to_1080_before_second_2x = config[section_key].get(f'resize_to_1080_before_second_2x', True)

    if do_upscale:
        if sd_engine_name is None and hd_engine_name is None and shd_engine_name is None:
            raise FileNotFoundError(
                f"2x upscaling is enabled but no SD model and HD model defined for slot {keybinding}. Expected at least one of SD or HD model to be specified with sd_model or hd_model in animejanai.conf.")
        ###new###
    if (clip.height < 2160 or clip.width < 3840) and container_fps < 100:
        run_animejanai(clip, sd_engine_name, hd_engine_name, shd_engine_name, container_fps, resize_factor_before_first_2x,
                   resize_height_before_first_2x, resize_720_to_1080_before_first_2x, do_upscale,
                   resize_to_1080_before_second_2x, upscale_twice, use_rife)


def init():
    global config
    config = animejanai_v2_config.read_config()
    if config['global']['logging']:
        init_logger()


init()

animejanai_v2_1.vpy

import sys, os
sys.path.append(os.path.dirname(os.path.abspath(__file__)))

import animejanai_v2

animejanai_v2.run_animejanai_with_keybinding(video_in, container_fps, 1)

animejanai_v2.conf

[slot_1]

SD_model=2x_AnimeJaNai_Strong_V1_Compact_net_g_120000

HD_model=2x_AnimeJaNai_Strong_V1_UltraCompact_net_g_100000

SHD_model=2x_AnimeJaNai_Strong_V1_SuperUltraCompact_net_g_100000

resize_factor_before_first_2x=1

resize_height_before_first_2x=540

resize_720_to_1080_before_first_2x=no

upscale_2x=yes

upscale_4x=yes

resize_to_1080_before_second_2x=no

rife=no

Model for handling MPEG2 artifacts

This suggestion is from Lycoris2013. Low bitrate MPEG2 sources are very common among anime captured from Japanese broadcasts. The main 2x_AnimeJaNai models are intended to preserve details as much as possible so they cannot have heavy artifact correction which tends to also remove details. So a separate model could be trained specifically for these MPEG2 sources such as the image below.

Cannot create engine unless I downgrade TensorRT version

When attempting to build an engine, TensorRT throws errors and fails (trtexec log below). Nothing fixed it until I tried to downgrade TRT. From mpv-janai v2.0.2, I copied the entire vsmlrt-cuda folder into v3.0's vapoursynth64/plugins folder, overwriting the old one.

Then v3 finally generated the engine and I tested it as working, but trtexec still complained that because of INT64, the result might be less accurate. Is that true? Should I be worried about that?

Error log before I downgraded TRT (regular janai v3)

&&&& RUNNING TensorRT.trtexec [TensorRT v9200] # C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\..\vapoursynth64\plugins\vsmlrt-cuda\trtexec --fp16 --onnx=C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --skipInference --infStreams=4 --builderOptimizationLevel=4 --saveEngine=C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.engine --tacticSources=-CUDNN,-CUBLAS,-CUBLAS_LT
[05/19/2024-14:15:37] [I] === Model Options ===
[05/19/2024-14:15:37] [I] Format: ONNX
[05/19/2024-14:15:37] [I] Model: C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx
[05/19/2024-14:15:37] [I] Output:
[05/19/2024-14:15:37] [I] === Build Options ===
[05/19/2024-14:15:37] [I] Max batch: explicit batch
[05/19/2024-14:15:37] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[05/19/2024-14:15:37] [I] minTiming: 1
[05/19/2024-14:15:37] [I] avgTiming: 8
[05/19/2024-14:15:37] [I] Precision: FP32+FP16
[05/19/2024-14:15:37] [I] LayerPrecisions:
[05/19/2024-14:15:37] [I] Layer Device Types:
[05/19/2024-14:15:37] [I] Calibration:
[05/19/2024-14:15:37] [I] Refit: Disabled
[05/19/2024-14:15:37] [I] Weightless: Disabled
[05/19/2024-14:15:37] [I] Version Compatible: Disabled
[05/19/2024-14:15:37] [I] ONNX Native InstanceNorm: Disabled
[05/19/2024-14:15:37] [I] TensorRT runtime: full
[05/19/2024-14:15:37] [I] Lean DLL Path:
[05/19/2024-14:15:37] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[05/19/2024-14:15:37] [I] Exclude Lean Runtime: Disabled
[05/19/2024-14:15:37] [I] Sparsity: Disabled
[05/19/2024-14:15:37] [I] Safe mode: Disabled
[05/19/2024-14:15:37] [I] Build DLA standalone loadable: Disabled
[05/19/2024-14:15:37] [I] Allow GPU fallback for DLA: Disabled
[05/19/2024-14:15:37] [I] DirectIO mode: Disabled
[05/19/2024-14:15:37] [I] Restricted mode: Disabled
[05/19/2024-14:15:37] [I] Skip inference: Enabled
[05/19/2024-14:15:37] [I] Save engine: C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.engine
[05/19/2024-14:15:37] [I] Load engine:
[05/19/2024-14:15:37] [I] Profiling verbosity: 0
[05/19/2024-14:15:37] [I] Tactic sources: cublas [OFF], cublasLt [OFF], cudnn [OFF],
[05/19/2024-14:15:37] [I] timingCacheMode: local
[05/19/2024-14:15:37] [I] timingCacheFile:
[05/19/2024-14:15:37] [I] Enable Compilation Cache: Enabled
[05/19/2024-14:15:37] [I] errorOnTimingCacheMiss: Disabled
[05/19/2024-14:15:37] [I] Heuristic: Disabled
[05/19/2024-14:15:37] [I] Preview Features: Use default preview flags.
[05/19/2024-14:15:37] [I] MaxAuxStreams: -1
[05/19/2024-14:15:37] [I] BuilderOptimizationLevel: 4
[05/19/2024-14:15:37] [I] Calibration Profile Index: 0
[05/19/2024-14:15:37] [I] Input(s)s format: fp32:CHW
[05/19/2024-14:15:37] [I] Output(s)s format: fp32:CHW
[05/19/2024-14:15:37] [I] Input build shape (profile 0): input=1x3x8x8+1x3x1080x1920+1x3x1080x1920
[05/19/2024-14:15:37] [I] Input calibration shapes: model
[05/19/2024-14:15:37] [I] === System Options ===
[05/19/2024-14:15:37] [I] Device: 0
[05/19/2024-14:15:37] [I] DLACore:
[05/19/2024-14:15:37] [I] Plugins:
[05/19/2024-14:15:37] [I] setPluginsToSerialize:
[05/19/2024-14:15:37] [I] dynamicPlugins:
[05/19/2024-14:15:37] [I] ignoreParsedPluginLibs: 0
[05/19/2024-14:15:37] [I]
[05/19/2024-14:15:37] [I] === Inference Options ===
[05/19/2024-14:15:37] [I] Batch: Explicit
[05/19/2024-14:15:37] [I] Input inference shape : input=1x3x1080x1920
[05/19/2024-14:15:37] [I] Iterations: 10
[05/19/2024-14:15:37] [I] Duration: 3s (+ 200ms warm up)
[05/19/2024-14:15:37] [I] Sleep time: 0ms
[05/19/2024-14:15:42] [I] Idle time: 0ms
[05/19/2024-14:15:42] [I] Inference Streams: 4
[05/19/2024-14:15:42] [I] ExposeDMA: Disabled
[05/19/2024-14:15:42] [I] Data transfers: Enabled
[05/19/2024-14:15:42] [I] Spin-wait: Disabled
[05/19/2024-14:15:42] [I] Multithreading: Disabled
[05/19/2024-14:15:42] [I] CUDA Graph: Disabled
[05/19/2024-14:15:42] [I] Separate profiling: Disabled
[05/19/2024-14:15:42] [I] Time Deserialize: Disabled
[05/19/2024-14:15:42] [I] Time Refit: Disabled
[05/19/2024-14:15:42] [I] NVTX verbosity: 0
[05/19/2024-14:15:42] [I] Persistent Cache Ratio: 0
[05/19/2024-14:15:42] [I] Optimization Profile Index: 0
[05/19/2024-14:15:42] [I] Inputs:
[05/19/2024-14:15:42] [I] === Reporting Options ===
[05/19/2024-14:15:42] [I] Verbose: Disabled
[05/19/2024-14:15:42] [I] Averages: 10 inferences
[05/19/2024-14:15:42] [I] Percentiles: 90,95,99
[05/19/2024-14:15:42] [I] Dump refittable layers:Disabled
[05/19/2024-14:15:42] [I] Dump output: Disabled
[05/19/2024-14:15:42] [I] Profile: Disabled
[05/19/2024-14:15:42] [I] Export timing to JSON file:
[05/19/2024-14:15:42] [I] Export output to JSON file:
[05/19/2024-14:15:42] [I] Export profile to JSON file:
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] === Device Information ===
[05/19/2024-14:15:42] [I] Available Devices:
[05/19/2024-14:15:42] [I]   Device 0: "NVIDIA GeForce RTX 3080" UUID: GPU-44b0b0ec-4a4a-2291-c949-ad5f2d47ac82
[05/19/2024-14:15:42] [I] Selected Device: NVIDIA GeForce RTX 3080
[05/19/2024-14:15:42] [I] Selected Device ID: 0
[05/19/2024-14:15:42] [I] Selected Device UUID: GPU-44b0b0ec-4a4a-2291-c949-ad5f2d47ac82
[05/19/2024-14:15:42] [I] Compute Capability: 8.6
[05/19/2024-14:15:42] [I] SMs: 68
[05/19/2024-14:15:42] [I] Device Global Memory: 10239 MiB
[05/19/2024-14:15:42] [I] Shared Memory per SM: 100 KiB
[05/19/2024-14:15:42] [I] Memory Bus Width: 320 bits (ECC disabled)
[05/19/2024-14:15:42] [I] Application Compute Clock Rate: 1.8 GHz
[05/19/2024-14:15:42] [I] Application Memory Clock Rate: 9.501 GHz
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[05/19/2024-14:15:42] [I]
[05/19/2024-14:15:42] [I] TensorRT version: 9.2.0
[05/19/2024-14:15:42] [I] Loading standard plugins
[05/19/2024-14:15:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 8080, GPU 1166 (MiB)
[05/19/2024-14:15:50] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +2726, GPU +312, now: CPU 11097, GPU 1478 (MiB)
[05/19/2024-14:15:50] [I] Start parsing network model.
[05/19/2024-14:15:50] [I] [TRT] ----------------------------------------------------------------
[05/19/2024-14:15:50] [I] [TRT] Input filename:   C:\Users\heath\Documents\mpv-upscale-2x_animejanai-v3\animejanai\core\..\onnx\2x_AnimeJaNai_HD_V3_UltraCompact.onnx
[05/19/2024-14:15:50] [I] [TRT] ONNX IR version:  0.0.7
[05/19/2024-14:15:50] [I] [TRT] Opset version:    14
[05/19/2024-14:15:50] [I] [TRT] Producer name:    pytorch
[05/19/2024-14:15:50] [I] [TRT] Producer version: 2.1.2
[05/19/2024-14:15:50] [I] [TRT] Domain:
[05/19/2024-14:15:50] [I] [TRT] Model version:    0
[05/19/2024-14:15:50] [I] [TRT] Doc string:
[05/19/2024-14:15:50] [I] [TRT] ----------------------------------------------------------------
[05/19/2024-14:15:50] [I] Finished parsing network model. Parse time: 0.0284306
[05/19/2024-14:15:50] [I] Set shape of input tensor input for optimization profile 0 to: MIN=1x3x8x8 OPT=1x3x1080x1920 MAX=1x3x1080x1920
[05/19/2024-14:15:50] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[05/19/2024-14:15:53] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_lessZero = lt(/Conv_output_0', /PRelu_zero), name=/PRelu_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:15:53] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_lessZero = lt(/Conv_output_0', /PRelu_zero), name=/PRelu_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_1_lessZero = lt(/Conv_1_output_0', /PRelu_1_zero), name=/PRelu_1_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_1_lessZero = lt(/Conv_1_output_0', /PRelu_1_zero), name=/PRelu_1_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_2_lessZero = lt(/Conv_2_output_0', /PRelu_2_zero), name=/PRelu_2_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_2_lessZero = lt(/Conv_2_output_0', /PRelu_2_zero), name=/PRelu_2_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_3_lessZero = lt(/Conv_3_output_0', /PRelu_3_zero), name=/PRelu_3_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_3_lessZero = lt(/Conv_3_output_0', /PRelu_3_zero), name=/PRelu_3_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_4_lessZero = lt(/Conv_4_output_0', /PRelu_4_zero), name=/PRelu_4_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_4_lessZero = lt(/Conv_4_output_0', /PRelu_4_zero), name=/PRelu_4_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_5_lessZero = lt(/Conv_5_output_0', /PRelu_5_zero), name=/PRelu_5_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_5_lessZero = lt(/Conv_5_output_0', /PRelu_5_zero), name=/PRelu_5_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_6_lessZero = lt(/Conv_6_output_0', /PRelu_6_zero), name=/PRelu_6_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_6_lessZero = lt(/Conv_6_output_0', /PRelu_6_zero), name=/PRelu_6_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_7_lessZero = lt(/Conv_7_output_0', /PRelu_7_zero), name=/PRelu_7_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_7_lessZero = lt(/Conv_7_output_0', /PRelu_7_zero), name=/PRelu_7_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_8_lessZero = lt(/Conv_8_output_0', /PRelu_8_zero), name=/PRelu_8_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:01] [E] Error[9]: Skipping tactic 0x0000000000000000 due to exception [::0]
Error during shape inference of
/PRelu_8_lessZero = lt(/Conv_8_output_0', /PRelu_8_zero), name=/PRelu_8_less
Error is:
Input 0's element type (half) differs from input 1's element type (float).
[05/19/2024-14:16:10] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[05/19/2024-14:16:11] [I] [TRT] Total Host Persistent Memory: 56032
[05/19/2024-14:16:11] [I] [TRT] Total Device Persistent Memory: 0
[05/19/2024-14:16:11] [I] [TRT] Total Scratch Memory: 4608
[05/19/2024-14:16:11] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 40 steps to complete.
[05/19/2024-14:16:11] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 0.6522ms to assign 4 blocks to 40 nodes requiring 1111454208 bytes.
[05/19/2024-14:16:11] [I] [TRT] Total Activation Memory: 1111454208
[05/19/2024-14:16:11] [I] [TRT] Total Weights Memory: 670720
[05/19/2024-14:16:11] [I] [TRT] Engine generation completed in 21.705 seconds.
[05/19/2024-14:16:11] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.

Request for Benchmarks

I'm collecting benchmarks for different hardware configurations on the wiki here: https://github.com/the-database/mpv-upscale-2x_animejanai/wiki/Benchmarks

If you would like to contribute your benchmarks, please try downloading the prerelease archive 1.0.1 here: https://github.com/the-database/mpv-upscale-2x_animejanai/releases/tag/1.0.1

To run the benchmark, extract the release archive and then run the bat file at: mpv-upscale-2x_animejanai\portable_config\shaders\animejanai_v2_benchmark_all.bat. If running the bat just opens and closes the console Window immediately, an error has occurred. If that happens, please open a command window, navigate to the mpv-upscale-2x_animejanai\portable_config\shaders directory, and then run this command: .\animejanai_v2_benchmark_all.bat. Any errors that occurred should be printed to the console, please share them here.

Plex for Windows support

Since Plex for Windows is mpv-based, it should be possible to upscale with AnimeJaNai in Plex. The steps just need to be documented. Most likely setup will be similar to https://www.svp-team.com/wiki/SVP:Plex_Media_Player

Hope to add a non-super-resolution version, similar to Anime4K_Restore.

Whether it’s super-resolution after frame interpolation or frame interpolation after super-resolution, the performance requirements are too high.
I hope to use 1x_animejanai before frame interpolation and then apply other upscaling methods after frame Interpolation, with minimal impact on performance.

GLSL Shader lite version

Hi there,

Setting up animejanai v3 on a system with AMD Radeon graphics can be challenging, especially for noobie.
I'd like to request a lite version of animejanai v3 implemented as a GLSL shader for mpv player.

This would allow users easily use animejanai.

Example,
https://github.com/igv/FSRCNN-TensorFlow/releases/tag/1.1

Thank you.

I will share with the developer how I caught the video lagging phenomenon

First of all, thank you so much for making such a great program.
This time, I had difficulty introducing 2x_animejanai, including mpv. However, I found two points, corrected them, and I think I overcame them.

Computer used: 13600k & Nvida RTX 4070ti
My coding knowledge: convergence to zero
English proficiency: Poor English skill, but Google Translate handles it

I tried using Standard_V1_Ultra compact for the first time, but there was a huge roar and a drop frame phenomenon.

After several trials and errors, I overcame it by modifying the 2x_SharpLines.vpy value.

core.num_theads = 14

I saw in the task manager that my computer had a CPU 14 core and changed the data value.
I think there will be a lot of improvements if many people change it according to their computer performance.

I noticed that there was a huge overload in processing upscaling twice.

To summarize, I added "SHD_ENGINE" as shown in the example below.
If I put the COMPACT engine twice, it seemed quite burdensome for the computer, so I lowered the engine one step at a time and applied it.

The result was very satisfying and I would be very happy if it would help the developer.

Result
cpu 99% > 20 ~ 40 %
gpu 99% > 60 ~ 80 %

import vapoursynth as vs
import os

SD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_Compact_net_g_120000"
HD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_UltraCompact_net_g_100000"
SHD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_SuperUltraCompact_net_g_100000"

core = vs.core
core.num_threads = 14 # can influence ram usage
colorspace="709"

def scaleTo1080(clip, w=1920, h=1080):
if clip.width / clip.height > 16 / 9:
prescalewidth = w
prescaleheight = w * clip.height / clip.width
else:
prescalewidth = h * clip.width / clip.height
prescaleheight = h
return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

def upscale2x(clip):
engine_name = SD_ENGINE_NAME if clip.height < 720 else HD_ENGINE_NAME
return core.trt.Model(
clip,
engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine",
num_streams=4,
)

def upscale4x(clip):
engine_name = HD_ENGINE_NAME if clip.height < 720 else SHD_ENGINE_NAME
return core.trt.Model(
clip,
engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine",
num_streams=4,
)

clip = video_in

if clip.height < 720:
colorspace = "170m"

clip = vs.core.resize.Bicubic(clip, format=vs.RGBS, matrix_in_s=colorspace,

# width=clip.width/2.25,height=clip.height/2.25 # pre-downscale

)

# pre-scale 720p or higher to 1080

if clip.height >= 720 or clip.width >= 1280:
clip = scaleTo1080(clip)

# upscale 2x

clip = upscale2x(clip)

# upscale 2x again if necessary

if clip.height < 2160 and clip.width < 3840:

# downscale down to 1080 if first 2x went over 1080

if clip.height > 1080 or clip.width > 1920:
clip = scaleTo1080(clip)

# upscale 2x again << CHANGE IT>>

clip = upscale4x(clip)

clip = vs.core.resize.Bicubic(clip, format=vs.YUV420P8, matrix_s=colorspace)

clip.set_output()

Is there a way to simultaneously use AnimeJaNai with RIFE AI frame interpolation?

I'm using rife-cuda in mpv_lazy and wanted to see if I could get AnimeJaNai working alongside it. I'm totally new to using these machine learning models. Is the only way to do it to merge the two desired ONNX models? It's easy to apply anime4k shaders with RIFE so I'm hoping there is a solution to get this working too.

V2 models

V2 models are being developed to address some feedback provided for the V1 models:

V1 models are oversharpening, even the models which are not Strong. Some ringing artifacts are introduced in the line art and oversharpening artifacts including "dot" artifacts are introduced in the backgrounds, easily seen in backgrounds with trees for example.
V1 models can introduce unwanted line darkening.
In some scenarios, V1 models treat grain or detail as unwanted noise and denoises them. All grain and detail should be preserved and upscaled.

The primary goal of the V2 models is to produce results that appear as if the source was originally produced in 4K while faithfully retaining the original look as much as possible. This will be tracked by downscaling the upscaled results to the native resolution of the anime. The results should be difficult to distinguish from the original anime source. Performing this test on the V1 models more easily reveals the above oversharpening, line darkening, and loss of detail.

It's expected that V2 will not require Soft/Standard/Strong models, there will simply 3 models consisting of V2 Compact, V2 UltraCompact, and V2 SuperUltraCompact.

As with V1, V2 will not be intended for use on low quality sources with heavy artifacting, as those artifacts will be preserved and upscaled. But with less oversharpening, V2 should perform better than V1 on low quality sources.

V2 will undoubtedly lose some sharpness when directly compared to V1. V1 models will remain available for those that prefer the extra sharp look.

Screenshots of progress on the V2 models are included in the following comments. Please note those screenshots are not final and the released V2 model may produce different results.

Replace Powershell script with prepackaged release archives

The Powershell script should be deprecated in favor of releasing prepackaged archives for simpler installation. Two archives should be supplied - one as a full standalone mpv archive like mpv_lazy, and another which can be dropped into any existing mpv installation. TensorRT engines should be generated on the fly.

Can you provide a simple function for VapourSynth/avisynth_filter?

save-position-on-quit not working

When I press close the player and then open again it starts from the begining instead of where It was before. The option save-position-on-quit is set to yes on the config but it's not working properly

[Question] Amd Graphics Card

Will it be possible to do this (in realtime) with an amd card in the near future?
I sold my 1070ti and I'm going to get a 6800xt + 13600k. Was also considering the rtx 4070 but 12gb vram and 192 bit made me look at amd.

[vapoursynth] Value Error: could not convert string to float

I was able to upscale for some time and pretty quickly it stopped working after playing with profiles.
I have rife disabled in profiles.

Issue seems to be related to rife fixes.

[Request] Add Intel and AMD support (documentation)

Please add:

documentation for usage on Intel and AMD with DirectML (change to DirectML in global settings).
fp32 models as only those are working with DirectML.
V2 models as those are preferable for some video sources
fp32 SD model

It works fine on my Intel Arc A750.
Thank You for Your hard work :)

Can you give me links for ONNX model ?

Dolby Vision Outputs Wrong Colors

MPV is able to properly decode dolby vision files with the following configuration lines but when using animejanai, it displays the wrong colors.

profile=gpu-hq
vo=gpu-next
gpu-api=d3d11
gpu-context=d3d11

With Animejanai

~1m 20s black screen after loading Video

After starting mpv or mpvnet and loading a Video over URL, it has a Black screen for more than a minute.

On normal MPV, it does not happen, starting the video instantly.

Is it, because it has to load the Model? Any way to speed up the process?

(RTX 3080, i7 8700k@5GHz, SSD, Win11)

Upscaler mute all the other sounds

Hello, when I play a video with the upscale feature, all other system sounds are muted. I want to talk on Discord while using the program, but when I start the video, I stop hearing it. Is there any way to hear the other programs besides the audio track of the video I'm playing?