duanzhiihao / qres-vae Goto Github PK

View Code? Open in Web Editor NEW

15.0 3.0 2.0 1.48 MB

Authors' PyTorch implementation of "Lossy Image Compression with Quantized Hierarchical VAEs"

License: Other

Python 4.76% Jupyter Notebook 95.24%

deep-learning image-compression lossy-image-compression pytorch

qres-vae's People

Contributors

Stargazers

Watchers

Forkers

baoyu2020 sybahk

qres-vae's Issues

Question about Num Bits

I have see the bits-per pixel metric many times, but I find you code very easy to read. This motives me to ask this question.

On line 24 of the evaluation script is the line:

num_bits = tmp_bit_path.stat().st_size

Would you elaborating on how this evaluation works or point me to a reference? Perhaps the model's compression produces a variable length sequence, and the reference line above merely reports the output file size?

Thank you in advance :D

qres30m mean and std computed on imagenet

Hi,

I'm currently doing a PhD at Université Paris 13 in Paris and I'm looking for Image Compression deep learning algorithms.

Yours is particularly interesting and I'm trying to train it on my own dataset (Waymo), but I'm running into some trouble...

I figured that my pb comes from qres30m in the library.py file :

Could you explain to me how did you get those shift and scale values pls ?

Help will be very much appreciated.

Thank you in advance.

Compress inference issue

Hi !
I'm encoutering a other pb :
I'm trying to compress an img with the following code (which is based on your demo.ipynb code ) :

from models.library import qres34m
model = qres34m()
model.load_state_dict(torch.load(trained_weights_path)['model'])
model.eval()
model.compress_mode()
model.to(device)
x = Image.open("original.png")
x = transforms.functional.to_tensor(x).to("cuda")
x = x.unsqueeze(0)

h, w = x.size(2), x.size(3)
p = 64  # maximum 6 strides of 2
new_h = (h + p - 1) // p * p
new_w = (w + p - 1) // p * p
padding_left = (new_w - w) // 2
padding_right = new_w - w - padding_left
padding_top = (new_h - h) // 2
padding_bottom = new_h - h - padding_top
x_padded = F.pad(
    x,
    (padding_left, padding_right, padding_top, padding_bottom),
    mode="constant",
    value=0,
)

start = time.time()
out_enc = model.compress(x_padded)
enc_time = time.time() - start
with open("img.bits", 'wb') as f : 
        pickle.dump(out_enc, file=f)
bpp = (Path("img.bits").stat().st_size * 8.0) / (x.size(0) * x.size(2) * x.size(3))
print(bpp)

with open("img.bits", 'rb') as f : 
        out_enc = pickle.load(file=f)
    
start = time.time()
out_dec = model.decompress(out_enc)
dec_time = time.time() - start

temp = transforms.functional.to_pil_image(out_dec[0])
temp.show()

But I get very strange results... :
bpp = 73.605 (which is way to big !!)
and the reconstructed img looks like this :

Do you know what I'm doing wrong ?

lambda values for MS-SSIM training

Hi !
Sorry to bother you again...
But I couldn't find in your paper or code the list of lambda values that you used when training your net on the MS-SSIM loss.
Could you provide them please ?
Thanks !

您好，复现代码有问题

论文中提到输入图像的像素必须是64的倍数，但是您也说了，对其做了处理，所以就不应该出现问题，但是我在运行您的代码时还是出现了问题，请问您上传的代码不是最终版本吗？
assert (im.shape[2] % self.max_stride == 0) and (im.shape[3] % self.max_stride == 0)

duanzhiihao / qres-vae Goto Github PK

qres-vae's People

Contributors

Stargazers

Watchers

Forkers

qres-vae's Issues

Question about Num Bits

qres30m mean and std computed on imagenet

Compress inference issue

lambda values for MS-SSIM training

您好，复现代码有问题

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent