duanzhiihao / qres-vae Goto Github PK
View Code? Open in Web Editor NEWAuthors' PyTorch implementation of "Lossy Image Compression with Quantized Hierarchical VAEs"
License: Other
Authors' PyTorch implementation of "Lossy Image Compression with Quantized Hierarchical VAEs"
License: Other
I have see the bits-per pixel metric many times, but I find you code very easy to read. This motives me to ask this question.
On line 24 of the evaluation script is the line:
num_bits = tmp_bit_path.stat().st_size
Would you elaborating on how this evaluation works or point me to a reference? Perhaps the model's compression produces a variable length sequence, and the reference line above merely reports the output file size?
Thank you in advance :D
Hi,
I'm currently doing a PhD at Université Paris 13 in Paris and I'm looking for Image Compression deep learning algorithms.
Yours is particularly interesting and I'm trying to train it on my own dataset (Waymo), but I'm running into some trouble...
I figured that my pb comes from qres30m in the library.py file :
Could you explain to me how did you get those shift and scale values pls ?
Help will be very much appreciated.
Thank you in advance.
Hi !
I'm encoutering a other pb :
I'm trying to compress an img with the following code (which is based on your demo.ipynb code ) :
from models.library import qres34m
model = qres34m()
model.load_state_dict(torch.load(trained_weights_path)['model'])
model.eval()
model.compress_mode()
model.to(device)
x = Image.open("original.png")
x = transforms.functional.to_tensor(x).to("cuda")
x = x.unsqueeze(0)
h, w = x.size(2), x.size(3)
p = 64 # maximum 6 strides of 2
new_h = (h + p - 1) // p * p
new_w = (w + p - 1) // p * p
padding_left = (new_w - w) // 2
padding_right = new_w - w - padding_left
padding_top = (new_h - h) // 2
padding_bottom = new_h - h - padding_top
x_padded = F.pad(
x,
(padding_left, padding_right, padding_top, padding_bottom),
mode="constant",
value=0,
)
start = time.time()
out_enc = model.compress(x_padded)
enc_time = time.time() - start
with open("img.bits", 'wb') as f :
pickle.dump(out_enc, file=f)
bpp = (Path("img.bits").stat().st_size * 8.0) / (x.size(0) * x.size(2) * x.size(3))
print(bpp)
with open("img.bits", 'rb') as f :
out_enc = pickle.load(file=f)
start = time.time()
out_dec = model.decompress(out_enc)
dec_time = time.time() - start
temp = transforms.functional.to_pil_image(out_dec[0])
temp.show()
But I get very strange results... :
bpp = 73.605 (which is way to big !!)
and the reconstructed img looks like this :
Do you know what I'm doing wrong ?
Hi !
Sorry to bother you again...
But I couldn't find in your paper or code the list of lambda values that you used when training your net on the MS-SSIM loss.
Could you provide them please ?
Thanks !
论文中提到输入图像的像素必须是64的倍数,但是您也说了,对其做了处理,所以就不应该出现问题,但是我在运行您的代码时还是出现了问题,请问您上传的代码不是最终版本吗?
assert (im.shape[2] % self.max_stride == 0) and (im.shape[3] % self.max_stride == 0)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.