phoboslab / qoi Goto Github PK
View Code? Open in Web Editor NEWThe “Quite OK Image Format” for fast, lossless image compression
License: MIT License
The “Quite OK Image Format” for fast, lossless image compression
License: MIT License
So I did a quick test, I tried to replace DIFF_8, DIFF_16 and DIFF_24 with GDIFF without adding new opcodes.
Here is my results, I think that 222 diff_8 + 454 gdiff_16 + 474 gdiff_24
is the most optimal.
original
kodak 771
misc 398
screenshots 2582
textures 184
wallpaper 10674
222 gdiff_8 + 373 gdiff_16 + 474 gdiff_24
kodak 693
misc 412
screenshots 2401
textures 191
wallpaper 10501
222 gdiff_8 + 454 gdiff_16 + 555 gdiff_24
kodak 719
misc 401
screenshots 2501
textures 178
wallpaper 10412
222 gdiff_8 + 454 diff_16 + 555 diff_24
kodak 772
misc 401
screenshots 2587
textures 185
wallpaper 10773
222 diff_8 + 454 gdiff_16 + 555 diff_24
kodak 722
misc 399
screenshots 2508
textures 178
wallpaper 10307
222 diff_8 + 454 diff_16 + 555 gdiff_24
kodak 768
misc 398
screenshots 2579
textures 183
wallpaper 10616
222 diff_8 + 454 gdiff_16 + 555 gdiff_24
kodak 721
misc 399
screenshots 2504
textures 178
wallpaper 10296
222 diff_8 + 454 gdiff_16 + 474 gdiff_24
kodak 694
misc 404
screenshots 2425
textures 178
wallpaper 10184
222 diff_8 + 373 gdiff_16 + 555 gdiff_24
kodak 699
misc 404
screenshots 2405
textures 188
wallpaper 10352
First let me say that this project is very cool and I'm looking forward to seeing where it's going. I saw the announcement of this project and noticed there was no description of where this library is safe to use and no fuzz tests. I wrote a super basic libfuzzer harness that can trigger ASAN violations pretty quickly with no corpus:
#define QOI_IMPLEMENTATION
#include "qoi.h"
#include <stddef.h>
#include <stdint.h>
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
int w, h;
if (size < 4) {
return 0;
}
qoi_decode((void*)(data + 4), (int)(size - 4), &w, &h, *((int *)data));
return 0;
}
$ clang -fsanitize=address,fuzzer fuzz.c && ./a.out
IMO since this project is getting a fair amount of attention rather quickly it may be wise to note that this reference implementation is purely experimental at this time and should not be used on untrusted inputs.
I want to apologize.
I may have been too quick with announcing the file format to be finished. I'm frankly overwhelmed with the attention this is getting. With all the implementations already out there, I thought it was a good idea to finalize the specification ASAP. I'm no longer sure if that was the right decision.
QOI is probably good enough the way it is now, but I'm wondering if there are things that could be done better — without sacrificing the simplicity or performance of this format.
One of these things is the fact that QOI_RUN_16
was determined to be pretty useless, and QOI could be become even simpler by just removing it. Maybe there's more easy wins with a different hash function or distributing some bits differently? I don't know.
At the risk of annoying everyone: how do you all feel about giving QOI a bit more time to mature?
To be clear, the things I'd be willing to discuss here are fairly limited:
What I'm looking for specifically is:
Should we set a deadline in 2-3 weeks to produce the really-final (pinky promise) specification? Or should we just leave it as it is?
Again, I'm very sorry for the confusing messaging!
Edit: Thanks for your feedback. Let's produce the final spec till 2021.12.20.
Compiler: Apple clang version 13.0.0 (clang-1300.0.29.3)
./qoi.h:382:14: warning: multiple unsequenced modifications to 'p' [-Wunsequenced]
int magic = QOI_READ_32(bytes, p);
^~~~~~~~~~~~~~~~~~~~~
./qoi.h:245:28: note: expanded from macro 'QOI_READ_32'
#define QOI_READ_32(B, P) (QOI_READ_16(B, P) << 16 | QOI_READ_16(B, P))
^~~~~~~~~~~~~~~~~
./qoi.h:244:33: note: expanded from macro 'QOI_READ_16'
#define QOI_READ_16(B, P) (((B[P++] & 0xff) << 8) | (B[P++] & 0xff))
The order in which P is incremented is non-deterministic / undefined.
I was thinking this might be cool in OpenEXR ~ at first glance, it looks like a 16 bit variant would be straight forward?
Hi, my business partner, who is an ex-Unity/Facebook engineer (and a woman), noticed you have cropped pornography in your test corpus:
https://phoboslab.org/files/qoibench/
The cropped pornography image in question is "misc/lenna.png". It should be deleted. This is very unprofessional.
...to enforce consistency when contributing
Spun out of #28 (comment) where I said:
I would instead advise to make everything little endian. Not just the header fields, but also the bytecodes. Almost all CPUs used today are little-endian, and support unaligned loads.
I just uploaded a proof of concept which showed an approx 1.05x improvement in decode speed. Copy/pasting the commit message:
Change wire format to little-endian
This is a backwards incompatible change to the wire format. For example,
the QOI_DIFF_24 bit-packing encoding changes from:
| MSB 6 5 4 3 2 1 LSB |
+---------------------------------+
| 1 1 1 0 r4 r3 r2 r1 | addr+0
| r0 g4 g3 g2 g1 g0 b4 b3 | addr+1
| b2 b1 b0 a4 a3 a2 a1 a0 | addr+2
to:
| MSB 6 5 4 3 2 1 LSB |
+---------------------------------+
| r3 r2 r1 r0 0 1 1 1 | addr+0
| b1 b0 g4 g3 g2 g1 g0 r4 | addr+1
| a4 a3 a2 a1 a0 b4 b3 b2 | addr+2
----
Decode speed-up on an Intel NUC (Comet Lake, BXNUC10i5FNKPA):
1.07x images/kodak
1.04x images/misc
1.01x images/screenshots
1.06x images/textures
1.05x images/wallpaper
In detail:
images/kodak
decode ms encode ms decode mpps encode mpps size kb
libpng: 8.0 144.2 49.02 2.73 717
stbi: 8.8 84.2 44.43 4.67 979
qoi: 3.6 4.2 109.80 93.41 771
qoile: 3.3 4.2 117.70 93.41 771
images/misc
decode ms encode ms decode mpps encode mpps size kb
libpng: 9.3 89.1 82.18 8.57 335
stbi: 8.2 77.5 93.17 9.85 497
qoi: 2.8 3.1 275.03 245.73 451
qoile: 2.7 3.1 285.92 245.95 451
images/screenshots
decode ms encode ms decode mpps encode mpps size kb
libpng: 45.3 519.5 181.85 15.84 2219
stbi: 35.2 622.5 233.73 13.22 2821
qoi: 24.3 23.9 339.08 344.26 2582
qoile: 23.9 23.9 343.80 344.20 2582
images/textures
decode ms encode ms decode mpps encode mpps size kb
libpng: 2.6 33.1 50.84 3.92 163
stbi: 2.5 19.8 52.26 6.56 232
qoi: 0.9 1.0 149.74 130.18 184
qoile: 0.8 1.0 158.48 130.22 184
images/wallpaper
decode ms encode ms decode mpps encode mpps size kb
libpng: 154.4 2289.3 60.71 4.09 9224
stbi: 190.0 1455.1 49.32 6.44 13299
qoi: 71.2 76.6 131.57 122.35 10647
qoile: 67.9 76.6 138.11 122.37 10647
The key difference between qoi.h
and qoile.h
is:
391c183
< void *qoi_decode(const void *data, int size, int *out_w, int *out_h, int channels) {
---
> void *qoile_decode(const void *data, int size, int *out_w, int *out_h, int channels) {
427c219
< int b1 = bytes[p++];
---
> uint32_t b = peek_u32le(bytes + p);
429,462c221,257
< if ((b1 & QOIBE_MASK_2) == QOIBE_INDEX) {
< px = index[b1 ^ QOIBE_INDEX];
< }
< else if ((b1 & QOIBE_MASK_3) == QOIBE_RUN_8) {
< run = (b1 & 0x1f);
< }
< else if ((b1 & QOIBE_MASK_3) == QOIBE_RUN_16) {
< int b2 = bytes[p++];
< run = (((b1 & 0x1f) << 8) | (b2)) + 32;
< }
< else if ((b1 & QOIBE_MASK_2) == QOIBE_DIFF_8) {
< px.rgba.r += ((b1 >> 4) & 0x03) - 1;
< px.rgba.g += ((b1 >> 2) & 0x03) - 1;
< px.rgba.b += ( b1 & 0x03) - 1;
< }
< else if ((b1 & QOIBE_MASK_3) == QOIBE_DIFF_16) {
< int b2 = bytes[p++];
< px.rgba.r += (b1 & 0x1f) - 15;
< px.rgba.g += (b2 >> 4) - 7;
< px.rgba.b += (b2 & 0x0f) - 7;
< }
< else if ((b1 & QOIBE_MASK_4) == QOIBE_DIFF_24) {
< int b2 = bytes[p++];
< int b3 = bytes[p++];
< px.rgba.r += (((b1 & 0x0f) << 1) | (b2 >> 7)) - 15;
< px.rgba.g += ((b2 & 0x7c) >> 2) - 15;
< px.rgba.b += (((b2 & 0x03) << 3) | ((b3 & 0xe0) >> 5)) - 15;
< px.rgba.a += (b3 & 0x1f) - 15;
< }
< else if ((b1 & QOIBE_MASK_4) == QOIBE_COLOR) {
< if (b1 & 8) { px.rgba.r = bytes[p++]; }
< if (b1 & 4) { px.rgba.g = bytes[p++]; }
< if (b1 & 2) { px.rgba.b = bytes[p++]; }
< if (b1 & 1) { px.rgba.a = bytes[p++]; }
---
> if ((b & QOILE_MASK_2) == QOILE_INDEX) {
> px = index[(b >> 2) & 63];
> p += 1;
> }
> else if ((b & QOILE_MASK_3) == QOILE_RUN_8) {
> run = ((b >> 3) & 0x1F);
> p += 1;
> }
> else if ((b & QOILE_MASK_3) == QOILE_RUN_16) {
> run = ((b >> 3) & 0x1FFF);
> p += 2;
> }
> else if ((b & QOILE_MASK_2) == QOILE_DIFF_8) {
> px.rgba.r += ((b >> 2) & 0x03) - 1;
> px.rgba.g += ((b >> 4) & 0x03) - 1;
> px.rgba.b += ((b >> 6) & 0x03) - 1;
> p += 1;
> }
> else if ((b & QOILE_MASK_3) == QOILE_DIFF_16) {
> px.rgba.r += ((b >> 3) & 0x1F) - 15;
> px.rgba.g += ((b >> 8) & 0x0F) - 7;
> px.rgba.b += ((b >> 12) & 0x0F) - 7;
> p += 2;
> }
> else if ((b & QOILE_MASK_4) == QOILE_DIFF_24) {
> px.rgba.r += ((b >> 4) & 0x1F) - 15;
> px.rgba.g += ((b >> 9) & 0x1F) - 15;
> px.rgba.b += ((b >> 14) & 0x1F) - 15;
> px.rgba.a += ((b >> 19) & 0x1F) - 15;
> p += 3;
> }
> else if ((b & QOILE_MASK_4) == QOILE_COLOR) {
> p += 1;
> if (b & 0x10) { px.rgba.r = bytes[p++]; }
> if (b & 0x20) { px.rgba.g = bytes[p++]; }
> if (b & 0x40) { px.rgba.b = bytes[p++]; }
> if (b & 0x80) { px.rgba.a = bytes[p++]; }
Hi, just noticed your cool project on hackaday.
Wanted to test this awesome tool on my macbook.
Anyway here's a Makefile for macOS big sur (sorry no time to create a pull request...):
# Makefile that works on macos big sur with brew install libpng and then
# just running make
CC = gcc
CFLAGS = -Wall -O2 -arch x86_64 -I.
HEADERS = qoi.h
CONVERTER = qoiconv
BENCHMARK = qoibench
INSTALL_DIR = /usr/local/bin
all: $(CONVERTER) $(BENCHMARK)
$(CONVERTER): qoiconv.c $(HEADERS)
$(CC) $(CFLAGS) qoiconv.c -o $(CONVERTER)
strip $(CONVERTER)
$(BENCHMARK): qoibench.c $(HEADERS)
$(CC) $(CFLAGS) qoibench.c -lpng -o $(BENCHMARK)
strip $(BENCHMARK)
clean:
@rm -rf $(CONVERTER) $(BENCHMARK)
Usage: just run make to create the 2 binaries:
make
gcc -Wall -O2 -arch x86_64 -I. qoiconv.c -o qoiconv
strip qoiconv
gcc -Wall -O2 -arch x86_64 -I. qoibench.c -lpng -o qoibench
... some warnings related to libpng stuff ...
6 warnings generated.
strip qoibench
And then run qoiconv or qoibench:
$ ./qoiconv
Usage: qoiconv <infile> <outfile>
Examples:
qoiconv input.png output.qoi
qoiconv input.qoi output.png
Keep up the great work.
Kind regard,
Walter
See https://github.com/xfmoulet/qoi : a pure go implementaiton of qoi file format.
Performance (not tuned) is around half the C version when compiled with gcc -O3
Hi!
First of all, very nice work!
I've noticed a potential issue with qoi_header_t
. Entire qoi file format is generally Big-endian, but width
,height
and size
integers in the header are represented in the output file either as Little or Big endian, depending on the machine architecture. This unfortunately makes the qoi files non-portable across machines with different byte-orders.
If this was not intentional, could you consider placing htonl
/ntohl
and htons
/ntohs
calls before/after packing/unpacking the mentioned integers in qoi_header_t
? I can submit a PR if you don't have the time.
This may be a dumb and obvious question but how would you implement this in WebAssembly so you can use it with things like Node Serverless Functions and Edge Workers?
first, thanks for this amazingly simple and efficient idea!
I hope you don't mind if I use something similar in my personal projects - namely as a preprocessing step before deflating directional lightmaps in my engine
I've noticed though that it has a problem with images with lots of variation in the alpha channel
I made a couple of changes in my from-scratch implementation and get ~8% size improvement over QOI on the kodim set
a side note: sometimes the full block with mask (color block in qoi) might encode better than the 24-bit block, but gives only marginal gains
new proposed encoding (also delta ranges are 2's complement, say -16..15)
the encoding differs in two modes of operation: the default one is color mode, and the other is alpha mode
bit prefix:
00 use 6-bit index (same for both modes)
01 use rgb delta -2..1 (same for both modes)
10 - color mode: use 4 bits for blue and 5 bits for red and green delta
- alpha mode: use 4-bit for rgb and 2-bit delta for alpha
110 rle mode + 5-bit rep
the rle mode is special, however. if another rle command follows this one, rep count is merged like this: (rep << 5) + lo 5 bits of cmd. the encoder has to make sure the chained rle commands go in big endian order to decode properly. ultimately rep-1 is encoded
1110 (24-bit code)
- color mode: 6 bits blue, 7 bits red and green delta
- alpha mode: 5 bits for r,g,b and a
1111 full with mask, same for alpha and color mode
exactly the same as qoi: 4 bit channel mask, then individual channel bytes if present
except that if mask is 0, the decoder should flip between modes (color<->alpha) and continue processing (no pixel is output in this case)
currently I simply output the mode switch byte before encoding a pixel which has alpha > 0 and < 255, this helps with images with complicated alpha. a more advanced encoder might use this byte to switch back to color mode at some point, but I don't do this
I'm also using a more sophisticated hashing (xor/shift + some mults on 32-bit pixel value), but the benefit of this is dubious
First of all, this is awesome - great work!
Since encoding is 20x-50x faster, would it make sense to run a 2nd pass, with a different direction, and then keep the smallest of the 2 results?
So, basically run a "horizontal" pass, then a "vertical" pass, and keep the smallest in size.
Most likely should not be the default, but if I was using QOI, I would like to have this option.
(I haven't read the code yet)
C# implementation:
Performance will improve over time.
Benchmarks will be added soon
NuGet package uploaded
Hi!
I really like QOI and decided to use in in my projects.
So to help myself and others I decided to make a Paint.NET file type plugin to be able to load, view, create, convert and save QOI images.
I hope you and/or others will find it useful.
There's no way to tell if the file contains pixels/texels in the sRGB colorspace, or not. It's a single bit, and in some applications this is very valuable.
Spotted while reading through "qoi.h" after finding this on reddit.
int max_size = w * h * (channels + 1) + sizeof(qoi_header_t) + 4;
=>
int max_size = w * h * (channels + 1) + sizeof(qoi_header_t) + QOI_PADDING;
QOI format has 4-Byte padding at the end.
But, in current implementation (81b438c), it seems that we can embed data in the padding area.
For example, consider this:
#!/bin/sh
# | "qoif" | wid | hei | size | QOI_COLOR |
echo '71 6f 69 66 00 01 00 01 00 00 00 05 ff a0 b0 c0 d0' | xxd -revert -plain > foo.qoi
./qoiconv foo.qoi foo.png
This successfully create 1x1 image (pixel value is (0xA0, 0xB0, 0xC0, 0xD0)). Is it valid?
Hi - I can add QOI to the Basis Universal repo:
https://github.com/BinomialLLC/basis_universal/
QOI is valuable because it's so fast, and I read/write A LOT of PNG's during development. So many that I optimized lodepng to be faster.
However, the library must be fuzzed with at least zzuf or we can't legally use it. That's required. I can do this as our time permits, but hopefully others will do this before us. Fuzzing is extremely important and required or we cannot use it.
Also, I need a Windows viewer app. It needs the ability to display the alpha channel as grayscale. Know of anything?
I test a 4096 * 4096 * 32bits BMP(platform: win10x64):
Libpng encoding takes about 30 ms (BMP2PNG), decoding takes about 30 ms (PNG2BMP).
LibQOI encoding takes about 234 ms (BMP2QOI), decoding takes about 187 ms (QOI2BMP).
Why is it so different from your test, Is it my understanding wrong?
test.zip
test souce code : https://github.com/dbyoung720/TestQOI.git
The public domain license makes it easier for us to include your header in our open source project. Otherwise we have to get permission from our corporate customers to use it. Making the license/public domain declaration compatible with stb_image.h will make it easier for commercial users to use your library.
Here is a pure Java 8 implementation of QOI:
BufferedImage
converter)Performance is reasonable, but the library not heavily optimized yet.
When benching files of different size, compression ratio is a better statistic than size. Similarly, it probably makes sense to get rid of encode and decode time, since the rates are more reliable and useful info.
A set of simple .qoi files and corresponding .png files would be helpful for testing independent implementations.
With all that we learned through the analysis and ideas of a lot of people here, I refined QOI quite a bit. More than I thought I would.
The current state is in the experimental branch.
First of all, benchmark results for the new test suite using
qoibench 1 images/ --nopng --onlytotals
## Total for images/textures_photo/
decode ms encode ms decode mpps encode mpps size kb rate
master: 8.2 11.6 127.43 90.52 2522 61.6%
experi: 5.9 8.2 178.14 127.56 1981 48.4%
## Total for images/textures_pk01/
master: 0.7 1.0 186.23 126.11 184 36.4%
experi: 0.6 0.9 214.67 145.87 178 35.2%
## Total for images/screenshot_game/
master: 2.7 3.9 231.42 162.40 534 21.6%
experi: 2.6 3.4 245.06 187.25 519 21.0%
## Total for images/textures_pk/
master: 0.3 0.5 138.31 93.64 83 48.1%
experi: 0.3 0.4 159.63 110.87 75 43.5%
## Total for images/textures_pk02/
master: 2.0 2.8 155.27 110.31 504 42.5%
experi: 1.7 2.3 182.73 133.00 479 40.4%
## Total for images/icon_64/
master: 0.0 0.0 251.06 163.38 4 28.3%
experi: 0.0 0.0 343.60 266.60 5 31.3%
## Total for images/icon_512/
master: 0.6 0.9 474.50 308.36 80 7.8%
experi: 0.6 0.7 474.62 378.76 102 10.1%
## Total for images/photo_kodak/
master: 2.9 4.2 137.76 92.77 771 50.2%
experi: 2.4 3.5 166.17 111.66 671 43.7%
## Total for images/textures_plants/
master: 3.8 6.2 281.64 170.37 951 22.9%
experi: 3.3 5.0 324.00 211.07 922 22.2%
## Total for images/screenshot_web/
master: 18.1 28.2 449.27 287.79 2775 8.7%
experi: 17.5 23.2 464.81 350.15 2649 8.3%
## Total for images/pngimg/
master: 6.5 10.0 279.91 180.57 1415 20.0%
experi: 5.9 8.6 307.44 210.93 1445 20.5%
## Total for images/photo_tecnick/
master: 10.1 15.2 142.74 95.00 2710 48.2%
experi: 8.8 13.6 163.36 105.69 2527 44.9%
## Total for images/photo_wikipedia/
master: 7.8 11.7 138.75 92.50 2260 53.4%
experi: 6.7 10.4 161.91 104.37 2102 49.6%
# Grand total for images/
master: 2.1 3.1 220.85 148.50 485 26.8%
experi: 1.9 2.7 245.67 173.24 465 25.7%
As you can see throughput improved a lot, as did the compression ratio for all files without an alpha channel (icon_*/
and pngimg/
suffered a bit, but the overall compression ratio for these files is already quite high. textures_plants/
still saw improvements). For photos or photo-like images QOI now often beats libpng!
What changed? After I switched the tags for QOI_RUN
(previously 2-bit tag) and QOI_GDIFF_16
(previously 4-bit tag) I noticed that QOI_GDIFF
covered almost all(!) cases that were previously encoded by QOI_DIFF_16/24
. So... why not remove them?
#define QOI_OP_INDEX 0x00 // 00xxxxxx
#define QOI_OP_DIFF 0x40 // 01xxxxxx (aka QOI_DIFF_8)
#define QOI_OP_LUMA 0x80 // 10xxxxxx (aka QOI_GDIFF_16)
#define QOI_OP_RUN 0xc0 // 11xxxxxx
#define QOI_OP_RGB 0xfe // 11111110 (aka QOI_COLOR with RGB)
#define QOI_OP_RGBA 0xff // 11111111 (aka QOI_COLOR with RGBA)
(see the experimental file format documentation for the details)
That is, most tags are now 2-bit, while the run-length is limited to 62 and thus leaves some room for the two 8-bit QOI_OP_RGB
and QOI_OP_RGBA
tags. So QOI would be even simpler than before and (probably?) gain a lot more possibilities for performance improvements:
Yes, it means that a change in the alpha channel will always be encoded as a 5-byte QOI_OP_RGBA
, but using the current test suit of images, this seems to be totally fine. The alpha channel is mostly either 255 or 0. The famous dice.png
and FLIF's fish.png
seem to be awfully "artificial" uses of PNG. (For comparison, in the experimental branch with the original tag-layout and QOI_DIFF_16/24
still present, the overal compression ratio was at 24.6% - but the win in simplicity and performance is imho worth this 1%).
The hash function changed to the following:
#define QOI_COLOR_HASH(C) (C.rgba.r * 3 + C.rgba.g * 5 + C.rgba.b * 7)
This is seriously the best performing hash function I could find and I tried quite a few. This also ignores the alpha channel, making it even more of a second-class citizen.
You may not like it (and I'm truly sorry for all the work that would need to be done in existent implementations), but I strongly believe that this is The Right Thing To Do™.
Thoughts?
Benchmark results on this page must be updated. I think they were calculated before this commit 30f8a39
I find the compression scheme itself kinda interesting, regardless of the header format. Separating it from the file format might encourage people to take this simple but quite OK scheme (pun intended) and use it on its own for their specific use cases.
It's not like anything will change, but I believe being precise with the definition (separating two different things: the compression scheme vs. the file format) is a right idea.
I am always curious of data compression.
I tried your corpus, and compare with zopflipng. What surprised me a lot is that QOI + XZ is smaller than zopflipng. XZ is run with -9
and zopflipng with --prefix -m
tests/images/kodak on master [?] ❯ du -c kodim*.png | tail -1
15072 total
tests/images/kodak on master [?] ❯ du -c zopfli_kodim*.png | tail -1
14424 total
tests/images/kodak on master [?] ❯ du -c kodim*.qoi | tail -1
18568 total
tests/images/kodak on master [?] ❯ du -c kodim*.qoi.xz | tail -1
13760 total
tests/images/kodak on master [?] ❯ du -c zopfli_kodim*.png.xz | tail -1
14424 total
tests/images/screenshots on master [?] ❯ find -name '*.png' ! -name 'zopfli_*' -print0 | xargs -0 du -c | tail -1
33216 total
tests/images/screenshots on master [?] ❯ du -c zopfli_*.png | tail -1
21912 total
tests/images/screenshots on master [?] ❯ du -c * | tail -1
128364 total
tests/images/screenshots on master [?] ❯ du -c *.qoi | tail -1
33600 total
tests/images/screenshots on master [?] ❯ du -c *.qoi.xz | tail -1
18132 total
tests/images/screenshots on master [?] ❯ du -c zopfli*.xz | tail -1
21504 total
Is there any explanation on why it can beat out PNG like this?
I noticed that you use the old way of define integer vars (unsigned short, int, etc). This makes the code and the file format dependant of the machine architecture of where was compiled/generated.
Instead, use the fixed width integer types that C99 have, and QOI would even work on 8,16 and 32 bit machines (and a fast & simple image format it's useful for retro computers) without any issue of portability. Specially, if the endianness problem it's fixed (saw in #10 )
I could made a PR with this kind of changes if you like.
After a discussion in #28, the QOI data format changes to accommodate some of the concerns. This will serve as the basis for final specification for QOI.
These changes are not yet reflected in the code of this repository. I'm working on it! The code in qoi.h
now implements all these changes.
QOI_DIFF
will shift -1, to be consistent with the range of a two's complement intQOI_DIFF
will explicitly allow to wrap around. Whether the encoder makes use of this is outside of the spec. The decoder must account for this wrapping.size
field in the header will be removedwidth
and height
in the header will be widened to 32bitchannels
field will be added to the header. This is purely informative and will not change the behavior of the en-/decodercolorspace
bitmap will be added to the header. This is purely informative and will not change the behavior of the en-/decoder.The header then looks like this:
struct qoi_header_t {
char [4]; // magic bytes "qoif"
u32 width; // image width in pixels (BE)
u32 height; // image height in pixels (BE)
u8 channels; // must be 3 (RGB) or 4 (RGBA)
u8 colorspace; // a bitmap 0000rgba where
// - a zero bit indicates sRGBA,
// - a one bit indicates linear (user interpreted)
// colorspace for each channel
};
The ranges for QOI_DIFF
change to:
-2..1
instead of the original range -1..2
-8..7
instead of the original range -7..8
-16..15
instead of the original range -15..16
The channels
field in the header serves only as a hint to the user on how to handle this image. It is valid for a QOI image to still encode alpha changes in a file with a header that denotes 3
channels. It is not the responsibility of the decoder to mask off alpha values. The color hash will always be computed as r^g^b^a
, irregardless of the number of channels denoted in the header.
Just a thought, for consistency sake. IIUC:
r ^ g ^ b ^ a
.r ^ g ^ b ^ 255
(solely because of px = px_prev
line).If px = px_prev
was replaced with zero initialization of px
, color hash would also be a simple xor of all 3 or 4 components (e.g. might make it a tiny bit easier to implement in a generic way in other languages).
It's probably worth mentioning in the spec (or, if there's not a spec, in qoi.h
) whether RGBA means premultiplied (associated) alpha or non-premultiplied (straight, unassociated) alpha.
https://en.wikipedia.org/wiki/Alpha_compositing#Straight_versus_premultiplied
If it's "whatever PNG does" then it's non-premultiplied alpha.
First, this project looks super awesome. Very impressed.
Minor suggestion would be to call fopen
in qoi_write()
prior to invoking encode, in alignment with 'fail fast' philosophy.
Logic: if I wanted to write a batch processor that warns on failure, it would be much faster for cases where files already exist/are unwritable.
The function
qoi_read(const char *filename, int *out_w, int *out_h, int channels)
needs the number of channels as parameter. This should not be needed.
The QOI file should know with which number of channels it was encoded.
To solve this the header could be improved to contain also the number of channels.
BTW.: The header should also contain some file format version. This allows future improvements.
Saying that I'm surprised by the amount of attention this is getting would be an understatement. There's lots of discussion going on about how the data format and compression could be improved and what features could be added.
I want to give my views here and discuss how to go forward.
First and foremost, I want QOI to be simple. Please keep this in mind. I consider the general compression scheme to be done. There's lots of interesting ideas on how to improve compression. I want to tinker with these ideas - but not for QOI.
QOI will not be versioned. There will only be one version of QOI's data format. I'm hoping we will be able to strictly define what exactly that is in the coming days.
QOI will only support 24bit RGB and 32bit RGBA data. I acknowledge there's some need for fewer or more channels and also for higher bit depths or paletted color - QOI will not serve these needs.
So, with all that said, there's some breaking changes that are probably worthwhile. I want to discuss if and how to implement those.
width
, height
and size
in the header should be stored as big endian for consistency with the rest of the format (this change already happened in c03edb2)
Color differences (QOI_DIFF_*
) should be stored have the same range as two's-complement. That means:
-2..1
instead of the current range -1..2
-8..7
instead of the current range -7..8
-16..15
instead of the current range -15..16
So, 1) is already implemented; 2) seems like the right thing to do (any objections?); 3) is imho worth discussing.
3a) Storing the number of channels (3
or 4
) in the header would allow a user of this library to omit if they want RGB or RGBA and files would be more descriptive of their contents. You would still be able to enforce 3 or 4 channels when loading. This is consistent to what stbi_load does
int x,y,n;
unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
// ... process data if not NULL ...
// ... x = width, y = height, n = # 8-bit components per pixel ...
// ... replace '0' with '1'..'4' to force that many components per pixel
// ... but 'n' will always be the number that it would have been if you said 0
It is my opinion that the channels
header value should be purely informative. Meaning, en-/decoder will do exactly the same, regardless of the number of channels. The extra 5bit for alpha in QOI_DIFF_24
will still be wasted for RGB files.
3b) I don't understand enough about the colorspace issue to gauge the significance. If we implement this however, I would suggest to give this a full byte in the header, where 0 = sRGB
and any non-zero value is another, user-defined(?) colorspace.
3c) I'm against an option for premultiplied alpha, because it puts more burden on any QOI implementation to decode in the right pixel format. We should just specify that QOI images have un-premultiplied alpha.
3d) For simplicity's sake I'd like to put 3a) and 3b) as one byte each into the header. I'm uncertain if we then should "pad" the u32 size
in the header with two more bytes. This would make the size
4byte aligned again, but there's probably no need for it!? A u16 unused
could also cause more confusion when other QOI libraries suddenly specify any of these bits to mean something.
With all this, the header would then be the following 16 bytes:
struct qoi_header_t {
char [4]; // magic bytes "qoif"
u16 width; // image width in pixels (BE)
u16 height; // image height in pixels (BE)
u8 channels; // must be 3 (RGB) or 4 (RGBA)
u8 colorspace; // 0 = sRGB (other values currently undefined)
u16 unused; // free for own use
u32 size; // number of data bytes following this header (BE)
};
The one issue I have with this, is how to give these extra header value to the user of this library. qoi_read("file.qoi", &w, &h, &channels_in_file, &colorspace, want_channels)
looks like an ugly API. So maybe that would rather be implemented as qoi_read_ex()
and qoi_read()
stays as it is. I'm still not sure if I want that extended header...
What's the opinion of the other library authors?
Since QUI is a C library, it could be a Python extension and used as a plugin for Pillow.
Hey, thanks for the work you're putting in to this.
I've written a Rust implementation of your image format (https://github.com/steven-joruk/qoi) and added some optimizations you might want to take as well.
The biggest gain is by factoring out writing the QOI_RUN command which lets you get rid of a bunch of redundant comparisons and a couple of branches: steven-joruk/qoi@3f3ee0a
You can reduce some more branches when writing QOI_COLOUR: https://github.com/steven-joruk/qoi/blob/3f3ee0ae7ecbb62a4b293f932d28580099989159/src/encode.rs#L158
And I'm unsure if this has an real affect but you only need to store the previous colour when it's changed (move the assignment in to the px_prev != px
block).
When I hacked those in to my local qoi.h I saw improvements of around 16% for dice.png, I haven't measured other files. The rust benchmark encodes dice.qoi (from raw) in around 2.3ms compared to qoibench's 3.7ms (3.4ms with the above changes), I haven't compared the assembly or profiles to see what else may be going on.
This issue was factored out of PR #6 (comment)
Generally, when passing around a pointer-length pair for a block of memory, the length should be
size_t
.
The qoi_decode
function still takes (const void* data, int size, etc)
.
Note that
ftell
andfread
returnlong
andsize_t
, notint
The qoi_read
function can also still overflow here.
Speaking of Windows (#24)... IIUC the default Windows color order is BGRA (not RGBA) and likewise for Linux (X11) and I think MacOS / iOS too. Can't remember what Android is.
Anway, if we're talking of finalizing the file format (#28), consider QOI producing BGRA, not RGBA. Especially as QOI is about being fast to decode, this would avoid what libpng calls the PNG_TRANSFORM_BGR
step.
Transforming BGRA <-> RGBA is cheap, especially with SIMD, but it's not free.
I tried playing with qoiconv on a few images I had around (since I had a hard time believing something so simple could get so close to png). So far some png inputs it fails to decode at all so it can't convert them to qoi, while others it can read and convert, but when you convert back to png the file size is almost double what it originally was. Running imagemagic convert on the png generated by qoiconv results in about a 40% reduction in file size.
I think the benchmark needs to switch to libpng or something else if it is to be a serious comparison against png compression, although the compression ratio is surprisingly good for something so simple.
For example:
https://ae27ff.meme.tips/res/klmmlyby.png original is 212153 bytes
qoi is 289133 bytes
qoiconv back to png is 354424 bytes
output from imagemagick convert on the file generated by qoivonc is 220817 bytes
So whatever stb_image does is not good at all, and that makes me question if the encode/decode times are even valid either versus libpng which seems to be just about the standard for png.
The header is currently 12 bytes:
struct qoi_header_t {
char [4]; // magic bytes "qoif"
unsigned short width; // image width in pixels
unsigned short height; // image height in pixels
unsigned int size; // number of data bytes following this header
};
Endianness is already discussed in #10.
If the file format isn't set in stone yet, consider:
You might find some inspiration in NIE's 16 byte header:
https://github.com/google/wuffs/blob/main/doc/spec/nie-spec.md
There's also, as mentioned in the Hacker News discussion, the idea of re-using the IFF / RIFF container format.
Can people use this for anything?
The Idea is to add two more fields to the header: chunk width and chunk height (u8?) that defines blocks of data that can be processed independently from others (each have its own "64 last known pixels"). The number of chunks can easily be determined by dividing the image's dimensions.
There are several adventages to this:
The drawback is the added complexity, but I think the sacrifice would be worth it. I'm waiting for your feedbacks to start to implement and benchmark this.
Also, each chunk could be stored contiguously for a better data (and not pixel) locality. Maybe better, maybe too complex for this, dunno...
What do you think?
I have now assembled a pretty comprehensive suite of test images. These all come with the proper license information (CC or public domain). It includes:
Here's the full set:
https://phoboslab.org/files/qoibench/qoi_benchmark_suite.tar (1.1 GB) — very proudly excluding lenna.jpg
All images in this test suite are PNGs. I will add QOI images once the specification has been finalized (related #20).
To make it a bit easier to test tweaks for qoi, qoibench.c (in the experimental branch) can now descend into subdirectories, prints a grand total and has gained various options:
Usage: qoibench <iterations> <directory> [options]
Options:
--nowarmup ... don't perform a warmup run
--nopng ...... don't run png encode/decode
--noverify ... don't verify qoi roundtrip
--noencode ... don't run encoders
--nodecode ... don't run decoders
--norecurse .. don't descend into directories
Examples
qoibench 10 images/textures/
qoibench 1 images/textures/ --nopng --nowarmup
E.g. if you just want to check the overall compression ratio for qoi as fast as possible:
./qoibench 1 images/ --nowarmup --nopng --noverify --nodecode
Hi!
It would be great to additionally support stream reading and writing functions like in libpng
(png_set_read_fn) or libtiff
(TIFFClientOpen).
Ideally, this feature has just a single customer-supplied reading (or writing) function like png_set_read_fn()
.
This feature is needed for image reading libraries like FreeImage or SAIL to simplify QOI integration.
probably close this as soon as you see it, as it isn't a real issue
quoting source:
// -----------------------------------------------------------------------------
// libpng encode/decode wrappers
// Seriously, who thought this was a good abstraction for an API to read/write
// images?
I haven't laughed this hard reading someone else's code since I learned that the candela is a poor unit (no direct links for line numbers, so please search for "I think the candela is a scam.")
Your code is fun to read.
I implemented the Zig implementation of Qoi and ai made it a clean room implementation, thus testing the specification in qoi.h
.
Some problems i noticed in both the implementation and the description: While the qoi
format assumes big endian byte order for a lot of things, the implementation is only suitable to run on little endian machines.
Also, the bit order is unclear for cross-byte fields:
dr
for example crosses the first byte boundary, and it is unspecified how the bits are ordered here. To me, it was unclear if the 128-bit or the 1-bit of the second byte will provide the additional bit. It's also unclear which bit the bit is in the final u5
value.
A good alternative that makes this unmistakable clear would be something like this:
| QOI_DIFF_24 |
| Byte + 0 | Byte + 1 | Byte + 1 |
| 7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0 | 7 6 5 4 3 2 1 0 |
|-------------------------|-------------------------|-------------------------|
| 1 1 1 0 r4 r3 r2 r1 | r0 g4 g3 g2 g1 g0 b4 b3 | b2 b1 b0 a4 a3 a2 a1 a0 |
With:
r4...r0 forming the red channel difference between -15..16
g4...g0 forming the green channel difference between -15..16
b4...b0 forming the blue channel difference between -15..16
a4...a0 forming the alpha channel difference between -15..16
I'm happy to create a PR for this change
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.