gametechdev / ispctexturecompressor Goto Github PK

View Code? Open in Web Editor NEW

427.0 48.0 91.0 40.7 MB

ISPC Texture Compressor

License: MIT License

C++ 65.39% C 3.20% HLSL 3.25% Batchfile 0.05% Assembly 28.11%

tool ispc

ispctexturecompressor's Introduction

Fast ISPC Texture Compressor

This repository contains a texture compression library for the following formats:

BC6H (FP16 HDR input)
BC7
ASTC (LDR, block sizes up to 8x8)
ETC1
BC1, BC3 (aka DXT1, DXT5) and BC4, BC5 (aka ATI1N, ATI2N)

The library uses the ISPC compiler to generate CPU SIMD-optimized compression algorithms. For more information, see the Fast ISPC Texture Compressor article on Intel Developer Zone.

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contributing

Please see CONTRIBUTING for information on how to request features, report issues, or contribute code changes.

Build Instructions

Binaries for ISPC v1.9.2 need to be obtained separately (e.g., from the ISPC repo or the SourceForge mirror). Download the appropriate compiler for your target, and place the binary in the following directories:

ISPC/linux/
ISPC/osx/
ISPC/win/

Source for the ISPC Texture Compressor library is under ispc_texcomp/.

Source for a sample that demonstrates the tradeoffs between the supported compression variants is under ISPC Texture Compressor/.

Windows

The build projects use Visual Studio 2017, Windows Tools 1.4.1, and the Windows 10 April 2018 Update SDK (17134)
Use ispc_texcomp\ispc_texcomp.vcxproj to build the ISPC Texture Compressor library
Use ISPC Texture Compressor\ISPC Texture Compressor.sln to build and run the sample

Mac OS X:

The build has been tested with Xcode 7.3 with minimum OS X deployment version set to 10.9
Use ispc_texcomp.xcodeproj to build the ISPC Texture Compressor library
dylib install name is set to @executable_path/../Frameworks/$(EXECUTABLE_PATH)
The sample application is not available on OS X.

Linux:

Use make -f Makefile.linux to build the ISPC Texture Compressor library
The sample application is not available on Linux.

ispctexturecompressor's People

Contributors

Stargazers

Watchers

Forkers

horzelski nightstyles vr3d poljere nexuapex aras-p kleopatra999 prollin smedis nuos martinecker nobolu-ootsuka-unrealengine aluedke-microsoft sevenfivel cdwfs rioharu zigguratvertigo nohle vk2gpu didito hanfling finbird morffiy ezhangle kingofthebongo2008 john-whigham kwakvincent whztt07 yazici calinou alenwesker zloop1982 ylyking whispersdo jackkingchen lyntel bmjoy tweakoz chenyangchenyang clapeysron siliconstudio-yunying-hu raphaelk12 nicholashowe bve-reborn tsinworks blackpants wreynard clayne hengle hoggqu ingenicc libraiger glasirius mu-l hilbertdu msoft1115 nathalie-raffray ssijonson tahanlong ducrp iorange junkot asdlei99 xiaohua2018 predatorcz rnshah9 drodin netwarm007 2004huangyimin knut0815 septag teddemunnik rygorous anylee2021 uniqlow gameatp havealex skiselkov isabella232 alexharte-dev crustal-creations brugarolas sweetsweep jamestiotio piggy-quest beardyking afrogg shimakaze09 pema99

ispctexturecompressor's Issues

"Error: Unsupported value for --arch" when building on Fedora

On Fedora 38, attempting to build the project results in the following error:

ispc -O2 --arch= --target= --opt=fast-math --pic -o ispc_texcomp/kernel_astc_ispc.o -h ispc_texcomp/kernel_astc_ispc.h ispc_texcomp/kernel_astc.ispc
Error: Unsupported value for --arch, supported values are: x86, x86-64, arm, aarch64 
make: *** [Makefile.linux:53: ispc_texcomp/kernel_astc_ispc.o] Error 255

This seems to be caused by the invocation of uname -p which outputs in "unknown" on my x64 machine. Running uname -m outputs "x86_64" as expected, and modifying the makefile to use this flag instead allows for the build to succeed. Both flags seem to work on Ubuntu 22.04 LTS. According to the man pages the -p flag is "non-portable."

ispc file support

Can you provide the ispc binary files for different system architectures or related links? For example, the ispc binary file for aarch64.

ETC2 support?

Readme shows that ISPCTextureCompressor only supports ETC1, how about ETC2? Is it supported?

Possible bug with the bc6h encoder

I'm having some blue/purple pixels with bc6h (mode 9 I think). Here is a picture

https://postimg.org/image/xpcpy3hcz/

replace #include "memory.h" with #include <memory.h>

Can you replace the:

https://github.com/GameTechDev/ISPCTextureCompressor/blob/master/ISPC%20Texture%20Compressor/ispc_texcomp/ispc_texcomp.cpp#L9
#include <memory.h>
with
#include "memory.h"

Because that's the proper way to include system headers? with <> instead of ""

I have my own "memory.h" header file in my project (that I've created by myself, which has my own memory management functions), and when I try to compile the project on Mac, the #include command uses my own header file, instead of the system file, which breaks the compilation.
But after I change to #include <memory.h> the problem goes away.

Wrong bytes/block values in ispc_texcomp.h comment

The notes in ispc_texcomp.h list the number of bytes per block necessary for BC1 as 4 and for BC3 as 8. It looks like these should actually be 8 and 16.

https://github.com/GameTechDev/ISPCTextureCompressor/blob/master/ISPC%20Texture%20Compressor/ispc_texcomp/ispc_texcomp.h#L103

https://docs.microsoft.com/en-us/windows/desktop/direct3d10/d3d10-graphics-programming-guide-resources-block-compression
https://docs.microsoft.com/en-us/windows/desktop/direct3d11/texture-block-compression-in-direct3d-11

Nondeterministic results in ASTC w/ varying partitions

Running CompressBlocksASTC for a single 1024x1024 image under ASTC8x8 produces different results than combining 8 iterations of CompressBlocksASTC for 1024x128 portions of the same image:

Attempt 1:
result = CompressBlocksASTC for 1024x1024 image (ASTC8x8)

Attempt 2:
result0 = CompressBlocksASTC for 1024x128 image (offset = 0) (ASTC8x8)
result1 = CompressBlocksASTC for 1024x128 image (offset = 128) (ASTC8x8)
result2 = CompressBlocksASTC for 1024x128 image (offset = 256) (ASTC8x8)
result3 = CompressBlocksASTC for 1024x128 image (offset = 384) (ASTC8x8)
result4 = CompressBlocksASTC for 1024x128 image (offset = 512) (ASTC8x8)
result5 = CompressBlocksASTC for 1024x128 image (offset = 640) (ASTC8x8)
result6 = CompressBlocksASTC for 1024x128 image (offset = 768) (ASTC8x8)
result7 = CompressBlocksASTC for 1024x128 image (offset = 896) (ASTC8x8)
result = { result0, result1, result2, .... }

When float16 0x8400 is fed to BC6H compressor, glitches occur

When we feed a float16 DDS file to the tool, and the DDS file contains values 0x8400, these decode as hypersaturated blue and magenta, rather than black.

I suspect that the compressor is not clamping its inputs to >=0, or if it does clamp, it's unable to deal with this particular value.

How to specify a fixed mode for BC7 encoding?

Hi, I'm sorry to post this question here. It is a question rather than an issue.

I'm trying to fix the output mode of BC7 encoding. However, the documentation about the BC7_enc_setting is not sufficient for me to find out the answer?

Let's say, if I want BC7 encoder to always output mode1 compressed block, is there a way to do that?

Thanks a lot!

Looks like BC7 doesn't encode alpha properly.

GetProfile_alpha_basic
CompressBlocksBC7
But in destination alpha is currupted

Mac Static Library

The provided xcode project is configured to generate a dylib file, however I'm interested in a static library.

I've tried changing the Mach-O type to "Static Library" but that didn't help, because there's still a dylib file generated.
I've tried renaming it to *.a static lib as the file was different size than before, so I thought maybe just the extension is incorrect, however after linking to it, I get undefined references.

How to compile for "Static Library"?

Edit:

I see that the "Other Linker Flags" setting is set to "build/.o", however in my case the files are generated in "Build/Intermediates/.o", but after changing the Other Linker Flags to the correct directory, there were still undefined references, because those files weren't being linked (that didn't help).
In the end I was able to fix the problem by manually drag and drop the "Build/Intermediates/*.o" files to the xcode project, as if they were C++ files (to make them listed in the project).
After that the generated static library got generated in full with all *.o files and functions linked correctly, and it works fine.

Feature Request: BC6H Signed Float Support (DXGI_FORMAT_BC6H_SF16)

It would be really interesting have support of signed half data (eg: to compress SDF, Vectorfields (both 2D or 3D in BC6H, which is a huge memory saver, especially in Texture3D)

https://msdn.microsoft.com/en-us/library/windows/desktop/hh308952(v=vs.85).aspx

ISPC 1.14 issues and suggested fixes/workarounds

At Unity we're finding that just updating the underlying ISPC compiler to 1.14 version gives a small compression speed increase (3-5% for BC7 & BC6H). That's cool! However the source code needs some fixes:

Integer type defs

ISPC now defines sized integer types, so ispc_texcomp/kernel.ispc needs removal of:

typedef unsigned int8 uint8;
typedef unsigned int32 uint32;
typedef unsigned int64 uint64;

(a similar change is done in #27)

ASTC dual plane bool

Not sure if due to new ISPC, or due to more recent C++ compiler, but the ASTC compressor dual plane flag was producing wrong results. Looks like ISPC produces 0 and 255 values for "bool", but some Clang optimizations assume it will only contain 0 and 1 values. Then the C++ code that does int D = !!block->dual_plane; and expects to produce 0 or 1 ends up producing 0 or 255 too, which when going into the ASTC block bitfields leads to much hilarity. Changing bool dual_plane; to uint8_t dual_plane; in ispc_texcomp_astc.cpp; and uniform bool dual_plane; to uniform uint8_t dual_plane; in kernel_astc.ispc fixes the issue.

ASTC solid color blocks

Solid color blocks in ASTC formats are encoded wrongly. In kernel_astc.ispc, ls_refine_scale() function for solid color blocks, sum_w and sum_ww can be zeroes, which makes sgesv2 return NaNs in xx[0] and xx[1], resulting in NaN scale too. Changing this:

if (scale > 0.9999) scale = 0.9999;
if (scale < 0) scale = 0;

to use clamp function fixes the issue:

scale = clamp(scale, 0.0f, 0.9999f); // note: clamp also takes care of possible NaNs

BC6H float types, input & output

Hi -- thanks for the great project!

Can you confirm the inputs and outputs for the BC6H compressor? My investigations seem to suggest it takes signed FP16 inputs (ie 10 bit mantissa), and generates a BC6H_UF16 output (ie, unsigned outputs). Is this correct?

I'm currently using the library in a pipeline with 32 bit float source data. So I could potentially build "unsigned" float 16 inputs (ie, 11 bit mantissa as per the Microsoft docs for BC6). Would there be any benefit to a version that took this kind of float as input, do you think?

Please, add BC4 encoding

BC7 encoder could convert completely transparent block (alpha=0) to non-transparent

The problem lies in the 'ep_quant0367'
https://github.com/GameTechDev/ISPCTextureCompressor/blob/master/ISPC%20Texture%20Compressor/ispc_texcomp/kernel.ispc#L973

If the pixels have fully transparent alpha=0, but there's smaller RGB error in "b=1" mode, then alpha would get converted to 4 (in 0..255 scale).
This kind of alpha is fairly visible when drawing 2D images on the screen, so this means that you could see noticable artifacts in places which should be completely transparent.

A simple workaround, is to always force "b=0" mode, when the alpha is zero:

replace code:

		for (uniform int p=0; p<4; p++)
			qep[i*4+p] = (err0<err1) ? qep_b[0+p] : qep_b[4+p];

with:

        if(channels==4 && ep[i*4+3]<=0.5f)err0=-1; // ESENTHEL CHANGED, BC7 allows to encode end points in 2 quantized modes, #1 standard, #2 add "0.5*levels" to all channels (1 extra bit precision, however it affects all channels at the same time, so if we have alpha=0, but RGB channels have smaller error with the extra 0.5 value, then alpha would get the +0.5 too, and it could destroy complete transparency, so this code always forces #1 version if we have alpha=0)

		for (uniform int p=0; p<4; p++)
			qep[i*4+p] = (err0<err1) ? qep_b[0+p] : qep_b[4+p];

Probably you could optimize it more in aspect to dynamic branching.

if programCount==1 then ASTC compression fails (PURPLE color)

I'm porting the code to use in regular C++, so it can be compiled on other platforms without ISPC compiler.
And I've noticed that ASTC compression always fails if programCount==1 (PURPLE color)
programCount>=2 work OK.
Looks like your codes have some assumption that programCount>=2, and in other cases they don't work.
Could you please check?

The simplest solution was to do:

#define programIndex 0
#define programCount 1

But that didn't work (PURPLE color), so I had to do for example this, with programCount>=2 :

for(int programIndex=0; programIndex<programCount; programIndex++)
    ispc::astc_encode_ispc((ispc::rgba_surface*)src, block_scores, dst, list, &list_context, (ispc::astc_enc_settings*)settings, programIndex);

Then it worked.

Maybe this code has something to do with it:

            if (*mode_list < programCount - 1)
            {
                int index = int(mode_list[0] + 1);
                mode_list[0] = index;

                mode_list[index] = (uint64_t(offset) << 32) + mode;
            }
            else
            {
                mode_list[0] = (uint64_t(offset) << 32) + mode;

                astc_encode(src, block_scores.data(), dst, mode_list, settings);
                memset(mode_list, 0, list_size * sizeof(uint64_t));
            }

BC7 determinism

Hi,

I have a problem with BC7 compression output no being deterministic. I read that there was such an issue before and it ought to be fixed, but here it is. I tried to force initialization of some local arrays, but this didn't help. Can there be anything done about this?

Thanks

Command line parameters?

When using the standalone EXE what are the available command line parameters?

ISPC HDR Texture Compressor still has Apache 2 license headers.

n/t

HDR and 3D texture support

Hey,

It would be great to see HDR and 3D texture support with the ASTC mode of ISPC Texture Compressor. I'm testing the codebase now with plans to do some lightmap baking to ASTC, found here:

https://github.com/boberfly/Urho3D/tree/ispc_texcomp

Cheers

DXT1 compression on images with smooth gradients results in banding

Original:

DXT1 compressed:

Support non-sRGB LDR compression

comment above CompressBlocksBCn noted that input of LDR image should be 32 bit/pixel (sRGB)，but we would sometimes need linear texture to be compressed, would be better if supporting non sRGB as well

bad code generation causing incorrect encoding in when compiling with clang optimization.

We use ISPCTextureCompressor on a linux backend. Recently we ran into issues with clang's optimization code generation. With optimizations turned on, the generated code would cause artifacts in the resulting ASTC images as seen below:

This is an eyeball texture & the pink pixels showcase the isssue.
We are not sure exactly what is causing this error, only that without optimizations turned on, everything comes out ok.
We ended up adding #pragma clang optimize off to ispc_texcomp_astc.cpp to work around the issue and allow the rest of the code to work with optimizations turned on.

Recently, we ran into a new issue where Clangs new-pass-manager compile flag is causing the same behavior, even with our pragma to turn optimization off for this code.

Compile error on Ubuntu 16.04: ‘numeric_limits’ is not a member of ‘std’

Ubuntu 16.04, GCC (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, ispc v1.9.2

g++ -O2 -msse2 -fPIC -I. -c ispc_texcomp/ispc_texcomp_astc.cpp -o ispc_texcomp/ispc_texcomp_astc.o
ispc_texcomp/ispc_texcomp_astc.cpp: In function ‘void CompressBlocksASTC(const rgba_surface*, uint8_t*, astc_enc_settings*)’:
ispc_texcomp/ispc_texcomp_astc.cpp:513:45: error: ‘numeric_limits’ is not a member of ‘std’
         block_scores[yy * tex_width + xx] = std::numeric_limits<float>::infinity();
                                             ^
ispc_texcomp/ispc_texcomp_astc.cpp:513:65: error: expected primary-expression before ‘float’
         block_scores[yy * tex_width + xx] = std::numeric_limits<float>::infinity();
                                                                 ^
Makefile.linux:49: recipe for target 'ispc_texcomp/ispc_texcomp_astc.o' failed