Code Monkey home page Code Monkey logo

granite's Introduction

Granite

Granite is my personal Vulkan renderer project.

Why release this?

The most interesting part of this project compared to the other open-source Vulkan renderers so far is probably the render graph implementation.

The project is on GitHub in the hope it might be useful as-is for learning purposes or generating implementation ideas.

Disclaimer

Do not expect any support or help. Pull requests will likely be ignored or dismissed.

License

The code is licensed under MIT. Feel free to use it for whatever purpose.

High-level documentation

See OVERVIEW.md.

Low-level rendering backend

The rendering backend focuses entirely on Vulkan, so it reuses Vulkan enums and data structures where appropriate. However, the API greatly simplifies the more painful points of writing straight Vulkan. It's not designed to be the fastest renderer ever made, it's likely a happy middle ground between "perfect" Vulkan and OpenGL/D3D11 w.r.t. CPU overhead.

  • Memory manager
  • Deferred destruction and release of API objects and memory
  • Automatic descriptor set management
  • Linear allocators for vertex/index/uniform/staging data
  • Automatic pipeline creation
  • Command buffer tracks state similar to older APIs
  • Uses TRANSFER-queue on desktop to upload linear-allocated vertex/index/uniform data in bulk
  • Vulkan GLSL for shaders, shaders are compiled in runtime with shaderc
  • Pipeline cache save-to-disk and reload
  • Warm up internal hashmaps with Fossilize
  • Easier-to-use fences and semaphores

Missing bits:

  • Multithreaded rendering
  • Precompile all shaders to optimized SPIR-V

Implementation is found in vulkan/.

High-level rendering backend

A basic scene graph, component system and other higher-level scaffolding lives in renderer/. This is probably the most unoptimized and naive part.

PBR renderer

Pretty barebones, half-assed PBR renderer. Very simplified IBL support. Fancy rendering is not the real motivation behind this project.

Post-AA

Fairly straight forward FXAA, SMAA and TAA (no true velocity buffer though).

Automatic shader recompile and texture reload (Linux/Android only)

Immediately when shaders are modified or textures are changed, the resources are automatically reloaded. The implementation uses inotify to do this, so it's exclusive to Linux unless a backend is implemented on Windows (no).

Network VFS

For Linux host and Android device, assets and shaders can be pulled over TCP (via ADB port-forwarding) with network/netfs_server.cpp. Quite convenient.

Validation

In debug build, LunarG validation layers are enabled. Granite is squeaky clean.

Render graph

renderer/render_graph.hpp and renderer/render_graph.cpp contains a fairly complete render graph. It supports:

  • Automatic layout transitions
  • Automatic loadOp/storeOp usage
  • Automatic scaled loadOp for simple lower-res game -> high-res UI rendering scenarios
  • Uses async compute queues automatically
  • Optimal barrier placement, signals as early as possible, waits as late as possible VkEvent is used for in-queue resources, VkSemaphore for cross-queue resources
  • Basic render target aliasing
  • Can merge two or more passes into multiple subpasses for efficient rendering on tile-based architectures
  • Automatic mip-mapping if requested
  • Uses transient attachments automatically to save memory on tile-based architectures
  • Render target history, read previous frame's results in next frame for feedback
  • Conditional render passes, can preserve render passes if necessary
  • Render passes are reordered for optimal (?) overlap in execution
  • Automatic, optimal multisampled resolve with pResolveAttachments

I have written up a longer blog post about its implementation here.

The default application scene renderer in application/application.cpp sets up a render graph which does:

  • Conditionally renders a shadow map covering entire scene
  • Renders a close shadow map
  • Automatically pulls in reflection/refraction render passes if present in the scene graph
  • Renders scene G-Buffer with deferred
  • Lighting pass (merged with G-Buffer pass into a single render pass)
  • Bloom threshold pass
  • Bloom pyramid downsampling
  • Async compute is kicked off to get average luminance of scene, adjusts exposure
  • Two upsampling steps to complete blurring in parallel with async
  • Tonemap (HDR + Bloom) rendered to backbuffer (sRGB)
  • (Potentially UI can be rendered on top with merged subpasses)

Scene format

glTF 2.0 with PBR materials is mostly supported. A custom JSON format is also added in order to plug multiple glTF files together for rapid prototyping of test scenes.

Texture formats

  • PNG, JPG, TGA, HDR (via stb)
  • GTX (Granite Texture Format, custom texture format for compressed formats)

ASTC, ETC2 and BCn/DXTn compressed formats are supported.

gltf-repacker

There's a tool to repack glTF models. Textures can be compressed to ASTC or BC using ISPC Texture Compressor. zeux's meshoptimizer library can also optimize meshes. The glTF emitted uses some Granite specific extras to be more optimal, so it's mostly for internal use.

Compilers

Tested on GCC, Clang, and MSVC 2017.

Platforms

  • SDL3 (Linux / Windows)
  • VK_KHR_display (headless Linux w/ basic keyboard, mouse, gamepad support)
  • libretro Vulkan HW interface
  • Headless (benchmarking)
  • Custom surface plugin
  • Android

Vulkan implementations tested

  • AMD Linux (Mesa, AMDVLK)
  • Intel Linux (Mesa)
  • AMD Windows
  • nVidia Linux
  • Arm Mali (Galaxy S7/S8/S9)
  • Pixel C tablet (Tegra X1)

Build

Plain CMake. Remember to check out submodules with git submodule update --init.

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -G Ninja
ninja -j16 # YMMV :3

For MSVC, it should work to use the appropriate -G flag. There aren't any real samples yet, so not much to do unless you use Granite as a submodule.

viewer/gltf-viewer is a basic glTF viewer used as my sandbox for more complex testing. Try some models from glTF-Sample-Models.

Android

Something ala:

cd viewer
gradle build

Assets used in the default gltf-viewer target are pulled from viewer/assets.

Third party software

These are pulled in as submodules.

granite's People

Contributors

agnesh avatar cforfang avatar joshua-ashton avatar orbea avatar themaister avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

granite's Issues

Invalid push constant ranges

https://github.com/Themaister/Granite/blob/3e4b24dab7e0a0f9ca7496da44cc5aa62a1d60e5/vulkan/shader.cpp#L194..L196

This fragment of code assumes, that only used (active) push constants need to be placed in the pipeline layout range.

However, the Vulkan spec says:

Similarly, the push constant block declared in each shader (if present) must only place variables at offsets that are each included in a push constant range (...)

So all declared variables should be backed by the range.

Following disabled chunk of code is more correct. It seems this is not pessimization of layers :) - this fixes crashing on at least one Vulkan implementation:

// The validation layers are too conservative here, but this is just a performance pessimization.
		size_t size =
		    compiler.get_declared_struct_size(compiler.get_type(resources.push_constant_buffers.front().base_type_id));
		layout.push_constant_offset = 0;
		layout.push_constant_range = size;

How to build from source on MSYS2 MINGW64?

cmake configuration done but when running ninja it failed with:

ninja: error: 'third_party/shaderc/third_party/glslang/OGLCompilersDLL/libOGLCompiler.a', needed by 'compiler/libgranite-compiler.a', missing and no known rule to make it

Can push constants be dropped from the texture decoders?

It looks like push constants aren't needed in assets/shaders/decode/*.comp.

The two changes needed for this are:

  1. Replacing registers.resolution with imageSize(OutputImage) in
    if (any(greaterThanEqual(coord.xy, registers.resolution)))                                 
        return; 
  1. Replacing registers.error_color with DECODE_8BIT ? uvec4(0xff, 0, 0xff, 0xff) : uvec4(0xffff) in
    imageStore(OutputImage, coord, registers.error_color);

latest version build issues on ubuntu 20

-- Configuring Shaderc to avoid building tests.
-- asciidoctor was not found - no documentation will be generated
CMake Error at /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146 (message):
Could NOT find PythonInterp: Found unsuitable version "2.7.18", but
required is at least "3" (found /usr/bin/python)
Call Stack (most recent call first):
/usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:391 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.16/Modules/FindPythonInterp.cmake:169 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
third_party/shaderc/cmake/setup_build.cmake:38 (find_package)
third_party/shaderc/CMakeLists.txt:44 (include)

Of course python3 is installed.

cheers,

neshume

Missing assets..

Hi,
just testing Granite for first time.. built on Linux..
clone repo and did git submodule update --init
but still lots of samples error about assets:
./pcf-test

[ERROR]: open(), error: No such file or directory
[ERROR]: OSFilesystem::open(): MMapFile failed to open file
[ERROR]: Failed to open file: assets://pipelines.json

[ERROR]: open(), error: No such file or directory
[ERROR]: OSFilesystem::open(): MMapFile failed to open file
[ERROR]: Failed to open file: assets://shader_cache.json
[ERROR]: Failed to load shader cache assets://shader_cache.json from disk. Skipping ...

etc..
also
./gltf-viewer
[ERROR]: open(), error: No such file or directory
[ERROR]: OSFilesystem::open(): MMapFile failed to open file
[ERROR]: Failed to open file: assets://scene.glb
[ERROR]: application_create() threw exception: Failed to load GLTF file.

where to obtain this assets?
thanks..

Question: Image Upload Synchronization

I have a question about synchronization when it comes to uploading images with initial data.

I've built my own Vulkan implementation very similar to Granite's, using Vulkan.hpp instead, but mostly the same functionally. I'm not sure if I've missed something or if this is a use-case you've simply never covered.

The asset manager in my program loads textures asynchronously on worker threads. This means that images can be created, uploaded, and have mipmaps generated in one frame, but not actually be used that frame in a shader. As far as I can tell, there doesn't appear to be any synchronization for this situation, because I would get errors that command pools are being reset before the image is fully uploaded. This makes sense, since there are no barriers or semaphores keeping the command buffer alive until the image is uploaded.

But I'm not sure where the best place to put such synchronization would be. Would I increment the queue's timeline value and insert a signal on that? Would I create a semaphore for each image upload? Or does this synchronization already exist in Granite and I'm simply misusing it?

texture-decoder-test issues

I noticed texture-decoder-test is not passing for me with astc formats but fails on "4 x 4 UNORM". I modified test to not exit on first failure and I see there are more failures if I let it continue. I wanted to ask do you think this could be a platform specific issue or Granite issue? Is there a known 'good point' where tests passed?

I've attached here log where I've commented out all but astc test run and force all the sub-tests to run:
granite.log

Cannot compile application_glfw.cpp

Hello!

I am tring to build using Visual Studio 2015 Update 3, however following happens:

application_glfw.cpp(52): error C2664: 'bool Vulkan::Context::init_loader(PFN_vkGetInstanceProcAddr)': cannot convert argument 1 from 'GLFWvkproc (__cdecl *)(VkInstance,const char *)' to 'PFN_vkGetInstanceProcAddr'

This is because call convention is incompatible here: seems GLFW is __cdecl, while Vulkan function pointers are __stdcall.

Wrong type usage working because of implicit conversion

In renderer/render_graph.hpp:251 is the declaration of the following member:

VkPipelineStageFlags used_queues = 0;

Which compiles because it is implicitly converted where used, but judging by the usage it should be of type RenderGraphQueueFlagBits

compile with GRANITE_FFMPEG GRANITE_FFMPEG_VULKAN failed

compile with the latest code,error come with some unkown member in ffmpeg hwcontext_vulkan.h:

/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:892:9: error: no member named 'lock_queue' in 'AVVulkanDeviceContext'
vk->lock_queue = [](AVHWDeviceContext *ctx, int, int) {
~~ ^
/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:897:9: error: no member named 'unlock_queue' in 'AVVulkanDeviceContext'
vk->unlock_queue = [](AVHWDeviceContext *ctx, int, int) {
~~ ^
/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:949:11: error: no member named 'img_flags' in 'AVVulkanFramesContext'
vk->img_flags |= VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT;
~~ ^
/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:1192:34: error: no member named 'lock_frame' in 'AVVulkanFramesContext'
inline void lock() const { vk->lock_frame(frames, vk_frame); }
~~ ^
/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:1193:36: error: no member named 'unlock_frame' in 'AVVulkanFramesContext'
inline void unlock() const { vk->unlock_frame(frames, vk_frame); }
~~ ^
/Users/xxx/workspace/Granite/video/ffmpeg_decode.cpp:1207:19: error: no member named 'img_flags' in 'AVVulkanFramesContext'
info.flags = vk->img_flags;
~~ ^
/Users/panbin/workspace/Granite/video/ffmpeg_decode.cpp:1241:16: error: no member named 'queue_family' in 'AVVkFrame'
if (vk_frame->queue_family[0] != VK_QUEUE_FAMILY_IGNORED)

[Question] Pipeline prewarm

I'm opening issue as there are no discussions tab.

I was wondering how does the renderer behave with D3D11 like logic where you create pipeline before draw/dispatch?

I saw you use fossilize to prewarm pipelines, in my engine I'm using VkPipelineCache, should I expect speed up?
How big would get the cache in real game for example?

Thanks in advance

Question: DescriptorPool Design

Hey Themaister, I recently read your blog in https://themaister.net/blog/2019/04/20/a-tour-of-granites-vulkan-backend-part-3/ about the descriptor set management in Granite. In the blog you talked about the driver implementation of descriptor pool.

On certain GPUs, allocating descriptor sets is, or at least used to be very costly. The descriptor pools might not be implemented as true pools (sigh …), so every vkAllocateDescriptorSets would mean a global heap allocation, absolutely horrible for performance.

I would like to know if mainstream PC/console platforms use this mechanism, and I would appreciate it if you could provide some documents or implementation detail about that.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.