Shaders should be considered engine source code, being stored with the rest of the code, rather than being grouped with asset files. This also allows for the shaders to appear in an IDE's project files, instead of having to open them via file explorer.
Currently, shaders are written as fully self-contained sources, leading to lots of duplication for common operations. Instead, sources should be modularized out into HLSL includes. An example of this would be introducing Mesh.hlsli for handling mesh-related operations, Vertex.hlsli for standardized programmable vertex pulling, and Color.hlsli for color space conversion utilities.
When writing to a texture, the source placed footprint resource (upload buffer) needs to be aligned to D3D12_TEXTURE_DATA_PLACEMENT_ALIGNMENT. This is normally the case, since power of two textures are commonly added, however if a buffer with static frequency is written to that isn't aligned to 512 bytes, it will cause the next texture write to be misaligned and throw an error.
When hovering the mouse over the edges of the engine window, the resize cursor image only appears for a single frame, before returning to the default Windows cursor image. ImGui windows have the proper cursor behavior.
In the existing render graph implementation, transient resources are recreated and destroyed every frame, which is a massive CPU bottleneck. Resources should be reused when possible to prevent huge wastes of CPU time.
When waiting for a frame deadline, just sleeping until the objective is varyingly imprecise. A better solution is to modify the timer resolution, sleep until some small unit of time before the deadline, and then spin wait for the remaining time segment. This will significantly reduce frame time variations.
The existing render graph is hardcoded and fairly rigid. The editor should expose tools to change the render graph, such as allowing for enabling and disabling passes in real time. This will need to be implemented after dynamic shader pipelines, where pipelines will need to be produced procedurally based off the render pass's info.
The existing descriptor binding model tries to emulate D3D11, however this isn't the ideal solution. Instead of tightly coupling descriptors with resources, consider looking into a "resource binder" object, or potentially create descriptors on demand. For material textures, consider not even exposing the underlying resource, instead expose a collection of SRVs.
A large amount of objects are leaking in the DX12 runtime, guessing about one object per frame. All of the leaked objects have a refcount of 1, so I'm assuming these are leaked transient resources from the render graph.
The existing engine-side system using an offline and online set of descriptor heaps, and copying when constructing descriptor tables works well. However, external systems such as ImGui aren't integrated with this descriptor system, which causes unbound resource errors when attempting to use a texture managed on the engine side within ImGui. An example of this is ImGui::Image(), where the image passed might be a texture from a material, which resides in an offline heap. In PIX, the texture binding will show up as "referenced but not bound to the pipeline stage".
Command lists should record a list of the resources they use for later validation within the render graph to ensure all read/written resources within the pass are declared.
Vertices are 56 bytes in size, which leads to bandwidth becoming the bottleneck in high-throughput renders. Vertices should be optimized and compressed to be 32 bytes in size. Look into: computing tangent/bitangent on the GPU from UVs, octahedron normal encoding.
The existing camera movement utilities are very poor and quickly thrown in. A proper FPS-like movement system should be implemented, with delta times so movement is not frame rate dependent. Also there needs to be a way to reacquire control, such as when clicking on the viewport window.
When rendering the scene, we don't take into account the editor viewport, which causes any differences in aspect ratio to lead to stretching when viewing the scene. We need to compute the available content region of the viewport and pass it back to the renderer.
The engine processes every Windows input event, which is ideal for operations such as typing, where we don't want to lose keystrokes at low framerates or hitches, however this behavior is suboptimal for movement inputs, such as controlling a camera. When processing every input event, low framerates can lead to a large message stack that results in very latent movement and unresponsiveness.
The render overlays in the editor only work if the scene window is docked and in view, otherwise things break down. The overlay proxy should be integrated differently and using less direct cursor offsets, following how the console window system works.
There is a visual mismatch between the drawn cursor position and where ImGui believes the cursor is on screen. Additionally, as the physical cursor moves further from the window origin, the more the cursor desyncs. This can be seen when hovering over the resize icon on the bottom right of windows, if the window is near the top of the physical window it works fairly well, but if the window is near the bottom there is a significant offset.
The existing lighting uses incorrect normal mapping, where the normal map is sampled and given directly to the lighting equation, without any spatial correction.
Tracy uses vcpkg, so we should invoke the shipped installation script in order to acquire the server's dependencies. Without this, attempting to build without manually fetching the dependencies results in a failed TracyServer build.
Implement the new Agility SDK, separating new DirectX features from Windows updates. This will lighten the restriction on minimum Windows version in order to run the engine. Refer to this blog post for details.
Many materials use opacity mapping, which needs to take an alpha-testing path in a shader. A new render pass needs to be introduced for this, which would run after the opaque forward pass. In the Sponza model, the foliage and chain links use opacity mapping.
When logging, if data that has the same type as PlatformErrorType, it will be interpreted as a platform error (which may not be intended) due to the lack of context.
With shader model 6.6, resources can be accessed in shaders exclusively via bindless indices, without the need for multiple tables with overlapping unbounded arrays. Additionally, root signatures can have more root constant data than needed without issue. In theory one root signature with 64 DWORDs of root constants and nothing else should be all that's needed for every shader. Other notes: the runtime combines duplicate root signatures into one. Could use shader code generation for handling static samplers.
Use the info queue to automatically break the program when an error or corruption is detected in the debug layer. This can help track down rendering issues where the application will continue to run in a partially broken state.
If more than one full GPU synchronizations are dispatched within a single frame, only the first synchronization will be obeyed. This is due to a lack of per-frame sync values, which is needed for multi-sync.
Render passes all share the same zone name in Tracy due to the same SourceLocation being used for each pass (static). A unique source location should be able to be created for each pass without relying on macros.
I don't see a case where we won't be using programmable pulling for vertices, so exposing traditional vertex buffers is pointless. Removing them will reduce the existing bind flags for buffers.
The current method of getting the scene output in the editor GUI is to use a separate fullscreen render target, which acts as a back buffer texture. This is very wasteful with respect to GPU memory bandwidth, so consider drawing the scene inline with the editor GUI using ImDrawCallbacks.
When inspecting a depth texture in the editor, the depth should be linearized to allow for a more useful visual analysis. To do this, consider using a separate pipeline state for depth texture rendering that remaps the pixel values when drawing.
When a fatal log occurs, it will cause the attached debugger to break in Logging.h, instead of the actual offending line. This requires the user to view the callstack and manually step out a frame to get to the proper line, which isn't the desired behavior.
Shaders are compiled once at engine startup, but we should allow for hot reloading and recompiling all affected pipeline states. This will allow for modifying shader sources while the engine is running, without having to restart in order to see changes.
Fmtlib has poor formatting of HRESULTs, so a custom formatter should be introduced for this. Ideally, format HRESULTs as unsigned hex with the 0x prefix, and as capital letters. See fmtlib/fmt#235. Look into stringifying the error code as well.