Code Monkey home page Code Monkey logo

d3d12translationlayer's Introduction

D3D12 Translation Layer

The D3D12 Translation Layer is a helper library for translating graphics concepts and commands from a D3D11-style domain to a D3D12-style domain.

A D3D11-style application generally:

  • Records graphics commands in a single-threaded manner.
  • Treats the CPU and GPU timeline as a single timeline. While there are some places where the asynchronicity of the GPU is exposed (e.g. queries or do-not-wait semantics), for the most part, a D3D11-style application can remain unaware of the fact that commands are recorded and executed at a later point in time.
    • Related to this is the fact that CPU-visible memory contents must be preserved from the time the CPU wrote them, until after the GPU has finished reading them in order to maintain the illusion of a single timeline.
  • Creates individual state objects (e.g. blend state, rasterizer state) and compiles individual shaders, and only at the time when Draw is invoked does the application provide the full set of state which will be used.
  • Ignore GPU parallelism and pipelining, trusting that driver introspection will maximize GPU utilization by parallelizing when possible, while preserving D3D11 semantics by synchronizing when necessary.

In contrast, a D3D12-style application must:

  • Be aware of GPU asynchronicity, and manually synchronize the CPU and GPU.
  • Be aware of GPU parallelism, and manually synchronize/barrier/transition resources from one usage to another.
  • Manage memory, including allocation, deallocation, and "renaming" (discussed further below).
  • Provide large bundles of state (called pipeline state objects in D3D12) all at once to enable cross-pipeline compilation and optimization.

To that end, this library provides an implementation of an API that looks like D3D11, and submits work to D3D12.

Make sure that you visit the DirectX Landing Page for more resources for DirectX developers.

Project Background

This project was started during the development of Windows 10 and D3D12. The Windows graphics team has a large set of D3D11 content which was heavily utilized during design and bringup of the D3D12 runtime and driver models. In order to use that content, a mapping layer, named D3D11On12, was developed.

This mapping layer proved successful and useful, to the point that a second mapping layer was developed, named D3D9On12. As the name implies, this maps from D3D9 to D3D12, and has to solve a lot of the same problems as D3D11On12. So, D3D11On12 was refactored into two pieces: a part that implements the D3D11-specific concepts, and a more general part that translates more traditional graphics constructs into a modern low-level D3D12 API consumer. This more general part is what became the D3D12TranslationLayer.

This code is currently being used by two mapping layers that ship as part of Windows: D3D11On12 and D3D9On12. In addition to the core D3D12TranslationLayer code, we also have released the source to D3D11On12, to serve as an example of how to consume this library.

What does this do?

This translation layer provides the following high-level constructs (and more) for applications to use:

  • Resource binding
    The D3D12 resource binding model is quite different from D3D11 and prior. Rather than having a flat array of resources set on the pipeline which map 1:1 with shader registers, D3D12 takes a more flexible approach which is also closer to modern hardware. The translation layer takes care of figuring out which registers a shader needs, managing root signatures, populating descriptor heaps/tables, and setting up null descriptors for unbound resources.

  • Resource renaming
    D3D11 and older have a concept of DISCARD CPU access patterns, where the CPU populates a resource, instructs the GPU to read from it, and then immediately populates new contents without waiting for the GPU to read the old ones. This pattern is typically implemented via a pattern called "renaming", where new memory is allocated during the DISCARD operation, and all future references to that resource in the API will point to the new memory rather than the old. The translation layer provides a separation of a resource from its "identity," which enables cheap swapping of the underlying memory of a resource for that of another one without having to recreate views or rebind them. It also provides easy access to rename operations (allocate new memory with the same properties as the current, and swap their identities).

  • Resource suballocation, pooling, and deferred destruction
    D3D11-style apps can destroy objects immediately after instructing the GPU to do something with them. D3D12 requires applications to hold on to memory and GPU objects until the GPU has finished accessing them. Additionally, D3D11 apps suffer no penalty from allocating small resources (e.g. 16-byte buffers), where D3D12 apps must recognize that such small allocations are infeasible and should be suballocated from larger resources. Furthermore, constantly creating and destroying resources is a common pattern in D3D11, but in D3D12 this can quickly become expensive. The translation layer handles all of these abstractions seamlessly.

  • Batching and threading
    Since D3D11 patterns generally require applications to record all graphics commands on a single thread, there are often other CPU cores that are idle. To improve utilization, the translation layer provides a batching layer which can sit on top of the immediate context, moving the majority of work to a second thread so it can be parallelized. It also provides threadpool-based helpers for offloading PSO compilation to worker threads. Combining these means that compilations can be kicked off at draw-time on the application thread, and only the batching thread needs to wait for them to be completed. Meanwhile, other PSO compilations are starting or completing, minimizing the wall clock time spent compiling shaders.

  • Residency management
    This layer incorporates the open-source residency management library to improve utilization on low-memory systems.

Building

This project produces a lib named D3D12TranslationLayer.lib. Additionally, if the WDK is installed in addition to the Windows SDK, a second project for a second lib named D3D12TranslationLayer_WDK.lib will be created.

The D3D12TranslationLayer project requires C++17, and only supports building with MSVC at the moment.

Contributing

This project welcomes contributions. See CONTRIBUTING for more information. Contributions to this project will flow back to the D3D11On12 and D3D9On12 mapping layers included in Windows 10.

Roadmap

There are three items currently on the roadmap:

  1. Refactoring for additional componentization - currently the translation layer is largely implemented by a monolithic class called the ImmediateContext. It would be difficult for an application consumer to pick and choose bits and pieces of functionality provided by this class, but that would be desirable to ease application porting to D3D12 while enabling the application to take on only those responsibilities with which they can achieve improved performance through app-specific information.
    A high-level thinking here is to require consumers to have an ImmediateContext object, and have sub-components that are registered with that context. For example, the resource state tracker component need not always be present, and applications could provide explicit resource barriers rather than relying on the ImmediateContext to do it for them.
    A key constraint on this componentization is that it should not negatively impact performance.
  2. Supporting initial data upload on D3D12_COMMAND_LIST_TYPE_COPY for discrete GPUs, and using WriteToSubresource for UMA (integrated) GPUs. This should improve performance.
  3. Supporting multi-GPU scenarios, specifically, using multiple nodes on a single D3D12 device. Currently, the D3D12TranslationLayer only supports one node, though it can be a node other than node 0.

Other suggestions or contributions are welcome.

Data Collection

The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft's privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices.

Specifically:
The g_hTracelogging variable has events emitted against it with keywords which may trigger telemetry to be sent, depending on the configuration and registration of the tracelogging provider which is used. If no tracelogging provider is specified (using D3D12TranslationLayer::SetTraceloggingProvider) or if the specified provider is not configured for telemetry, then no telemetry will be sent. In the default configuration, no provider is created and no data is sent.

d3d12translationlayer's People

Contributors

bhouse-intel avatar colta95 avatar hanfling avatar iandunn-intel avatar jacquesvanrhynmsft avatar jenatali avatar joecitizen avatar microsoft-github-policy-service[bot] avatar python3kgae avatar randytiddmsft avatar randytiddmsft2 avatar sivileri avatar strega-nil avatar vdwtanner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

d3d12translationlayer's Issues

D3D9: Picture is very bright, FPS is very poor

I was really hoping I could use this to interop with OpenXR (I'd been doing an OpenGL port with ANGLE for weeks when I stumbled across this on Wikipedia's WDDM page) except for some reason it's making my game app very bright (bright white colors) and the frame rate is very low, I can't say how low because I'm on a new PC with D3D12 that should run the app with frame rate to spare several times over, however D3D9on12 is running well below 60fps (the average is on lower side of 30-60fps with 60hz syncing.)

TEST RESULTS: For the record, with "vsync" disabled, plain D3D9 runs about 145fps and D3D9on12 runs about 45fps in this case.

EDITED: I should add I'm using the Windows SDK version of d3d9.dll, on up-to-date "fast ring" "insiders build" of Windows 10.

The transition from exclusive state to shared state must be accompanied by copying of actual state values.

m_spExclusiveState[0].IsMostRecentlyExclusiveState = false;

Most of the time, resources are exclusive and they make the transition on first use to another queue, tipically during PostSubmitUpdateState() -> SetSharedResourceState(). Does I undertand right that we have to copy actual exclusive state to shared in order to be sure that next transition as shared resource will be with correct state "before" ?

Dirty bit handling can cause binding of stale descriptors

Reported by ThomasY on Discord.

This app bound a constant buffer to the PS stage, bound a pixel shader that used that constant buffer, and drew. This updated the descriptor table for constant buffers.

It then deleted that constant buffer, flushed the command list, changed the constant buffer bound to the PS, and bound a pixel shader that does not use a constant buffer, and drew.

This combination of events caused us to skip updating the PS CB descriptor table, because the currently-bound pixel shader couldn't use the new one, but then still call SetGraphicsRootDescriptorTable on the command list for the previous PS descriptor table, which now contains a descriptor pointing to a deleted resource.

I'm not sure offhand what the fix here would be. Probably to skip this binding until a PS is bound that can actually reference one of these CBs.

Lacking documentation on 9on12 Interop API

I'm trying to get my hands on 9On12 but I can't seem to find any good source of documentation besides your recent blogpost.

I'm running on 19041.208 (insider slow ring). In a v140 project I'm using the 10.0.19041.0 target version. I got the header files but I can not link the Direct3DCreate9On12 function (LNK2001). Do I need to specify something in order to make that work? For now I'm working around this issue with a dynamic call as shown here.

Also...

    D3D9ON12_ARGS arg = {};
    arg.Enable9On12 = true;
    m_pD3D = RunD3D9Proc<decltype(Direct3DCreate9On12)>("Direct3DCreate9On12", D3D_SDK_VERSION, &arg, 1);

the code above succeeds and the device is created and everything works fine BUT...
Querying the interface for IDirect3dDevice9On12 fails with E_NOINTERFACE. Why is that?

    IDirect3DDevice9On12* mpD3D12 = NULL;
    hr = m_pD3D->QueryInterface(__uuidof(IDirect3DDevice9On12), (void**)&mpD3D12);

I would look up how to solve this myself but I just can't find any good resource on this.

Creating a query heap per query might be too expensive

Trying to run Tracy through D3D11On12 takes far too long to start up, with pretty much all of the time spent creating 64*1024 timestamp queries. Each of these timestamp queries creates a 4-query heap, which likely involves allocating a 64KiB query heap to store 32B.

This should probably turn into a device (immediate context) level query heap pool per type where query heap slots can be suballocated out to individual queries.

Lacking documentation on Direct3DCreate9On12

Would it be possible to create an example app which is using the D3D9on12 layer end-to-end? For me IDirect3D9 gets created correctly by Direct3DCreate9On12 but when I call:

g_pD3D->CreateDevice(D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, _hWnd, 0, &g_d3dpp, &g_pd3dDevice);

it fails with HR = 0x8876086C (unable to detect a supported Graphics Card).

Thanks!

Missing d3d9.lib exports

As described here, d3d9.lib exports for Direct3DCreate9On12 and others are missing.

I can confirm this issue on windows version 19041.208 (insider slow ring) on a v140 project using the 10.0.19041.0 target SDK version.

The exports exist in the d3d9.dll however and calling those functions can be achieved with a hack - basically using GetProcAddress to retrieve the exported function and calling it.

D3D11 Dependency

Scanning through the codes I've discovered this inclusion:

#include <d3d11_3.h>

I just wonder why? Shouldn't the translation layer actually be API-agnostic as it's mainly utilized to
actually implement some higher-level APIs in turn?

Circular dependency in include files

I'm trying to compile the library using CMake and Ninja instead of the Visual Studio generators. I was aware that IntelliSense is smarter when used in tandem with MSBuild, as it picks up global information of the build and doesn't operate solely based on translation units. When building with Ninja, after realizing that PCH isn't a build acceleration method but is actually a hard requirement of the build, I added target_precompiled_header invocation and realized that the project still doesn't build. I started adding #includes directives to the headers to please the compiler, but ultimately engaged in cyclic include directives. A lot of headers arrive to including DeviceChild.hpp which in turn requires ImmediateContext.hpp which in turn requires Resource.hpp which then requires DeviceChild.hpp again. Removing any of these includes results in undefined types when compiling the very same pch.hpp the build otherwise relies on.

Can anyone comment on how to untie this knot?

ProcessBatchWork crash.

OS: Windows 10 Pro, Version 1903, OS Build 18362.657
DLL: d3d11on12.dll, version 10.0.18362.387, WinBuild.160101.0800
PDB Signature: 6e29daca7e48d493f5fe671202e9d6c51 (d3d11on12.pdb from Microsoft Symbol Servers)

After trying to debug an issue with the latest version of pix, causing the game I'm currently working on to crash, I ended up in the D3D12TranslationLayer::BatchedContext::ProcessBatchWork function. Though this appears to be a compiler issue. After going through the disassembly of the function, I found that it ends up basically trying to read data from rcx + 8 + rcx * 8.

This is a 100% crash during Present.
D3D12TL_MarkedDissassembly.txt (I've marked the crashing line with >>)

Accessing command list on D3D9On12

Hello

We have our engine on Directx9 and we want to migrate to Directx12. We want migration step wise so using D3D9On12 where we have access to Directx12 device, command queue but no access to Command List.

Is there any way of getting acccess of command list to recored my command.

D3D11Transition lack of documentation

Is there any chance that code samples or documentation will appear in the near future?
I've got working code written with DX11 API and wanted to port it on DX12. What should i do to achieve that? I tried to link your library instead of d3d11, but it seems to not working.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.