ONNX and DirectML execution provider guide – part 2

Date:

Share post:

Introduction

The ONNX Runtime – DirectML execution provider (EP) is based on DirectX® 12, but because there are also many DirectX 11 applications, it’s necessary to find an easy way to integrate DirectML into DX11 to fully utilize the power of neural networks.

In part 2 of our guide, I will discuss a common way to share resources between DirectX 11 and 12 to utilize the performance of DirectML in a DirectX 11 application.

1. Problem description

Sharing resources across DirectX is a niche topic in real-time rendering, but an inference engine in a neural network can be made up of different SDKs like CUDA, HIP, and DirectX 12. It’s very common to come across tensor resource sharing issues across SDKs.

Resources are allocated by each logical GPU device, and even if they are on the same GPU hardware, sharing resources between each logical device shouldn’t need to rely on CPU transportation, or there will be a lot of unnecessary memory workload.

A much better way is to build a native data pipeline between each logical GPU device on the same hardware to share resources. , and if you want to do that, how do you it with the ONNX Runtime – DirectML EP + DirectX 11? For that, you need to dive into DirectX resource sharing APIs.

2. Setting up the shared resource in DirectX 11

Sharing resources from DirectX 11 to 12 needs to meet these conditions:

  1. Only texture can be shared from DirectX 11 to 12.
  2. The texture cannot contain a mipmap.
  3. The texture must be created with the D3D11_RESOURCE_MISC_SHARED_NTHANDLE flag.
  4. According to MSDN, it is recommended to add D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX.

D3D11_TEXTURE2D_DESC desc;

desc.Width = IMAGE_WIDTH;

desc.Height = IMAGE_HEIGHT;

desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;

desc.SampleDesc.Count = 1;

desc.SampleDesc.Quality = 0;

desc.Usage = D3D11_USAGE_DEFAULT;

desc.BindFlags = D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX | D3D11_RESOURCE_MISC_SHARED_NTHANDLE;

Microsoft::WRL::ComPtr<ID3D11Texture2D> sharedTexture;

HRESULT hr = dx11device->CreateTexture2D(&desc, NULL, sharedTexture.ReleaseAndGetAddressOf());

// succeed to create shared texture.

That you can only share texture not native resource is a bit of a pity, but as most of DirectX applications only need a CNN network, texture is enough for them.

3. Setting up the synchronization object in DirectX 11

As DirectX 11 and 12 use different pipelines, sharing a fence object between them for resource synchronization is recommended.

Sharing a fence from DirectX 11 to 12 must meet these conditions:

  1. Use ID3D11Device5 class to access CreateFence API.
  2. The fence must have D3D11_FENCE_FLAG_SHARED flag.

Microsoft::WRL::ComPtr<ID3D11Device5> deviceDX11_5;

Microsoft::WRL::ComPtr<ID3D11Fence> sharedFence;

HRESULT hr = dx11device.As(&deviceDX11_5);

if (SUCCEEDED(hr = deviceDX11_5->CreateFence(initialValue, D3D11_FENCE_FLAG_SHARED, IID_PPV_ARGS(&sharedFence)))) {

// succeed to create shared fence.

4. Getting the DirectX 11 shared object in DirectX 12

Now, the shared texture and fence are created by DirectX 11, you need to get them as a DirectX 12 object so you can then use them for inference in DirectML. This can be reached by DXGI shared handle API.

The open shared texture sample code is below and this can be read as the preprocessing input data. The postprocessing output data needs the DXGI_SHARED_RESOURCE_WRITE flag.

Microsoft::WRL::ComPtr<ID3D11Texture2D> sharedTexture;

Microsoft::WRL::ComPtr<IDXGIResource1> dxgires;

Microsoft::WRL::ComPtr<ID3D12Resource> frameBuffer;

HRESULT hr = frame.As(&dxgires);

HANDLE sharedhandle = NULL;

if (SUCCEEDED(hr = dxgires->CreateSharedHandle(NULL, DXGI_SHARED_RESOURCE_READ, NULL, &sharedhandle))) {

Microsoft::WRL::Wrappers::Event closeHandle(sharedhandle);

if (SUCCEEDED(hr = dx12device->OpenSharedHandle(sharedhandle, IID_PPV_ARGS(&frameBuffer)))) {

// succeed to get the shared texture.

The open shared fence sample code is below, this needs to be read and written for synchronization.

Microsoft::WRL::ComPtr<ID3D11Fence> sharedFence;

Microsoft::WRL::ComPtr<ID3D12Fence> frameBufferFence;

HANDLE sharedfenceHandle = NULL;

HRESULT hr = fence->CreateSharedHandle(NULL, GENERIC_ALL, NULL, &sharedfenceHandle);

Microsoft::WRL::Wrappers::Event closeHandle(sharedfenceHandle);

if (SUCCEEDED(hr = dx12device->OpenSharedHandle(sharedfenceHandle, IID_PPV_ARGS(&frameBufferFence)))) {

// succeed to get the shared fence.

5. The full pipeline of inference for DirectX 11

Here is the full inference pipeline of mixed DirectX 11 and 12 and it’s more complicated than the native DirectX 12 version but not hard to achieve.

6. Getting the DirectX 12 shared object in to DirectX 11

This part is optional. Sometimes DirectX 11 also needs to access DirectX 12 resources, which is feasible, but the API is a bit different.

Sharing resources from DirectX 12 to 11 also needs to meet these conditions:

  1. Only texture can be shared from DirectX 12 to 11.
  2. The texture cannot contain a mipmap.
  3. The texture must be created with the D3D12_HEAP_FLAG_SHARED flag.

Here is a sample code of shared resources created by DirectX 12.

D3D12_RESOURCE_DESC textureDesc = {};

textureDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;

textureDesc.Width = TEXTURE_WIDTH;

textureDesc.Height = TEXTURE_HEIGHT;

textureDesc.DepthOrArraySize = 1;

textureDesc.MipLevels = 1;

textureDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;

textureDesc.SampleDesc.Count = 1;

D3D12_HEAP_PROPERTIES heapProps = {};

heapProps.Type = D3D12_HEAP_TYPE_DEFAULT;

D3D12_CLEAR_VALUE clearValue = {};

clearValue.Format = textureDesc.Format;

Microsoft::WRL::ComPtr<ID3D12Resource> sharedTexture;

HRESULT hr = dx12device->CreateCommittedResource(&heapProps, D3D12_HEAP_FLAG_SHARED, &textureDesc, D3D12_RESOURCE_STATE_COMMON, &clearValue, IID_PPV_ARGS(&sharedTexture));

// succeed to create shared texture.

Then use the DirectX 11 device to open the shared texture from DirectX 12 (this requires DirectX 11.1).

Microsoft::WRL::ComPtr<ID3D12Resource> dx12Resource;

Microsoft::WRL::ComPtr<ID3D11Texture2D> dx11Resource;

HANDLE sharedHandle = NULL;

HRESULT hr = dx12device->CreateSharedHandle(dx12Resource.Get(), NULL, GENERIC_ALL, name, &sharedHandle);

Microsoft::WRL::Wrappers::Event closeHandle(sharedHandle);

if (SUCCEEDED(hr = dx11device->OpenSharedResource1(sharedHandle, IID_PPV_ARGS(&dx11Resource)))) {

// succeed to get the shared resource from DX12.

Now the DirectX 11 device can access the DirectX 12 shared texture.

7. Conclusion

I’ve covered a lot of information here and I hope this guide can help you with your inference pipeline. For more information about resource sharing in DirectX, please check out the DXGI shared resource and the DirectX 12 shared heaps on MSDN.

Source link

spot_img

Related articles

BlueNoroff’s latest campaigns: GhostCall and GhostHire

Introduction Primarily focused on financial gain since its appearance, BlueNoroff (aka. Sapphire Sleet, APT38, Alluring Pisces, Stardust Chollima, and...

Types of Scanners

The scanner has become a vital piece of technology with the evolution of the digital world. What began...

How to Digitalize Education Fairs and Maximize Impact: The Educoway Case

Organizing education fairs is no small task. From registering participants and managing check-ins to capturing leads for exhibitors,...