Code Monkey home page Code Monkey logo

spirv-headers's Introduction

SPIR-V Headers

This repository contains machine-readable files for the SPIR-V Registry. This includes:

  • Header files for various languages.
  • JSON files describing the grammar for the SPIR-V core instruction set and the extended instruction sets.
  • The XML registry file.
  • A tool to build the headers from the JSON grammar.

Headers are provided in the include directory, with up-to-date headers in the unified1 subdirectory. Older headers are provided according to their version.

In contrast, the XML registry file has a linear history, so it is not tied to SPIR-V specification versions.

How is this repository updated?

When a new version or revision of the SPIR-V specification is published, the SPIR-V Working Group will push new commits onto master, updating the files under include.

The SPIR-V XML registry file is updated by Khronos whenever a new enum range is allocated.

Pull requests can be made to

  • request allocation of new enum ranges in the XML registry file
  • register a new magic number for a SPIR-V generator
  • reserve specific tokens in the JSON grammar

Registering a SPIR-V Generator Magic Number

Tools that generate SPIR-V should use a magic number in the SPIR-V to help identify the generator.

Care should be taken to follow existing precedent in populating the details of reserved tokens. This includes:

  • keeping generator numbers in numeric order
  • filling out all the existing fields

Reserving tokens in the JSON grammar

Care should be taken to follow existing precedent in populating the details of reserved tokens. This includes:

  • pointing to what extension has more information, when possible
  • keeping enumerants in numeric order
  • when there are aliases, listing the preferred spelling first
  • adding the statement "version" : "None"

How to install the headers

mkdir build
cd build
cmake ..
cmake --build . --target install

Then, for example, you will have /usr/local/include/spirv/unified1/spirv.h

If you want to install them somewhere else, then use -DCMAKE_INSTALL_PREFIX=/other/path on the first cmake command.

Using the headers without installing

Using CMake

A CMake-based project can use the headers without installing, as follows:

  1. Add an add_subdirectory directive to include this source tree.
  2. Use ${SPIRV-Headers_SOURCE_DIR}/include} in a target_include_directories directive.
  3. In your C or C++ source code use #include directives that explicitly mention the spirv path component.
#include "spirv/unified1/GLSL.std.450.h"
#include "spirv/unified1/OpenCL.std.h"
#include "spirv/unified1/spirv.hpp"

See also the example subdirectory. But since that example is inside this repostory, it doesn't use and add_subdirectory directive.

Using Bazel

A Bazel-based project can use the headers without installing, as follows:

  1. Add SPIRV-Headers as a submodule of your project, and add a local_repository to your WORKSPACE file. For example, if you place SPIRV-Headers under external/spirv-headers, then add the following to your WORKSPACE file:
local_repository(
    name = "spirv_headers",
    path = "external/spirv-headers",
)
  1. Add one of the following to the deps attribute of your build target based on your needs:
@spirv_headers//:spirv_c_headers
@spirv_headers//:spirv_cpp_headers
@spirv_headers//:spirv_cpp11_headers

For example:

cc_library(
  name = "project",
  srcs = [
    # Path to project sources
  ],
  hdrs = [
    # Path to project headers
  ],
  deps = [
    "@spirv_tools//:spirv_c_headers",
    # Other dependencies,
  ],
)
  1. In your C or C++ source code use #include directives that explicitly mention the spirv path component.
#include "spirv/unified1/GLSL.std.450.h"
#include "spirv/unified1/OpenCL.std.h"
#include "spirv/unified1/spirv.hpp"

Generating headers from the JSON grammar for the SPIR-V core instruction set

This will generally be done by Khronos, for a change to the JSON grammar. However, the project for the tool to do this is included in this repository, and can be used to test a PR, or even to include the results in the PR. This is not required though.

The header-generation project is under the tools/buildHeaders directory. Use CMake to build and install the project, in a build subdirectory (under tools/buildHeaders). There is then a bash script at bin/makeHeaders that shows how to use the built header-generator binary to generate the headers from the JSON grammar. (Execute bin/makeHeaders from the tools/buildHeaders directory.) Here's a complete example:

cd tools/buildHeaders
mkdir build
cd build
cmake ..
cmake --build . --target install
cd ..
./bin/makeHeaders

Notes:

  • this generator is used in a broader context within Khronos to generate the specification, and that influences the languages used, for legacy reasons
  • the C++ structures built may similarly include more than strictly necessary, for the same reason

Generating C headers for extended instruction sets

The GLSL.std.450.h and OpenCL.std.h extended instruction set headers are maintained manually.

The C/C++ header for each of the other extended instruction sets is generated from the corresponding JSON grammar file. For example, the OpenCLDebugInfo100.h header is generated from the extinst.opencl.debuginfo.100.grammar.json grammar file.

To generate these C/C++ headers, first make sure python3 is in your PATH, then invoke the build script as follows:

cd tools/buildHeaders
python3 bin/makeExtinstHeaders.py

FAQ

  • How are different versions published?

    The multiple versions of the headers have been simplified into a single unified1 view. The JSON grammar has a "version" field saying what version things first showed up in.

  • How do you handle the evolution of extended instruction sets?

    Extended instruction sets evolve asynchronously from the core spec. Right now there is only a single version of both the GLSL and OpenCL headers. So we don't yet have a problematic example to resolve.

License

Copyright (c) 2015-2024 The Khronos Group Inc.

Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and/or associated documentation files (the
"Materials"), to deal in the Materials without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Materials, and to
permit persons to whom the Materials are furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Materials.

MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
   https://www.khronos.org/registry/

THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

spirv-headers's People

Contributors

alan-baker avatar alelenv avatar amdrexu avatar antiagainst avatar asudarsa avatar bashbaug avatar broxigarchen avatar dgkoch avatar dj2 avatar dmitrybushev avatar dneto0 avatar ehsannas avatar gfxstrand avatar gnl21 avatar heroseh avatar jdknight avatar jeffbolznv avatar johnkslang avatar kloczek avatar kpet avatar mean-ui-thread avatar mrsidims avatar nikitarudenkointel avatar penguinliong avatar raunraun avatar sirlynix avatar tiwaria1 avatar tobski avatar vmaksimo avatar wooyoungqcom avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spirv-headers's Issues

Storage class of invocation id's in openCL

[Transferring existing Bugzilla SPIR-V issues here.]

Bas Nieuwenhuizen 2016-03-23 17:04:09 PDT

I noticed SPIRV-LLVM emitting the GlobalInvocationID Builtin in the UniformConstant storage class. This surprised me and as that is supposed to be shared across all invocations, while the invocation id is unique per invocation.

So I don't think that that is the correct storage class for the builtin. However Input is only available with Shader Capability, so I don't know a better storage class to put it in.

So what is the correct storage class for GlobalInvocationId? (and related builtins)

Side note:
A lot of the builtins refer to the OpenCL API specification ... which has no mention of them.

davidnetokhr 2016-03-28 08:27:11 PDT

This is a good question.

UniformConstant is the only non-Shader storage class that's read-only. However, the spec also says it's OpenCL __constant memory. That seems a little off, as the number of __constant memory handles can be severely limited by an implementation.

I'll raise this at the working group.

VariablePointers conflicting with logical addressing

The VariablePointers and VariablePointersStorageBuffer capabilities allow OpPtrAccessChain. However per the universal validation rules, a pointer cannot be the result of an OpPtrAccessChain if logical adressing is used. Which makes it pretty useless for the VK_KHR_variable_pointers vulkan extension.

Furthermore, the only reference to these capabilities I see is that they allow OpPtrAccessChain, making it unclear in what situations it is allowed and with what restrictions. I can't even tell how VariablePointers and VariablePointersStorageBuffer are supposed to be different.

I used the 1.2 version of the spec for checking this issue.

initialization of a global pointer

With current version of the spec there is no way to express initialization of a global(program scope) pointer by address if other global variable.

Consider the following OpenCL C code:

int v = 0;
global int* p = &v;

It can be translated to LLVM as:

...
@v = addrspace(1) global i32 0, align 4
@p = addrspace(1) global i32 addrspace(1)* @v, align 4
...

LLVM to SPIR-V translator produces the following lines:

          ...
          %2 = OpTypeInt 32 0
          %3 = OpConstant %2 0
          %4 = OpTypePointer CrossWorkgroup %2
          %6 = OpTypePointer CrossWorkgroup %4
          %5 = OpVariable %4 CrossWorkgroup %3
          %7 = OpVariable %6 CrossWorkgroup %5

The latter OpVariable is initialized by a non-constant instruction, which violates the spec requirement for OpVariable.

var_addr.ll.txt
var_addr.cl.txt
var_addr.spt.txt
var_addr.spv.txt

Seems like #12 and this issue may have the same solution.

What are the merge blocks if a loop has multiple exits?

This is generated from glslang with

float test(float[2] arr) {
    uint i = 0;
    float sum = 42;
    while(true) {
        sum += 1.0;
        i+=1;
        if(i == 100){
            return sum;
        }
    }
    return sum;
}

The way I implemented it is: If the loop headers terminator is an OpBranch, I look for the first branch where one of the targets exits the loop. Then for this branch I don't generate an SelectionMerge. And one of its targets is the loop merge.

But what if we have multiple exits like in the image above?

For example for %22. Is %38 correct? Would %37 also be a correct merge block? Or do successive branches (like `%22) need to have the merge block inside the loop?

OpGroupIAdd shouldn't be forced to have the same return type as argument type

A native ballot() operation is a useful primitive to exploit for warp/wave/simd work compaction.

A ballot() operation is not exposed in SPIR-V or OpenCL (right?) and the existence of architectures with sub_group widths over 32 lanes preclude this from being represented with a uint32_t.

If the OpGroupIAdd opcode was relaxed to support differing return and argument types — specifically, an integer return type and boolean argument — then SPIR-V would be able to optionally efficiently express:

popcount( ballot() & lanes_less_than() )
popcount( ballot() & lanes_less_than_or_equal() )
popcount( ballot() )

This would then allow OpenCL to expose the following potentially optimal sub_group functions:

int sub_group_scan_exclusive_add(bool pred)
int sub_group_scan_inclusive_add(bool pred)
int sub_group_reduce_add(bool pred)

Alternatively, simply recognizing cases where the integer subgroup scan argument is guaranteed to be 0 or 1 would allow a native popcount( ballot() & lanes_mask_xxx() ) sequence to be emitted and the OpGroupIAdd opcode specification left as is.

The native operations described in this blog post might be a good basis for considering either altering the SPIR-V opcode or suggesting vendors optimize for the alternative.

control flow path branching and block mangling

Hi, I may have missed something in the specs, but the OpBranchConditional will create 2 control flow paths. In the
selection construct, you can have the blocks from those 2 control flow paths mangled. Namely, you can mangle the blocks
from the 2 control flow paths and still follow the block domination rules.
Is this mangling allowed by the specs, or did I miss something? Maybe some specs hardening on "branching" inside a construct at
the same depth?

Does BulitinSampleMask require CapabilitySampleRateShading?

Section 3.20 (Decoration) of the SPIR-V spec lists SampleRateShading in the row for SampleMask, and the SPIR-V validator checks for this. However, https://github.com/KhronosGroup/SPIRV-Headers/blob/master/include/spirv/1.2/spirv.core.grammar.json says:

{
      "enumerant" : "SampleMask",
      "value" : 20,
      "capabilities" : [ "Shader" ]
},

In desktop OpenGL, gl_SampleMaskIn does not require sample rate shading, but gl_SampleMask (the output) does. I don't know off the top of my head how this is exposed in Vulkan. Given the disagreement among the various specs, it's not 100% clear what the correct action for a SPIR-V generator is.

I don't think it's worth the effort to have different capabilities for the input and output in SPIR-V. If others agree, then spirv.core.grammar.json should be updated to list the SampleRateShading dependency. My code for emitting decorations and setting the necessary capabilities is generated directly from the JSON, so, whatever the conclusion, that file needs to be correct.

spirv.hpp uses implementation-defined underlying enum types

The glslang code that uses spirv.hpp causes compiler warnings like:

/KhronosGroup/glslang/SPIRV/GlslangToSpv.cpp:1973:33: warning: comparison of constant 4294967295 with expression of type 'spv::BuiltIn' is always true [-Wtautological-constant-out-of-range-compare]
                if (builtIn != spv::BadValue)
                    ~~~~~~~ ^  ~~~~~~~~~~~~~

This is not just a warning but a potential bug. The underlying type of an unscoped enum is an implementation-defined integral type capable of representing all values of the enum. Considering that spv::Builtin only represents a limited number of enum values, the underlying type could be 16-bit or even 8-bit in which case a variable of the enum type will always be unequal to 0xFFFFFFFF (spv::BadValue).

Maybe it would be good to force the enums to be 32-bit by adding a dummy MaxEnum = 0x7FFFFFFF to each enum type.

Discrepancy between Op(InBounds|Ptr)AccessChain and OpCompositeInsert/Extract operands

[Transferring existing Bugzilla SPIR-V issues here.]

mknejp 2015-08-10 16:19:31 PDT

The OpCompositeInsert/Extract instructions take immediate literals to specify the sequence of indices in a composite object whereas Op(InBounds|Ptr)AccessChain takes a mix of immediates and OpConstant depending on whether it's indexing a struct or not.

My assumption is that Op(InBounds|Ptr)AccessChain is meant to support OpSpecConstant, but wouldn't this be also useful for the composite operations? And if that is not the intention, why not make them all immediate?

To me this seems like a bug since the only difference between the two classes of instructions is the level of indirection on the base.

johnk 2015-09-15 08:36:29 PDT

It is correct that the "pointer-relative" instruction (OpAccessChain) has more indirection than the "in register" instruction (OpComposite*). This is to allow dynamic (not necessarily constant) computation of a memory

Emit a "Count" member or constant for enums

We just found and fixed a major buffer overflow in a compiler caused by an out-of-date spirv op count constant. Looking at spirv.hpp, there is no machine-generate count enum member or constant for the various enums. This seems like an easy addition that would help prevent unfortunate bugs.

Can Flat decorate a structure object?

This comes from KhronosGroup/glslang#418. I'm copying relevant spec. text here:

From the SPIR-V specification:

When applied to structure-type members, the Decorations Noperspective, Flat, Patch, Centroid, and Sample can only be applied to the top-level members of the structure type. (Nested objects' types cannot be structures whose members are decorated with these decorations.)

Also:

Flat

Apply to an object or a member of a structure type. Indicates no interpolation will be done. The non-interpolated value will come from a vertex, as described in the API specification. The object or member must be a scalar or vector of floating-point type or integer type. Arrays of these types are also allowed. Only valid for the Input and Output Storage Classes.

(This probably came from before deep inheritance propagation was disallowed in SPIR-V. When we decided to have it apply to top-level members only, this rule should have allowed structures.)

So, we have one rule saying it can only apply to top-level members, and another rule saying it can't apply to structure types. Together, that implies that top-level members cannot be structures, meaning there cannot be nested structures in input and output.

However, structures are not disallowed as top-level members of blocks (which in SPIR-V have a structure type). The only way to make something flat in the following case is to make the block-structure member be decorated as flat, as the rules are clearly saying don't go deeper than that:

struct S {  vec4 v; };
in inName {  flat S s; };  // Allowed?  I think so, but that means a struct is Flat

Given that structures should not be disallowed here, I suspect the rule saying Flat only applies to scalars/vectors is the overly restrictive one, especially given how clearly the other rule mentions not decorating members of nested structures (as opposed to saying "no structures").

Crash in OpVectorShuffle when Component literal is FFFFFFFF

Hello. I tried the code.

GLSL:
  vec4 v;
  v.x = 1.0;
Vulkan:
  VectorSuffle 1 1 2 -1 -1 -1

but it crashed in [index_to_swizzle].
I see SPIR-V Specification 1.00.

A Component literal may also be FFFFFFFF, which means the corresponding result component has no source and is undefined.

Could spirv_glsl not be implemented about FFFFFFFF?

The value of OpMax is incorrect

The SPIR-V binary use 16 bits to store the OpCode. Thus, the limit of OpCode should be OpMax = 0xffff instead of OpMax = 0x7fffffff.

OpLoopMerge + OpBranchConditional

According to the description of OpLoopMerge, the loop header block can contain a conditional branch (OpBranchConditional). Doing so would however require the block to contain an OpSelectionMerge in addition to the OpLoopMerge. This is currently not allowed by the spec, because both descriptions state that (OpLoopMerge|OpSelectionMerge) "must be the second-to-last instruction in its block".

E.g. this code is currently invalid:

   OpSelectionMerge %selmergebb None
        OpLoopMerge %loopmergebb %continuebb None
OpBranchConditional %condition %true %false

Solution: either remove OpBranchConditional from the OpLoopMerge description or refine the "second-to-last" text so that both merge operations can occur in a block.

If doing the latter: shouldn't OpLoopMerge allow OpSwitch as well then?

Root readme needs updating

Now that the github repo will be used for the issue tracking instead of the bugzilla; the root readme.md needs updating. In particular the sentence:

Issues with the header files should be filed in the Khronos public bugzilla database, against the Specification component of the SPIR-V product.

When does OpTypeImage use an access specifier?

I'm new to SPIR-V and OpenCL and found the description of OpTypeImage unclear. It says is that the access specifier parameter is optional without constraints, but all the access specifiers depend on the Kernel capability. Does that imply that image types with an access specifier must only be used in Kernel execution modes and those without must only be used in other execution modes?
That definition would fit into a larger picture since there is no mention of a default access specifier and when access specifiers are used on OpTypeImage they are accompanied by the ImageBasic and ImageReadWrite capabilities that also depend on Kernel.

P.S.: Also since we can have multiple entry points with different execution modes, the question can be asked if it is valid to have access specifiers on an image type that is used in both Kernel and non-Kernel execution modes. In other words, is the access specifier allowed and silently ignored in non-Kernel execution modes?

Tagging a stable version

Would you mind tagging a recent commit with a stable version (no "rc") of SPIRV-Headers so that it can be packaged in Homebrew?

Missing support for AMD extensions

The current public SPIR-V headers are missing headers/json for SPV_AMD_shader_ballot, SPV_AMD_shader_trinary_minmax, SPV_shader_explicit_vertex_parameter, and SPV_AMD_gcn_shader. Since at least one shipping game (DOOM) uses some of these extensions in Vulkan, we plan to implement these extensions in Mesa for radv. Since Mesa uses the C versions of the headers, and the only public version of the AMD extensions header (in glslang) is the C++ version, this means we'll have to ship modified (non-Khronos-official) versions of the SPIR-V headers in Mesa, which seems unfortunate.

SubgroupLocalInvocationId and SubgroupSize should list SubgroupBallotKHR

GL_ARB_spirv_extensions says:

Interactions with ARB_shader_ballot and SPV_KHR_shader_ballot:

    When using these extensions the following GLSL -> SPIR-V mapping is used:

    ...
    * in uint gl_SubGroupInvocationARB; -> SubgroupLocalInvocationId,
    * uniform uint gl_SubGroupSizeARB;  -> SubgroupSize,

However, the only capability listed for these enumerants is Kernel.

SubgroupLocalInvocationId and SubgroupSize also appear to be absent from SPV_KHR_shader_ballot.

Continue Construct and OpKill

Is a continue construct that consists of OpKill valid? If so, the rules on the header block post-dominating the continue construct need to be fixed. Otherwise, there's a bug in spirv-opt where kill() is improperly inlined.

http://shader-playground.timjones.io/1323167d9be3ec6d7a54996e098612de

spirv-opt output:

; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 7
; Bound: 13
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %main "main"
               OpExecutionMode %main OriginUpperLeft
               OpSource GLSL 330
               OpName %main "main"
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
       %main = OpFunction %void None %3
          %5 = OpLabel
               OpLoopMerge %11 %12 None
               OpBranch %12
         %12 = OpLabel
               OpKill
         %11 = OpLabel
               OpUnreachable
               OpFunctionEnd

spirv-opt input:

// Module Version 10000
// Generated by (magic number): 80007
// Id's are bound by 17

                              Capability Shader
               1:             ExtInstImport  "GLSL.std.450"
                              MemoryModel Logical GLSL450
                              EntryPoint Fragment 4  "main"
                              ExecutionMode 4 OriginUpperLeft
                              Source GLSL 330
                              Name 4  "main"
                              Name 6  "kill("
               2:             TypeVoid
               3:             TypeFunction 2
              14:             TypeBool
              15:    14(bool) ConstantTrue
         4(main):           2 Function None 3
               5:             Label
                              Branch 9
               9:             Label
                              LoopMerge 11 12 None
                              Branch 13
              13:             Label
                              BranchConditional 15 10 11
              10:               Label
                                Branch 12
              12:               Label
              16:           2   FunctionCall 6(kill()
                                Branch 9
              11:             Label
                              Return
                              FunctionEnd
        6(kill():           2 Function None 3
               7:             Label
                              Kill
                              FunctionEnd

spirv.h has trailing C99-style comment

While compiling spirv.h with C89 mode (-ansi in GCC), I get a warning:

output/include/spirv_cross/spirv.h:1206:9: warning: extra tokens at end of #endif directive [-Wendif-labels]
 #endif  // #ifndef spirv_H

The rest of the header uses C89-style comments, so it makes me think this instance of a C99 comment is unintentional. It compiles fine on GCC though for some reason, except the warning.

Non constant operand of OpSpecConstantOp

SPIR-V translator generates OpSpecConstantOp instruction with non-const operand, which violates spec requirements for this instruciton:

Operands are the operands required by opcode, and satisfy the semantics of opcode. In addition, all Operands must be the s of other constant instructions.

%11 = OpSpecConstantOp %10 InBoundsPtrAccessChain %8 %9 %9
%8 = OpVariable %7 CrossWorkgroup %6

OpVariable is not a constant instruction.

LLVM code contains ConstantExpr with GlobalVariable as first operand.

@v = addrspace(1) global [2 x i32] [i32 1, i32 2], align 4
@s = addrspace(1) global i32 addrspace(1)* getelementptr inbounds ([2 x i32] addrspace(1)* @v, i32 0, i32 0), align 4

In this case there is no way to satisfy the the semantics of OpSpecConstantOp instruction.

OpSpecConstantOp.ll.txt
OpSpecConstantOp.spt.txt
OpSpecConstantOp.spv.txt
OpSpecConstantOp.cl.txt

Where is support for cross-lane subgroup "shuffle" operations?

[Transferring existing Bugzilla SPIR-V issues here.]

Allan MacKinnon 2015-03-12 08:40:51 PDT

I don't see any SPIR-V opcodes for subgroup cross-lane "shuffle" operations.

There is fairly general cross-lane communication support in NVIDIA sm_30+ hardware as well as Intel Broadwell and, apparently, AMD GCN3.

Simple black box subgroup scan/reduce operations are not expressive enough.

johnk 2015-03-22 14:12:16 PDT

Thanks, we will look into this.

Allan MacKinnon 2015-03-22 14:28:45 PDT

Intel's extension could serve as a guide for the missing SPIR-V ops:

https://www.khronos.org/registry/cl/extensions/intel/cl_intel_subgroups.txt

johnk 2016-02-03 12:34:16 PST

This is being considered internally for a future release (not for 1.0).

Need to refine definition of a back edge used by structured control flow rules

The current definition of a back edge in SPIR-V is not quite correct:

Back Edge: If a depth-first traversal is done on a function’s CFG, starting from the first block of the function, a back edge is a branch to a previously visited block. A back-edge block is the block containing such a branch.

This was a late addition that doesn't quite work with the intent of the structured control flow rules.

Suggestions from @dneto0, in issue KhronosGroup/SPIRV-Tools#270 :

I think the spec's definition of back-edge incorrectly allows cross edges to be classified as back-edges. I think it should instead say the equivalent of "A back-edge is an edge in the CFG from a block B to a block C that dominates B. B may be the same as C."

But I think glslang's code generation is sensible, and if anything the spec should be amended to clearly allow unreachable continue-constructs.

For B-> C to be a back edge the "previously visited" phrasing to me means "I am currently visiting B, and the edge makes me want to visit C, but I have previously visited C". It's the "I am currently visiting B" part that is not satisfied if doing a DFS starting a the entry block of the function.

additional Storage Class is required: static or constant WorkgroupLocal

[Transferring existing Bugzilla SPIR-V issues here.]

Dm 2015-04-05 06:12:48 PDT

random lookup tables (which is data dependent, not work-item) requires to be inside WorkgroupLocal memory, because UniformConstant has slow random indexed access, but maximum workgroup size is (several times) smaller than maximum work items count per EACH compute unit which are in execution state (i.e. their registers and WorkgroupLocal memory is allocated, and they are running or waiting for memory loads) this leads to duplication of constant lookup table in WorkgroupLocal memory, which in turn leads to low processing performance (as WorkgroupLocal memory size is limited).

there is two ways of initialization of table:

  1. calculation of lookup table. In that case initialization code should be executed once for each compute unit for kernel run, or there will be additional performance loss.
  2. load from global or UniformConstant memory, in this case better execute optimal loading once.

P.S. may be other cases which would require static non const access to that (local compute unit) memory.

Dm 2015-04-05 06:19:38 PDT

correction: (which is indexed data dependent way)

PrimitiveId can't be used in Fragments

Currently the PrimitiveId BuiltIn is only allowed in Geometry and Tessellation but according to the Vulkan and OpenGl spec it should also be an allowed input to Fragment stages.

function recursion and re-entry

Hi,
Just skimmed through spirv 1.2 specs and it seems that function recursion and re-entry is now allowed?

Recursion and call-graph re-entry must be detected for those functions because they cannot afford a static allocation of vector registers. It seems related to the control flow graph dynamic thingies.
Roughly speaking:

  • each recursive function (or recursive call-subgraph) will need a stack in GPU memory for each vector unit-->this will hurt badly performance.
  • 're-entered' functions would need a new set of vector unit registers for each re-entry.

Did I miss something? Or do we go CPU-like (stack) vector units in GPU? (namely, we ate the bullet)

UniformConstant initialization

According to "3.7. Storage Class" of the SPIR-V spec, the UniformConstant storage class has the restriction that "Variables declared with this storage class are read-only, and cannot have initializers". This clashes with the necessity of constant value/object initialization in OpenCL.
Note that this was only added in SPIR-V Version 1.00, Revision 5, listed as "Khronos SPIR-V issue #56: Clarify that the meaning of "read-only" in the Storage Classes includes not allowing initializers.". The reasoning behind this is unclear (and the issue/report is not public as far as I can tell).

OpenCL example, built with Intel's compiler (same as a SPIRV-LLVM/SPIR toolchain):

constant float4 carr[] = {
    (float4)(1.0f),
    (float4)(1.0f),
};

kernel void constant_test(global float4* out) {
    out[get_global_id(0)] = carr[get_global_id(0) % 2];
}

violating OpVariable with an initializer marked below:

               OpCapability Addresses
               OpCapability Linkage
               OpCapability Kernel
          %1 = OpExtInstImport "OpenCL.std"
               OpMemoryModel Physical32 OpenCL
               OpEntryPoint Kernel %18 "constant_test"
               OpSource OpenCL_C 102000
               OpName %11 "carr"
               OpName %14 "__spirv_BuiltInGlobalInvocationId"
               OpName %19 "out"
               OpDecorate %31 Constant
         %31 = OpDecorationGroup
               OpDecorate %14 BuiltIn GlobalInvocationId
               OpDecorate %11 LinkageAttributes "carr" Export
               OpDecorate %14 LinkageAttributes "__spirv_BuiltInGlobalInvocationId" Import
               OpDecorate %11 Alignment 16
               OpGroupDecorate %31 %11 %14
          %2 = OpTypeFloat 32
          %4 = OpTypeVector %2 4
          %6 = OpTypeInt 32 0
          %8 = OpTypeArray %4 %7
         %10 = OpTypePointer UniformConstant %8
         %12 = OpTypeVector %6 3
         %13 = OpTypePointer UniformConstant %12
         %15 = OpTypeVoid
         %16 = OpTypePointer CrossWorkgroup %4
         %17 = OpTypeFunction %15 %16
         %25 = OpTypePointer UniformConstant %4
          %3 = OpConstant %2 1
          %5 = OpConstantComposite %4 %3 %3 %3 %3
          %7 = OpConstant %6 2
          %9 = OpConstantComposite %8 %5 %5
         %24 = OpConstant %6 0
-->      %11 = OpVariable %10 UniformConstant %9
         %14 = OpVariable %13 UniformConstant
         %18 = OpFunction %15 None %17
         %19 = OpFunctionParameter %16
         %20 = OpLabel
         %21 = OpLoad %12 %14 Aligned 0
         %22 = OpCompositeExtract %6 %21 0
         %23 = OpUMod %6 %22 %7
         %26 = OpInBoundsPtrAccessChain %25 %11 %24 %23
         %27 = OpLoad %4 %26 Aligned 16
         %28 = OpLoad %12 %14 Aligned 0
         %29 = OpCompositeExtract %6 %28 0
         %30 = OpInBoundsPtrAccessChain %16 %19 %29
               OpStore %30 %27 Aligned 16
               OpReturn
               OpFunctionEnd

I would presume this restriction on UniformConstant was put in place because of its use in Vulkan/GLSL, where UniformConstant is used for samplers and images (13.1 Vulkan spec), which obviously can't have initializers in Vulkan/GLSL, as well as Input and PushConstant variables which understandably can't have initializers either. So it seems like this restriction on UniformConstant was more of an accident than an intentional change when Input/PushConstant were changed. Also note that glslang is currently emitting Function storage class variables for any constant variables in GLSL, which is spec conformant, but seems unfortunate to me, considering that vendor compilers might want to put this into separate/special read-only memory instead of it eating up registers and not being read-only.

Proposed resolution: refine the restriction on UniformConstant so that only initialization of opaque types is not allowed, with one exception for OpenCL (or LiteralSampler capability), where it's also allowed to initialize OpTypeSampler variables with a OpConstantSampler.

Does the exit branch for a loop need to be an OpBranchConditional?

Consider something like this

Where bb6 exits the loop with bb17. I assume that would be illegal because only OpBranchConditional and OpBranch can be "annotated" with an OpLoopMerge.

Couldn't it be relaxed so that an OpSwitch can also become a LoopHeader if the switch has two targets or less? Default + 1 target branch

handling of kernel struct parameters

Note: cross-posted from KhronosGroup/SPIRV-LLVM#151

given the following OpenCL C code:

typedef struct {
    int val;
} test_struct;

kernel void struct_test(global int* buf, test_struct param) {
    buf[get_global_id(0)] = param.val;
}

kernel void int_test(global int* buf, int param) {
    buf[get_global_id(0)] = param;
}

resulting in the following IR (shortened for brevity):

%struct.test_struct = type { i32 }

define spir_kernel void @struct_test(i32 addrspace(1)* %buf, %struct.test_struct* %param) nounwind {
  %1 = getelementptr inbounds %struct.test_struct* %param, i64 0, i32 0
  %2 = load i32* %1, align 4, !tbaa !12
  %3 = tail call spir_func i64 @_Z13get_global_idj(i32 0) nounwind readnone
  %4 = getelementptr inbounds i32 addrspace(1)* %buf, i64 %3
  store i32 %2, i32 addrspace(1)* %4, align 4, !tbaa !12
  ret void
}

define spir_kernel void @int_test(i32 addrspace(1)* %buf, i32 %param) nounwind {
  %1 = tail call spir_func i64 @_Z13get_global_idj(i32 0) nounwind readnone
  %2 = getelementptr inbounds i32 addrspace(1)* %buf, i64 %1
  store i32 %param, i32 addrspace(1)* %2, align 4, !tbaa !12
  ret void
}

resulting in the following SPIR-V (shortened for brevity):

               OpEntryPoint Kernel %12 "struct_test"
               OpEntryPoint Kernel %25 "int_test"
               OpDecorate %5 LinkageAttributes "__spirv_BuiltInGlobalInvocationId" Import
          %2 = OpTypeInt 64 0
          %7 = OpTypeInt 32 0
         %16 = OpConstant %2 0
         %17 = OpConstant %7 0
          %3 = OpTypeVector %2 3
          %4 = OpTypePointer UniformConstant %3
          %6 = OpTypeVoid
          %8 = OpTypePointer CrossWorkgroup %7
          %9 = OpTypeStruct %7
         %10 = OpTypePointer Function %9
         %11 = OpTypeFunction %6 %8 %10
         %18 = OpTypePointer Function %7
         %24 = OpTypeFunction %6 %8 %7
          %5 = OpVariable %4 UniformConstant
         %12 = OpFunction %6 None %11
         %13 = OpFunctionParameter %8
         %14 = OpFunctionParameter %10
         %15 = OpLabel
         %19 = OpInBoundsPtrAccessChain %18 %14 %16 %17
         %20 = OpLoad %7 %19 Aligned 4
         %21 = OpLoad %3 %5 Aligned 0
         %22 = OpCompositeExtract %2 %21 0
         %23 = OpInBoundsPtrAccessChain %8 %13 %22
               OpStore %23 %20 Aligned 4
               OpReturn
               OpFunctionEnd
         %25 = OpFunction %6 None %24
         %26 = OpFunctionParameter %8
         %27 = OpFunctionParameter %7
         %28 = OpLabel
         %29 = OpLoad %3 %5 Aligned 0
         %30 = OpCompositeExtract %2 %29 0
         %31 = OpInBoundsPtrAccessChain %8 %26 %30
               OpStore %31 %27 Aligned 4
               OpReturn
               OpFunctionEnd

Is the way kernel struct parameters are handled really the correct/intended behavior?
Considering that scalar types are directly used in OpFunctionParameter/OpTypeFunction, shouldn't structs be handled the same way instead of going through an "OpTypePointer Function" indirection? Even more, doesn't this indirection say that only a pointer argument will be set/used (4 or 8 bytes), not so much a struct object (which could be any size)?
I know that the issue here is that LLVM/SPIR can only handle struct parameters as pointers, but something like that isn't specified for SPIR-V.

How to solve this?

Option 1 (preferable):
Keep it the way it is right now, but explicitly specify that kernel pointer parameters to Function/private memory actually perform some kind of allocation of the element/pointee type on the device side, and are set as this element/pointee type on the host side (not as the pointer type). Note that private address space pointer kernel arguments are otherwise invalid.

Option 2 (impossible?):
Directly use OpTypeStruct in OpFunctionParameter/OpTypeFunction. This will however require IR/SPIR-V translator changes, since OpTypeStruct is no longer a pointer type (making all GEPs/Op*AccessChain instructions using it invalid). This might be impossible to do though, since there is no way of getting a pointer to this struct then in SPIR-V (afaik).

edit:
Option 3:
Require a OpVariable in OpFunctionParameter/OpTypeFunction for struct types. This way it should be clear what is actually happening + it is still a pointer.

Rust enums

I'd be happy to extend the current build headers tool to generate rust enums. Should I open a pull request?

Spirv-Headers have no tags

It would bee good to give spirv-headers tags otherwise users build inconstant applications.

In this case spriv-headers should be synchronized with spirv-tools.

Clarification on code motion section

Texturing instructions in the Fragment Execution Model that rely on an implicit derivative cannot be moved into control flow that is not known to be uniform control flow within each derivative group.

It's not totally clear to me what cannot be moved means in this context. Is it not allowed to have texture ops with implicit derivatives to be inside non-uniform control flow paths? Compilers should not move those instructions into non-uniform control flow paths?

Duplicate capability constant

In include/spirv/1.1/spirv.h, the value 4433 is used for two enum values (lines 637+)

SpvCapabilityStorageBuffer16BitAccess = 4433,
SpvCapabilityStorageUniformBufferBlock16 = 4433,

It looks like the SpvCapabilityStorageBuffer16BitAccess should be 4432.

Same error on the next two lines:

SpvCapabilityStorageUniform16 = 4434,
SpvCapabilityUniformAndStorageBuffer16BitAccess = 4434,

Is it valid structured control flow: switch-case goto loop merge

switchcase_goto_loop_merge.spv.txt

The switch-case is inside the loop. It branches from switch-case to loop merge block.
I think it's invalid, but I'd like someone to confirm it. Is there some validator that can check it?

    %808 = OpLabel
           OpLoopMerge **%809** %810 None
           OpBranch %811
    ...
           OpSelectionMerge %813 None
           OpSwitch %849 %814 0 %815 1 %816 2 %817
    %814 = OpLabel
           OpBranch **%809**
    %815 = OpLabel
    %818 = OpLoad %211 %213
    %819 = OpLoad %215 %217
    %820 = OpSampledImage %219 %818 %819
    %822 = OpImageSampleImplicitLod %27 %820 %670
           OpBranch **%809**
    ...
    %813 = OpLabel
           OpUnreachable
    %810 = OpLabel
           OpBranch %808
    %809 = OpLabel
    ...

Commentary instruction

I'm writing a Maxwell -> SPIR-V decompiler for an emulator. For debugging purposes, commenting the current instruction to be decompiled is useful because it let's you know what it's trying to do. That's easy to do in GLSL, but in SPIR-V there's no such thing.

To workaround this issue what I'm currently doing is emitting an OpUndef %void and then naming it, that way it ends up being visible in spirv-dis's output:

               OpName %12_MUFU_0x5080000000470003 "12_MUFU_0x5080000000470003"
               OpName %13_IPA_0xe043ff880037ff00 "13_IPA_0xe043ff880037ff00"

...

%12_MUFU_0x5080000000470003 = OpUndef %void
        %315 = OpLoad %float %gpr0
        %316 = OpLoad %float %gpr4
        %317 = OpFDiv %float %float_1 %315
               OpStore %gpr3 %317
%13_IPA_0xe043ff880037ff00 = OpUndef %void
        %321 = OpLoad %float4 %input_attr_0
        %322 = OpCompositeExtract %float %321 0
               OpStore %gpr0 %322

What I want is something that can be easily dropped from the binary like OpLine. In my opinion a OpComment "This is a commentary" would be optimal (OpString in a function body is illegal).

As a note, I'm emitting SPIR-V directly to binary so spirv-as commentaries won't work (and they wouldn't be visible in debugging tools).

Offset/ConstOffset Image Operand with a Cube image

[Transferring existing Bugzilla SPIR-V issues here.]

florian.ziesche 2016-05-01 04:47:27 PDT

As far as I can tell from the spec an Offset/ConstOffset Image Operand (3.14) isn't explicitly forbidden when using it with a cube map.

If this is intended:
"The number of components must equal the number of components in Coordinate" is either incorrect and should be "2 components for a cube image" (since the offset should be applied to (s,t) / inside the 2D face image after transforming the 3D vector), or just mention that the 3rd component is ignored.

If this is not intended:
Mention that it is not supported with Dim Cube.
Note that this is the case for GLSL (and certain other platforms) right now.

Also, minor editorial thing/question: both ConstOffset and Offset say "It is a compile-time error if these fall outside a target-dependent allowed range". Should this be mentioned at all for Offset or should it rather be replaced by "run-time error" or "UB ensues"? Though one might still expect a compile-time error when using Offset with constant offsets.

OpEntryPoint should list _all_ variable IDs referenced by the corresponding function.

The specification requires listing only inputs and outputs for OpEntryPoint. It should require listing all global variable IDs (including uniforms) that are referenced within the function's static call graph.

In order to verify that two SPIR-V modules can be linked, and report meaningful errors, an application cannot rely on the ID list provided by OpEntryPoint.

Listing only inputs and outputs is not sufficient to determine if linkage between stages from different modules (that share one or more uniforms by name or location) is valid. The locations, bindings, and types of shared uniforms must match between stages.

Unless the application assumes that all entry points in a module reference all uniforms declared therein, static analysis is still necessary to perform this very basic kind of validation.

Determining whether a referenced variable ID is an input or an output is easy, determining if a function actually uses a variable is time-consuming.

I am unable to find rationale for this limitation. If it is documented somewhere, I would appreciate a link.

OpCompositeConstruct causes driver crash in rare (but seemingly valid) case

I have a project that generates SPIR-V directly, rather than compiling it from GLSL via glslangValidator.

The code under consideration takes a vector of 3 floats (vec3) and creates a vector of 4 floats (vec4) by concatenating the vec3 with a fourth component, 1.0. The equivalent GLSL is:

vec3 v3 = input.xyz;
vec4 v4 = vec4(v3, 1.0); // <-- This is what I want to accomplish.

To accomplish this, I generated what seemed to be the most straightforward SPIR-V code:

%1 = OpLoad %v3float %...
%2 = OpConstant %float 1
%3 = OpCompositeConstruct %v4float %1 %2

However, when attempting to use this code in a vertex shader, it causes an access violation in the latest AMD driver (16.9.3).

When the same function is performed in GLSL and compiled by glslangValidator, it creates the vec4 a different way. It extracts each float component from the vec3 individually and constructs a vec4 from 4 floats. glslangValidator generates SPIR-V code like this:

%1 = OpLoad %v3float %...
%2 = OpCompositeExtract %float %1 0
%3 = OpCompositeExtract %float %1 1
%4 = OpCompositeExtract %float %1 2
%5 = OpConstant %float 1
%6 = OpCompositeConstruct %v4float %2 %3 %4 %5

This code works in the latest AMD driver.

Now, the SPIR-V spec seems to imply that the original code (the one that crashes) is valid and should do what I expect. OpCompositeConstruct should create a v4float if the input parameters are a v3float and a float. The SPIR-V 1.0 Specification says:

OpCompositeConstruct

Construct a new composite object from a set of constituent objects that will fully form it.

Result Type must be a composite type, whose top-level members/elements/components/columns have the same type as the types of the operands, with one exception. _The exception is that for constructing a vector, the operands may also be vectors with the same component type as the Result Type component type. When constructing a vector, the total number of components in all the operands must equal the number of components in Result Type._

The "exception" allows combining vectors of different sizes into a new vector whose size is the total of all constituents. But since glslangValidator does not generate such code, I suspect that driver programmers have neglected to support this case.

In closing, I have two questions. Do I understand correctly that my original SPIR-V is valid? If so, where can I submit a bug report to AMD?

Reduce requirement for building static call graph for validation

[Transferring existing Bugzilla SPIR-V issues here.]

ratchet freak 2015-08-31 03:51:26 PDT

To verify that function is only used from 1 execution model you currently need to create the static call graph for each entry point in the module.

This is needed to verify that opKill for example is only called from a fragment shader.

This is also made much harder if you use exported and imported functions as poor-man's uniform subroutines. If you export a function you can't enforce that only a fragment shader can call into it.

I suggest having function decorated with which execution model(s) are allowed to call into it. This way you can verify each function separately.

For example I have a generic purpose foo function that I declare that all models can call into. I also have a bar that I declare that only fragment shader can call (because it uses dPdx for example).

foo cannot call bar, but bar can call foo (the execution models of the callee must be a super-set of the caller's).

johnk 2016-02-03 12:59:42 PST

(In reply to ratchet freak from comment #0)

To verify that function is only used from 1 execution model you currently
need to create the static call graph for each entry point in the module.

Sounds right.

This is needed to verify that opKill for example is only called from a
fragment shader.

Generally, a validator, or other offline tools, will want to do such a thing, for multiple reasons, while execution time for an end user may assume code is valid and won't do such a thing.

This is also made much harder if you use exported and imported functions

That sounds like an OpenCL ability right now, not graphics.

as
poor-man's uniform subroutines.

That sounds like graphics. SPIR-V modules contain already fully linked graphical stages.

If you export a function you can't enforce
that only a fragment shader can call into it.

I suggest having function decorated with which execution model(s) are
allowed to call into it. This way you can verify each function separately.

I'll make this an enhancement request for this. I think validation would still build the full static tree to validate the decorations were done correctly though, and run-time execution would still assume the code is correct.

Spec bug: OpConvertPtrToU wrong result type

In the description of OpConvertPtrToU

Result Type must be a scalar or vector of integer type, whose Signedness operand is 0.
should be
Result Type must be a scalar of integer type, whose Signedness operand is 0.

OpAccessChain should allow OpConstantNull for structures

The SPIR-V spec says:

Each of the Indexes must:

...
- be an OpConstant when indexing into a structure.

This is too restrictive. An OpConstantNull can also define an integer constant (of zero), so it should be allowed. For at least some SPIR-V generators, this will effectively mean that OpConstantNull cannot be used for scalar integers since the code that generates the constant may not know what the constant will be used for.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.