3 Replies Latest reply: Dec 12, 2012 7:56 AM by gsellers RSS

Are dynamically-indexed subroutine arrays in GL4 on HD5xxx hardware half-possible?

bananafish Newbie
Currently Being Moderated

To my astonishment, the following successfully compiles in GLSL 410 on a Radeon HD 5570 using FGLRX-updates 2:9.000 ubuntu:

 

#version 410
subroutine uniform testSR testSub[2];
/**
Code here setting up the subroutines for testSR
**/

void main()
{
testSub[gl_VertexID%2]();
}

 

...however this resulted in execution only of the subroutine at array element 0. My understanding is that instructions within wavefronts are in lock-step and cannot truly branch so I did not expect this to compile at all, or perhaps it would be compiled to the equivalent to a switch() block and inline the subroutines the old fashioned way. When a constant is passed as array index, implicit or otherwise, the array call for that index works.  For example:

 

void main()
{
testSub[1]();
}

..correctly results in subroutine behavior of the subroutine specified at index 1.

 

Why is this? Is this a bug?

 

The test code above was inspired by this posting. The wording was confusing but the idea of using dynamically-indexed subroutine arrays seemed impossible in GPU hardware as it would imply that each thread could follow its own instruction path without traditional clause-lockouts.

 

Driver info below, in case this is a bug:

sudo modinfo fglrx_updates

filename:       /lib/modules/3.5.0-18-generic/updates/dkms/fglrx_updates.ko

license:        Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY

description:    ATI Fire GL

author:         Fire GL - ATI Research GmbH, Germany

srcversion:     9C5BC5DE95ACE501DA51B24

...

vermagic:   3.5.0-18-generic SMP mod_unload modversions
Ubuntu-Server//amd64//RadeonHD//Headless//OpenCL//JavaCL
  • Re: Are dynamically-indexed subroutine arrays in GL4 on HD5xxx hardware half-possible?
    gsellers Moderator
    Currently Being Moderated

    Hi,

     

    No, dynamically non-uniform expressions are not supported. The index used to look up into the array of subroutine uniforms is always taken from lane zero of the wavefront. There is no requirement that the shader compiler fail to compile the shader because determining that an expression is not dynamically uniform is virtually impossible. For example, you could read from a texture containing all black and end up with a uniform expression. The shader compiler has no way to know that you're going to do this although it is a (highly contrived) use case.

     

    There is no bug here. The shader is correctly compiled to execute a jump into an indexed array of subroutines, but that index is always taken from lane zero.

     

    In the example you posted, gl_VertexID % 2 is always zero for lane zero of any wavefront, regardless of the size of the draw. If you were, instead, to write gl_VertexID % 3 (and expand the array size appropriately), you would see subroutine zero executed for the first 64 vertices, subroutine 1 executed for the second 64 vertices, subroutine 2 executed for the third 64 vertices and so on.

     

    Cheers,

     

    Graham

    Graham Sellers
    Sr. Manager, OpenGL Driver Team, AMD

    Twitter: @grahamsellers

    ---

    The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.

More Like This

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points