MultiDrawIndirect

One of the most interesting extensions in modern OpenGL is indirect drawing commands and MultiDrawIndirect in particular. Before explaining this extension lets review first drawing commands in OpenGL to have the background to better understand how MultiDrawIndirect works. If you already know about the different drawing commands in OpenGL please skip next two sections.

Drawing Commands

Drawing commands in OpenGL can be classified as indexed/non-indexed, and direct/indirect. That’s a total of 4 categories ( Indexed direct, non-Indexed direct, indexed indirect and non-indexed indirect ). The difference between indexed and non-indexed is pretty straightforward. In an indexed draw you have to provide and index buffer (GL_ELEMENT_ARRAY_BUFFER) which OpenGL will use to fetch the vertices from the vertex buffers. Non-indexed drawing commands on the other hand don’t use any index buffer so vertices are feed sequentially in the order that they appear in the vertex buffers.There are a lot of direct rendering commands, both for indexed and non-indexed, but they all are just reduced versions of the most generalized drawing commands, glDrawArraysInstancedBaseInstance() for non-indexed drawing and glDrawElementsInstancedBaseVertexBaseInstance() for indexed. There are two parameters of these functions which need some explanation, BaseVertex in the indexed draw and BaseInstance. Actually, if you’ve read my previous post on Array Textures, you’ve already seen BaseVertex being used. When BaseVertex is specified, OpenGL will fetch the vertex index from the buffer bound to GL_ELEMENT_ARRAY_BUFFER and then add baseVertex to it before it is used to index into the vertex buffers. This is useful to combine multiple meshes into a single vertex buffer so you don’t need to modify the indices of every single mesh. Here is an example for you, say you have two meshes Mesh1 and Mesh2 that you want to combine in a single buffer.

Mesh1:                     vertex: {v0, v1, v2}                     index: {0,1,2}
Mesh2:                     vertex: {w0, w1, w2}                   index: {0,1,2}
Mesh1+Mesh2:        vertex: {v0,v1,v2,w0,w1,w2}

Now, if you want to render both meshes, you’ll bind the index buffer of the first mesh and Mesh1 will get rendered, but if you bind the index buffer of Mesh2 you will be rendering the vertices v1,v2,v3 again and that’s not what we want. If we use BaseVertex=3 when rendering Mesh2, then first index will be 0+3=3, the second index will be 1+3=4 and the third index will be 2+3=5, so we will be fetching the correct vertices.

Instancing

To explain what BaseInstance is we will need to know a little bit about instanced rendering in OpenGL. Instanced rendering is a method provided by OpenGL to render multiple copies of the same geometry with a single draw call. glDrawArraysInstanced and glDrawElementsInstanced accept the argument instanceCount which tells OpenGL to render instanceCount copies of the geometry. In the vertex shader you will know which instance the vertex that’s being processed belongs to by consulting the built-in variable gl_InstanceID. But, how can you pass per instance data to the shader using vertex buffers? The answer is using Instanced arrays. To make OpenGL read attributes from the vertex arrays once per instance for example, you can use glVertexAttribDivisor(). glVertexAttribDivisor modifies the rate at which generic vertex attributes advance during instanced rendering, if divisor is non-zero, the attribute advances once per divisor instance. When you have instanced vertex attributes, you can use BaseInstance to offset where in their respective buffers the data is read from. The actual formula for calculating the index from which instanced attributes are fetched is (instance/divisor) + baseInstance.

Indirect drawing commands

Indirect drawing commands is a family of commands that allow the parameters of each draw to be stored in a buffer object. There are four indirect drawing commands in OpenGL: glDrawArraysIndirect(), glDrawElementsIndirect(), glMultiDrawArraysIndirect() and glMultiDrawElementsIndirect. glDrawArraysIndirect and glDrawElementsIndirect are the indirect equivalent to the most generic direct drawing commands ( glDrawArraysInstancedBaseInstace and glDrawElementsInstancedBaseVertexBaseInstance respectively ) but the parameters to this functions will be stored in a buffer bound to the GL_DRAW_INDIRECT_BUFFER binding point. The contents of the buffer at this address are different depending on which function is being used:

typedef struct
{
GLuint vertexCount;
GLuint instanceCount;
GLuint firstVertex;
GLuint baseInstance;
}DrawArraysIndirectCommand;

typedef struct
{
GLuint vertexCount;
GLuint instanceCount;
GLuint firstVertex;
GLuint baseVertex;
GLuint baseInstance;
}DrawElementsIndirectCommand;

Is that simple. Now, this alone is not particularly useful, but it gets more interesting with the introduction of Multidraw indirect calls. glMultiDrawArraysIndirect and glMultiDrawElementsIndirect performs the same operation than glDrawArraysIndirect and glDrawElementsIndirect but they do it in a loop on an array of DrawArraysIndirectCommand or DrawElementsIndirectCommand. You specify the number of draw calls you are about to perform and OpenGL will generate for you all those draw commands reading the parameters for each command from the buffer bound to GL_INDIRECT_BUFFER. If your platform supports ARB_shader_draw_parameters extension, then you can use the built-in variable gl_DrawID in the shader to know which draw call is being executed. If it doesn’t, there is a workaround to get the draw id in the shader. The trick is very well explained here. Basically, we are going to exploit the fact that each draw command is an instanced drawing command even if it renders only one instance. We will setup an instanced vertex attribute with the draw ids and then we will set the baseInstance field of every command to the index within that attribute’s array.

To illustrate the use of glMulitDrawIndirect we are going to make the little program from the last post more efficient by drawing all those little quads in just one draw call. By the way, I have upload the code for all the samples so far in a github repository: GLSamples.

#include <iostream> //std::cerr
#include <cstdlib>  //rand
 
#include "GL/glew.h"
#include "GL/freeglut.h"
 
namespace
{
  struct SVertex2D
  {
    float x,y;  //Position
    float u,v;  //Uv
  };
 
  struct SDrawElementsCommand
  {
    GLuint vertexCount;
    GLuint instanceCount;
    GLuint firstIndex;
    GLuint baseVertex;
    GLuint baseInstance;
  };
 
  const GLchar* gVertexShaderSource[] = 
  {
    "#version 430 core\n"
    "layout (location = 0 ) in vec2 position;\n"
    "layout (location = 1 ) in vec2 texCoord;\n"
    "layout (location = 2 ) in uint drawid;\n"
    "out vec2 uv;\n"
    "flat out uint drawID;\n"
    "void main(void)\n"
    "{\n"
    "  gl_Position = vec4(position,0.0,1.0);\n"
    "  uv = texCoord;\n"
    "  drawID = drawid;\n"
    "}\n"
  };
 
  const GLchar* gFragmentShaderSource[] = 
  {
    "#version 430 core\n"
    "out vec4 color;\n"
    "in vec2 uv;\n"
    "flat in uint drawID;\n"
    "layout (binding=0) uniform sampler2DArray textureArray;\n"
    "void main(void)\n"
    "{\n"
    "  color = texture(textureArray, vec3(uv.x,uv.y,drawID) );\n"
    "}\n"
  };
 
  const SVertex2D gQuad[] = { {0.0f,0.0f,0.0f,0.0f},
                              {0.1f,0.0f,1.0f,0.0f},
                              {0.0f,0.1f,0.0f,1.0f},
                              {0.1f,0.1f,1.0f,1.0f}
                            };
 
  const unsigned int gIndex[] = {0,1,2,1,3,2};
 
  GLuint gArrayTexture(0);
  GLuint gVertexBuffer(0);
  GLuint gElementBuffer(0);
  GLuint gIndirectBuffer(0);
  GLuint gDrawIdBuffer(0);
  GLuint gProgram(0);
 
}//Unnamed namespace
void GenerateGeometry()
{
  //Generate 100 little quads
  SVertex2D vVertex[400];
  int index(0);
  float xOffset(-0.95f);
  float yOffset(-0.95f );
  for( unsigned int i(0); i!=10; ++i )
  {
    for( unsigned int j(0); j!=10; ++j )
    {
        for( unsigned int k(0); k!=4; ++k)
       {
         vVertex[index].x = gQuad[k].x+xOffset;
         vVertex[index].y = gQuad[k].y+yOffset;
         vVertex[index].u = gQuad[k].u;
         vVertex[index].v = gQuad[k].v;
	 index++;
      } 
      xOffset += 0.2f;
    }
    yOffset += 0.2f;
    xOffset = -0.95f;
  }
 
  GLuint vao;
  glGenVertexArrays(1,&vao);
  glBindVertexArray(vao);
 
  //Create a vertex buffer object
  glGenBuffers( 1, &gVertexBuffer );
  glBindBuffer( GL_ARRAY_BUFFER, gVertexBuffer );
  glBufferData( GL_ARRAY_BUFFER, sizeof(vVertex), vVertex, GL_STATIC_DRAW );
 
  //Specify vertex attributes for the shader
  glEnableVertexAttribArray(0);
  glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, sizeof(SVertex2D), (GLvoid*)0 );
  glEnableVertexAttribArray(1);
  glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, sizeof(SVertex2D), (GLvoid*)8 );
 
  //Create an element buffer
  glGenBuffers( 1, &gElementBuffer );
  glBindBuffer( GL_ELEMENT_ARRAY_BUFFER, gElementBuffer );
  glBufferData( GL_ELEMENT_ARRAY_BUFFER, sizeof(gIndex), gIndex, GL_STATIC_DRAW );
 
  //Generate draw commands
  SDrawElementsCommand vDrawCommand[100];
  for( unsigned int i(0); i<100; ++i )
  {
    vDrawCommand[i].vertexCount = 6;
    vDrawCommand[i].instanceCount = 1;
    vDrawCommand[i].firstIndex = 0;
    vDrawCommand[i].baseVertex = i*4;
    vDrawCommand[i].baseInstance = i;
  }
 
  glGenBuffers(1, &gIndirectBuffer );
  glBindBuffer( GL_DRAW_INDIRECT_BUFFER, gIndirectBuffer );
  glBufferData( GL_DRAW_INDIRECT_BUFFER, sizeof(vDrawCommand), vDrawCommand, GL_STATIC_DRAW );
 
 
  //Generate an instanced vertex array to identify each draw call in the shader
  GLuint vDrawId[100];
  for( GLuint i(0); i<100; i++ )
  {
    vDrawId[i] = i;
  }
 
  glGenBuffers( 1, &gDrawIdBuffer );
  glBindBuffer( GL_ARRAY_BUFFER, gDrawIdBuffer );
  glBufferData( GL_ARRAY_BUFFER, sizeof(vDrawId), vDrawId, GL_STATIC_DRAW );
 
  glEnableVertexAttribArray(2);
  glVertexAttribIPointer(2, 1, GL_UNSIGNED_INT, 0, (GLvoid*)0 );
  glVertexAttribDivisor(2, 1);
 
  //NOTE: Instead of creating a new buffer for the drawID as we just did, 
  //we could use the "baseInstance" field from the gIndirectBuffer to 
  //provide the gl_DrawID to the shader. The code will look like this:
  /*
  glBindBuffer(GL_ARRAY_BUFFER, gIndirectBuffer );
  glEnableVertexAttribArray(2);
  glVertexAttribIPointer(2, 1, GL_UNSIGNED_INT, sizeof(SDrawElementsCommand), (void*)( 4 * sizeof(GLuint)) );
  glVertexAttribDivisor(2, 1);
  */
}
 
void GenerateArrayTexture()
{
  //Generate an array texture
  glGenTextures( 1, &gArrayTexture );
  glActiveTexture(GL_TEXTURE0);
  glBindTexture(GL_TEXTURE_2D_ARRAY, gArrayTexture);
 
  //Create storage for the texture. (100 layers of 1x1 texels)
  glTexStorage3D( GL_TEXTURE_2D_ARRAY,
                  1,                    //No mipmaps as textures are 1x1
                  GL_RGB8,              //Internal format
                  1, 1,                 //width,height
                  100                   //Number of layers
                );
 
  for( unsigned int i(0); i!=100;++i)
  {
    //Choose a random color for the i-essim image
    GLubyte color[3] = {rand()%255,rand()%255,rand()%255};
 
    //Specify i-essim image
    glTexSubImage3D( GL_TEXTURE_2D_ARRAY,
                     0,                     //Mipmap number
                     0,0,i,                 //xoffset, yoffset, zoffset
                     1,1,1,                 //width, height, depth
                     GL_RGB,                //format
                     GL_UNSIGNED_BYTE,      //type
                     color);                //pointer to data
  }
 
  glTexParameteri(GL_TEXTURE_2D_ARRAY,GL_TEXTURE_MIN_FILTER,GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D_ARRAY,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D_ARRAY,GL_TEXTURE_WRAP_S,GL_CLAMP_TO_EDGE);
  glTexParameteri(GL_TEXTURE_2D_ARRAY,GL_TEXTURE_WRAP_T,GL_CLAMP_TO_EDGE);
}
 
GLuint CompileShaders(const GLchar** vertexShaderSource, const GLchar** fragmentShaderSource )
{
  //Compile vertex shader
  GLuint vertexShader( glCreateShader( GL_VERTEX_SHADER ) );
  glShaderSource( vertexShader, 1, vertexShaderSource, NULL );
  glCompileShader( vertexShader );
 
  //Compile fragment shader
  GLuint fragmentShader( glCreateShader( GL_FRAGMENT_SHADER ) );
  glShaderSource( fragmentShader, 1, fragmentShaderSource, NULL );
  glCompileShader( fragmentShader );
 
  //Link vertex and fragment shader together
  GLuint program( glCreateProgram() );
  glAttachShader( program, vertexShader );
  glAttachShader( program, fragmentShader );
  glLinkProgram( program );
 
  //Delete shaders objects
  glDeleteShader( vertexShader );
  glDeleteShader( fragmentShader );
 
  return program;
}
 
void Init(void)
{
  //Check if Opengl version is at least 3.0
  const GLubyte* glVersion( glGetString(GL_VERSION) );
  int major = glVersion[0] - '0';
  int minor = glVersion[2] - '0';
  if( major < 3 || minor < 0 )
  {
    std::cerr<<"ERROR: Minimum OpenGL version required for this demo is 3.0. Your current version is "<<major<<"."<<minor<<std::endl;
    exit(-1);
  }
 
  //Init glew
  glewInit();
 
  //Set clear color
  glClearColor(1.0, 1.0, 1.0, 0.0);
 
  //Create and bind the shader program
  gProgram = CompileShaders( gVertexShaderSource, gFragmentShaderSource );
  glUseProgram(gProgram);
 
  glUniform1i(0,0); //Sampler refers to texture unit 0
  
  GenerateGeometry();
  GenerateArrayTexture();
}
 
void Display()
{
  glClear( GL_COLOR_BUFFER_BIT );
  glMultiDrawElementsIndirect( GL_TRIANGLES, 
			       GL_UNSIGNED_INT, 
			       (GLvoid*)0, 
			       100, 
			       0 );
 
  glutSwapBuffers();
}
 
void Quit()
{
  //Clean-up
  glDeleteProgram(gProgram);
  glDeleteBuffers( 1, &gVertexBuffer );
  glDeleteBuffers( 1, &gElementBuffer );
  glDeleteBuffers( 1, &gDrawIdBuffer );
  glDeleteBuffers( 1, &gIndirectBuffer );
  glDeleteTextures( 1, &gArrayTexture );
 
  //Exit application
  exit(0);
}
 
void OnKeyPress( unsigned char key, int x, int y )
{
  //'Esc' key
  if( key == 27 )
    Quit();
}
 
int main( int argc, char** argv )
{
  glutInit(&argc, argv);
  glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB );
  glutInitWindowSize(400,400);
  glutCreateWindow("Array Texture Example");
  glutIdleFunc(Display);
  glutKeyboardFunc( OnKeyPress );
 
  Init();
 
  //Enter the GLUT event loop
  glutMainLoop();
}

Ok, there are still some cool extensions like Bindless and Sparse Textures which I will try to cover if I find the time, but with these three extensions alone you can write efficient GL programs with very low driver overhead. I also would like to recommend this article with lots of interesting ideas about this topic.

Advertisements

One thought on “MultiDrawIndirect

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s