Monday, 15 December 2014

Stereoscopic rendering with a geometry shader

**Edit 16 Dec 2014 - Unfortunately doing this late at night makes you miss a beat or two.. turns out the difference between rendering stereo in two passes doesn't give as much difference as I hoped for. So little in fact that it is not worth the exercise. Still in case someone finds it useful I've updated this article with the relevant info. **

While my engine is still running on a slow frame rate, mostly because I'm sending way more then I need to the rendering pipeline, I wanted to run a little experiment.

Something I added a little while ago was support for stereoscopic rendering using a split screen approach. This because it works on my 3D TV.

My engine is running roughly at 10fps with the cityscape model (see my previous post), once I turn on the stereoscopic rendering this drops to below 6fps:

This isn't surprising as all the geometry is rendered twice, once for the left eye, then for the right eye nearly halving the frame rate. It's not exactly half because I'm using a deferred renderer.

The idea that I wanted to try out is offloading the stereoscopic logic into a geometry shader. This would allow me to render everything in one pass and cut out on a lot of overhead in parsing all the models.

The first step I took was to insert a simple passthrough geometry shader in the middle. I was surprised to find out that in normal rendering this actually dropped the frame rate to just over 9fps. I found out later that when I adjust the geometry shader to support both mono and stereo rendering the frame rate dropped further to 8fps.
It is thus worth having a separate shader that you use when rendering stereoscopic.

Now in the code below I'm putting the relevant parts from my shaders and it is a very simple single pass renderer without lighting, I haven't tested this code but it's purely meant as an example.

First up is our vertex shader. This becomes nearly a pass through shader as we're doing our projection in the geometry shader:

#version 330

layout (location=0) in vec3 vertices;
layout (location=1) in vec2 texcoords;

out VS_OUT {
  vec2 T;
} vs_out;

void main(void) {
  gl_Position = vec4(vertices, 1.0);
  vs_out.T = texcoords.st;
}

Then the magic happens in our geometry shader, note that we provide two model-view-projection matrices, one for our left eye and one for the right:

#version 330

layout(triangles) in;
layout(triangle_strip, max_vertices=6) out;

uniform mvp[2];

in VS_OUT {
  vec2 T;
} gs_in[];

out GS_OUT {
  vec2 S;
  vec2 T;
  float eye;
} gs_out;

void main(void) {
  // Emit our left eye triangle
  for (int i=0; i < 3; i++) {
    vec4 P = mvpMat[0] * gl_in[i].gl_Position;
    P.x = (P.x * 0.5) - (P.w * 0.5);

    gl_Position = P;
    gs_out.S = P.xy / P.w;
    gs_out.T = gs_in[i].T;
    gs_out.eye = -1.0;
    EmitVertex();
  }
  EndPrimitive();


  // Emit our right eye triangle
  for (int i=0; i < 3; i++) {
    vec4 P = mvpMat[0] * gl_in[i].gl_Position;
    P.x = (P.x * 0.5) + (P.w * 0.5);

    gl_Position = P;
    gs_out.S = P.xy / P.w;
    gs_out.T = gs_in[i].T;
    gs_out.eye = 1.0;
    EmitVertex();
  }
  EndPrimitive();
}

Note, I had to do three changes here:
- set max_vertices=6, this is not the number vertices for a single primitive but the total number of vertices you potentially emit
- for the left eye I had to subtract 0.5, note also that I multiply this by P.w as the hardware will divide our X by W
- output S instead of relying on gl_FragCoord, looks like its deprecated in our fragment shader

Now we have two triangles being output, one for the left side using the left eye matrix and one for the right. The problem we still stuck with are triangles for the left eye that cross the border into the right half of the screen and visa versa. The clipping done by OpenGL only takes care what is offscreen. We're going to have to add a little extra sauce into our fragment shader:

#version 330

uniform float viewportWidth;
uniform sampler2D texturemap;

in GS_OUT {
  vec2 S;
  vec2 T;
  float eye;
} fs_in;

out vec4 color;

void main(void) {
  // discard our pixel if we're rendering for the wrong eye...
  if ((fs_in.eye < 0.0) && (fs_in.S.x > 0.0)) {
    discard;
  } else if ((fs_in.eye > 0.0) && (fs_in.S.x < 0.0)) {
    discard;
  }

  color = texture(texturemap, fs_in.T.st);
}

And our result?

We are up to 6.0 fps, well that was hardly worth the effort. One could argue it is a nicer way to do the stereo rendering but still.

There is room for improvement however and in that there may lie some merit in continuing with the approach.

At the moment we are sending many triangles for the left eye to the right side of the screen only to discard all the fragments. We could add a relative simple check to see if all vertices of the triangle are "on screen" for that eye. Only if at least one is on screen we output the triangle, else we skip it.

I've also been thinking about using a similar trick to speed up building my shadow maps. In theory we could render all required shadow maps in a single pass. More on that later.

The next step for me is changing the way I've got my VAOs setup and start looking into occlusion checks to speed things up by rendering a lot less of the scene then I am right now.

2 comments:

  1. You could do Sutherland Hodgman clipping of your triangles in geometry shader to confine them to a particular half of the screen instead of doing so in fragment shader.

    But also at this point you probably have to concur that your engine may not be CPU limited, but GPU limited. Good thing is, you can usually throw more GPU at things (they scale) while you can't throw more CPU at things, as a high-end CPU is barely faster than a low-end one.

    ReplyDelete
    Replies
    1. Hey Siana, thanks for the input, I didn't see your reaction timely, not sure why blogger doesn't give me a notification.

      Yes there is a lot of room for improvement in the geometry shader, I haven't spend any time on this for while now. Stereoscopic was a gimmick at the time I added it in though it is on my list of things to improve. There are far more important improvements to deal with first though such as organising the 3D scene better so I'm not rendering half of the offscreen content.

      Delete