Tuesday 19 April 2016

A simple shader preprocessor (part 29)

As I started planning out the additions for my deferred lighting renderer I realised I could no longer postpone implementing at least a basic shader preprocessor.

While some parts of the code can be moved more central other parts need to be further duplicated and in doing so the need to fix the same issues in multiple places make things harder and harder to maintain.

For my goals however I don't need the preprocessor to do much so we can keep everything very simple and we'll limit the functionality to the following:

  • support for a #include to insert the text from a file into our shader
  • supplying a number of "defines" which we can trigger logic
  • very basic #ifdef, #ifndef and #else logic that use these defines to include or exclude parts of the shader code

Changes to our system library


I was thinking about putting most of this code in our system.h file but decided against that for now. I may yet change this in the future. For now one support function has been added here:
// gets the portion of the line up to the specified delimiter(s)
// return NULL on failure or if there is no text
// returns string on success, calling function is responsible for freeing the text
char * delimitText(const char *pText, const char *pDelimiters) {
  int    len = 0;
  char * result = NULL;
  bool   found = false;
  int    delimiterCount;

  delimiterCount = strlen(pDelimiters) + 1; // always include our trailing 0 as a delimiter ;)

  while (!found) {
    int pos = 0;
    while ((!found) && (pos < delimiterCount)) {
      if (pText[len] == pDelimiters[pos]) {
        found = true;
      };
      pos++;
    };

    if (!found) {
      len++;
    };
  };

  if (len != 0) {
    result = malloc(len + 1);
    if (result != NULL) {
      memcpy(result, pText, len);
      result[len] = 0;
    };
  };

  return result;
};
This function splits off the first part of the text pointed to by pText from the start until it detects one of the delimiters or the end of the string.
This is fairly similar to the code we wrote before to read out material and object files line by line but without using our varchar implementation.

Changes to our varchar library


We are going to use varchar.h but in combination with a linked list to store our defines in. For this I've added 3 new functions:
// list container for varchars
// llist * strings = newVarcharList()
llist * newVarcharList() {
  llist * varcharList = newLlist((dataRetainFunc) varcharRetain, (dataFreeFunc) varcharRelease);
  return varcharList;
};
This first function simply returns a linked list setup to accept varchar objects.
// list container for varchars created by processing a string
// empty strings will not be added but duplicate strings will be
// llist * strings = newVarcharList()
llist * newVCListFromString(const char * pText, const char * pDelimiters) {
  llist * varcharList = newVarcharList();

  if (varcharList != NULL) {
    int    pos = 0;

    while (pText[pos] != 0) {
      // find our next line
      char * line = delimitText(pText + pos, pDelimiters);
      if (line != NULL) {
        int len = strlen(line);

        varchar * addChar = newVarchar();
        if (addChar != NULL) {
          varcharAppend(addChar, line, len);

          llistAddTo(varcharList, addChar);
        };

        if (pText[pos + len] != 0) {
          // skip our newline character
          pos += len + 1;
        } else {
          // we found our ending
          pos += len;
        };

        free(line);
      } else {
        // skip any empty line...
        pos++;
      };
    };
  };

  return varcharList;
};
This method uses our new delimitText function to pull a given string appart and add each word in the string as an entry into a new linked list.
// check if our list contains a string
bool vclistContains(llist * pVCList, const char * pText) {
  if ((pVCList != NULL) && (pText != NULL)) {
    llistNode * node = pVCList->first;

    while (node != NULL) {
      varchar * text = (varchar *) node->data;

      if (varcharCmp(text, pText) == 0) {
        return true;
      };

      node = node->next;
    };
  };

  // not found
  return false;
};
And finally a function that checks if a given word is present in our linked list.

Changes to our shader library


The real implementation can be found in our shader library. We've added a new parameter to our newShader function so we can pass it the defines we want to use for that shader:
shaderInfo * newShader(const char *pName, const char * pVertexShader, const char * pTessControlShader, const char * pTessEvalShader, const char * pGeoShader, const char * pFragmentShader, const char *pDefines) {
  shaderInfo * newshader = (shaderInfo *)malloc(sizeof(shaderInfo));
  if (newshader != NULL) {
    llist * defines;
    ...
    // convert our defines
    defines = newVCListFromString(pDefines, " \r\n");

    // attempt to load our shader by name
    if (pVertexShader != NULL) {
      shaders[count] = shaderLoad(GL_VERTEX_SHADER, pVertexShader, defines);
      if (shaders[count] != NO_SHADER) count++;      
    };
    ...
    // no longer need our defines
    if (defines != NULL) {
      llistFree(defines);
    };
    ...
  return newshader;
};
We first convert our new parameter pDefines into a linked list of varchars by calling our new newVCListFromString function.
We then pass our new linked list to each shaderLoad call so it can be used by our preprocessor.
Finally we deallocate our linked list and all the varchars held within.

The only change in shaderLoad is that it no longer called loadFile directly but instead calls shaderLoadAndPreprocess:
varchar * shaderLoadAndPreprocess(const char *pName, llist * pDefines) {
  varchar * shaderText = NULL;

  // create a new varchar object for our shader text
  shaderText = newVarchar();
  if (shaderText != NULL) {
    // load the contents of our file
    char * fileText = loadFile(shaderPath, pName);

    if (fileText != NULL) {
      // now loop through our text line by line (we do this with a copy of our pointer)
      int    pos = 0;
      bool   addLines = true;
      int    ifMode = 0; // 0 is not in if, 1 = true condition not found, 2 = true condition found

      while (fileText[pos] != 0) {
        // find our next line
        char * line = delimitText(fileText + pos, "\n\r");

        // found a non-empty line?
        if (line != NULL) {
          int len = strlen(line);

          // check for any of our preprocessor checks
          if (memcmp(line, "#include \"", 10) == 0) {
            if (addLines) {
              // include this file
              char * includeName = delimitText(line + 10, "\"");
              if (includeName != NULL) {
                varchar * includeText = shaderLoadAndPreprocess(includeName, pDefines);
                if (includeText != NULL) {
                  // and append it....
                  varcharAppend(shaderText, includeText->text, includeText->len);
                  varcharRelease(includeText);
                };
                free(includeName);
              };
            };
          } else if (memcmp(line, "#ifdef ", 7) == 0) {
            if (ifMode == 0) {
              char * ifdefined;

              ifMode = 1; // assume not defined....
              ifdefined = delimitText(line + 7, " ");
              if (ifdefined != NULL) {
                // check if our define is in our list of defines
                if (vclistContains(pDefines, ifdefined)) {
                  ifMode = 2;
                };
                free(ifdefined);
              };
              addLines = (ifMode == 2);              
            } else {
              errorlog(SHADER_ERR_NESTED, "Can't nest defines in shaders");
            };
          } else if (memcmp(line, "#ifndef ", 8) == 0) {
            if (ifMode == 0) {
              char * ifnotdefined;

              ifMode = 1; // assume not defined....
              ifnotdefined = delimitText(line + 7, " ");
              if (ifnotdefined != NULL) {
                // check if our define is not in our list of defines
                if (vclistContains(pDefines, ifnotdefined) == false) {
                  ifMode = 2;
                };
                free(ifnotdefined);
              };
              addLines = (ifMode == 2);              
            } else {
              errorlog(SHADER_ERR_NESTED, "Can't nest defines in shaders");
            };
          } else if (memcmp(line, "#else", 5) == 0) {
            if (ifMode == 1) {
              ifMode = 2;
              addLines = true;
            } else {
              addLines = false;
            };
          } else if (memcmp(line, "#endif", 6) == 0) {
            addLines = true;
            ifMode = 0;
          } else if (addLines) {
            // add our line
            varcharAppend(shaderText, line, len);
            // add our line delimiter
            varcharAppend(shaderText, "\r\n", 1);
          };

          if (fileText[pos + len] != 0) {
            // skip our newline character
            pos += len + 1;
          } else {
            // we found our ending
            pos += len;
          };

          // don't forget to free our line!!!
          free (line);
        } else {
          // skip empty lines...
          pos++;
        };
      };

      // free the text we've loaded, what we need has now been copied into shaderText
      free(fileText);
    };

    if (shaderText->text == NULL) {
      varcharRelease(shaderText);
      shaderText = NULL;
    };
  };

  return shaderText;
};
I'm not going to detail each and every section, I hope the comments do a good enough job for that. In a nutshell however, we start by creating a new varchar variable called shaderText which is what we'll end up returning. This means that our shaderLoad function also has a small change to work with a varchar instead of a char pointer as a result.
After this we load the contents of our shader file into a variable called fileText but instead of using this directly we use delimitText to loop through our shader text one line at a time.
For each line we check if it starts with one of our preprocessor commands and if so handle the special logic associated with it. If not we simply add our line to our shaderText variable.

#include is the first preprocessor command we handle, it simply checks the filename presented and attempts to load that file by calling shaderLoadAndPreprocess recursively.

This is followed by the code that interprets our #ifdef, #ifndef, #else and #endif preprocessor commands. These basically check if the given define is present in our linked list. They toggle the values of ifMode and addLines that control whether we ignore text in our shader file or add the lines to our shaderText.

Changes to our shaders


I've made two changes to our shaders, the first is that I've created a new shader files called "shadowmap.fs" that contains our samplePCF, shadow and shadowTest functions and we use #include in the various fragment shaders where we need these functions.

The second change is that I've combined our flatshader.fs, textured.fs and reflect.fs fragment shaders into a single standard.fs file that looks as follows:
#version 330

// info about our light
uniform vec3      lightPos;                         // position of our light after view matrix was applied
uniform float     ambient = 0.3;      // ambient factor
uniform vec3      lightcol = vec3(1.0, 1.0, 1.0);   // color of the light of our sun

// info about our material
uniform float     alpha = 1.0;                      // alpha for our material
#ifdef textured
uniform sampler2D textureMap;                       // our texture map
#else
uniform vec3      matColor = vec3(0.8, 0.8, 0.8);   // color of our material
#endif
uniform vec3      matSpecColor = vec3(1.0, 1.0, 1.0); // specular color of our material
uniform float     shininess = 100.0;                // shininess

#ifdef reflect
uniform sampler2D reflectMap;                       // our reflection map
#endif

// these are in world coordinates
in vec3           E;                                // normalized vector pointing from eye to V
in vec3           N;                                // normal vector for our fragment

// these in view
in vec4           V;                                // position of fragment after modelView matrix was applied
in vec3           Nv;                               // normal vector for our fragment (inc view matrix)
in vec2           T;                                // coordinates for this fragment within our texture map
in vec4           Vs[3];                            // our shadow map coordinates
out vec4          fragcolor;                        // our output color

#include "shadowmap.fs"

void main() {
#ifdef textured
  // start by getting our color from our texture
  fragcolor = texture(textureMap, T);  
  fragcolor.a = fragcolor.a * alpha;
  if (fragcolor.a < 0.2) {
    discard;
  };
#else
  // Just set our color
  fragcolor = vec4(matColor, alpha);
#endif

  // Get the normalized directional vector between our surface position and our light position
  vec3 L = normalize(lightPos - V.xyz);
  
  // We calculate our ambient color
  vec3  ambientColor = fragcolor.rgb * lightcol * ambient;

  // Check our shadow map
  float shadowFactor = shadow(Vs[0], Vs[1], Vs[2]);
  
  // We calculate our diffuse color, we calculate our dot product between our normal and light
  // direction, note that both were adjusted by our view matrix so they should nicely line up
  float NdotL = max(0.0, dot(Nv, L));
  
  // and calculate our color after lighting is applied
  vec3 diffuseColor = fragcolor.rgb * lightcol * (1.0 - ambient) * NdotL * shadowFactor;

  // now for our specular lighting
 vec3 specColor = vec3(0.0);
  if ((NdotL != 0.0) && (shininess != 0.0)) {
    // slightly different way to calculate our specular highlight
    vec3 halfVector = normalize(L - normalize(V.xyz));
    float nxHalf = max(0.0, dot(Nv, halfVector));
    float specPower = pow(nxHalf, shininess);
  
    specColor = lightcol * matSpecColor * specPower * shadowFactor;
  };

#ifdef reflect
  // add in our reflection, this is one of the few places where world coordinates are paramount. 
  vec3  r = reflect(E, N);
  vec2  rc = vec2((r.x + 1.0) / 4.0, (r.y + 1.0) / 2.0);
  if (r.z < 0.0) {
   r.x = 1.0 - r.x;
  };
  vec3  reflColor = texture(reflectMap, rc).rgb;

  // and add them all together
  fragcolor = vec4(clamp(ambientColor+diffuseColor+specColor+reflColor, 0.0, 1.0), fragcolor.a);
#else
  // and add them all together
  fragcolor = vec4(clamp(ambientColor+diffuseColor+specColor, 0.0, 1.0), fragcolor.a);
#endif
}
Note the inclusion of our #ifdef blocks to change between our various bits of logic while reusing code that is the same in all three shaders.
We can now change our shader loading code in engine.h to the following:
  colorShader = newShader("flatcolor", "standard.vs", NULL, NULL, NULL, "standard.fs", "");
  texturedShader = newShader("textured", "standard.vs", NULL, NULL, NULL, "standard.fs", "textured");
  reflectShader = newShader("reflect", "standard.vs", NULL, NULL, NULL, "standard.fs", "reflect");
If we need it we could very quickly add a fourth shader that combines texture mapping and reflection mapping by simply passing "textured reflect" as our defines.

In the same way I've combined our shadow shaders into a single shader file.

Obviously there is a lot of room for improvement here, but it is a start and enough to keep us going for a little bit longer.

Download the source here

What's next?


Now we're ready to start working on our deferred shader.



Saturday 9 April 2016

Shadow maps #3 (part 28)

Ok, time for the last part of implementing basic shadow maps. The technique we're going to look at in this post is called cascading shadow maps. This is a technique mostly used to improve the quality for lights that effect large areas such as our sunlight. The problem we've had so far is that a high quality shadow map will only produce shadows in a small area while a large area shadow maps will be of such low quality things get very blocky.

While using a smoothing technique like we did in our last post does improve this somewhat up close shadows do not look very good. The lower quality shadow maps are fine for things that are further away.

Now there are several techniques that can improve this each with their own strong points and weakpoints. One alternative I'd like to mention is altering the projection matrix so the projection is skewed based on the distance to the camera ensuring we have higher detail shadow maps closer to the camera in a single map.
I'd also like to point to a completely different technique called shadow volumes, I've not implemented those myself but reading about them I'm interested to try some day. They seem to give incredible results but they may be more difficult to implement if you have loads of moving objects. I'm no expert in them yet so I'll refrain from commenting too much.

The technique we'll be using is where we simply render multiple shadow maps and pick the one best suited to what we're rendering. So we have a high quality shadow map for shadows cast close to the camera, we have a medium quality shadow map for things further away, and we have a low quality shadow map for things even further out. The screenshot below shows the three maps where I've changed the color of the shadow cast to highlight the transitions between the shadow maps (I left the code that produces this in our terrain fragment shader but disabled it, if you want to play around with it):


Now I'm keeping things simple here and as a result we're adding more overhead then we need.
First off, where there is overlap in the shadow maps we're rendering a bunch of detail into the lower quality shadow maps that will never be used. We could use a stencil buffer to cut that out but I'm not sure how much that would improve things as we're really not doing anything in the fragment shader anyway. Another improvement I've thought about is using our bounding box checking logic to exclude anything that falls fully within the overlap space, that might make a noticeable difference.
Second, depending on our camera position and the angle of our sun we may not need the other shadow maps at all.
Third, I already mentioned this in my previous posts and this ties into the second point, we're centering our shadow maps on the camera position so in worse case half our our shadow maps will never be used. Adjusting our lookat point for our shadows may allow us to cover a greater area with our higher detail shadow map.

These are all issues for later to deal with. It's worth noting though that with the changes we're making today on my little MacBook Pro the frame rate has suffered and while we were rendering at a comfortable 60fps unless we move to high in our scene, it's dropped to 30 to 40fps at the moment.
I have added one small enhancement and that is that I only re-render our shadow maps if our lighting direction has changed (which usually is a static) or if our lookat point has moved more then a set distance (we do this by rounding our look at position).

Last but not least, I've added a small bit of code to react to the - and = (+) keys and move the position of the sun. There is no protection for "night time" so we actually end up lighting the scene from below.

Going from 1 to 3 shadow maps


Obviously we need to add support for our 3 levels of shadow maps first. This starts with adjusting our lightsource structure:
// and a structure to hold information about a light (temporarily moved here)
typedef struct lightSource {
  float             ambient;          // ambient factor for our light
  vec3              position;         // position of our light
  vec3              adjPosition;      // position of our light with view matrix applied
  bool              shadowRebuild[3]; // do we need to rebuild our shadow map?
  vec3              shadowLA[3];      // remembering our lookat point for our shadow map
  texturemap *      shadowMap[3];     // shadowmaps for this light
  mat4              shadowMat[3];     // view-projection matrices for this light
} lightSource;
Note that as we're not yet dealing with anything but our sun as a lightsource I'm not putting any code in yet to support a flexible number of shadow maps.

So we now have 3 shadow maps and three shadow matrices to go with them. There is also a set of flags that determine if shadow maps need to be rebuild and a set of look at coordinates that we can use to check if we've moved our camera enough in order to need to rebuild our shadow maps.

It is important to realise at this point that this won't be enough once we start moving objects around. The easiest is to update our rebuild flags but we may as well remove this all together once things start moving around. A better solution would be to render our shadow maps with all the static objects only, and either overlay or add in objects that move around as we render our scenes. Thats something for much later however.

Similarly our shader library is enhanced to support the 3 shadow maps as well:
...
// structure for encapsulating a shader, note that not all ids need to be present (would be logical to call this struct shader but it's already used in some of the support libraries...)
typedef struct shaderInfo {
  ...
  GLint   shadowMapId[3];           // ID of our shadow maps
  GLint   shadowMatId[3];           // ID for our shadow matrices
  ...
} shaderInfo;
...
void shaderSetProgram(shaderInfo * pShader, GLuint pProgram) {
  ...
  for (i = 0; i < 3; i++) {
    sprintf(uName, "shadowMap[%d]", i);
    pShader->shadowMapId[i] = glGetUniformLocation(pShader->program, uName);
    if (pShader->shadowMapId[i] < 0) {
      errorlog(pShader->shadowMapId[i], "Unknown uniform %s:%s", pShader->name, uName);
    };
    sprintf(uName, "shadowMat[%d]", i);
    pShader->shadowMatId[i] = glGetUniformLocation(pShader->program, uName);
    if (pShader->shadowMatId[i] < 0) {
      errorlog(pShader->shadowMatId[i], "Unknown uniform %s:%s", pShader->name, uName);
    };
  };
  ...

And we need a similar change to our materials library to inform our shaders of the 3 shadow maps:
...
bool matSelectProgram(material * pMat, shaderMatrices * pMatrices, lightSource * pLight) {
  ...
  for (i = 0; i < 3; i++) {
    if (pMat->matShader->shadowMapId[i] >= 0) {
      glActiveTexture(GL_TEXTURE0 + texture);
      if (pLight->shadowMap[i] == NULL) {
        glBindTexture(GL_TEXTURE_2D, 0);      
      } else {
        glBindTexture(GL_TEXTURE_2D, pLight->shadowMap[i]->textureId);
      }
      glUniform1i(pMat->matShader->shadowMapId[i], texture); 
      texture++;   
    };
    if (pMat->matShader->shadowMatId[i] >= 0) {
      glUniformMatrix4fv(pMat->matShader->shadowMatId[i], 1, false, (const GLfloat *) pLight->shadowMat[i].m);
    };
  };
  ...
These changes should all be pretty straight forward so far.

Rendering our 3 shadow maps


Rendering 3 maps instead of 1 is simply a matter of calling our shadow map code 3 times. For this to work I've changed our renderShadowMapForSun function so I can parse it parameters to let it know which shadow map we're rendering and at what level of detail we want it. I'm just adding the start of the code here as most of the function has stayed the same from our first part. Have a look at the full source on github to see that other changes needed:
...
// render our shadow map
// we'll place this in our engine.h for now but we'll soon make this part of our lighting library
void renderShadowMapForSun(bool * pRebuild, texturemap * pShadowMap, vec3 * pLookat, mat4 * pShadowMat, int pResolution, float pSize) {
  vec3 newLookat;

  // prevent rebuilds if we only move a tiny bit....
  newLookat.x = camera_eye.x - fmod(camera_eye.x, pSize/100.0);
  newLookat.y = camera_eye.y - fmod(camera_eye.x, pSize/100.0);
  newLookat.z = camera_eye.z - fmod(camera_eye.x, pSize/100.0);

  if ((pLookat->x != newLookat.x) || (pLookat->y != newLookat.y) || (pLookat->z != newLookat.z)) {
    vec3Copy(pLookat, &newLookat);
    *pRebuild = true;
  };

  // we'll initialize a shadow map for our sun
  if (*pRebuild == false) {
    // reuse it as is...
  } else if (tmapRenderToShadowMap(pShadowMap, pResolution, pResolution)) {
  ...
I'm highlighting this part of the code because it shows the changes we made to limit the number of times we rebuild our shadow maps. We round our lookat position based on the level of detail we want in our shadow map. For our closest shadow map we may only move our camera 15 units before we need to rebuild our shadow maps while for our higher detail map it will be 100 units. Obviously if our light position changes we set our rebuild flags to true and we rebuild all shadow maps.

Finally we need to call this method 3 times which we do in our engineRender method:
  ...
  if (pMode != 2) {
    renderShadowMapForSun(&sun.shadowRebuild[0], sun.shadowMap[0], &sun.shadowLA[0], &sun.shadowMat[0], 4096, 1500);
    renderShadowMapForSun(&sun.shadowRebuild[1], sun.shadowMap[1], &sun.shadowLA[1], &sun.shadowMat[1], 4096, 3000);
    renderShadowMapForSun(&sun.shadowRebuild[2], sun.shadowMap[2], &sun.shadowLA[2], &sun.shadowMat[2], 2048, 10000);
  };
  ...
So our highest quality shadow map is a 4096x4096 map that covers an area of 3000x3000 units (2*1500).
Our lowest quality shadow map is a 2048x2048 map that covers an area of 20000x20000 units.

Note that this is where the color coded rendering of the shadow maps does come in handy for tweaking what works well as the size of our maps depend a lot on the sizes of your objects and what you consider to be close or far.

Changing our shaders

The final ingredient is changing our shaders. Again at this stage we need to update all our shaders but I'm only going to look at the changes once.

In our vertex shader (and in our tessellation evaluation shader for our terrain) we now need to calculate 3 vertices projected for our shadow maps. In this case I'm fully writing them out as its faster then looping:
...
uniform mat4      shadowMat[3];   // our shadows view-projection matrix
...
// shadow map
out vec4          Vs[3];          // our shadow map coordinates

void main(void) {
  ...
  // our shadow map coordinates
  Vs[0] = shadowMat[0] * model * V;
  Vs[1] = shadowMat[1] * model * V;
  Vs[2] = shadowMat[2] * model * V;

  ...
Our fragment shaders need to be adjusted as well. First off we need to change our samplePCF function so it checks a specific shadow map:
...
uniform sampler2D shadowMap[3];                     // our shadow map
in vec4           Vs[3];                            // our shadow map coordinates
...
float samplePCF(float pZ, vec2 pCoords, int pMap, int pSamples) {
  float bias = 0.0000005; // our bias
  float result = 1.0; // our result
  float deduct = 0.8 / float(pSamples); // deduct if we're in shadow

  for (int i = 0; i < pSamples; i++) {
    float Depth = texture(shadowMap[pMap], pCoords + offsets[i]).x;
    if (pZ - bias > Depth) {
      result -= deduct;
    };  
  };
    
  return result;
}
...
And finally we need to change our shadow function to figure out which shadow map to use.

We simply start with our highest quality shadow map and if our projection coordinates are within bounds we use it, else we check a level up:
...
// check if we're in shadow..
float shadow(vec4 pVs0, vec4 pVs1, vec4 pVs2) {
  float factor;
  
  vec3 Proj = pVs0.xyz / pVs0.w;
  if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
    // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
    factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 0, 9);
  } else {
    vec3 Proj = pVs1.xyz / pVs1.w;
    if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
      // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
      factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 1, 4);
    } else {
      vec3 Proj = pVs2.xyz / pVs2.w;
      if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
        // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
        factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 2, 1);
      } else {
        factor = 1.0;
      };
    };
  };

  return factor;
}

void main() {
  ...
  // Check our shadow map
  float shadowFactor = shadow(Vs[0], Vs[1], Vs[2]);
  ...
And that's it.

For this part I've created a Tag in Github instead of a branch. We'll see which works better.
Download the source code

And a quick video showing the end result:


What's next


I think I've gone as far as I want with shadows for now. The next part may take while before I get it done as there is a lot involved rewriting our code so far to a deferred lighting model but that's what we'll be doing next.

After that we'll start looking at adding additional lights and looking at other shading techniques.
Somewhere in the middle we'll also start looking at adding a simple pre-processor to our shaders so we can start reusing some code and make our shaders easier to put together.


Sunday 3 April 2016

Shadow maps #2 (part 27)

Okay, just a small one today.

When we look at texture mapping our GPU nicely interpolates the colors between pixels to not make our textures look very blocky when we come too close.
Now it does do the same when we query our shadow map however we're just interpolating our Z, when we then apply that to our rendering we still end up with a very blocky result:

Obviously in this case it is clear our shadow map simply doesn't contain the resolution we require to get nice looking shadows for our trees and we'll be looking at resolving that at least somewhat in our next post but we'll never get rid of this completely unless we're willing to waste GPU and memory on really large shadow maps.

Instead we'll take a page our of our texture mapping book and smooth our shadows and the algorithm we're going to use is commonly known as Percentage Closer Soft Shadows.

Now note that this technique isn't just for smoothing out shadows to get rid of our blockiness. Another use for it is to soften the shadows more as the distance between the surface and the shadow casting object grows as more ambient light is able to illuminate the surface. We won't be going into that today though.

The algorithm itself requires us to obtain values for surrounding pixels in our shadow map and obtain an average for our shadow. The more of those pixels are in shadow, the darker we render our surface.

To enable doing this we add a table of offsets to our shader:
// Precision ring
//      9 9 9
//      9 1 2
//      9 4 4
const vec2 offsets[] = vec2[](
  vec2( 0.0000,  0.0000),
  vec2( 0.0005,  0.0000),
  vec2( 0.0000,  0.0005),
  vec2( 0.0005,  0.0005),
  vec2(-0.0005,  0.0005),
  vec2(-0.0005,  0.0000),
  vec2(-0.0005, -0.0005),
  vec2( 0.0000, -0.0005),
  vec2(-0.0005, -0.0005)
);
Note that this table gives us the option to use 1, 2, 4 and 9 samples. We could add more rings if we wish to go further.

Now we replace our sample function with one that applies the PCF algorithm:
float samplePCF(float pZ, vec2 pCoords, int pSamples) {
  float bias = 0.0000005; // our bias
  float result = 1.0; // our result
  float deduct = 0.8 / float(pSamples); // deduct if we're in shadow

  for (int i = 0; i < pSamples; i++) {
    float Depth = texture(shadowMap, pCoords + offsets[i]).x;
    if (pZ - bias > Depth) {
      result -= deduct;
    };  
  };
    
  return result;
}
And now we call this new function from our shadow factor function:
// check if we're in shadow..
float shadow(vec4 pVs) {
  float factor;
  
  vec3 Proj = pVs.xyz / pVs.w;
  if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
    // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
    factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 4);
  } else {
    factor = 1.0;
  };

  return factor;
}
Note that for now I've duplicated this code in each shader and they are all using a sample size of 4 for now.

Here is our result with this sample size:


And with a sample size of 9:


Download the source here

What's next?

Okay, that was a short one. I've left the shadow map projection matrix calculation alone for now, I may come back to that at a later time but I haven't found a really good adjustment yet.

The next tutorial we'll have a look at cascaded shadow maps to get some sharper shadows up close.




Saturday 2 April 2016

Shadow maps #1 (part 26)

So yes, I've decided to swap doing shadows first and then change the engine over to using a deferred lighting model.

Part of me wishes I hadn't. It isn't that shadows are difficult but in the state our engine currently is in, we're duplicating a few things. I really need to find time to add a pre-processor into our shader loading code. But we'll make do. Note that for our write up I'll only do things once so where code currently needs to be duplicated, have a look at the finished source code.

Rendering shadows requires knowing whether there is anything between the surface you are rendering and the lightsource that illuminates it.
It gets increasingly complex when more light sources are involved though that is something I won't get into now.

Shadowmaps are a bit of a cheat to allow us to quickly find out if light is being blocked out by another object. With a shadow map we render our scene from the perspective of the light source. As we render our scene the Z-buffer builds up and will eventually paint a picture of what are the closest objects that block out our light.

When we render our real scene besides projecting our vertices to screen space, we also project them using the same mvp we used when rendering it to the shadow map. That allows us to check the Z value for each fragment against our shadow map. If it's larger, we're behind something and we're thus in shadow.

This does require us to render our scene twice (or more if we have more light sources). This adds overhead but we've got a few things going for us:
  • we're only interested in our depth buffer, so we can create very simple and quick shaders that do as little calculations as possible
  • we can be more conservative with what we render, for spotlights we often need to only render a fragment of our scene, only for our sunlight we include a lot
  • we may not need to render everything, for instance it makes little sense to render our terrain into our shadow map, nothings beneath our terrain so there is nothing for it to cast its shadow on.
  • when stereo rendering we can reuse our shadow maps for both eyes, we don't need to render them twice
Also once we go beyond these initial stages there are other optimisations. For instance most things in your scene are static, so you could render your shadow maps once with all the static objects, then in your render loop make a copy and render just the objects that move around. In our example below that requires a bit more thought as our light "moves" to deal with the large area a sunlight covers but there are still ways to use this optimisation.

We're just going to render a shadow map for our sunlight. Because the sun is very very far away and light rays hit our surfaces pretty much parallel we're going to use an orthographical projection for this.
When we'll eventually add spotlights we'll use a perspective projection to create proper shadows.

Creating our shadow map

Now here one of our previous posts comes in very handy. For our shadow map we're going to render to texture, it's just that our texture is a depth buffer :)

So we start by adding a handy function for this to our texture map library that is a simplified version of our render to texture function we added in our LOD tutorial:
// Prepare our texture as a shadow map (if needed) and makes our
// shadow map frame buffer active
bool tmapRenderToShadowMap(texturemap * pTMap, int pWidth, int pHeight) {
  if (pTMap == NULL) {
    return false;
  };

  // check if we can reuse what we have...
  if ((pTMap->width != pWidth) || (pTMap->height != pHeight)) {
    // chuck our current frame buffer JIC.
    tmapFreeFrameBuffers(pTMap);
  };
Note that we'll decide to rebuild our shadow map if its size changes. Not something we'll use today but it can be handy sometimes. We just need to be conservative as rebuilding our buffers will introduce a fair amount of overhead.
  // create our frame buffer if we haven't already
  if (pTMap->frameBufferId == 0) {
    GLenum status;

    pTMap->filter = GL_LINEAR;
    pTMap->wrap = GL_CLAMP;
    pTMap->width = pWidth;
    pTMap->height = pHeight;
Obviously we're assuming no texture is loaded so we set our values as we need them to be.
    glGenFramebuffers(1, &pTMap->frameBufferId);
    glBindFramebuffer(GL_FRAMEBUFFER, pTMap->frameBufferId);

    // init our depth buffer
    glBindTexture(GL_TEXTURE_2D, pTMap->textureId);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT32F, pTMap->width, pTMap->height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
    glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_TEXTURE_MODE, GL_LUMINANCE);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, pTMap->filter);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, pTMap->filter);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, pTMap->wrap);
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, pTMap->wrap);    

    // bind our depth texture to our frame buffer
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, pTMap->textureId, 0);
So this bit is nearly identical to how we created our frame buffer and added our Z-buffer in our render to texture example. The difference is that we're not adding any color buffers. I'm also using our textureId as we've already generated a texture object when we construct our object and it seems wasteful to create a second one just to use our depth buffer.
    // and make sure our framebuffer knows we draw nothing else...
    glDrawBuffer(GL_NONE);
    glReadBuffer(GL_NONE);
Now here is a bit of magic, these two commands ensure our frame buffer knows there are no color buffers to write to or read from. We're just writing to our Z-buffer.
    // and check if all went well
    status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
    if (status != GL_FRAMEBUFFER_COMPLETE) {
      errorlog(status, "Couldn't init framebuffer (errno = %i)", status);
      tmapFreeFrameBuffers(pTMap);
      return false;
    } else {
      errorlog(0, "Created shadow map %i,%i", pWidth, pHeight);
    };
  } else {
    // reactivate our framebuffer
    glBindFramebuffer(GL_FRAMEBUFFER, pTMap->frameBufferId);
  };

  return true;
};
And the last bit again is the same as our render to texture function, we check if we've successfully created our frame buffer and reuse our frame buffer next time we call our function.

Our shadow shaders

As I mentioned we need simplified shaders to render our objects to our shadow maps. We have one vertex shader and two fragment shaders. We need an extra fragment shader to deal with texture maps that have an alpha such as those that we use for our leaves or our leaves would cast square shadows. We don't want that overhead if we don't need it.

Here is our vertex shader:
#version 330

layout (location=0) in vec3 positions;
layout (location=2) in vec2 texcoords;

uniform mat4      mvp;            // our model-view-projection matrix
out vec2          T;              // coordinates for this fragment within our texture map

void main(void) {
  // load up our values
  vec4 V = vec4(positions, 1.0);
  T = texcoords;
  
  // our on screen position by applying our model-view-projection matrix
  gl_Position = mvp * V;
}
And our normal fragment shader:
#version 330

out vec4          fragcolor;

void main() {
  // this does nothing, we're only interested in our Z
  fragcolor = vec4(1.0, 1.0, 1.0, 1.0);
}
And our texture shadow shader:
#version 330

uniform sampler2D textureMap;                       // our texture map

in vec2           T;                                // coordinates for this fragment within our texture map
out vec4          fragcolor;

void main() {
  fragcolor = texture(textureMap, T);
  if (fragcolor.a < 0.2) {
    discard;
  };
}
By now these should be pretty self explanatory. Even though we do output a fragment color that output is ignored

Finally in our load_shaders we actually load these shaders:
  solidShadow = newShader("solidshadow", "shadow.vs", NULL, NULL, NULL, "solidshadow.fs");
  textureShadow = newShader("textureshadow", "shadow.vs", NULL, NULL, NULL, "textureshadow.fs");
It starts to get interesting once we start using our shaders. I've modified our material library to record both the normal shader and the shadow shader for each material. If no shadow shaders is set the material doesn't cast a shadow. For this to work we've added two functions to our material library:
  • matSetShadowShader assigns a shadow shader to our material
  • matSelectShadow selects that shader and set it up
Note that I've also moved our lightSource struct into this library temporally and added both a shadowMap texture and shadowMat view-projection matrix variable to this structure. This will soon get it's own place.

Finally we assign our shadow shaders to our materials in our load_objects function. Note that I have moved a few things around where I didn't want materials to get a shadow shader:)
  ...
  
  // assign shaders to our materials
  lnode = materials->first;
  while (lnode != NULL) {
    mat = (material * ) lnode->data;

    // assign both solid and shadow shaders, note that our shadow shader will be ignored for transparent shadows
    if (mat->reflectMap != NULL) {  
      matSetShader(mat, reflectShader);
      matSetShadowShader(mat, solidShadow);
    } else if (mat->diffuseMap != NULL) {          
      matSetShader(mat, texturedShader);
      matSetShadowShader(mat, textureShadow);
    } else {
      matSetShader(mat, colorShader);
      matSetShadowShader(mat, solidShadow);
    };
    
    lnode = lnode->next;
  };

  ...

Rendering our shadow map

Now it's time to render our shadow map. First I've enhanced our meshnode library and added a meshNodeShadowMap function to it that renders our node using the shadow shaders. It's a dumbed down version of our meshNodeRender function that only renders non transparent objects for which a shadow shader is available.
// render suitable objects to a shadow map
void meshNodeShadowMap(meshNode *pNode, shaderMatrices * pMatrices) {
  dynarray *      meshesWithoutAlpha  = newDynArray(sizeof(renderMesh));
  mat4            model;
  int             i;

  // prepare our array with things to render, we ignore meshes with alpha....
  mat4Identity(&model);
  meshNodeBuildRenderList(pNode, &model, pMatrices, meshesWithoutAlpha, NULL);

  // we should sort our meshesWithoutAlpha list by material here and then only select our material 
  // if we're switching material  
  for (i = 0; i < meshesWithoutAlpha->numEntries; i++) {
    bool selected = true;
    renderMesh * render = dynArrayDataAtIndex(meshesWithoutAlpha, i);
  
    shdMatSetModel(pMatrices, &render->model);
    if (render->mesh->material != NULL) {
      selected = matSelectShadow(render->mesh->material, pMatrices);
      if (selected) {
        meshRender(render->mesh);
      };
    };
  };

  dynArrayFree(meshesWithoutAlpha);
};
There are a few small tweaks to meshNodeBuildRenderList that allow for a NULL pointer to be used for the dynarrays and prevent rendering our bounding boxes to our depth buffer.

Now it's time to enhance our rendering loop. At the start of engineRender I've added this snippit of code:
  // only render our shadow maps once per frame, we can reuse them if we're doing our right eye as well
  if (pMode != 2) {
    renderShadowMapForSun();
  };
It calls the renderShadowMapForSun function unless we're rendering our right eye (as we're reusing our left eyes map).

The renderShadowMapForSun function is where the magic happens, lets look at it in detail:
// render our shadow map
// we'll place this in our engine.h for now but we'll soon make this part of our lighting library
void renderShadowMapForSun() {
  // we'll initialize a 4096x4096 shadow map for our sun
  if (tmapRenderToShadowMap(sun.shadowMap, 4096, 4096)) {
    mat4            tmpmatrix;
    vec3            tmpvector;
    shaderMatrices  matrices;
    GLint           wasviewport[4];

    // remember our current viewport
    glGetIntegerv(GL_VIEWPORT, &wasviewport[0]);

    // set our viewport
    glViewport(0, 0, 4096, 4096);
So above we've called tmapRenderToShadowMap to create our shadowMap (first time round) and select our frame buffer. We then set our viewport to match. Now here I was a little surprised to find out the viewport is not bound to the framebuffer so this overwrites the viewport configuration we had set in our main.c code. We thus store this before hand.
We've created a 4096x4096 map which should provide us with enough detail to get started.
    // clear our depth buffer
    glClear(GL_DEPTH_BUFFER_BIT);      

    // enable and configure our backface culling, note that here we cull our front facing polygons
    // to minimize shading artifacts
    glEnable(GL_CULL_FACE);   // enable culling
    glFrontFace(GL_CW);       // clockwise
    glCullFace(GL_FRONT);     // frontface culling

    // enable our depth test
    glEnable(GL_DEPTH_TEST);
    // disable alpha blending  
    glDisable(GL_BLEND);
    // solid polygons
    glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);
This should look pretty familiar, we clear our depth buffer, enable what we need to but there is one strange little tidbit here. We're culling our front faces instead of our back faces.
Assuming our objects are all solid this prevents objects to throw shadows onto themselves.
    // need to create our projection matrix first
    // for our sun we need an orthographic projection as rays of sunlight pretty much are parallel to each other.
    // if this was a spotlight a perspective projection gives the best result
    mat4Identity(&tmpmatrix);
    mat4Ortho(&tmpmatrix, -10000.0, 10000.0, -10000.0, 10000.0, -50000.0, 50000.0);
    shdMatSetProjection(&matrices, &tmpmatrix);
As mentioned, we use an orthographic projection for our sun. Note that our near place is -50000. Our orthographic projection maps our Z buffer to -1.0 => 1.0, if we set our near plane to 0 any objects between our "eye" and halfway through our scene would fall behind the clipping point. Oops.
The our map spans is important. The larger our area the further down in the scene we'll be able to render shadows but at the cost of precision. Our map spans an area of 20000x20000 which is enough for our scene to get shadows far away enough without sacrificing too much precision but you'll see it isn't perfect. We'll be looking at way to improve this in the next two posts.
    // We are going to adjust our sun's position based on our camera position.
    // We position the sun such that our camera location would be at Z = 0.
    // Our near plane is actually behind our 'sun' which gives us some wiggleroom.
    vec3Copy(&sun.adjPosition, &sun.position);
    vec3Normalise(&sun.adjPosition);  // normalize our sun position vector
    vec3Mult(&sun.adjPosition, 10000.0); // move the sun far enough away
    vec3Add(&sun.adjPosition, &camera_eye); // position in relation to our camera
We readjust our position of our sun so it's not too far away as it becomes the position of our camera.
    // Now we can create our view matrix, here we use a lookat matrix from our sun looking towards our camera position.
    // There is an argument to use our lookat point instead as in worst case scenarios half our of shadowmap could
    // relate to what is behind our camera but using our lookat point risks not covering enough with our shadowmap.
    //
    // Note that for our 'up-vector' we're using an Z-axis aligned vector. This is because our sun will be straight
    // up at noon and we'd get an unusable view matrix. An Z-axis aligned vector assumes that our sun goes from east
    // to west along the X/Y axis and the Z of our sun will be 0. Our 'up-vector' thus points due north (or south
    // depending on your definition).
    // If you do not align your coordinate system to a compass you'll have to calculate an up-vector that points to your
    // north or south 
    mat4Identity(&tmpmatrix);
    mat4LookAt(&tmpmatrix, &sun.adjPosition, &camera_eye, vec3Set(&tmpvector, 0.0, 0.0, 1.0));
    shdMatSetView(&matrices, &tmpmatrix);
And we use our good old mat4LookAt function to set our view matrix. I won't repeat what I mention about the up-vector in the comments in the code, just read them:)
    // now we override our eye position to be at our camera position, this is important for our LOD calculations
    shdMatSetEyePos(&matrices, &camera_eye);
This is an important small change we added to our matrices object. We can override our eye position which is important here because our LOD calculations would otherwise be incorrect.
    // now remember our view-projection matrix, we need it later on when rendering our scene
    mat4Copy(&sun.shadowMat, shdMatGetViewProjection(&matrices));
We also need to remember our view-projection matrix because we need it later on when rendering our scene.
    // and now render our scene for shadow maps (note that we only render materials that have a shadow shader and we ignore transparent objects)
    if (scene != NULL) {
      meshNodeShadowMap(scene, &matrices);    
    };
Last but not least, we call our meshNodeShadowMap render function
    // and output back to screen
    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glViewport(wasviewport[0],wasviewport[1],wasviewport[2],wasviewport[3]);
  };
};
And as part of our cleanup we reset our viewport back to what it was before. At the end of this we have our shadow map, but we're not using it yet.

Applying our shadows


Now we're ready to actually cast some shadows in our end result. This has become fairly simple at this point in time. First off we need to make sure our shaders know what shadowmap to use and what our shadow view-projection matrix is. Luckily these are both stored in our lightSource structure so we simply need to add a small code fragment to matSelectProgram:
  ...

  if (pMat->matShader->shadowMapId >= 0) {
    glActiveTexture(GL_TEXTURE0 + texture);
    if (pLight->shadowMap == NULL) {
      glBindTexture(GL_TEXTURE_2D, 0);      
    } else {
      glBindTexture(GL_TEXTURE_2D, pLight->shadowMap->textureId);
    }
    glUniform1i(pMat->matShader->shadowMapId, texture); 
    texture++;   
  };
  if (pMat->matShader->shadowMatId >= 0) {
    glUniformMatrix4fv(pMat->matShader->shadowMatId, 1, false, (const GLfloat *) pLight->shadowMat.m);
  };

  ...

So now we need to update our shaders. Now these changes at this point in time need to be applied to multiple shaders. I've added them to our terrain shader, our flatshader, our textured shader and our reflection shader. We'll only discuss the changes to our textured shader here.

First we start with our standard.vs vertex shader, I'll just highlight the changes:
...
uniform mat4      shadowMat;      // our shadows view-projection matrix
...
// shadow map
out vec4          Vs;             // our shadow map coordinates

void main(void) {
  ...
  // our shadow map coordinates
  Vs = shadowMat * model * V;
  ...
}
So we've added our shadow view-projection matrix as a uniform and added an output called Vs. Then we calculate Vs by projecting our vertex position.

In our fragment shaders we add a new uniform for our shadowMap and an input for Vs at the start:
uniform sampler2D shadowMap;                        // our shadow map
in vec4           Vs;                               // our shadow map coordinates
After that we add two helper functions that use our Vs input to perform our lookup in our shadowMap and return a factor between 0.0 (fully in shadow) and 1.0 (not in shadow). That seems a bit like overkill right now but in part two of this write up we'll expand on this:
// sample our shadow map
float sampleShadowMap(float pZ, vec2 pCoords) {
  float bias = 0.00005;
  float depth = texture(shadowMap, pCoords).x;
  
  if (pZ - bias > depth) {
    return 0.0;
  } else {
    return 1.0;
  };  
}

// check if we're in shadow..
float shadow(vec4 pVs) {
  float factor;
  
  vec3 Proj = pVs.xyz / pVs.w;
  if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
    // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
    factor = sampleShadowMap(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5));
  } else {
    factor = 1.0;
  };

  return factor;
}
And in our main function we'll call shadow to obtain our shadow factor and apply it:
void main() {
  ...
  // Check our shadow map
  float shadowFactor = shadow(Vs);
  ...
  // and calculate our color after lighting is applied
  vec3 diffuseColor = fragcolor.rgb * lightcol * (1.0 - ambient) * NdotL * shadowFactor; 
  ...
    specColor = lightcol * matSpecColor * specPower * shadowFactor;
  ...
}
Note how we simply add our shadow factor into our diffuse and specular colour calculation.

And we have shadows:

Download the source here

What's next?


This is only a start. When you move the camera around you'll see we've got plenty of things that need to improve. Very simply put, we don't have enough detail in our shadow map. We're also sacrificing half our shadow map as part of the shadow maps relates to shadows that are behind our camera.

In the next part we'll start looking at "Percentage Close Filtering" which will be a small post on smoothing out our shadow maps. We'll also look at ways to improve our projection matrix so we sacrifice less detail.

After that we'll look at cascaded shadow maps, we basically render more then 1 shadow map for our light source so we can use a higher detail one for shadows that are close to our camera.