Bastiaan's blog: 2016

Tuesday, 23 August 2016

Update of GLFW tutorial

Hey all,

I've found some time to update the makefile for the windows build so everything compiles. I've tested this on a new computer I was able to use at work using Visual Studio 2013 and it works pretty nicely.

I've also updated both windows and mac builds to use the latest GLFW version 3.2.1. I haven't changed any functionality as I haven't had time to look into the changes in this new build but it seems to have a few interesting bits that I'll be looking into.

Other then that I've slowly been working on my forward+ renderer. It is now rendering well over 1000 point lights on my little old Macbook reaching about 45fps at 1280x720 in windowed mode.

On the new windows machine it managed to fluctuate between 80 and 150 fps, that thing is fast.

One nice thing is that the number of lights do not effect performance too much.

Mind you, it's still a good 25% slower then the deferred renderer. Ok I never brought it up to supporting a large volume of lights but I'm still surprised at the results so far. Seeing the simplicity of the scene I'm rendering so far the forward+ renderer should be doing less not having to build 5 really large images. It really makes me wonder where it is loosing most of the speed.

I like the simpler implementation of the forward+ shader however so I'm not giving up on it yet.

Saturday, 13 August 2016

Just an update

Hi All,

Haven't posted for awhile, been hard at work but on and off at reshuffling my code. I haven't checked anything into GitHub yet and probably won't until I'm further down the line but the code now is split into normal header and source files (easier to maintain, faster to compile) and I've turned it back into a forward shader but now with the lighting code build into the fragment shader.

That has yielded some surprising results. Two things that I thought really stood out:
1) it's costing way more performance then I thought. I've only enabled the directional light and the spot lights at the moment and it is already much slower then the deferred lighting approach. Add in any of the spotlights and things just die. Now part of this is looping through the lights which is notoriously bad for the GPU but still I was surprised at how much performance as lost
2) with the deferred lighting approach there is a sort of natural optimisation in resource usage. You select the shadow map for the light and the code stays very simple for that one light.
With multiple lights in a single fragment shader you have to have all the resources active and your shaders has to be able to handle multiple types of lights so there are far more conditions.

Time will tell if changing this over to what I've learned is called Forward+ makes enough of an improvement to make this worth its while. Else I'll go back to the fully deferred renderer. Still the advantage of being able to handle transparent objects without a second renderer implementation is appealing so we'll stick with it.

The journey itself hasn't been without its own bonuses. The first was dealing with the resource issue. With 3 shadow maps for my directional light and 6 shadow maps for each point light, I'm quickly running out of active textures. My graphics card on my Macbook Pro can only handle a measly 16 active textures (this is not a limit to the number of textures, but how many you can use simultaneously).

Layers to the rescue!

Layers are an extension to the mipmapping support in textures. It allows you to store multiple layers in a single texture and my graphics card allows me to create hundreds of layers for a single texture, limited only by the memory on the graphics card. In your shader you simply use a vec3 instead of a vec2 where the z simply selects the layer.

Also what seemed to be a plus point is that you can bind the texture array (a texture with layers) to a framebuffer and then render to the individual layers by selecting the layer in your geometry shader.
I was able to render all 3 shadow maps for my directional light in one pass by having my geometry shader triple every face, once for each layer, with the proper projection applied to each instance.
Unfortunately it killed all performance, I don't know if geometry shaders aren't properly supported on my graphics card or if this is an architectural problem, but I had much more success binding the layers to my frame buffer one at a time and rendering each layer separately.
Worth checking on newer hardware and maybe supporting both approaches after determining if the hardware handles it correctly.

At the end I'm working to having a single texture containing my shadow maps as layers, or maybe splitting the shadow maps in just a few arrays (every layer needs to have the same size and my spot light shadow maps are smaller).

I'll do the same for my light maps, just having a single texture with a layer for each light map. These are static anyway.

Textures for my objects will work the same as always, here it makes no sense to have them active all at once as we're still rendering the objects one at a time.

Uniform Buffer Objects

Another optimisation I've added is that I've started using Uniform Buffer Objects or UBOs.
UBOs have two distinct advantages:
1) you can load a bunch of data in bulk.
2) the data can be shared amongst shaders

For instance, while we're rendering our scene we keep copying our projection matrix and view matrix and all derivatives of those into each shader even thought they don't change. Sure we're more often then not using the model-view-projection matrix which does change so it's not the best example but with a UBO you could load the projection and view matrix into a UBO at the start of a frame, and then use it in every shader.

I've used it right now for loading all information about my lights at the start of the frame. I'm only doing this for my left eye only in stereo rendering as it can be reused for the right eye.

In the same way I plan to move all my material data in a UBO as well, this one I even only need to load once after our scene is loaded. In my shader I simply need to pass an index onto our material data.

There are drawbacks for UBOs:
- they take up space in graphics memory
- you can't use samplers
- UBOs have strict rules around data alignment, you sometimes need to pad the structure to ensure a struct in C(++) matches the struct in GLSL so you can transfer the data. It isn't helped that some implementations are buggy turning a vec3 into a vec4 for instance.

The sampler drawback is the most irritating one. For our materials it means we still set the textures when selecting the shader, we can only store things like color, reflectiveness, etc. in the material data.

For our light UBO we use the fact that we're using layers in our shadowmap and lightmap texture. We simply store the layer number in our data.
This also has the advantage that we bind GL_TEXTURE1 to our shadowmap texture atlas and GL_TEXTURE2 to our lightmap texture atlas and leave those reserved as such.
I'm skipping GL_TEXTURE0 because I'm using that to manipulate textures with and it's used in our font rendering, and I'm using GL_TEXTURE2 and onwards for my material textures.

Anyway, lots that I've done so far. I'm now working on the first attempt of implementing a compute shader to create my light tile-map. Once I have that working I'll check stuff into github and maybe highlight a few bits if the implementation that are interesting.

To be continued...

Friday, 8 July 2016

Spotlights in our deferred rendering

Alright, time to add some spotlights :)

But before we do so I wanted to talk about this project itself. I was already musing about it some time ago but I'm going to change things up a bit.

This little project is now pretty far removed from a tutorial series as originally intended and it really doesn't suit the format it's in right now so it's time to change direction.

This post will be the last in the series and it'll leave a number of things open but c'est la vie. The final nail in the coffin so to speak was reading the information behind light indexed rendering and it's made me want to change direction.

I've also felt that the engine I've been building is slowly developing to be a render engine that I can grow to something that I could actually use. But I've been wanting to bring it back to a more traditional library for inclusion in other projects.

After I finish writing this post the first order of business will be to take the core and restructure it, splitting all the header files into header and source files and compiling the source files into a library. After that I'll initially be turning it back into a forward renderer, after which I'm going to look into implementing the light indexing technique. I'll be blogging about each step and making the code available on github but I won't go into the level of detail I've done so far.

Spotlights

However before that, let's add a couple of spotlights.

A spotlight isn't much different from our point light, all that really changes is that it shines in a limited direction. When we look at our deferred shader there are two parts that we need to deal with, the first is the shape we'll be rendering and the second is the changes to the lighting calculation itself.

On that first part I'm going to cheat. Just for getting things to work I'm still using the same code I used for our point light. This means we're rendering way more then we should but at this stage I don't care. The idea is to eventually render a cone but as per my intro above, I've changed direction and won't be doing so at this point in time.

I'm also going to cheat on how we implement our spotlight. Traditionally you would calculate the angle between our light direction vector and the vector from the origin of the light to our fragment. The greater this angle, the less we illuminate our fragment, until eventually we don't illuminate our fragment at all as we pass the edge of our cone.

In the days that we didn't have oodles of GPU power we cheated using a lightmap:

Now it might seem strange to use this old cheat but there is a really good reason for doing so. You can start doing some really funky things with this because you basically end up projecting this image as our light and it doesn't have to be a boring white circle. It can also be something cool like this:

Yup, we're going to project a bat symbol on our un-expecting little house....

Because our spot light shines in one direction we also only need to create one shadow map which we do using a perspective projection matrix and here's the funky bit, the same calculation we need to do for determining our shadowmap coordinates is the same calculation we need to do to get our light map coordinates.

I've made a few changes to our light structure.

First I've added a type, 0 for directional (which is still handled separately), 1 for a pointlight and 2 for a spotlight. This has allowed me to add code to our shadowmap code to figure out what needs to be created.

I've also added a 'lookat' vector that basically tells us the direction the spotlight is shining to and I've added an extra cached value to track if our lookat has changed and if we need to recalculate our shadowmap.

And there is our light angle value that determines the shape of our light cone.

If you look at the changes to our lsRenderShadowMapsForLight function (used to be our point light function) you'll see that it will calculate only one shadow map for a spotlight and instead of using our 6 lookat vectors uses the vector in our light structure. It also uses our light angle as the FOV value for our projection map.

Second, I've added our spotlight shader :) I'm not going to show the whole code here but there is one bit in the logic that I do want to highlight:

    // we're going to use our shadow maps projection matrix to limit our light
    vec4 Vs = shadowMat[0] * V;
    vec3 Proj = Vs.xyz / Vs.w;
    if ((abs(Proj.x) < 1.00) && (abs(Proj.y) < 1.00) && (abs(Proj.z) < 1.00)) {
      vec2 coords = vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5);
      // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
      shadowFactor = samplePCF(0.5 * Proj.z + 0.5, coords, 0, 9);

      lColor = lColor * texture(lightMap, 1.0-coords).rgb;
    } else {
      // no point in doing this..
      discard;
    };

This is the bit of code that uses our one shadowmap projection matrix, determines the coordinates in our shadowmap, discards the fragment if we're outside of it, and obtains the lights color from our light map.

And that really is it. The end result is:

And here is our bat symbol :)

Well that's it for today. It'll probably be awhile before my next post as I've got a fair amount of work to do restructuring things :)

Sunday, 26 June 2016

Normal mapping (part 31)

Okay, we're taking a quick detour. I've been wanting to add this to the engine for a little while now but just didn't get around to it yet. As I was adding a model to the scene that had some normal maps I figured, why not...

Now I'll be the first to admit I'm not the greatest expert here. The last time I messed around with normal mapping was in the mid 90ies and that was completely faking it on the CPU because an 80386 didn't have the umph to do this.

The idea behind normal mapping is simple. The normal for the surfaces are the key thing that determines the lighting of our surface. It allows us to determine the angle at which light hits the surface and thus how to shade it. Now if there is a grove in our surface we need to introduce additional complexity to our model to properly light this. Take a brick wall for instance, we would really need to model each and every brick with all its jagged edges, to create believable lighting of the wall.

Normal mapping allows us to approximate this by still using a flat surface but adding a texture map that contains all the normals for the surface. So take our brick wall, at the edges of our bricks our normals won't be pointing outwards but sideways around the contours of the bricks. The end result is shading the bricks properly. Now this is very limited as we won't be casting any shadows or changing the geometry in any meaningful way so if you come very close you'll see that we're being tricked but it is very effective for convincingly representing a rough surface.

The biggest issue with normal mapping is that the texture assumes the surface is pointing in one direction (facing us) so we need to rotate all the normals to match the orientation of our surface.
Funny enough, single pass shaders optimise this by doing the opposite and rotating the lights to match our normal map, it's prevents a relatively expensive matrix calculation in our fragment shader.
In a deferred shader we don't have that luxury and will apply our matrix in our fragment shader.

To do this we need not only our normals for our vertices, we also need what are called the tangent and bitangent of our normal. These are two perpendicular vectors to our normals that further define the orientation of our surface. Most implementations I've seen will use the adjacent vertices of our vertex to calculate the tangent and bitangent for that vertex and add that to our data. I was planning on doing the same until I read this post: Deferred rendering and normal mapping.
I have no idea why this works, but it does, so I've used it.

Preparing your mesh

Now the first thing is that we actually need a normal map for our object. The house I've added to our scene had a normal map included but I've also added one for our tree using this brilliant site here: NormalMap Online
This little online tool is amazing. It lets you create all sorts of maps from source images. Now I've used our tree texture directly and we have this results:

A note of warning here, what is behind this is the assumption that darker colors are 'deeper' then lighter colors and that is what drives creating the normal map. It usually gives a convincing result but sometimes it pays to create a separate 'depth' image to create a much better result. Indeed I believe the normal map for the house was created by similar means and gives some incorrect results.
For our purposes today, it will do just fine.

We already had added a "bumpmap" texture to our shader structure for our heightmap implementation so we'll use that here as well. That means I just had to change the material file loader to support wavefronts map_bump directive and add that to our tree material file.

Changes to our shader

You'll see I've added two shaders to our list of shaders, BUMP_SHADER and BUMPTEXT_SHADER and both use our standard vertex shader and standard fragment shader. We simply added "normalmap" as a definition to add in the code we need. The first shader simply applies this on a single color material and the second on a textured material. We could easily combine our environment map into this as well though I have not made any changes to using the normal from our normal map instead of the vertex normal.

In our vertex shader we can see that we simply take parts of our normal-view matrix as our tangent and bitangent:

...
#ifdef normalmap
out vec3          Tangent;        // tangent
out vec3          Binormal;       // binormal
#endif

void main(void) {
  ...

  // N after our normalView matrix is applied
  Nv = normalize(normalView * N);
#ifdef normalmap
  Tangent = normalize(normalView[0]);
  Binormal = normalize(normalView[1]);
#endif 

  ...
}

As I said before, I have no idea why this works, I'm sure if I dive into the math I'll figure it out or find out it is not a complete solution but it seems to have the right effects. All in all I may get back to this and precalc all my tangents and bitangents and make it part of our vertex data.
Do note that the normal, tangent and bitangent get interpolated between our vertices.

In our fragment shader we simply replace the code that outputs our normal as it is to code that takes the normal from our normal map and applies the matrix we end up preparing:

...
#ifdef normalmap
uniform sampler2D bumpMap;                          // our normal map
in vec3           Tangent;                          // tangent
in vec3           Binormal;                         // binormal
#endif
...
void main() {
  ...
#ifdef normalmap
  // TangentToView matrix idea taken from http://gamedev.stackexchange.com/questions/34475/deferred-rendering-and-normal-mapping
  mat3 tangentToView = mat3(Tangent.x, Binormal.x, Nv.x,
                            Tangent.y, Binormal.y, Nv.y,
                            Tangent.z, Binormal.z, Nv.z);
  vec3 adjNormal = normalize((texture(bumpMap, T).rgb * 2.0) - 1.0);
  adjNormal = adjNormal * tangentToView;
  NormalOut = vec4((adjNormal / 2.0) + 0.5, 1.0); // our normal adjusted by view
#else
  NormalOut = vec4((Nv / 2.0) + 0.5, 1.0); // our normal adjusted by view
#endif
  ...
}

So we create a tangentToView matrix using our tangent, bitangent and normal, then get the normal from our normal map, apply our matrix and write it out to our normal geobuffer.

Note btw that the code for writing to our normal geobuffer has changed slightly to bring our -1.0 to 1.0 range into a 0.0 to 1.0 range. You'll see that I've updated our lighting shaders to reverse this. Without this change we lost our negative values in our normal map. I'm fairly surprised this didn't give that much problem during lighting.

Anyways, here's the difference in shading between our tree trunk without normal mapping and with normal mapping:

Tree without normal mapping

Tree with normal mapping

And here are a few renders of the house I added, without normal mapping, normal mapping without our texturemaps and the full end result:

House without normal mapping

House with normal mapping but no textures

House with normal mapping and textures

Well, that's enough for today. Source code will be on github shortly :)

Saturday, 25 June 2016

Deferred lighting rendering #3 (part 30)

So it's time to have a look at adding our point lights. Point lights are in many respects the simplest of localised lights. A light simply shines from a single point in space, the light slowly diminishing in strength as distance to that location increases.

In hindsight I should have added logic for at least one point light before we moved to a deferred rendering approach to better illustrate the differences but it is in handling these lights that deferred rendering starts to shine.

Traditionally in single pass shaders we would either find a loop that runs through all the lights or fixed logic that handles a fixed set of lights (usually generated by a parser). Because this logic is repeated for every fragment rendered to screen whether the fragment is lit by the light or not and whether the fragment will later be overwritten or not there is a lot of performance that is wasted.

Now with the speed of modern GPUs and only including lights that likely illuminate an object a single pass shader tip the balance back in its favour, I'm not sure.

Deferred rendering ensures that we prevent a lot of this overhead by doing the lighting calculation as few times as possible by working on the end result of rendering our scene to our geobuffer.

Our main light calculation

The basis of our light calculation remains the same for our point light as for our directional sunlight. I've skipped the ambient component as I feel either using the ambient sunlight as we have now or using some form of environment mapping gives enough results with this.
So we restrict our light calculation to diffuse and specular highlighting. Those calculations remain the same as with our directional light with the one difference that our light to fragment vector plays a much larger role.

The thing that is new is that the intensity of our light diminishes as we move further away from the light. To be exact, it diminishes by the square of the distance to our light.

For computer graphics however we find that we trick this a little, you can find much better explanations then I can possibly give but the formulas that we'll be using is the following:

float attenuation = constant + (linear * distance) + (exponential * distance squared);
fragcolor = lightcolor / attenuation

I've left out a few details there but lightColor is the color we calculated in the same way had with our sunlight and we divide it with our calculation based on our distance. There are 3 values that we input into this formula next to our distance:

a constant
a linear component we multiply with our distance
an exponential component we multiply with the our distance squared

You can create your engine to allow for the manual input of all 3 values to give loads of flexibility but in our engine I've simplified it. Note that when attenuation is 1.0 we get the color as is. Basically the distance at which our formula results 1.0 is where the light starts loosing its strength.

On a quick sidenote in my shader you'll see that if this formula returns larger then 1.0 I cap it. You can have some fun by letting it become overly bright by putting this threshold higher and add some bloom effects to your lighting, but thats a topic for another day.

I'm using the fact that our light starts to loose its intensity at attenuation = 1.0 to calculate our 3 values by specifying the radius at which I want this to happen and then calculating our 3 values as follows:

our constant is simply 0.2
our linear component is calculated as 0.4 / radius
our exponential component is calculated as 0.4 / radius squared

When distances equals radius our formula will be 0.2 + 0.4 + 0.4 = 1.0

Finally, in theory our light as unlimited range, the intensity will keep getting smaller and smaller but it will never reach 0. But there is a point where our intensity becomes so low that it won't have an effect on our scene anymore. In a single stage renderer you could use this to filter out which lights are close enough to your object to be evaluated, in our deferred renderer we use it to limit how much of our screen we update with our lighting color.
Now truth be told, I'm taking a shortcut here and pretending our linear component is 0.0 and our exponential component is 0.8 / radius squared. This makes the calculation slightly easier but I overestimate the range slightly.
Our range calculation simply becomes: range = radius * sqrt((maxIllum / threshold) - 0.2)
maxIllum is simply the highest of our 3 RGB values and threshold is a threshold at which our light has become to low.

Adding shadowmaps

This is where point lights get a bit ridiculous and why using spotlights can be way more effective. Point lights shine in every direction and thus cast shadows in every direction. The way we solve this is by mapping our shadowmaps on a cube and we thus create 6 individual shadowmaps. One for lookup up from the light, one for down, one for left, right, forwards and backwards.

Then when we do our shadow checks we figure out which of those 6 shadow maps applies. I have to admit, this bit needs some improvement, I used a fairly blunt force approach here mostly because I couldn't be bothered to figure out a better way.

Unlike our shadow maps for our directional lights we use a perspective projection for these shadowmaps. I'm using our distance calculations we performed just now to set our far value. Also these are static shadowmaps which means we calculate them once and reuse them unless our lights position changes instead of redoing them every frame. This saves a bunch of overhead especially if we have loads of lights. To be exact, you could save them and skip the first render step all together.

The problem with static shadowmaps is that they won't update if objects move around, so say your character walks past a point light he/she won't cast a shadow.
We'll deal with this in another article but in short we'll leave any object that moves or is animated out of our static shadowmaps, keep a copy, and render just the objects that move or are animated before rendering our frame.

Again as with our sunlight we can also reuse our shadow maps for both eyes.

The code for creating the shadow maps is nearly identical to the code for our directional light other then the added loop to update 6 maps and the change to calculating our projection and view matrices.
Also note that we only check the rebuild flag for the first map, if one map needs changing we assume all need to change (unlike our directional light where we check them individually):

void lsRenderShadowMapsForPointLight(lightSource * pLight, int pResolution, meshNode * pScene) {
  int i;

  vec3 lookats[] = {
       0.0, -100.0,    0.0, 
     100.0,    0.0,    0.0, 
    -100.0,    0.0,    0.0, 
       0.0,  100.0,    0.0, 
       0.0,    0.0,  100.0, 
       0.0,    0.0, -100.0, 
  };

  // as we're using our light position and its the same for all shadow maps we only check our flag on the first
  if ((pLight->shadowLA[0].x != pLight->position.x) || (pLight->shadowLA[0].y != pLight->position.y) || (pLight->shadowLA[0].z != pLight->position.z)) {
    vec3Copy(&pLight->shadowLA[0], &pLight->position);
    pLight->shadowRebuild[0] = true;
  };

  // we'll initialize our shadow maps for our point light
  if (pLight->shadowRebuild[0] == false) {
    // reuse it as is...
  } else if (pScene == NULL) {
    // nothing to render..
  } else {
    for (i = 0; i < 6; i++) {
      if (pLight->shadowMap[i] == NULL) {
        // create our shadow map if we haven't got one already
        pLight->shadowMap[i] = newTextureMap("shadowmap");
      };

      if (tmapRenderToShadowMap(pLight->shadowMap[i], pResolution, pResolution)) {
        mat4            tmpmatrix;
        vec3            tmpvector, lookat;
        shaderMatrices  matrices;

        // rest our last used material
        matResetLastUsed();

        // set our viewport
        glViewport(0, 0, pResolution, pResolution);

        // enable and configure our backface culling, note that here we cull our front facing polygons
        // to minimize shading artifacts
        glEnable(GL_CULL_FACE);   // enable culling
        glFrontFace(GL_CW);       // clockwise
        glCullFace(GL_FRONT);     // frontface culling

        // enable our depth test
        glEnable(GL_DEPTH_TEST);  // check our depth
        glDepthMask(GL_TRUE);     // enable writing to our depth buffer

        // disable alpha blending  
        glDisable(GL_BLEND);

        // solid polygons
        glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);    

        // clear our depth buffer
        glClear(GL_DEPTH_BUFFER_BIT);      

        // set our projection
        mat4Identity(&tmpmatrix);
        mat4Projection(&tmpmatrix, 90.0, 1.0, 1.0, lightMaxDistance(pLight) * 1.5);
        shdMatSetProjection(&matrices, &tmpmatrix); // call our set function to reset our flags

        // now make a view based on our light position
        mat4Identity(&tmpmatrix);
        vec3Copy(&lookat, &pLight->position);
        vec3Add(&lookat, &lookats[i]);
        mat4LookAt(&tmpmatrix, &pLight->position, &lookat, vec3Set(&tmpvector, 0.0, 1.0, 0.0));
        shdMatSetView(&matrices, &tmpmatrix);

        // and render
        meshNodeShadowMap(pScene, &matrices);

        // now remember our view-projection matrix, we need it later on when rendering our scene
        mat4Copy(&pLight->shadowMat[i], shdMatGetViewProjection(&matrices));

        // we can keep it.
        pLight->shadowRebuild[i] = false;

        // and we're done
        glBindFramebuffer(GL_FRAMEBUFFER, 0);
      };
    };
  };
};

Rendering our lights

Now it is time to actually render our lights. This is done by calling gBufferDoPointLight for each light that needs to be rendered. We make the assumption that our directional light has been rendered and we thus have content for our entire buffer. Each light is now rendered on top of that result by using additive blending. This means that instead of overwriting our pixel the result of our fragment shader is added to the end result.

gBufferDoPointLight assumes our blending has already been setup as we need the same settings for every light. Our loop in our render code therefor looks like this:

    // now use blending for our additional lights
    glEnable(GL_BLEND);
    glBlendEquation(GL_FUNC_ADD);
    glBlendFunc(GL_ONE, GL_ONE);

    // loop through our lights
    for (i = 0; i < MAX_LIGHTS; i++) {
      if (pointLights[i] != NULL) {
        gBufferDoPointLight(geoBuffer, &matrices, pointLights[i]);
      };
    };

As you can see for now we've just got a simple array of pointers to our lights and it currently holds 3 test lights. Eventually I plan to place the lights inside of our scene nodes so we can move lights around with objects (and accept the overhead in recalculating shadow maps). For now this will do just fine.

The rendering of our light itself is implemented in the vertex and fragment shaders called geopointlight. Most implementations I've seen render a full sphere with a radius of our maximum light distance but for now I've stuck with rendering a flat circle and doing so fully within our vertex shader (using a triangle fan):

#version 330

#define PI 3.1415926535897932384626433832795

uniform float   radius = 100.0;
uniform mat4    projection;
uniform vec3    lightPos;

out vec2 V;
out float R;

void main() {
  // doing full screen for a second, we're going to optimize this by drawing a circle !!
  // we're going to do a waver
  // first point is in the center
  // then each point is rotated by 10 degrees

  //           4
  //      3   ---   5
  //       /\  |  /\  
  //     /    \|/    \
  //   2|------1------|6
  //     \    /|\    /
  //       \/  |  \/  
  //      9   ---  7
  //           8

  if (gl_VertexID == 0) {
    vec4 Vproj = projection * vec4(lightPos, 1.0);
    V = Vproj.xy / Vproj.w;
    R = radius;
  } else {
    float ang = (gl_VertexID - 1) * 10;
    ang = ang * PI / 180.0;
    vec4 Vproj = projection * vec4(lightPos.x - (radius * cos(ang)), lightPos.y + (radius * sin(ang)), lightPos.z, 1.0);
    V = Vproj.xy / Vproj.w;
    R = 0.0;
  };
  gl_Position = vec4(V, 0.0, 1.0);
}

Now drawing a circle this way ensure that every pixel that requires our lighting calculation to be applied will be included. For very bright lights this means the entire screen but for small lights the impact is severely minimised.

You can do a few more things if you use a sphere to render the light but there are also some problems with it. We'll revisit this at some other time.

I'm not going to put the entire fragment shader here, its nearly identical to our directional light fragment shader. The main differences are:
- we discard any fragment that doesn't effect our scene
- we ignore the ambient buffer
- we use our boxShadow function to check the correct shadowmap
- we calculate our attenuation and divide our end result with that

Note that if our attenuation is smaller then 1.0 we ignore it. Basically we're within the radius at which our light is at full strength. If we didn't do this we'd see that things close to the light become overly bright. Now that can be a fun thing to play around with. The OpenGL superbible has an interesting example where they write any value where any color component is bigger then 1.0 to a separate buffer. They then blur that buffer and write it back over the end result to create a bloom effect.
But at this stage we keep that bit easy.

Seeing our buffers

Last but not least, I'm now using an array of shaders instead of individual variables and have introduced an enum to manage this.

There are two new shaders both using the same vertex and fragment shader called rect and rectDepth. These two shaders simple draw texture rectangles onto the screen.

At the end of our render loop, if we have our interface turned on (toggle by pressing i) we now see our main buffers.

At the top we see our 5 geobuffer textures.
Then we see our 3 shadow maps for our directional light.
Finally we see our 6 shadow maps of our first point light.

Handy for debugging :)

Here is a shot where we can see the final result of our 3 lights illuminating the scene:

Check out the sourcecode so far here

I'll be implementing some spotlight next time. These are basically easier then our point lights as the shine in a restricted direction and we thus can implement these with a single shadowmap.
But we can also have some fun with the color of these lights.

I'm also going to look into adding support for transparent surfaces.

Last but not least, I want to have a look into volumetric lighting. This is something I haven't played around with before so it is going to take a bit of research on my side.

Tuesday, 21 June 2016

Update on lighting

Hey all,

I've been nice and busy in what little spare time I have and have got point lights working nicely in the engine. Point lights are easiest to implement from a light perspective but are a bit of a pain when it comes to shadowmaps as you're basically creating a cube around the light and rendering a shadow map for each side. The plus side is that you can generally use static shadow maps (rendering them once and then just reusing). I'll look into combining static shadow maps for the environment with shadow maps to deal with moving objects at some later time to get a best of both worlds thing going.

I only have 3 point lights at this time but in theory I should be able to render a lot of them before any noticeable framerate drop. I won't however do that until I implement spot lights. Spotlights only illuminate in a particular direction and can use a single shadow map and in most cases suffice where a point light is overkill.

I updated the previous post with some images I generated from the buffers to hopefully make the buffers a bit clearer, I'll find some time to write up info about the point lights at a later stage. For now I will check in my changes as they are right now so you can have a look at the code and leave you with two images.

First an image where I'm rendering small versions of all the buffers ontop of the UI (though I'm only showing the 6 shadow maps for the first point light):

Then here is the end result of having some fun with lighting:

Saturday, 4 June 2016

Deferred lighting rendering #2 (part 30)

Sorry for the pause between posts. I've been caught up in life and tinkering away with other things lately. Also as I mentioned in my previous post I'm still not sure of the format I want to go forward with.

So we left off in the last part discussing how deferred lighting works. Lets have a look at the actual implementation (it was checked into GitHub a little while ago, I haven't labeled it yet though).

First off, I'll be changing the shaders to work for deferred lighting, that means they no longer work for our render to texture example that we used for our 3rd LOD of our trees. I could off course easily add a set of shaders to render that texture but I didn't feel that would add to our discussion, just distract from it. For now I've disabled that option but obviously the code for doing so is still in the previous examples and with a little bit of work you could change the engine to support both single stage and deferred rendering.

We've also switched off transparency support for now.

We don't change anything to rendering our shadow maps.

gBuffer

At the heart of our new rendering technique is what is often called rendering to a geometric buffer (because it holds various geometric data for our scene).
I've created a new library for this called gbuffer.h which is implemented in the same single file way we're used to right now.

I have to admit that at this stage I'm very tempted to rejig the engine to a normal header files + source files approach so I can compile the engine into a library to include in here. Anyway, I'm getting distracted :)

Note also at this point that I've added all the logic for the lighting stage into this file as well so you'll find structures and methods for those as well. We'll get there in due time.

An instance of a geographic buffer is contained within a structure called gBuffer which contains all the texture and the FBO we'll be using to render to the frame buffer.

In engine.c we define a global pointer to the gBuffer we'll be using and initialise this by calling newGBuffer in our engineLoad function and freeing our gBuffer in engineUnload.
Note that there is a HMD parameter send to the gBuffer routine which when set applies an experimental barrel distortion for head mounted devices such as a Rift or Vive. I won't be talking about that today as I haven't had a chance to hook it up to an actual HMD but I'll spend a post on it on its own once I've done so and worked out any kinks.

The gBuffer creates several textures that are all mapped as outputs on the FBO. These are hardcoded for now. They are only initialised in newGBuffer, they won't actually be created until you use the gBuffer for the first time and are recreated if the buffer needs to change size.

I have an enum called GBUFFER_TEXTURE_TYPE that is a nice helper to index our textures and then there are a number of arrays defined that configure the textures themselves:

// enumeration to record what types of buffers we need
enum GBUFFER_TEXTURE_TYPE {
  GBUFFER_TEXTURE_TYPE_POSITION,  /* Position */
  GBUFFER_TEXTURE_TYPE_NORMAL,    /* Normal */
  GBUFFER_TEXTURE_TYPE_AMBIENT,   /* Ambient */
  GBUFFER_TEXTURE_TYPE_DIFFUSE,   /* Color */
  GBUFFER_TEXTURE_TYPE_SPEC,      /* Specular */
  GBUFFER_NUM_TEXTURES,           /* Number of textures for our gbuffer */
};

...

// precision and color settings for these buffers
GLint  gBuffer_intFormats[GBUFFER_NUM_TEXTURES] = { GL_RGBA32F, GL_RGBA, GL_RGBA, GL_RGBA, GL_RGBA};
GLenum gBuffer_formats[GBUFFER_NUM_TEXTURES] = { GL_RGBA, GL_RGBA, GL_RGBA, GL_RGBA, GL_RGBA };
GLenum gBuffer_types[GBUFFER_NUM_TEXTURES] = { GL_FLOAT, GL_FLOAT, GL_UNSIGNED_BYTE, GL_UNSIGNED_BYTE, GL_UNSIGNED_BYTE };
GLenum gBuffer_drawBufs[GBUFFER_NUM_TEXTURES] = { GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1, GL_COLOR_ATTACHMENT2, GL_COLOR_ATTACHMENT3, GL_COLOR_ATTACHMENT4 };
char   gBuffer_uniforms[GBUFFER_NUM_TEXTURES][50] = { "worldPos", "normal", "ambient", "diffuse", "specular" };

When you look at how different people have implemented deferred lighting you'll see they all have a slightly different mix of outputs.
At minimum you'll find an output for position, color and normal. Other outputs depend on what sort of capabilities you want to enable in your lighting. Obviously the more outputs you have, the more overhead you have in clearing those buffers and reading from them in the lighting stage.

There are two other outputs I've added.

One is an ambient color output. Now this one you probably won't see very often. As we saw in our original shaders we simply calculate the ambient color as a fraction of the diffuse color so why store it separately? Well I use it in this case to render our reflection map output to so I know the color is used in full but another use would be for self lighting properties of a material. It is definitely one you probably want to leave out unless you have a specific use for it.

The other is the specular color output. Note that this is the base color that we use for specular output, not the end result. I also 'abuse' the alpha channel to encode the shininess factor. In many engines you'll see that this texture is used only to store the shininess factor because often either the diffuse color is used or the color of the light. If you go down this route you can also use the other 3 channels to encode other properties you want to use in your lighting stage.

Obviously there is a lot of flexibility here in customising what extra information you need. But lets look at the 3 must haves.

Position, this texture stores the position in view space of what we're rendering. We'll need that during our lighting stage so we can calculate our light to object vector for our diffuse and specular lighting especially for things like spotlights. This is by far the largest output buffer using 32bit floats for each color channel. Note that as we're encoding our position in a color, and our color values run from 0.0 to 1.0, we need to scale all our positions to that range.

Diffuse color, this texture stores the color of the objects we're rendering. Basically this texture will look like our end result but without any lighting applied.

Normals, this texture stores the normals of the surfaces we're rendering. We'll need these to calculate the angle at which light hits our surface to determine our intensities.

Changing our shaders

At this stage we're going to seriously dumb down our shaders. First off I've created an include file for our fragment shaders called output.fs:

layout (location = 0) out vec4 WorldPosOut; 
layout (location = 1) out vec4 NormalOut; 
layout (location = 2) out vec4 AmbientOut; 
layout (location = 3) out vec4 DiffuseOut; 
layout (location = 4) out vec4 SpecularOut; 

uniform float posScale = 1000000.0;

Looks pretty similar to our attribute inputs but not we use outputs. Note we've defined an output for each of our textures. The uniform float posScale is simply a configurable factor with which we'll scale our positions to bring them into the aforementioned 0.0 - 1.0 range.

Now I'm only going to look at one fragment shader and a dumbed down version of our standard shader at that but our outputs are used:

#version 330

in vec4           V;                                // position of fragment after modelView matrix was applied
in vec3           Nv;                               // normal vector for our fragment (inc view matrix)
in vec2           T;                                // coordinates for this fragment within our texture map

#include "outputs.fs"

void main() {
  vec4 fragcolor = texture(textureMap, T);
  WorldPosOut = vec4((V.xyz / posScale) + 0.5, 1.0); // our world pos adjusted by view scaled so it fits in 0.0 - 1.0 range
  NormalOut = vec4(Nv, 1.0); // our normal adjusted by view
  AmbientOut = vec4(fragcolor.rgb * ambient, 1.0);
  DiffuseOut = vec4(fragcolor.rgb * (1.0 - ambient), 1.0);
  SpecularOut = clamp(vec4(matSpecColor, shininess / 256.0), 0.0, 1.0);
}

So we can see that we just about copy all the outputs of our vertex shader right into our fragment shader. The only thing we're doing is scaling our position and our shineness factor.

We make similar changes to all our shaders. Note that for some shaders like our skybox shader we abuse our ambient output to ensure no lighting is applied.

Note that I have removed most of the uniform handling on our shaders that relate to lighting from our shader library. I could have left it in place and unused but for now I decided against that. You'll have to put them back in if you want to mix deferred and direct lighting.

Rendering to our gBuffer

Now that we've created our gBuffer and changed our shaders it is time to redirect our output to the gBuffer.

Our changes here are fairly simple. In our engineRender routine we simply add a call to gBufferRenderTo(...) to initialise our textures and make our FBO active. Our output now goes to our gBuffer.

We first clear our output but note though that I no longer clear the color buffer, only the depth buffer. Because of our skybox I know our scene will cover the entire output and thus there really is no need for this overhead.

The rest of the logic is pretty much the same as before as our shaders do most of the work but once our scene is rendered to the gBuffer we do have more work to do.

We're no longer outputting anything to screen, we now need to use our gBuffer to do our lighting pass to get our final output. Before we do so we first need to unset the FBO:

    // set our output to screen
    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glViewport(wasviewport[0],wasviewport[1],wasviewport[2],wasviewport[3]);  
    glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);

And then we call gBufferDoMainPass to perform our main lighting pass.
After this we should do our additional calls for other lights we want to apply to our scene but that's something we'll come back to.

Our main lighting shaders

Before we can look at our lighting pass we need some more shaders. At this point all we have implemented is our main pass which applies our global lighting over our whole scene. To apply this we're going to draw two triangles making up a rectangle that fills the entire screen. We need to render each pixel on the screen once and this logic is thus applied fully in our fragment shader.

Our vertex shader (geomainpass.vs) for this pass is thus pretty simple:

#version 330

out vec2 V;

void main() {
  // our triangle primitive
  // 2--------1/5
  // |        /|
  // |      /  |
  // |    /    |
  // |  /      |
  // |/        |
  //0/3--------4

  const vec2 coord[] = vec2[](
    vec2(-1.0,  1.0),
    vec2( 1.0, -1.0),
    vec2(-1.0, -1.0),
    vec2(-1.0,  1.0),
    vec2( 1.0,  1.0),
    vec2( 1.0, -1.0)
  );

  V = coord[gl_VertexID];
  gl_Position = vec4(V, 0.0, 1.0);
}

We don't need any buffers for this and we thus render these two triangles with a simple call to glDrawArrays(GL_TRIANGLES, 0, 3 * 2).

The magic happens in our fragment shader (geomainpass.fs, I've left out the barrel distortion in the code below):

#version 330

uniform sampler2D worldPos;
uniform sampler2D normal;
uniform sampler2D ambient;
uniform sampler2D diffuse;
uniform sampler2D specular;

uniform float posScale = 1000000.0;

// info about our light
uniform vec3      lightPos;                         // position of our light after view matrix was applied
uniform vec3      lightCol = vec3(1.0, 1.0, 1.0);   // color of the light of our sun

#include "shadowmap.fs"

in vec2 V;
out vec4 fragcolor;

void main() {
  // get our values...
  vec2 T = (V + 1.0) / 2.0;
  vec4 ambColor = texture(ambient, T);  
  if (ambColor.a < 0.1) {
    // if no alpha is set, there is nothing here!
    fragcolor = vec4(0.0, 0.0, 0.0, 1.0);
  } else {
    vec4 V = vec4((texture(worldPos, T).xyz - 0.5) * posScale, 1.0);
    vec3 difColor = texture(diffuse, T).rgb;
    vec3 N = texture(normal, T).xyz;
    vec4 specColor = texture(specular, T);

    // we'll add shadows back in a minute
    float shadowFactor = shadow(V);

    // Get the normalized directional vector between our surface position and our light position
    vec3  L = normalize(lightPos - V.xyz);
    float NdotL = max(0.0, dot(N, L));
    difColor = difColor * NdotL * lightCol * shadowFactor;

    float shininess = specColor.a * 256.0;
    if ((NdotL != 0.0) && (shininess != 0.0)) {
      // slightly different way to calculate our specular highlight
      vec3  halfVector  = normalize(L - normalize(V.xyz));
      float nxHalf = max(0.0, dot(N, halfVector));
      float specPower = pow(nxHalf, shininess);
      
      specColor = vec4(lightCol * specColor.rgb * specPower * shadowFactor, 1.0);
    } else {
      specColor = vec4(0.0, 0.0, 0.0, 0.0);
    };

    fragcolor = vec4(ambColor.rgb + difColor + specColor.rgb, 1.0);
  }
}

Most of this code you should recognise from our normal shader as we're basically applying our lighting calculations exactly as they were in our old shaders. The difference being that we're getting our intermediate result from our textures first.

Because these shaders are very different from our normal shaders and we simply initialise them when we build our gBuffer I've created a seperate structure for them called lightShader with related functions. They still use much of the base code from our shader library.

Note also that because we're rendering the entire screen and no longer using a z-buffer, I've uncommented the code that clears our buffers in our main loop.

And thats basically it. There is a bit more too it in nitty gritty work but I refer you to the source code on github.

Next up

I really need to add some images of our intermediate textures and I may add those to this writeup in the near future but I think this has covered enough for today.

I'll spend a short post next time about the barrel distortion if I get a chance to borrow a friends Rift or Vive.

The missing bit of this writeup is adding more lights. At this point in time we've not won anything over our earlier implementation, we've actually lost due to the added overhead.
Once we start adding lights to our scene the benefits of this approach will start to show though I've learned that on the new generation hardware a single pass renderer can render multiple lights more then fast enough to make the extra efforts and headaches not worth it.

Thursday, 5 May 2016

Deferred lighting rendering #1 (part 30)

Alrighty, this one is going to take a bit of time to slowly put together. I'm tinkering with the idea of changing the format of these blog posts. This is really no longer a tutorial but is starting to become a development blog on building my little 3D engine.

The sheer amount of changes I've made to the engine over the past two weeks, many of which are just background things, would just deter from the topic I'm trying to discuss...

So let's just see where this will lead us but I may just start focussing on the core changes and leave it up to you to examine the other stuff by looking at the changes checked in on github.

Speaking of github, I'm still polishing up things so I won't be checking things in just yet. This mostly is more of an introduction into what we'll be doing.

Deferred lighting, I've mentioned it a few times now since I started this series. What is it all about?

Deferred lighting or deferred shading is a technique that implements our rendering into two (or more) separate passes. So far we've been doing all the work in a single pass, every pixel we render to the screen we do all the calculations for to determine its final color in one go.
With deferred lighting we first render our scene to a number of temporary buffers in which we store information about all our objects without doing any lighting calculations, and then we do a pass for each light in our scene to render our final scene.

So why is this a good thing? Well lets start by saying it doesn't have to be :)
When we look at our current example, an outdoor scene with very little detail, well it won't help us at all. In fact after all the work of the last two weeks, the engine is slower then before..

But there is something we can already see in our current implementation of our shaders. Just take a good look at our standard fragment shader. Even with just the calculations of a single light, a very simple directional light, 90% of the fragment shader is all focussed on calculating the amount of light illuminating our fragment.
Now with a simple scene, a bit of sorting of our objects, and good use of our Z-buffer it's a likely assumption we're already discarding all the fragments that would result in all these calculations being done more then once for the same pixel as objects overlap.
But with a complex scene, with many overlapping objects, the likelihood of doing many of these complex calculation for nothing increases.

This is where the first strong point of a deferred shader comes into play. By rendering all the geometry to a set of temporary buffers without doing all our lighting calculations making this step very fast, we can then do a second pass to only do the lighting calculations for those fragments that actually end up being pixels in our end result. The overhead of rendering our scene in two passes can easily be offset by the time we save not doing complex lighting calculations multiple times for the same pixel.

But the second strong point is where this approach comes into its own. Right now we have one light. A light that potentially effects the whole screen, and in that our sun is actually rather unique.
Imagine an indoor scene with a number of lights scattered around the room, maybe a desk lamp somewhere, or the light cast by a small LED clock display.
With a single pass renderer we're checking for the effects of these light for every fragment we attempt to render even though they may only effect a tiny part of our end result.
In a deferred renderer we're using 3D meshes that encompass the area effected by these types of lights. We then render these "lights" onto our end result and only apply the lighting calculations for each light where it counts as part of our 2nd pass.

When we look at an environment where we want to deal with many lights effecting complex geometry a deferred renderer can be much faster then a single pass renderer.
But it is very important to realise that as with many techniques there is a tradeoff.

Basically there are four main weak points of a deferred renderer that I can name off the top of my head:
- for simpler or more open space environments the added overhead won't be won back as I've explained up above.
- as we'll see in this part of the series, transparency is relatively hard to do. You either need to move rendering transparent objects to your lighting pass or introduce additional buffers
- multi layer materials with different lighting properties are very hard to do because we don't do our lighting until we've already "flattened" our material.
- we can't use some optimisations in lighting normally performed in the vertex shader because the input into our lighting scene is in view space. It's a shame we didn't already implement it before we got to this point so I could show the differences but an example of this is normal mapping. You can avoid doing some costly calculations in the fragment shader by projecting your light based on the orientation of the polygon you're rendering and using your normal map directly. But in a deferred shader your have to project all the normals in your normal map to view space.

When we look at the open world environment we've been working with so far a single pass shader would definitely be the better choice. Let's say we turn this into a little flight simulator, or turn this into a little racing simulator, we're probably want to stick with the single pass shaders we've build so far.

But when I look at the platformer I want to build, I'm suddenly less concerned about the drawbacks and much more enthusiastic about the lighting capabilities of our deferred lighting approach.

At this point in time the changes to the renderer to enable deferred shading have replaced the single pass approach. This is one of the areas where I want to do a bit more spit and polish. Ideally I want to create a situation where I can chose which approach I want to take.

Anyways, that's it for now. In the next post we'll start having a closer look at how the deferred lighting rendered is actually works.

Tuesday, 19 April 2016

A simple shader preprocessor (part 29)

As I started planning out the additions for my deferred lighting renderer I realised I could no longer postpone implementing at least a basic shader preprocessor.

While some parts of the code can be moved more central other parts need to be further duplicated and in doing so the need to fix the same issues in multiple places make things harder and harder to maintain.

For my goals however I don't need the preprocessor to do much so we can keep everything very simple and we'll limit the functionality to the following:

support for a #include to insert the text from a file into our shader
supplying a number of "defines" which we can trigger logic
very basic #ifdef, #ifndef and #else logic that use these defines to include or exclude parts of the shader code

Changes to our system library

I was thinking about putting most of this code in our system.h file but decided against that for now. I may yet change this in the future. For now one support function has been added here:

// gets the portion of the line up to the specified delimiter(s)
// return NULL on failure or if there is no text
// returns string on success, calling function is responsible for freeing the text
char * delimitText(const char *pText, const char *pDelimiters) {
  int    len = 0;
  char * result = NULL;
  bool   found = false;
  int    delimiterCount;

  delimiterCount = strlen(pDelimiters) + 1; // always include our trailing 0 as a delimiter ;)

  while (!found) {
    int pos = 0;
    while ((!found) && (pos < delimiterCount)) {
      if (pText[len] == pDelimiters[pos]) {
        found = true;
      };
      pos++;
    };

    if (!found) {
      len++;
    };
  };

  if (len != 0) {
    result = malloc(len + 1);
    if (result != NULL) {
      memcpy(result, pText, len);
      result[len] = 0;
    };
  };

  return result;
};

This function splits off the first part of the text pointed to by pText from the start until it detects one of the delimiters or the end of the string.
This is fairly similar to the code we wrote before to read out material and object files line by line but without using our varchar implementation.

Changes to our varchar library

We are going to use varchar.h but in combination with a linked list to store our defines in. For this I've added 3 new functions:

// list container for varchars
// llist * strings = newVarcharList()
llist * newVarcharList() {
  llist * varcharList = newLlist((dataRetainFunc) varcharRetain, (dataFreeFunc) varcharRelease);
  return varcharList;
};

This first function simply returns a linked list setup to accept varchar objects.

// list container for varchars created by processing a string
// empty strings will not be added but duplicate strings will be
// llist * strings = newVarcharList()
llist * newVCListFromString(const char * pText, const char * pDelimiters) {
  llist * varcharList = newVarcharList();

  if (varcharList != NULL) {
    int    pos = 0;

    while (pText[pos] != 0) {
      // find our next line
      char * line = delimitText(pText + pos, pDelimiters);
      if (line != NULL) {
        int len = strlen(line);

        varchar * addChar = newVarchar();
        if (addChar != NULL) {
          varcharAppend(addChar, line, len);

          llistAddTo(varcharList, addChar);
        };

        if (pText[pos + len] != 0) {
          // skip our newline character
          pos += len + 1;
        } else {
          // we found our ending
          pos += len;
        };

        free(line);
      } else {
        // skip any empty line...
        pos++;
      };
    };
  };

  return varcharList;
};

This method uses our new delimitText function to pull a given string appart and add each word in the string as an entry into a new linked list.

// check if our list contains a string
bool vclistContains(llist * pVCList, const char * pText) {
  if ((pVCList != NULL) && (pText != NULL)) {
    llistNode * node = pVCList->first;

    while (node != NULL) {
      varchar * text = (varchar *) node->data;

      if (varcharCmp(text, pText) == 0) {
        return true;
      };

      node = node->next;
    };
  };

  // not found
  return false;
};

And finally a function that checks if a given word is present in our linked list.

Changes to our shader library

The real implementation can be found in our shader library. We've added a new parameter to our newShader function so we can pass it the defines we want to use for that shader:

shaderInfo * newShader(const char *pName, const char * pVertexShader, const char * pTessControlShader, const char * pTessEvalShader, const char * pGeoShader, const char * pFragmentShader, const char *pDefines) {
  shaderInfo * newshader = (shaderInfo *)malloc(sizeof(shaderInfo));
  if (newshader != NULL) {
    llist * defines;
    ...
    // convert our defines
    defines = newVCListFromString(pDefines, " \r\n");

    // attempt to load our shader by name
    if (pVertexShader != NULL) {
      shaders[count] = shaderLoad(GL_VERTEX_SHADER, pVertexShader, defines);
      if (shaders[count] != NO_SHADER) count++;      
    };
    ...
    // no longer need our defines
    if (defines != NULL) {
      llistFree(defines);
    };
    ...
  return newshader;
};

We first convert our new parameter pDefines into a linked list of varchars by calling our new newVCListFromString function.
We then pass our new linked list to each shaderLoad call so it can be used by our preprocessor.
Finally we deallocate our linked list and all the varchars held within.

The only change in shaderLoad is that it no longer called loadFile directly but instead calls shaderLoadAndPreprocess:

varchar * shaderLoadAndPreprocess(const char *pName, llist * pDefines) {
  varchar * shaderText = NULL;

  // create a new varchar object for our shader text
  shaderText = newVarchar();
  if (shaderText != NULL) {
    // load the contents of our file
    char * fileText = loadFile(shaderPath, pName);

    if (fileText != NULL) {
      // now loop through our text line by line (we do this with a copy of our pointer)
      int    pos = 0;
      bool   addLines = true;
      int    ifMode = 0; // 0 is not in if, 1 = true condition not found, 2 = true condition found

      while (fileText[pos] != 0) {
        // find our next line
        char * line = delimitText(fileText + pos, "\n\r");

        // found a non-empty line?
        if (line != NULL) {
          int len = strlen(line);

          // check for any of our preprocessor checks
          if (memcmp(line, "#include \"", 10) == 0) {
            if (addLines) {
              // include this file
              char * includeName = delimitText(line + 10, "\"");
              if (includeName != NULL) {
                varchar * includeText = shaderLoadAndPreprocess(includeName, pDefines);
                if (includeText != NULL) {
                  // and append it....
                  varcharAppend(shaderText, includeText->text, includeText->len);
                  varcharRelease(includeText);
                };
                free(includeName);
              };
            };
          } else if (memcmp(line, "#ifdef ", 7) == 0) {
            if (ifMode == 0) {
              char * ifdefined;

              ifMode = 1; // assume not defined....
              ifdefined = delimitText(line + 7, " ");
              if (ifdefined != NULL) {
                // check if our define is in our list of defines
                if (vclistContains(pDefines, ifdefined)) {
                  ifMode = 2;
                };
                free(ifdefined);
              };
              addLines = (ifMode == 2);              
            } else {
              errorlog(SHADER_ERR_NESTED, "Can't nest defines in shaders");
            };
          } else if (memcmp(line, "#ifndef ", 8) == 0) {
            if (ifMode == 0) {
              char * ifnotdefined;

              ifMode = 1; // assume not defined....
              ifnotdefined = delimitText(line + 7, " ");
              if (ifnotdefined != NULL) {
                // check if our define is not in our list of defines
                if (vclistContains(pDefines, ifnotdefined) == false) {
                  ifMode = 2;
                };
                free(ifnotdefined);
              };
              addLines = (ifMode == 2);              
            } else {
              errorlog(SHADER_ERR_NESTED, "Can't nest defines in shaders");
            };
          } else if (memcmp(line, "#else", 5) == 0) {
            if (ifMode == 1) {
              ifMode = 2;
              addLines = true;
            } else {
              addLines = false;
            };
          } else if (memcmp(line, "#endif", 6) == 0) {
            addLines = true;
            ifMode = 0;
          } else if (addLines) {
            // add our line
            varcharAppend(shaderText, line, len);
            // add our line delimiter
            varcharAppend(shaderText, "\r\n", 1);
          };

          if (fileText[pos + len] != 0) {
            // skip our newline character
            pos += len + 1;
          } else {
            // we found our ending
            pos += len;
          };

          // don't forget to free our line!!!
          free (line);
        } else {
          // skip empty lines...
          pos++;
        };
      };

      // free the text we've loaded, what we need has now been copied into shaderText
      free(fileText);
    };

    if (shaderText->text == NULL) {
      varcharRelease(shaderText);
      shaderText = NULL;
    };
  };

  return shaderText;
};

I'm not going to detail each and every section, I hope the comments do a good enough job for that. In a nutshell however, we start by creating a new varchar variable called shaderText which is what we'll end up returning. This means that our shaderLoad function also has a small change to work with a varchar instead of a char pointer as a result.
After this we load the contents of our shader file into a variable called fileText but instead of using this directly we use delimitText to loop through our shader text one line at a time.
For each line we check if it starts with one of our preprocessor commands and if so handle the special logic associated with it. If not we simply add our line to our shaderText variable.

#include is the first preprocessor command we handle, it simply checks the filename presented and attempts to load that file by calling shaderLoadAndPreprocess recursively.

This is followed by the code that interprets our #ifdef, #ifndef, #else and #endif preprocessor commands. These basically check if the given define is present in our linked list. They toggle the values of ifMode and addLines that control whether we ignore text in our shader file or add the lines to our shaderText.

Changes to our shaders

I've made two changes to our shaders, the first is that I've created a new shader files called "shadowmap.fs" that contains our samplePCF, shadow and shadowTest functions and we use #include in the various fragment shaders where we need these functions.

The second change is that I've combined our flatshader.fs, textured.fs and reflect.fs fragment shaders into a single standard.fs file that looks as follows:

#version 330

// info about our light
uniform vec3      lightPos;                         // position of our light after view matrix was applied
uniform float     ambient = 0.3;      // ambient factor
uniform vec3      lightcol = vec3(1.0, 1.0, 1.0);   // color of the light of our sun

// info about our material
uniform float     alpha = 1.0;                      // alpha for our material
#ifdef textured
uniform sampler2D textureMap;                       // our texture map
#else
uniform vec3      matColor = vec3(0.8, 0.8, 0.8);   // color of our material
#endif
uniform vec3      matSpecColor = vec3(1.0, 1.0, 1.0); // specular color of our material
uniform float     shininess = 100.0;                // shininess

#ifdef reflect
uniform sampler2D reflectMap;                       // our reflection map
#endif

// these are in world coordinates
in vec3           E;                                // normalized vector pointing from eye to V
in vec3           N;                                // normal vector for our fragment

// these in view
in vec4           V;                                // position of fragment after modelView matrix was applied
in vec3           Nv;                               // normal vector for our fragment (inc view matrix)
in vec2           T;                                // coordinates for this fragment within our texture map
in vec4           Vs[3];                            // our shadow map coordinates
out vec4          fragcolor;                        // our output color

#include "shadowmap.fs"

void main() {
#ifdef textured
  // start by getting our color from our texture
  fragcolor = texture(textureMap, T);  
  fragcolor.a = fragcolor.a * alpha;
  if (fragcolor.a < 0.2) {
    discard;
  };
#else
  // Just set our color
  fragcolor = vec4(matColor, alpha);
#endif

  // Get the normalized directional vector between our surface position and our light position
  vec3 L = normalize(lightPos - V.xyz);
  
  // We calculate our ambient color
  vec3  ambientColor = fragcolor.rgb * lightcol * ambient;

  // Check our shadow map
  float shadowFactor = shadow(Vs[0], Vs[1], Vs[2]);
  
  // We calculate our diffuse color, we calculate our dot product between our normal and light
  // direction, note that both were adjusted by our view matrix so they should nicely line up
  float NdotL = max(0.0, dot(Nv, L));
  
  // and calculate our color after lighting is applied
  vec3 diffuseColor = fragcolor.rgb * lightcol * (1.0 - ambient) * NdotL * shadowFactor;

  // now for our specular lighting
 vec3 specColor = vec3(0.0);
  if ((NdotL != 0.0) && (shininess != 0.0)) {
    // slightly different way to calculate our specular highlight
    vec3 halfVector = normalize(L - normalize(V.xyz));
    float nxHalf = max(0.0, dot(Nv, halfVector));
    float specPower = pow(nxHalf, shininess);
  
    specColor = lightcol * matSpecColor * specPower * shadowFactor;
  };

#ifdef reflect
  // add in our reflection, this is one of the few places where world coordinates are paramount. 
  vec3  r = reflect(E, N);
  vec2  rc = vec2((r.x + 1.0) / 4.0, (r.y + 1.0) / 2.0);
  if (r.z < 0.0) {
   r.x = 1.0 - r.x;
  };
  vec3  reflColor = texture(reflectMap, rc).rgb;

  // and add them all together
  fragcolor = vec4(clamp(ambientColor+diffuseColor+specColor+reflColor, 0.0, 1.0), fragcolor.a);
#else
  // and add them all together
  fragcolor = vec4(clamp(ambientColor+diffuseColor+specColor, 0.0, 1.0), fragcolor.a);
#endif
}

Note the inclusion of our #ifdef blocks to change between our various bits of logic while reusing code that is the same in all three shaders.
We can now change our shader loading code in engine.h to the following:

  colorShader = newShader("flatcolor", "standard.vs", NULL, NULL, NULL, "standard.fs", "");
  texturedShader = newShader("textured", "standard.vs", NULL, NULL, NULL, "standard.fs", "textured");
  reflectShader = newShader("reflect", "standard.vs", NULL, NULL, NULL, "standard.fs", "reflect");

If we need it we could very quickly add a fourth shader that combines texture mapping and reflection mapping by simply passing "textured reflect" as our defines.

In the same way I've combined our shadow shaders into a single shader file.

Obviously there is a lot of room for improvement here, but it is a start and enough to keep us going for a little bit longer.

Download the source here

What's next?

Now we're ready to start working on our deferred shader.

Saturday, 9 April 2016

Shadow maps #3 (part 28)

Ok, time for the last part of implementing basic shadow maps. The technique we're going to look at in this post is called cascading shadow maps. This is a technique mostly used to improve the quality for lights that effect large areas such as our sunlight. The problem we've had so far is that a high quality shadow map will only produce shadows in a small area while a large area shadow maps will be of such low quality things get very blocky.

While using a smoothing technique like we did in our last post does improve this somewhat up close shadows do not look very good. The lower quality shadow maps are fine for things that are further away.

Now there are several techniques that can improve this each with their own strong points and weakpoints. One alternative I'd like to mention is altering the projection matrix so the projection is skewed based on the distance to the camera ensuring we have higher detail shadow maps closer to the camera in a single map.
I'd also like to point to a completely different technique called shadow volumes, I've not implemented those myself but reading about them I'm interested to try some day. They seem to give incredible results but they may be more difficult to implement if you have loads of moving objects. I'm no expert in them yet so I'll refrain from commenting too much.

The technique we'll be using is where we simply render multiple shadow maps and pick the one best suited to what we're rendering. So we have a high quality shadow map for shadows cast close to the camera, we have a medium quality shadow map for things further away, and we have a low quality shadow map for things even further out. The screenshot below shows the three maps where I've changed the color of the shadow cast to highlight the transitions between the shadow maps (I left the code that produces this in our terrain fragment shader but disabled it, if you want to play around with it):

Now I'm keeping things simple here and as a result we're adding more overhead then we need.
First off, where there is overlap in the shadow maps we're rendering a bunch of detail into the lower quality shadow maps that will never be used. We could use a stencil buffer to cut that out but I'm not sure how much that would improve things as we're really not doing anything in the fragment shader anyway. Another improvement I've thought about is using our bounding box checking logic to exclude anything that falls fully within the overlap space, that might make a noticeable difference.
Second, depending on our camera position and the angle of our sun we may not need the other shadow maps at all.
Third, I already mentioned this in my previous posts and this ties into the second point, we're centering our shadow maps on the camera position so in worse case half our our shadow maps will never be used. Adjusting our lookat point for our shadows may allow us to cover a greater area with our higher detail shadow map.

These are all issues for later to deal with. It's worth noting though that with the changes we're making today on my little MacBook Pro the frame rate has suffered and while we were rendering at a comfortable 60fps unless we move to high in our scene, it's dropped to 30 to 40fps at the moment.
I have added one small enhancement and that is that I only re-render our shadow maps if our lighting direction has changed (which usually is a static) or if our lookat point has moved more then a set distance (we do this by rounding our look at position).

Last but not least, I've added a small bit of code to react to the - and = (+) keys and move the position of the sun. There is no protection for "night time" so we actually end up lighting the scene from below.

Going from 1 to 3 shadow maps

Obviously we need to add support for our 3 levels of shadow maps first. This starts with adjusting our lightsource structure:

// and a structure to hold information about a light (temporarily moved here)
typedef struct lightSource {
  float             ambient;          // ambient factor for our light
  vec3              position;         // position of our light
  vec3              adjPosition;      // position of our light with view matrix applied
  bool              shadowRebuild[3]; // do we need to rebuild our shadow map?
  vec3              shadowLA[3];      // remembering our lookat point for our shadow map
  texturemap *      shadowMap[3];     // shadowmaps for this light
  mat4              shadowMat[3];     // view-projection matrices for this light
} lightSource;

Note that as we're not yet dealing with anything but our sun as a lightsource I'm not putting any code in yet to support a flexible number of shadow maps.

So we now have 3 shadow maps and three shadow matrices to go with them. There is also a set of flags that determine if shadow maps need to be rebuild and a set of look at coordinates that we can use to check if we've moved our camera enough in order to need to rebuild our shadow maps.

It is important to realise at this point that this won't be enough once we start moving objects around. The easiest is to update our rebuild flags but we may as well remove this all together once things start moving around. A better solution would be to render our shadow maps with all the static objects only, and either overlay or add in objects that move around as we render our scenes. Thats something for much later however.

Similarly our shader library is enhanced to support the 3 shadow maps as well:

...
// structure for encapsulating a shader, note that not all ids need to be present (would be logical to call this struct shader but it's already used in some of the support libraries...)
typedef struct shaderInfo {
  ...
  GLint   shadowMapId[3];           // ID of our shadow maps
  GLint   shadowMatId[3];           // ID for our shadow matrices
  ...
} shaderInfo;
...
void shaderSetProgram(shaderInfo * pShader, GLuint pProgram) {
  ...
  for (i = 0; i < 3; i++) {
    sprintf(uName, "shadowMap[%d]", i);
    pShader->shadowMapId[i] = glGetUniformLocation(pShader->program, uName);
    if (pShader->shadowMapId[i] < 0) {
      errorlog(pShader->shadowMapId[i], "Unknown uniform %s:%s", pShader->name, uName);
    };
    sprintf(uName, "shadowMat[%d]", i);
    pShader->shadowMatId[i] = glGetUniformLocation(pShader->program, uName);
    if (pShader->shadowMatId[i] < 0) {
      errorlog(pShader->shadowMatId[i], "Unknown uniform %s:%s", pShader->name, uName);
    };
  };
  ...

And we need a similar change to our materials library to inform our shaders of the 3 shadow maps:

...
bool matSelectProgram(material * pMat, shaderMatrices * pMatrices, lightSource * pLight) {
  ...
  for (i = 0; i < 3; i++) {
    if (pMat->matShader->shadowMapId[i] >= 0) {
      glActiveTexture(GL_TEXTURE0 + texture);
      if (pLight->shadowMap[i] == NULL) {
        glBindTexture(GL_TEXTURE_2D, 0);      
      } else {
        glBindTexture(GL_TEXTURE_2D, pLight->shadowMap[i]->textureId);
      }
      glUniform1i(pMat->matShader->shadowMapId[i], texture); 
      texture++;   
    };
    if (pMat->matShader->shadowMatId[i] >= 0) {
      glUniformMatrix4fv(pMat->matShader->shadowMatId[i], 1, false, (const GLfloat *) pLight->shadowMat[i].m);
    };
  };
  ...

These changes should all be pretty straight forward so far.

Rendering our 3 shadow maps

Rendering 3 maps instead of 1 is simply a matter of calling our shadow map code 3 times. For this to work I've changed our renderShadowMapForSun function so I can parse it parameters to let it know which shadow map we're rendering and at what level of detail we want it. I'm just adding the start of the code here as most of the function has stayed the same from our first part. Have a look at the full source on github to see that other changes needed:

...
// render our shadow map
// we'll place this in our engine.h for now but we'll soon make this part of our lighting library
void renderShadowMapForSun(bool * pRebuild, texturemap * pShadowMap, vec3 * pLookat, mat4 * pShadowMat, int pResolution, float pSize) {
  vec3 newLookat;

  // prevent rebuilds if we only move a tiny bit....
  newLookat.x = camera_eye.x - fmod(camera_eye.x, pSize/100.0);
  newLookat.y = camera_eye.y - fmod(camera_eye.x, pSize/100.0);
  newLookat.z = camera_eye.z - fmod(camera_eye.x, pSize/100.0);

  if ((pLookat->x != newLookat.x) || (pLookat->y != newLookat.y) || (pLookat->z != newLookat.z)) {
    vec3Copy(pLookat, &newLookat);
    *pRebuild = true;
  };

  // we'll initialize a shadow map for our sun
  if (*pRebuild == false) {
    // reuse it as is...
  } else if (tmapRenderToShadowMap(pShadowMap, pResolution, pResolution)) {
  ...

I'm highlighting this part of the code because it shows the changes we made to limit the number of times we rebuild our shadow maps. We round our lookat position based on the level of detail we want in our shadow map. For our closest shadow map we may only move our camera 15 units before we need to rebuild our shadow maps while for our higher detail map it will be 100 units. Obviously if our light position changes we set our rebuild flags to true and we rebuild all shadow maps.

Finally we need to call this method 3 times which we do in our engineRender method:

  ...
  if (pMode != 2) {
    renderShadowMapForSun(&sun.shadowRebuild[0], sun.shadowMap[0], &sun.shadowLA[0], &sun.shadowMat[0], 4096, 1500);
    renderShadowMapForSun(&sun.shadowRebuild[1], sun.shadowMap[1], &sun.shadowLA[1], &sun.shadowMat[1], 4096, 3000);
    renderShadowMapForSun(&sun.shadowRebuild[2], sun.shadowMap[2], &sun.shadowLA[2], &sun.shadowMat[2], 2048, 10000);
  };
  ...

So our highest quality shadow map is a 4096x4096 map that covers an area of 3000x3000 units (2*1500).
Our lowest quality shadow map is a 2048x2048 map that covers an area of 20000x20000 units.

Note that this is where the color coded rendering of the shadow maps does come in handy for tweaking what works well as the size of our maps depend a lot on the sizes of your objects and what you consider to be close or far.

Changing our shaders

The final ingredient is changing our shaders. Again at this stage we need to update all our shaders but I'm only going to look at the changes once.

In our vertex shader (and in our tessellation evaluation shader for our terrain) we now need to calculate 3 vertices projected for our shadow maps. In this case I'm fully writing them out as its faster then looping:

...
uniform mat4      shadowMat[3];   // our shadows view-projection matrix
...
// shadow map
out vec4          Vs[3];          // our shadow map coordinates

void main(void) {
  ...
  // our shadow map coordinates
  Vs[0] = shadowMat[0] * model * V;
  Vs[1] = shadowMat[1] * model * V;
  Vs[2] = shadowMat[2] * model * V;

  ...

Our fragment shaders need to be adjusted as well. First off we need to change our samplePCF function so it checks a specific shadow map:

...
uniform sampler2D shadowMap[3];                     // our shadow map
in vec4           Vs[3];                            // our shadow map coordinates
...
float samplePCF(float pZ, vec2 pCoords, int pMap, int pSamples) {
  float bias = 0.0000005; // our bias
  float result = 1.0; // our result
  float deduct = 0.8 / float(pSamples); // deduct if we're in shadow

  for (int i = 0; i < pSamples; i++) {
    float Depth = texture(shadowMap[pMap], pCoords + offsets[i]).x;
    if (pZ - bias > Depth) {
      result -= deduct;
    };  
  };
    
  return result;
}
...

And finally we need to change our shadow function to figure out which shadow map to use.

We simply start with our highest quality shadow map and if our projection coordinates are within bounds we use it, else we check a level up:

...
// check if we're in shadow..
float shadow(vec4 pVs0, vec4 pVs1, vec4 pVs2) {
  float factor;
  
  vec3 Proj = pVs0.xyz / pVs0.w;
  if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
    // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
    factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 0, 9);
  } else {
    vec3 Proj = pVs1.xyz / pVs1.w;
    if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
      // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
      factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 1, 4);
    } else {
      vec3 Proj = pVs2.xyz / pVs2.w;
      if ((abs(Proj.x) < 0.99) && (abs(Proj.y) < 0.99) && (abs(Proj.z) < 0.99)) {
        // bring it into the range of 0.0 to 1.0 instead of -1.0 to 1.0
        factor = samplePCF(0.5 * Proj.z + 0.5, vec2(0.5 * Proj.x + 0.5, 0.5 * Proj.y + 0.5), 2, 1);
      } else {
        factor = 1.0;
      };
    };
  };

  return factor;
}

void main() {
  ...
  // Check our shadow map
  float shadowFactor = shadow(Vs[0], Vs[1], Vs[2]);
  ...

And that's it.

For this part I've created a Tag in Github instead of a branch. We'll see which works better.
Download the source code

And a quick video showing the end result:

What's next

I think I've gone as far as I want with shadows for now. The next part may take while before I get it done as there is a lot involved rewriting our code so far to a deferred lighting model but that's what we'll be doing next.

After that we'll start looking at adding additional lights and looking at other shading techniques.
Somewhere in the middle we'll also start looking at adding a simple pre-processor to our shaders so we can start reusing some code and make our shaders easier to put together.