Bastiaan's blog: August 2016

Hey all,

I've found some time to update the makefile for the windows build so everything compiles. I've tested this on a new computer I was able to use at work using Visual Studio 2013 and it works pretty nicely.

I've also updated both windows and mac builds to use the latest GLFW version 3.2.1. I haven't changed any functionality as I haven't had time to look into the changes in this new build but it seems to have a few interesting bits that I'll be looking into.

Other then that I've slowly been working on my forward+ renderer. It is now rendering well over 1000 point lights on my little old Macbook reaching about 45fps at 1280x720 in windowed mode.

On the new windows machine it managed to fluctuate between 80 and 150 fps, that thing is fast.

One nice thing is that the number of lights do not effect performance too much.

Mind you, it's still a good 25% slower then the deferred renderer. Ok I never brought it up to supporting a large volume of lights but I'm still surprised at the results so far. Seeing the simplicity of the scene I'm rendering so far the forward+ renderer should be doing less not having to build 5 really large images. It really makes me wonder where it is loosing most of the speed.

I like the simpler implementation of the forward+ shader however so I'm not giving up on it yet.

Hi All,

Haven't posted for awhile, been hard at work but on and off at reshuffling my code. I haven't checked anything into GitHub yet and probably won't until I'm further down the line but the code now is split into normal header and source files (easier to maintain, faster to compile) and I've turned it back into a forward shader but now with the lighting code build into the fragment shader.

That has yielded some surprising results. Two things that I thought really stood out:
1) it's costing way more performance then I thought. I've only enabled the directional light and the spot lights at the moment and it is already much slower then the deferred lighting approach. Add in any of the spotlights and things just die. Now part of this is looping through the lights which is notoriously bad for the GPU but still I was surprised at how much performance as lost
2) with the deferred lighting approach there is a sort of natural optimisation in resource usage. You select the shadow map for the light and the code stays very simple for that one light.
With multiple lights in a single fragment shader you have to have all the resources active and your shaders has to be able to handle multiple types of lights so there are far more conditions.

Time will tell if changing this over to what I've learned is called Forward+ makes enough of an improvement to make this worth its while. Else I'll go back to the fully deferred renderer. Still the advantage of being able to handle transparent objects without a second renderer implementation is appealing so we'll stick with it.

The journey itself hasn't been without its own bonuses. The first was dealing with the resource issue. With 3 shadow maps for my directional light and 6 shadow maps for each point light, I'm quickly running out of active textures. My graphics card on my Macbook Pro can only handle a measly 16 active textures (this is not a limit to the number of textures, but how many you can use simultaneously).

Layers to the rescue!

Layers are an extension to the mipmapping support in textures. It allows you to store multiple layers in a single texture and my graphics card allows me to create hundreds of layers for a single texture, limited only by the memory on the graphics card. In your shader you simply use a vec3 instead of a vec2 where the z simply selects the layer.

Also what seemed to be a plus point is that you can bind the texture array (a texture with layers) to a framebuffer and then render to the individual layers by selecting the layer in your geometry shader.
I was able to render all 3 shadow maps for my directional light in one pass by having my geometry shader triple every face, once for each layer, with the proper projection applied to each instance.
Unfortunately it killed all performance, I don't know if geometry shaders aren't properly supported on my graphics card or if this is an architectural problem, but I had much more success binding the layers to my frame buffer one at a time and rendering each layer separately.
Worth checking on newer hardware and maybe supporting both approaches after determining if the hardware handles it correctly.

At the end I'm working to having a single texture containing my shadow maps as layers, or maybe splitting the shadow maps in just a few arrays (every layer needs to have the same size and my spot light shadow maps are smaller).

I'll do the same for my light maps, just having a single texture with a layer for each light map. These are static anyway.

Textures for my objects will work the same as always, here it makes no sense to have them active all at once as we're still rendering the objects one at a time.

Uniform Buffer Objects

Another optimisation I've added is that I've started using Uniform Buffer Objects or UBOs.
UBOs have two distinct advantages:
1) you can load a bunch of data in bulk.
2) the data can be shared amongst shaders

For instance, while we're rendering our scene we keep copying our projection matrix and view matrix and all derivatives of those into each shader even thought they don't change. Sure we're more often then not using the model-view-projection matrix which does change so it's not the best example but with a UBO you could load the projection and view matrix into a UBO at the start of a frame, and then use it in every shader.

I've used it right now for loading all information about my lights at the start of the frame. I'm only doing this for my left eye only in stereo rendering as it can be reused for the right eye.

In the same way I plan to move all my material data in a UBO as well, this one I even only need to load once after our scene is loaded. In my shader I simply need to pass an index onto our material data.

There are drawbacks for UBOs:
- they take up space in graphics memory
- you can't use samplers
- UBOs have strict rules around data alignment, you sometimes need to pad the structure to ensure a struct in C(++) matches the struct in GLSL so you can transfer the data. It isn't helped that some implementations are buggy turning a vec3 into a vec4 for instance.

The sampler drawback is the most irritating one. For our materials it means we still set the textures when selecting the shader, we can only store things like color, reflectiveness, etc. in the material data.

For our light UBO we use the fact that we're using layers in our shadowmap and lightmap texture. We simply store the layer number in our data.
This also has the advantage that we bind GL_TEXTURE1 to our shadowmap texture atlas and GL_TEXTURE2 to our lightmap texture atlas and leave those reserved as such.
I'm skipping GL_TEXTURE0 because I'm using that to manipulate textures with and it's used in our font rendering, and I'm using GL_TEXTURE2 and onwards for my material textures.

Anyway, lots that I've done so far. I'm now working on the first attempt of implementing a compute shader to create my light tile-map. Once I have that working I'll check stuff into github and maybe highlight a few bits if the implementation that are interesting.

To be continued...

Bastiaan's blog

Tuesday, 23 August 2016

Update of GLFW tutorial

Saturday, 13 August 2016

Just an update

Layers to the rescue!

Uniform Buffer Objects