Monday 29 December 2014

Space Colonization Algorithm Part 2

One issue that I ran into a few times after finishing part one is that sometimes a branch would end with equal distance between two attraction points. When adding a new node I call the method called growBranch that adds the new vertex and node to the tree.
This logic has been enhanced in 3 small ways:
- I look up the "parent" node to make further processing easier
- I update a counter for the number of child nodes, this will allow us to vary the thickness of the tree later on.
- I calculate the dot product of the vector of the parent node and the new node to see if it is backtracking on itself (which is what happens in the aforementioned situation) and branch out in a separate direction using the cross product of the two vectors.

Calculating the dot product of two normalised vectors gives us the cosine for the angle between the two vectors, a very handy and oft used little calculation. A value of 1.0 means the vectors are in the same direction, a value of 0.0 means they are at right angles and a value of -1.0 means they are in opposite direction (which is what we're checking for).

The cross product of two vectors gives us a vector that is at right angle to the two vectors. Very handy.


Interface changes

Another small change I added is that I no longer rotate the model automatically but rotated it based on scrolling horizontally and going closer/further away using vertical scrolling. This works really well on a MacBook pro as these actions are controlled by using two fingers on the touchpad but may be unwanted using traditional mice, you may want to stray away from this.

I'm also drawing larger dots for the vertices of the tree to better show the output of the algorithm. 

Optimising the nodes

The real enhancement and the subject of this blog post is the added optimisation step.
After generating our tree (and zooming in a bit), we can see how many segments we've created:

This level of detail is overkill for what we're doing. Again our trusty dot product comes to the rescue. If we have two nodes whose vectors are pointing in (roughly) the same direction we could merge the nodes to a single node between the two end points and remove the point in the middle. We can only do this if a node has only one child node, if it has multiple branches the detail remains. We thus need to test for that situation.

void treelogic::optimiseNodes() {
  unsigned int node = 0;
  std::vector<unsigned int> children;
  
  while (node < mNodes.size()) {
    unsigned int mergeWith = 0;
    
    // we need to find out how many children we have, we can only optimise if just one is found
    children.clear();
    for (unsigned int n = node+1; n < mNodes.size(); n++) {
      if (mNodes[n].parent == node) {
        children.push_back(n);
      };
    };
    
    // only one child? check if we need to merge
    if (children.size() == 1) {
      vec3  parentVector = mVertices[mNodes[node].b] - mVertices[mNodes[node].a];
      parentVector = parentVector.normalized();      

      vec3  childVector = mVertices[mNodes[children[0]].b] - mVertices[mNodes[children[0]].a];
      childVector = childVector.normalized();
      
      // use dot product, this gives our cosine, the closer to 1.0 the more the vectors match direction
      if ((parentVector % childVector) > 0.9) {
        mergeWith = children[0];
      };
    };
    
    // and merge
    if (mergeWith!=0) {
      unsigned int eraseVertice = mNodes[node].b; // should be same as mNodes[mergeWith].a, this we'll erase..
      
      // merge our nodes...
      mNodes[mergeWith].a = mNodes[node].a;
      mNodes[mergeWith].childcount = mNodes[node].childcount;
      mNodes[mergeWith].parent = mNodes[node].parent;
      mNodes.erase(mNodes.begin() + node);
      
      // erase our vertice
      mVertices.erase(mVertices.begin() + eraseVertice);
      
      // adjust our nodes
      for (unsigned int n = 0; n < mNodes.size(); n++) {
        if (mNodes[n].parent > node)mNodes[n].parent--;
        if (mNodes[n].a > eraseVertice) mNodes[n].a--;
        if (mNodes[n].b > eraseVertice) mNodes[n].b--;
      };      
    } else {
      node++;
    };
  };
};

This removes most of our in between nodes:


At this point we can see that we've removed a little to much detail because some of our curves have gone. This is because the small angles between nodes when there are a lot of segments means we end up removing too much.  I'm still playing around to see if this can be improved.

Next up however, taking our nodes and turning it into a mesh that we can then further process. 

Space Colonization Algorithm Part 1

Hitting a roadblock with the model I was using to test my engine I found myself working on a few related projects.

The first was implementing a texture blending technique that was described here: http://www.nongnu.org/techne/research/blending/blending.pdf , I'll write some more about that at a later stage but it was fun to do and works amazingly well.

The second was that trees and vegetation were going to be needed at some point in time. Now I could simply import models of trees and other things but I wanted to find out if it would be possible to generate these and do this in a way that would allow me to create multi LOD meshes for the same tree. Searching the web you find a lot of neat tools and small tutorials.

Most seem to favour a technique called L-System fractals. Google on that term if you wish but it is basically a way to notationally describe a fractal system that can be used to generate a tree (and many other things). However there is a lot of trial and error involved and the more complex and fully grown a tree becomes, the more roadblocks your seem to hit.

Eventually, buried deep down in the search results someone mentioned a technique called "Space Colonization" which I promptly ignored due to the name suggesting it had more to do with space combat games then procedurally generating vegetation. But after it came back a few times most notably on this amazingly cool blog I starting looking into it. Eventually it lead me to the the very well written paper called "Modeling Trees with a Space Colonization Algorithm" written by Adam Runions, Brendan Lane, and Przemyslaw Prusinkiewicz.

I decided to try my hand at implementing it and see how far I'd get. It turned out to be very easy to do the basic generation of the tree. As I've not found many people showing how they've implemented the algorithm I decided to dedicate a few blog posts about it.
This first one is just about the node generation.
The next step will be to optimise the output merging branches, then we'll need to generate a model that will allow us to render the tree and finally I need to add in leaves and such.
Finally I need to add more algorithms for creating differently shaped point clouds or maybe even allow for drawing out the shape.
Not all necessarily in that order:)

Source code, Github and framework

I've uploaded and will be maintaining the project to my GitHub page. I'm working on a Macbook Pro at the moment so everything will only be tested on that platform for now. Obviously you'll need XCode but as I'm using makefiles (I'm a little old fashioned) you'll also need to install the command line tools and possible make (I can't remember, I set it all up years ago). Or off course you start a new project in your favourite IDE and just draw the source files in.

As I'm using OpenGL and GLUT is pretty useless once you go beyond OpenGL 2 (on a Mac at least) I've started using GLFW and GLEW. The GitHub repository contains the header files and static universal libraries compiled for Mac OS X. I just added those to make it easier to duplicate the repository and get started but you'd do well to get the original source code and documentation for both. I won't waste any words here on setting up and compiling these (I might be tempted to do so at some point in time) as the documentation is pretty good.

Right now I've not added a math library but just added those classes and those methods I've needed so far. This may change as the project progresses.

OpenGL 1.0

At this stage the project is using OpenGL 1.0 and worse even, using glBegin/glEnd commands. This is simply done because I needed to test the algorithm and it is the quickest way to just output stuff to screen. It won't stay so for much longer though. I haven't decided yet but it is likely I'll use tessellation shaders only available in OpenGL 4 to generate the detail for the mesh. Also as my main engine runs OpenGL 3 code I'll eventually end up changing the code to using proper vertex arrays and buffers.

Attraction Point Cloud

This is the heart of this algorithm. The basic idea behind the algorithm is that you have a cloud of points to which the tree grows. The shape of the point cloud very much dictates the overall look of the tree but inherent in the algorithm is the branching you would expect with vegetation.

The code I currently use to test the algorithm isn't very imaginative in the cloud it creates, I simply randomly generate positions of points within a hemisphere and stretch that to roughly represent the shape of a tree. How its stretched and how the points are organised can be set with parameters to the method that generates the point cloud. The samples in the aforementioned paper use much better algorithms producing better shapes and I plan to enhance this eventually and take it even further then the paper suggests.

Note that for this reason also the sample code animates the algorithm, this algorithm works very well visually to see how it works and see how changes to the parameters effect the outcome. When you use it for real obviously you'd just run the algorithm and work with the finished mesh.

The code to generate the attraction points is very simple:
void treelogic::generateAttractionPoints(unsigned int pNumOfPoints, float pOuterRadius, float pInnerRadius, float pAspect, float pOffsetY, bool pClear) {
  // Seed our randomiser
  srand (time(NULL));

  if (pClear) {
    // Clear any existing points (shouldn't be any..)
    mAttractionPoints.clear();
  };

  // Add random attraction points until we reached our goal
  for (unsigned int i = 0; i < pNumOfPoints; i++) {
    vec3 point;

    // random normalized vector for half a hemisphere
    point.x = randf();
    point.y = randf(0.0f, 1.0f);
    point.z = randf();
    point = point.normalize();

    // Scale it up to a random radius and stretch if needed
    point *= ((pOuterRadius - pInnerRadius) * randf(0.0f, 1.0f)) + pInnerRadius;
    point.y *= pAspect;
    point.y += pOffsetY;

    // and add it to our buffer
    mAttractionPoints.push_back(attractionPoint(point));
  };
};

You can call it a few times consecutively to create slightly more interesting clouds.
And it gives a point cloud like so:
All the green dots are the points that were generated.

Growing the tree

The magic happens in the iterative logic that grows the tree. Notice in the previous screenshot the brown line, that is the start of our tree. The idea is that you can start off with a basic shape to further control the end result. All the algorithm needs is at least one starting vertex at 0,0,0 which is taken care of in the constructor of the class.

For every iteration we check each point in our attraction point cloud and check which vertex within our tree lies closest to that point. If the distance is within a certain threshold (dk in the paper, pMaxDistance in the code below) we take it into account, else we ignore the attraction point for now. In this manner we find all attraction points that are closest to each vertex of the tree (these are our sets). For each vertex of the tree we calculate the average normalised vectors to these points and then branch out in that direction by a specified distance (D in the paper, pBranchSize in the code beow).
Finally, any attraction point that is closer then a specified distance (di in the paper, pCutOffDistance in the code below) to its nearest vertex is discarded.

We repeat this process until all attraction points have been discarded.

The code for our iteration looks like this:
bool treelogic::doIteration(float pMaxDistance, float pBranchSize, float pCutOffDistance, vec3 pBias) {
  unsigned int numVerts = mVertices.size(); // need to know the number of vertices at the start of our process
  unsigned int i, v;
  std::vector<float> numOfAPoints;
  std::vector<vec3> directions;
  
  // init our temporary buffers
  for (v = 0; v < numVerts; v++) {
    numOfAPoints.push_back(0.0);
    directions.push_back(vec3(0.0f, 0.0f, 0.0f));
  };
  
  // find out what our closest vertice to each attraction points is:
  i = 0;
  while (i < mAttractionPoints.size()) { // use a while loop, we'll be removing points..
    attractionPoint point = mAttractionPoints[i];
    
    // start with our current distance for our attraction point
    vec3 delta = mVertices[point.closestVertice] - point.position;
    float currentDistance = delta.length();
    
    // as our vertices haven't moved we only need to check any new vertices
    for (v = mLastNumOfVerts; v < mVertices.size(); v++) {
      delta = mVertices[v] - point.position;
      float distance = delta.length();
      if (distance < currentDistance) {
        // this one is now our closest
        point.closestVertice = v;
        currentDistance = distance;
      };
    };
    
    if (currentDistance < pCutOffDistance) {
      // we're done with this one...
      mAttractionPoints.erase(mAttractionPoints.begin() + i);
    } else {
      // copy back our new closest vertice and advance...
      mAttractionPoints[i].closestVertice = point.closestVertice;
      
      if (currentDistance < pMaxDistance) {
        // count our vertice
        numOfAPoints[point.closestVertice] += 1.0;
        vec3 norm = mAttractionPoints[i].position - mVertices[point.closestVertice];
        directions[point.closestVertice] += norm.normalize();        
      };
      
      // and advance
      i++;
    };
  };
  
  // Update our last number of vertices
  mLastNumOfVerts = numVerts;
  
  // Now check which vertices need to branch out...
  for (v = 0; v < numVerts; v++) {    
    if (numOfAPoints[v] > 0.0) {
      vec3  vert = mVertices[v];
      directions[v] /= numOfAPoints[v];
      directions[v] = directions[v].normalize() * pBranchSize;        
      vert += directions[v] + pBias;
      
      growBranch(v, vert);      
    };
  };
  
  // as long as we still have attraction points left we must still be growing our tree
  return mAttractionPoints.size() > 0; 
};

I've added a few optimisations into this code.
First of all is that I only check new vertices added since the last iteration and remember the closest vertex for the next round. As vertices once generated do not move only new vertices can be closer then the current closest vertex.
Second, instead of building a set of attraction points that are closest to a particular vertex I simply count the number of attraction points and add their normalised vectors together. Then once I'm finished I simply divide the sum of the vectors by the number of points to get the direction in which to grow my branch.

The end result is a nice stick figure tree:


It's a lot of fun experimenting with the parameters and seeing what different shapes to come up with. One of the things I added in the sample code is to generate a second point cloud after the tree has been build to create roots.

More next time....



Tuesday 23 December 2014

Bounding boxes update

So I've been playing around with the city model with mixed results. I've added an algorithm to loading the model to split it into smaller parts where faces are clustered together but it is dreadfully slow and unreliable.

 


That said, it wasn't a total loss, it has proven that the changes to my code have had the desired effect but it does feel like it is the end of the line for using this model for testing things. As the second screenshot shows the bounding boxes are very erratic as some objects have been correctly isolated (like the sand bags or whatever they are), some are included in much larger boxes (like the car) and some bounding boxes seem unrelated all together.

When rendering the entire scene the frame-rate suffers greatly because the number of objects being rendered is so much higher but I'm not interested in that much because I hope to end up in a situation that when we move far enough away we start rendering lower level of detail models and exclude unnecessary detail. Moving to areas where much less objects are on screen we can suddenly see the frame rate jump up even to a comfortable 60fps (to which GLFW seems to be limited).

It sounds like I'll have to move forward my plans of creating a scene editor of sorts. This will allow me to load individual objects such as buildings and placing them and relating them to other objects.



Monday 22 December 2014

Add bounding boxes and use those to exclude things from rendering

I've got my initial bounding box logic working. As I wrote before, the bounding boxes are stored on the node level of my BSP tree. For now I only calculate boxes for those nodes that actually refer to a model that needs to be rendered but the logic is so that branches in the tree can have a bounding box that encapsulates all the leaves inside and prevent a whole bundle of objects from rendering without evaluating the individual objects held within.

Now the bounding boxes are used in two stages.

The first which I've just implemented is a simple and straight forward on-screen test of the bounding box in software.
At this point in time I'm taking a shortcut by applying the following rules:
  1. As soon as one of the 8 vertices of the bounding box is on screen I stop checking and render my model. 
  2. If all of the 8 vertices are behind the camera, skip
  3. if all of the 8 vertices are above/below/to the left or to the right of the screen, skip
  4. Else render
#4 is where I've really cut a corner and I could end up rendering an object that is just off screen. What I should be doing here is check all the inner faces of the bounding box and skip rendering if all inner faces are off screen. I figured however that the one or two objects that are included while they shouldn't don't justify the extra expense as the first 3 points are very easy to check with the following code:
bool isVisible(matrix pModelViewProj) {
  vec3  verts[8];
  vec3  min;
  vec3  max;

  // Apply our matrix to our vertices
  for (int i = 0; i < 8; i++) {
    verts[i] = pModelViewProj * mVertices[i];

    // get the minimum and maximum values.. 
    if (i==0) {
      min = verts[i];
      max = verts[i];
    } else {
      if (min.x > verts[i].x) min.x = verts[i].x;
      if (min.y > verts[i].y) min.y = verts[i].y;
      if (min.z > verts[i].z) min.z = verts[i].z;

      if (max.x < verts[i].x) max.x = verts[i].x;
      if (max.y < verts[i].y) max.y = verts[i].y;
      if (max.z < verts[i].z) max.z = verts[i].z;
    }

    if ((verts[i].x >= -1.0) && (verts[i].x <= 1.0) && (verts[i].y >= -1.0) && (verts[i].y <= 1.0) && (verts[i].z < 1.0)) {
      // no need to check any further, this vertice is on screen!!
      return true;
    };
  };

  if ((max.z < 0.0) || (min.z > 1.0)) {
    // all vertices are behind the camera, no need to check any further
    return false;
  } else if ((max.x < -1.0) || (min.x > 1.0)) {
    // all vertices are to the left or right of the screen, no need to check any further
    return false;
  } else if ((max.y < -1.0) || (min.y > 1.0)) {
    // all vertices are to the top or bottom of the screen, no need to check any further
    return false;
  };

  // We assume any other situation is visible, if our object is somewhere just off a corner
  // we have a false positive
  return true;
};

Later on we'll also use our bounding box to do occlusion testing. Here we will actually attempt to render our bounding box to test if it is behind things already rendered to the screen. But that is for another day.

Unfortunately I won't know how much impact this change will have as I found out how badly my sample city is for the purpose. I wrongly made the assumption buildings would be separated out into separate models but they are not. The breakup is such that each group within the wavefront obj file covers the entire model and 95% of the model is still being rendered even with most off screen.

I'll either have to find a better way to split the model up or abandon this model for testing. I'll have to think some more about this.

Friday 19 December 2014

The hunt for more frames per second starts..

No boring screenshot this time because there is very little to show so far.

As per my earlier post I got to a point where I could do basic rendering of a model using a deferred lighting technique. This means I'm rendering everything to a set of offscreen buffers first and then apply my lighting in a final set of stages.

Currently I am only doing lighting with a single light source (the sun) and apply an atmospheric ray-caster for those pixels on screen that are not covered by any of the polys being rendered. That may not seem too advantageous though with complex scenes it potentially means doing a lot less light calculations. Eventually I'll be adding more light source but that will come when I'm further along and I'll blog more about the technique I'm using.

The model I'm using for testing has nearly 4 million polygons which the hardware I'm using should have no problem with. My starting point however is a measly 10 to 11fps and it drops to 3fps if I recalculate my shadows every frame. Nothing to write home about. In my defence, I'm rendering everything straight up, nothing is sorted, nothing is excluded, everything is rendered, on or off screen, occluded, at way to high detail.

Scratchpad

This is my little scratchpad of steps I plan to follow in the coming days (xmas holidays yeah so hopefully I'll have time to work on this):
1) break up the model into smaller objects
2) add bounding boxes and use those to exclude things from rendering
3) sort those objects that will be rendered by distance to the camera
4) use our bounding boxes to perform occlusion checks for high poly objects that cover a relatively small area
5) use the BSP tree implementation to group together objects to allow excluding objects where the containing node is deemed off screen
6) use LOD to render a low poly version of a high poly object that is far enough away from the camera

Overview

In my engine a single model has a list of vertices (at the moment each consisting of a vector, normal and texture coord) and 1 or more index lists per material used. The index lists are basically 3 indexes into the list of vertices per polygon. A single model can be reused multiple times, it is referenced from a node in a BSP tree. That node holds the model matrix that determines where in the 3d world this model should be rendered. Each node can hold any number of sub nodes who's model matrix positions another model in relation to its parent node. When rendering a house one could think of a structure like this:
- node "house" renders model "house" somewhere down the street
  - subnode "chair 1" renders model "chair" in the living room
  - subnode "chair 2" renders model "chair" in the dining room

Here we can see that a single model "chair" is rendered twice. We can also see a prelude to how our BSP tree can optimise rendering. If our house is off screen, there is no point in checking any of the furniture. Similarly with the LOD system which is another branch of the tree. It is the high poly version of the house that contains the subnodes, the low poly node only references the low poly model and there is no need to traverse the tree any further again skipping the contents of the house.

None of this is very relevant at this time because the model we're using isn't setup to use this hierarchy. At best we'll have a collection of nodes each rendering a model at a certain location but it is a good starting point for the first couple of steps. It isn't until step 5 where we really start needing to group things together in the right way and by the time we reach it, we'll dive into this deeper.

Break up the model into smaller objects

The model I'm using is split into three files each counting around the 1 million poly mark (2 below, 1 above if memory serves me). Each model is further divided into smaller chunks by their material. As a result we'll end up with 3 models in memory each having a large number of vertices and each having a large number of index lists for each "chunk".

As there is no point in trying to do our optimisations on those 3 big models initially our model becomes pretty useless but if we break up the model by chunk we'll get something that we can use.

However these chunks may be smaller then we'd like. One chunk may be too small and represent a roof of a single house. A chunk could also be too large such as the chunk that represent the streets.

Still breaking up our model into a model for each chunk would give us a really nice base to work from. As a result I added functionality in my wavefront object loader to output a model per "material" instead of one big model. I added this as an option because eventually I want to load the models properly and not have this mostly arbitrary division.

The output of the loader is actually a node with a subnode for each model that has been created. Each model is also centered and the matrix of the subnode initialised to move the model into the right spot. This will help greatly later on in building our bounding boxes.

I wasn't expecting any difference as I'm still rendering everything pretty much at random but surprise surprise, it turns out OpenGL doesn't like the large buffers neither and splitting up the model like this brought the frame rate up to 14 to 15 fps.

We've won our first 3 to 4 fps speed increase by a relatively simple change:)

Next time, we'll start adding bounding boxes.

Monday 15 December 2014

Stereoscopic rendering with a geometry shader

**Edit 16 Dec 2014 - Unfortunately doing this late at night makes you miss a beat or two.. turns out the difference between rendering stereo in two passes doesn't give as much difference as I hoped for. So little in fact that it is not worth the exercise. Still in case someone finds it useful I've updated this article with the relevant info. **

While my engine is still running on a slow frame rate, mostly because I'm sending way more then I need to the rendering pipeline, I wanted to run a little experiment.

Something I added a little while ago was support for stereoscopic rendering using a split screen approach. This because it works on my 3D TV.

My engine is running roughly at 10fps with the cityscape model (see my previous post), once I turn on the stereoscopic rendering this drops to below 6fps:

This isn't surprising as all the geometry is rendered twice, once for the left eye, then for the right eye nearly halving the frame rate. It's not exactly half because I'm using a deferred renderer.

The idea that I wanted to try out is offloading the stereoscopic logic into a geometry shader. This would allow me to render everything in one pass and cut out on a lot of overhead in parsing all the models.

The first step I took was to insert a simple passthrough geometry shader in the middle. I was surprised to find out that in normal rendering this actually dropped the frame rate to just over 9fps. I found out later that when I adjust the geometry shader to support both mono and stereo rendering the frame rate dropped further to 8fps.
It is thus worth having a separate shader that you use when rendering stereoscopic.

Now in the code below I'm putting the relevant parts from my shaders and it is a very simple single pass renderer without lighting, I haven't tested this code but it's purely meant as an example.

First up is our vertex shader. This becomes nearly a pass through shader as we're doing our projection in the geometry shader:

#version 330

layout (location=0) in vec3 vertices;
layout (location=1) in vec2 texcoords;

out VS_OUT {
  vec2 T;
} vs_out;

void main(void) {
  gl_Position = vec4(vertices, 1.0);
  vs_out.T = texcoords.st;
}

Then the magic happens in our geometry shader, note that we provide two model-view-projection matrices, one for our left eye and one for the right:

#version 330

layout(triangles) in;
layout(triangle_strip, max_vertices=6) out;

uniform mvp[2];

in VS_OUT {
  vec2 T;
} gs_in[];

out GS_OUT {
  vec2 S;
  vec2 T;
  float eye;
} gs_out;

void main(void) {
  // Emit our left eye triangle
  for (int i=0; i < 3; i++) {
    vec4 P = mvpMat[0] * gl_in[i].gl_Position;
    P.x = (P.x * 0.5) - (P.w * 0.5);

    gl_Position = P;
    gs_out.S = P.xy / P.w;
    gs_out.T = gs_in[i].T;
    gs_out.eye = -1.0;
    EmitVertex();
  }
  EndPrimitive();


  // Emit our right eye triangle
  for (int i=0; i < 3; i++) {
    vec4 P = mvpMat[0] * gl_in[i].gl_Position;
    P.x = (P.x * 0.5) + (P.w * 0.5);

    gl_Position = P;
    gs_out.S = P.xy / P.w;
    gs_out.T = gs_in[i].T;
    gs_out.eye = 1.0;
    EmitVertex();
  }
  EndPrimitive();
}

Note, I had to do three changes here:
- set max_vertices=6, this is not the number vertices for a single primitive but the total number of vertices you potentially emit
- for the left eye I had to subtract 0.5, note also that I multiply this by P.w as the hardware will divide our X by W
- output S instead of relying on gl_FragCoord, looks like its deprecated in our fragment shader

Now we have two triangles being output, one for the left side using the left eye matrix and one for the right. The problem we still stuck with are triangles for the left eye that cross the border into the right half of the screen and visa versa. The clipping done by OpenGL only takes care what is offscreen. We're going to have to add a little extra sauce into our fragment shader:

#version 330

uniform float viewportWidth;
uniform sampler2D texturemap;

in GS_OUT {
  vec2 S;
  vec2 T;
  float eye;
} fs_in;

out vec4 color;

void main(void) {
  // discard our pixel if we're rendering for the wrong eye...
  if ((fs_in.eye < 0.0) && (fs_in.S.x > 0.0)) {
    discard;
  } else if ((fs_in.eye > 0.0) && (fs_in.S.x < 0.0)) {
    discard;
  }

  color = texture(texturemap, fs_in.T.st);
}

And our result?

We are up to 6.0 fps, well that was hardly worth the effort. One could argue it is a nicer way to do the stereo rendering but still.

There is room for improvement however and in that there may lie some merit in continuing with the approach.

At the moment we are sending many triangles for the left eye to the right side of the screen only to discard all the fragments. We could add a relative simple check to see if all vertices of the triangle are "on screen" for that eye. Only if at least one is on screen we output the triangle, else we skip it.

I've also been thinking about using a similar trick to speed up building my shadow maps. In theory we could render all required shadow maps in a single pass. More on that later.

The next step for me is changing the way I've got my VAOs setup and start looking into occlusion checks to speed things up by rendering a lot less of the scene then I am right now.

Saturday 13 December 2014

Slowly putting a 3D engine together

For the past couple of weeks I've been happily tinkering away at my little 3D engine in what little spare time I have. It is getting to the point where all the basic ingredients are there. It is now a matter of expanding and improving it, it'll still take months before it becomes something useful but thats not the point. If I wanted something useful I'd just download unity or unreal, this is all about learning and having fun.

Now I'm not going to write a tutorial on how to build an engine, or to get started with OpenGL or anything like that. There are hundreds of such tutorials out there. I may try and find some time to write about some of the pitfalls I encountered so far so people can avoid those. The biggest issue I've found so far is that OpenGL has evolved immensely over the years (and the same probably applies to Direct X) and so many tutorials out there have become obsolete as a result. You can't see the wood for the trees.

The biggest challenge was switching over to OpenGL 3/4 where the old fixed pipeline has basically been removed. While many tutorials already make heavy use of programmable shaders they often still rely on other parts that are now deprecated or nonfunctional such as the build in projection and view matrices which you need to take care off yourself in OpenGL 3.

The other one I'll mention just in case it helps someone is that code simply will not function unless you use Vertex Array Objects and as this was an optional before OpenGL 3.0 many tutorials skip over that step. It literally took me a few nights staring at a black screen before figuring that one out. Yes I was using VBOs but without the context provided by a VAO...
VAOs are little special btw, they are simply a container of state that allow you to quickly switch between. So instead of selecting the correct vertex buffer, array buffer, and some other nitbits before rendering each model, you setup that state for a VAO associated with your model once and then make it the active VAO before rendering. Without a VAO there is no state and there thus is nothing to render.

For those who wish to start doing things in OpenGL the hard way, I'm using:
GLFW as a cross platform framework (now that GLUT is defunct)
GLEW as an extension manager
I'm using my own math and 3D primitives classes that I've slowly put together over years of tinkering but I don't recommend doing so. A great math library that suits OpenGL fairly well is GLM, I'm tempted to switch as it is far more complete then my own hobby stuff.

The engine itself uses a deferred lighting technique. Together with shadow mapping this means rendering a scene goes through 3 phases:
- rendering the shadow maps
- rendering the geography buffers
- final output with lighting
The deferred lighting technique I'm not using to its fullest yet but that will come soon.

I've also added stereoscopic support currently geared to what my 3D TV is capable off but I hope to get my hands on a RIFT some day (soon).

Basically there are three parts of my engine I'm working on in Parallel.

The first is a terrain system that now runs fully on the GPU based on a height-map. I'll spend a separate blog post on that some time in the near future once its a little further along. I recently learned about Dual Contouring to render more complex terrain and am tempted to explore that avenue though for my purposes my current approach may be more then sufficient and the best bang for buck performance wise.

The second is an atmosfere rendering that takes care of the background. At this point in time it only does mie and rayleigh scattering mostly based on this article about Sky Rendering. The plan is to mix in a night sky, moon and clouds.

The third allows for managing and rendering 3D models. At the heart of the system is a BSP tree with a build in LOD system. Neither of which I'm using yet but they are an essential part of create larger worlds. Again I'll talk about those in more detail when I'm further along.

The focus for me at the moment is once we get to a point where a model at a certain LOD is visible and needs to be rendered. I recently started using a very high poly model of a city which I found here: Damaged Downtown on tf3dm.com
This model is the worst for what it is I want to do, but as a result its perfect to start improving the engine with.
First off this is a model (split into 3 sub models) of an entire city. That means 90% of what is being rendered is invisible.
Second, everything is at full detail regardless of how far away something is.
Third, the texturing on this thing is insane, nearly every face has its own texture coordinates. This isn't a wide problem for the format the model is stored in but for OpenGL it means that each face needs its own unique vertices and no vertices are shared. For nearly 4 million faces that means 12 million vertices. Thats a LOT.

If I would redo this city in a way that my engine is designed to work it would be setup very differently and eventually I'll get there. But for now it serves its purpose and this is where I am at right now:

So yeah, that is 8.4 frames per second on a 2.6 Ghz I7 Retina Macbook Pro. Nothing to write home about. As it is a static scene I also don't need to redo the shadowmaps every frame or the frame rate would be closer to 2 frames per second.

The challenge for the weeks to come is to get this down as close as I can get to 60 fps.



Tuesday 25 November 2014

FontStash and OpenGL 3.0 support

I'll write up a proper post soon about my experience last weekend moving from OpenGL 1.0 (and Glut) through a conversion to using GLEW + GLFW to OpenGL 3.0+ support.

For now, anyone who is using FontStash, the default glfontstash implementation no longer works in OpenGL 3 because OGL3 requires the use of vertex array buffers and GLSL shaders. The fixed render pipeline has been removed.

I've put an enhanced glfontstash library here:
https://github.com/BastiaanOlij/fontstash/blob/master/src/glfontstash.h

Just add a:
#define GLFONTSTASH_OPGL3
before including the fontstash library


Tuesday 28 October 2014

Fields based on subwindows

Prologue

Below is a write-up based on one of the talks I did at EurOmnis 2014. Further files related to this can be found on my github page: https://github.com/BastiaanOlij/EurOmnis/tree/master/2014

Introduction

We often find ourselves implementing the same behaviour time and time again on entry fields and such, adding additional keyboard shortcuts, adding a button to popup some predefined options, implementing auto completion, etc.

One solution to this problem is to create a helper object that centralises the code used in these scenarios. The downside of this approach is that any enhancement you make to the helper object potentially requires you to go past every instance it is used and make modifications.

Omnis has an easy to use solution that does not seem obvious at first but is really powerful once you know the rules by which to play. Before we look into this solution it is important to dispel one big myth.

$dataname

When we assign a variable name to the $dataname property of a field, such as an entry field, the illusion is created that this entry field now display and makes accessible this variable. In fact this is not what is happening as this break the encapsulation rule in OO programming.

An entry field actually has its own value and it is that value that is being displayed. For convenience we can directly access this value through the $contents property of the field. Even without the $dataname set, the entry field is fully functional.

Omnis will copy the contents of the variable into the fields contents when a redraw is triggered. This is the responsibility of the window, for the window the variable is within scope and it has access to the field and can thus copy the value without breaking any of the rules.

In the background it is slightly smarter as it first checks if the variable has changed by comparing it with the current contents and if changed copy the value and redraws the field. If you've ever build an XCOMP you'll see the special messages you need to implement on the XCOMP that drives this.

Omnis only copies the contents of the field back into the variable in 1 condition and that is when the field looses focus. It is actually part of the evAfter event where the field informs its parents window focus is leaving it, the parent window copies the contents into the variable (again, both now being in scope) and then runs through the $event code.
This is something you can easily see for yourself if you turn $keyevents to true and capture the key events, you'll see the variable never changes while the contents is updated as a result of the keyboard input. Only at the start of the evAfter event the variable now has its new value.

It is this last piece of knowledge that hinders us the most as with subwindows we may combine a couple of fields to form a single entity and we may have a need to update our value but not being able to push it back out to the variable as our main field hasn't got focus. But I'm running ahead of myself.



Omnis is not alone here, to be exact every development environment I've worked in so far, be it Obj-C Cocoa development, Microsoft Foundation Classes, Javascript, etc. have to deal with the same and each implement data binding differently. The easiest in which to see this probably is Javascript/HTML. The entry fields all exist within a form and each hold their own data. You access that data directly on the fields in Javascript through the fields text property and have to write your own code to copy this into any variables where you need the data.

Omnis' solution is one of the more elegant ones I've come across and excels in it simplicity.

Foundation for a subwindow field

The funky thing in all of this is that Omnis has added one extra little feature that seems counter intuitive but allows us to implement more complex fields as subwindows. As it turns out setting the $dataname on both the subwindow field on the parent name and on the field help within causes the logic to expand to the fields within the subwindow.


It is important to realise within our subwindow our variable is out of scope and we do not have access to it. We do however have access to the contents of our field.

We start with building an extremely boring subwindow with just our one field on it to prove the concept and lay down some rules.

One rule is one of a naming convention, you can decide on your own but I tent to prefix these types of subwindows with "wView". "View" comes from the MVC (Model View Controller) approach that is widely adapted today. A view is defined as an entity that visually presents data and enables the users to interact with that data without any further knowledge of the construct the data is a part of nor the business rules that surround that data. But I regress..

Create a new window class and call it wViewExample.
  • Set its $issubwindow property to true (this is purely a helper property and has no real effect on the window).
  • I also tent to size the subwindow to how I would use it and set its $style to kNoFrame but you may want to leave this until the very last moment as it does hinder with development.
  • Set its $backgroundtheme to kBGThemeControl. While this has no effect when using our subwindow it does help when developing our field.
Add an entry field onto our subwindow and name it "Contents". Again this is just a convention I follow, you can decide on your own name if you wish but I'm assuming below it is called Contents.
We're going to set a few properties on this field:
  • first of all, leave $fieldstyle empty, we'll be inheriting our fieldstyle from our subwindow field
  • set $backgroundtheme to kBGThemeParent, this will copy the background settings from our subwindow field
  • set $effect to kBorderNone, our subwindow field will already have the required border, no need in doubling it
  • set $edgefloat to kEFPosnClient
  • set $vertcentertext to kTrue, I generally find this to work better visually but it is an optional. 
  • set $subwindowstyle (text) to true. This ensures our font, fontsize and fontstyle are all taken from our subwindow field
Notice that I've not set $dataname just yet. It is much better to copy our $dataname from our subwindow field and we simply doing this by adding some code to our $construct:
wViewExample.$construct
-----------------------
Calculate $cinst.$objs.Contents.$dataname as $cinst.$dataname
Calculate $cinst.$objs.Contents.$tooltip as $cinst.$tooltip

Notice that I've also copied $tooltip. You may find other properties that you can set on your subwindow field that you would want to copy into your field but for this example I'll stick with $tooltip.

Note that if I was to instantiate my view as a window it works fully, these actions are pretty much ignored. While $cinst points to my window instance these properties don't exist. But we'd never use this window as such.

When used as a subwindow it is very important to realise $cinst points to the subwindow field on the parent window not directly to your subwindow instance. This seems a feeble difference but it is important. Your subwindow instance lives 'inside' your subwindow field. There is a runtime property on your subwindow field called $subinst that gives you access to the subwindow instance.

Now we'll sidetrack slightly, one of the things that may happen is that you will want to change the $dataname or $tooltip in runtime. When you do this you will change the property on the subwindow field but not our entry field held within. Luckily there is a simple solution by implementing $assign methods for these properties:
wViewExample.$dataname.$assign(pvNewName)
-----------------------------------------
Do default ;;  This will assign the dataname on our subwindow field
Calculate $cinst.$objs.Contents.$dataname as pvNewName

wViewExample.$tooltip.$assign(pvNewTooltip)
-------------------------------------------
Do default ;;  This will assign the tooltip on our subwindow field
Calculate $cinst.$objs.Contents.$tooltip as pvNewTooltip

We do not implement getters for these properties as that would break access to our properties on the subwindow field. We would implement getters for any additional properties we want to add to our subwindow.

Now we're ready to put our new subwindow on a test window.
  • we create a window called wExample
  • we create an instance variable on our window called ivTest
  • we drag our subwindow field from the Subwindows tab in our component star (this is what the $issubwindow property is for)
  • we give our subwindow field a name, lets say "MySubwindowField"
  • we set the $dataname property of our subwindow field to ivTest
Now there are a few things we notice when doing this:
  1. when the subwindow is dragged from the component store we end up with a field slightly larger then the size we've set our subwindow class too. Hence why I tent to set these conservative
  2. if you configure your subwindow field either through field styles or by setting its background and text properties you should notice the field within following suit
  3. if you test your window you should notice that the contents of our variable is now shown and changing the data also changes the contents of the variable (after you tab out of the field)

Adding events

The obvious problem is that our evAfter is now contained within our subwindow. Our parent window is never told the user has tabbed out of the field and thus can't react appropriately. We need a way to send events to the parent window.

Omnis does not support a way to fire of standard events. You could call $event but you do not have any control over the event parameters. You could call a method on the containing window through $cwind but this is also not without problems:
  • What if your subwindow is within another subwindow, you'd end up calling the wrong parent
  • What if you have multiple copies of your subwindow on the window, you only have one method, you'd somehow need to know which subwindow is calling
  • What if the developer using your subwindow doesn't know which methods to implement? Or forgets one? You'll run into a nasty error
Omnis however has a very elegant solution, one that is arguably better then the build in $event approach (and yes, I've put in an enhancement request:))

As I mentioned before $cinst will be pointing to the subwindow field, but in absence of a method in the subwindow fields the method in class methods will be called. Now let that sink in, because in every other situation you would expect the method in the class methods to be called.

Lets create a method in our class methods called $evAfter and call it from our after event:
wViewExample.$evAfter
---------------------
;  This is just a stub
Quit method kTrue

wViewExample.Contents.$events
-----------------------------
...
On evAfter    ;; Event Parameters - pClickedField, pClickedWindow, pMenuLine, pCommandNumber, pRow
  If $cinst.$evAfter()=kFalse
    Quit event handler (Discard event)
  Else
    Quit event handler (Pass to next handler)
  End If
...

Note: I tent to quit true or quit false and then issue a discard or pass event on return (the =kFalse is so my code assumes passing the event if nothing is returned).
Alternatively you could just end your $evAfter code with "Quit event handler..." and call "Do $cinst.$evAfter" in your evAfter event. There is no right or wrong here.

If you test your window now, we're no further then we where. We can type in some text, tab out, the $evAfter is called but our parent window is still non the wiser. However we've already dealt with two issues:
  • if our developer using our component doesn't implement the $evAfter method, nothing breaks
  • the developer can check the interface manager and see which events are supported.
For this last point I'm using the naming convention that any class method starting with $ev is an event that will be supported when implementing on the subwindow field.

So our final piece of the puzzle is implementing $evAfter on our parent window:
wExample.MySubwindowField.$evAfter
----------------------------------
OK message MySubwindowField.$evAfter {Our value was changed to [ivMyValue]}

Now when we test our window, type in something in our field and tab out, we get a nice message informing us the value has changed. Our event mechanism works!

Obviously all we have now is a glorified entry field that does a lot less then a normal entry field.
Lets make it a bit more special

A spinner control

Lets change our example into a spinner control. We're going to do a couple of enhancements to our example subwindow:

  1. we are going to assume it holds a numeric value
  2. we are going to implement key events that will increase/decrease the value
  3. we are going to add buttons to increase/decrease our value

For point one the only thing we'll change for now is to set our $align property on our contents field to kCenterJst. We can't enforce the user to use a numeric variable. We could go through the trouble to use a masked entry field instead of a normal entry field and set its formatting to numeric. That I'll leave as an assignment to you to do.

To make our key events work we'll need to set the $keyevents property on our Contents field to true. Then we implement our key event code:
wViewExample.Contents.$events
-----------------------------
...
On evKey     ;; Event Parameters - pKey, pSystemKey
  If pKey='-'|pKey='_'
    Calculate $cobj.$contents as $cobj.$contents-1
   Quit event handler (Discard event)
  Else If pKey='='|pKey='+'
    Calculate $cobj.$contents as $cobj.$contents+1
    Quit event handler (Discard event)
  Else
    Quit event handler (Pass to next handler)
  End If
...

Try out your window (don't forget to save or the properties don't always stick!). You should now be able to increase/decrease the value with the + and - keys respectively.
We are updating the $contents of our field so our variable doesn't change but this is fine, as we still have focus on the field as soon as we tab out Omnis will copy our variable and handle the evAfter event.

Now we are going to do the same with buttons. Add two buttons to your window "IncreaseBtn" and "DecreaseBtn". Set the $edgefloat for "IncreaseBtn" to kEFposnRightToolbar and for "DecreaseBtn" to kEFposnLeftToolbar. You'll want to give them an appropriate width but I'll leave it up to your imagination to further style these buttons.
I also tent to set $disablefocus to kTrue but that is a personal preference.

Now implement some code:
wViewExample.DecreaseBtn.$event
-------------------------------
On evClick     ;; Event Parameters - pRow( Itemreference )
  Calculate $cinst.$objs.Contents.$contents as $cinst.$objs.Contents.$contents-1
  Quit event handler (Pass to next handler)

wViewExample.IncreaseBtn.$event
-------------------------------
On evClick     ;; Event Parameters - pRow( Itemreference )
  Calculate $cinst.$objs.Contents.$contents as $cinst.$objs.Contents.$contents+1
  Quit event handler (Pass to next handler)

Well that was easy... or was it?? Sure when I press one of the buttons it increase or decreases the number. But as soon as I do something else it changes back??????

Why?

Remember, Omnis will copy the contents of your variable into the field on a redraw, and only copy back the contents into the variable when our field looses focus.
When we press the button our field has already lost focus, the new contents is not copied back into our variable.

There are two ways to deal with this. The easiest is to ensure our field does have focus.
wViewExample.DecreaseBtn.$event
-------------------------------
On evClick     ;; Event Parameters - pRow( Itemreference )
  Do $ctarget.$assign($cinst.$objs.Contents)
  Calculate $cinst.$objs.Contents.$contents as $cinst.$objs.Contents.$contents-1
  Quit event handler (Pass to next handler)

And do the same for the IncreaseBtns $event. This will make our field work. Our entry field get focus, we change our contents, and when the user tabs out all is well.

This technique however doesn't always work. For simple situations like these it does but once your subwindow becomes more complex and starts to popup messages or dropdowns and such it falls hopelessly short. Omnis just won't handle the focus correctly or the already queued events will get in the way.

As mentioned a few times now, this is all about scope. When you leave the field Omnis is sending an event to the parent window, it is the parent window that copies the contents into the variable, and then the evAfter is triggered.
We can do the same but we don't want to rely on the developer to implement the needed method on the parent window so we do this dynamically.

Here is the final bit of code needed to make this work consistently:
wViewExample.doDatanameCallback
-------------------------------
;  $cinst points to our subwindow field so this is actually going to look for our method on our parent window!
;  Note that this logic will fail if our parent window is locked
Set reference lvMethodref to $cinst().$methods.$findname('$tmpUpdateData')
If lvMethodref
  ;  Cool
Else
  Set reference lvMethodref to $cinst().$methods.$add('$tmpUpdateData')
  Do lvMethodref.$lvardefs.$add('pvValue',kInteger,k32bitint,0,kTrue)
  Do lvMethodref.$methodlines.$add(con('Calculate ',$cinst.$dataname,' as pvDate'))
End If
; 
;  And call
Do $cinst.$tmpUpdateData($cinst.$objs.Contents.$contents)

wViewExample.DecreaseBtn.$event
-------------------------------
On evClick     ;; Event Parameters - pRow( Itemreference )
  Calculate $cinst.$objs.Contents.$contents as $cinst.$objs.Contents.$contents-1
  Do method doDatanameCallback
  Do $cinst.$evAfter()
  Quit event handler (Pass to next handler)

wViewExample.IncreaseBtn.$event
-------------------------------
On evClick     ;; Event Parameters - pRow( Itemreference )
  Calculate $cinst.$objs.Contents.$contents as $cinst.$objs.Contents.$contents+1
  Do method doDatanameCallback
  Do $cinst.$evAfter()
  Quit event handler (Pass to next handler)

I've also asked TigerLogic for an enhancement to make this push a feature, it really is the only missing ingredient.

Now things are working correctly. We have a reusable field that lets us increase/decrease our value.
A few enhancements that I can think off of the top of my head that would be nice to implement:

  • implement an $evModified method that gets called before $evAfter but only if the value has changed
  • implement a $stepvalue property that allows you to set by what value we increment
  • react to mouse input to increase/decrease the step value

There are plenty more examples in the widgets library, some a lot more complex then this.

Sunday 27 April 2014

Getting back into playing with OpenGL

The single most influential thing that got me into programming where games..

My father worked in IT and we had computers at home since before I was born which may not seem special today, but in the late 70ies, in the Netherlands, people owning their own Personal Computer was rare, very rare indeed.

My first memories are playing simple text based computer games on a TRS-80 or somewhat better looking games on an Atari 2600. While I started learning to program in Basic on the TRS-80 at a very young age it wasn't until my father bought an intel 286 based PC with a EGA graphics card that I started getting really interested in graphics programming. Initially still sparked by a wish to learn how to write games but by the later 80ies I would be introduced to the so called Demo Scene and get hooked purely on doing graphics programming for the sake of making cool effects onscreen.

Eventually I would get a job doing "normal" programming (which in an entirely different way is exciting and rewarding in its own right) and doing graphics programming was put onto the back burner. Occasionally I would play around again, learn some of the newer technologies and forget about it for awhile again. Some of my friends did move into writing games and I've stayed in touch which is great as they do not mind letting me know a bit on what goes on under the hood of modern graphics programming.

My interest has recently been sparked again, purely as a hobby mind you, mostly with the popularity of the new Oculus Rift and my age old love of VR. Combine that with having bought a TV that allows stereoscopic viewer and a Retina Macbook that has an HDMI port and voile, the game is set.

So I've dived back into OpenGL. Why OpenGL and not Direct X (which seems the better choice if I ever actually get my hands on a Rift)? Well simply put, I have a Macbook pro and OpenGL is natively supported and the graphics card in my little Macbook is beefy enough for what I do.
I have since read that while OpenGL is well supported on the Linux platform, Microsoft seems to not give it any love because it competes with Direct X and as there is a large game market using that API it always seems to run a generation in front of OpenGL. Alas.

It isn't a big problem as long as this stays a hobby, if something comes out of this in years in the future, well we'll deal with it then:)

The last time I played with OpenGL was, uhm, 10 years or so ago? Yeah probably about that if I don't count my stint with WebGL a few years back, so I decided to start from scratch. Learned a few new things about vertex buffer attributes and brushed up on my GLSL skills and I have my starting point.

I've got a landscape thingy that I'm still working on, but I've put that on ice for a little while. The thing that has driven my wife nuts for the last few days as I've spend hours deep into the night while everyone is asleep was getting a simple wavefront object loader up and running and doing a simple texture mapped shader on which I can start building some cooler things.

I pulled out a trusty old model from one of my favourite games Homeworld to test things with. Here is a nice little screenshot :)

And for those that are able to view things in stereo, a nice little splitscreen screenshot:


This is a good article (imho) on which I based most of the stereo scopic code: Stereo geometry in OpenGL

Thursday 27 March 2014

Developing externals for Omnis Studio

For the past few years I've been building externals for Omnis Classic and later Omnis Studio. Initially relatively small things but as time progressed I ended up having to build larger and more complex externals.

In the Omnis Classic days externals where relatively limited but the SDK that comes with Omnis Studio allows for a lot of functionality to be implemented as is proven by some of the commercially available externals such as Brainy Data's excellent oWrite.

Unfortunately the documentation often leaves much to be desired, the sample externals are over simplified and support steers clear from any issues involving external development.

Both at my previous employer and my current employer I've been blessed to have worked on some interesting, yet proprietary externals. That hasn't stopped me from doodling around in my spare time, one of the more notorious albeit long forgotten, more public externals I did in the classic days was one that allowed you to "skin" the Omnis interface much like the then popular WinAmp did.

I have for a long time been working, mostly in my spare time, on a framework for creating externals that takes care of a lot of the nitty gritty stuff that ends up in each and every external and just gets in the way of making the external do what you want it to do.  I've started again a few times and still haven't gotten it right but it is slowly coming of age.

I've since published the work that has been done so far on my GitHub page: https://github.com/BastiaanOlij

As I add things onto this project I'll try and keep up blogging about what I've been up to and what things where worth learning.

If you find use of the source code that has been made public, please be so kind as to credit the author (thats me). Other then that use it as you see fit, except for evil, I don't like evil.

On that note, I'd like to take the time to credit and thank the following people:
Kelly Burgess, for his excellent sessions at EurOmnis and his frequent emails with advise
Stefan Csomor, for always offering a helping hand and insight in the more lower level stuff
Michael Monschau, for showing us just how far you can take this
David McKeone, for also contributing to the public knowledge with his excellent GitHub page.