Sunday 12 April 2015

Rendering tiles method 2 (part 7)

So I promised a second approach to the same tile rendering demo in my previous post. With this approach we'll put the solution on its head.

This solution isn't necessarily better or faster, it can be depending on how complex your map is and whether you applied the optimizations to the previous method I suggested or not.
It is however a very worth while technique to learn.

What we are doing in essence here is to draw every pixel on screen and figure out what part of our map should be drawn here. We'll be moving part of our solution into our fragment shader. That in itself should sound some alarm bells as this will increase the workload of the GPU however we are removing overhead as we no longer generate a complex mesh nor attempt to draw anything that would have been off screen. It is how this balance shifts that determines whether this approach is better or worse.

As simple as our example is we're probably implementing a slower solution. However when you are looking at multiple layers of the maps drawn over each other it would be possible to combine the logic within a single fragment shader and possibly remove a lot of overhead.

Inverse of a matrix

One new function that was added to our math library is a function that generates the inverse of a matrix. The inverse of a matrix is a matrix that applies the inverse of the translation of that matrix. Say you have a matrix that rotates an object 90 degrees clockwise, the inverse of that matrix would rotate the object 90 degrees counter clockwise.

It isn't always possible to create the inverse of a matrix but generally speaking it works very well.

In our case we're going to take the inverse of our projection matrix. This allows us to take a screen coordinate, apply our inverse matrix and figure out the coordinate for our "model". Remember that in our original example we generated a 40x40 mesh for rendering our map, it is the coordinates within this 40x40 mesh that we calculate.

Applying our inverse matrix to every pixel we are rendering would be very wasteful but luckily for us we're dealing with a linear projection. As a result we only need to calculate our coordinates at each corner of the screen and interpolate those values, something OpenGL is very good at. We'll look into this a little more once we look at our vertex shader.


Where did my model-view-projection matrix go?!?

First however we take a very important sidestep. I've made only minor changes to my sample application. I don't tent to remove things often used as we may change things back for our next example. In our shader loader we are now loading "invtilemap.vs" and "invtilemap.fs" instead of our orignal tilemap shader. While I'm still defining my mvp matrix we are now using the inverse of this matrix in our solution.

What you'll see is that our call:
mvpId = glGetUniformLocation(shaderProgram, "mvp");

will fail even though our mvp variable is defined in our shader, it just isn't used. GLSL will automatically filter out any unused variables during compiling the shader and thus there is no variable to obtain our uniform location for.

There is a lot of debate to be found on the internet about this and various strategies on how to deal with it. It is a good thing to research and make up your own mind about. For now I just log variables I can't find so I can make up my mind whether this is intentional or indicates a fault in my program. Especially when you have multiple shaders you may not want to custom make different loaders but simply call the same code to load the shaders and just assume common variables to be there.

Our shader loader code has only slightly been changed, the only noteworthy change is the addition of getting our uniform for invmvp, our inverse model-view-projection matrix.

In our render function we then see that we calculate the inverse of our matrix and set it in our shader:
    // set our model-view-projection matrix first
    mat4Copy(&mvp, &projection);
    mat4Multiply(&mvp, &view);

    if (mvpId >= 0) {
      glUniformMatrix4fv(mvpId, 1, false, (const GLfloat *) mvp.m);      
    };
    if (invMvpId >= 0) {
      // also want our inverse..
      mat4Inverse(&invmvp, &mvp);
      glUniformMatrix4fv(invMvpId, 1, false, (const GLfloat *) invmvp.m);      
    };

A little further down we also see that our call to glDrawArrays now tells OpenGL to only draw 2 triangles.

Our inverse shaders

This is due to our main change that we are going to draw each pixel on screen once. As a result we need to draw a single square, thus two triangles, that encompass the entire screen. In OpenGL we're dealing with a coordinate system that has (1.0, 1.0) in the top right and (-1.0, -1.0) so that is what we're drawing in our vertex shader:
#version 330

uniform mat4 mvp;
uniform mat4 invmvp;

out vec2 T;

void main() {
  // our triangle primitive
  // 2--------1/5
  // |        /|
  // |      /  |
  // |    /    |
  // |  /      |
  // |/        |
  //0/3--------4

  const vec2 vertices[] = vec2[](
    vec2(-1.0, -1.0),
    vec2( 1.0,  1.0),
    vec2(-1.0,  1.0),
    vec2(-1.0, -1.0),
    vec2( 1.0, -1.0),
    vec2( 1.0,  1.0)
  );

  // Get our vertice
  vec4 V = vec4(vertices[gl_VertexID], 0.0, 1.0);
  
  // and project it as is
  gl_Position = V;
  
  // now apply the inverse of our projection and use the x/y 
  T = (invmvp * V).xy;
}

Notice also that we apply our inverse model-view-projection matrix but only store the x/y into a 2D vector, that is all we're interested in here. As mentioned before, OpenGL will nicely interpolate this value for us.

The real work now happens in our fragment shader so we'll handle that in parts:
#version 330

uniform sampler2D mapdata;
uniform sampler2D tiles;

in vec2 T;
out vec4 fragcolor;
Nothing much special in our header, we now have both texture maps here, our T input from our vertex shader and our fragcolor output.
void main() {
  vec2 to, ti;
  
  // our tiles are 100.0 x 100.0 sized, need to map that
  to = T / 100.0;
  ti = vec2(floor(to.x), floor(to.y));
  to = to - ti;
Here we've defined two variables, "to" which will be our coordinate in our tilemap and "ti" which will be our coordinate in our map data. You may recall from our original solution we created our tiles as 1.0 x 1.0 tiles and multiplied them by 100 to get 100.0 x 100.0 tiles. Here we need to divide them by 100.
We floor "ti" to get integer values and than subtract "ti" from "to" so "to" contains offsets within a single tile.
  // our bitmaps are 32.0 x 32.0 within a 256x256 bitmap:
  to = 31.0 * to / 256.0;
As our tiles are 32x32 tiles within a 256x256 bitmap, just like in our original solution we need to adjust our offset accordingly
  // now add an offset for our tile
  ti += 20.0;
  int tileidx = int(texture(mapdata, (ti + 0.5) / 40.0).r * 256.0);
  int s = tileidx % 8;
  int t = (tileidx - s) / 8;
  to = to + vec2((float(s * 32) + 0.5) / 256.0, (float(t * 32) + 0.5) / 256.0);
And now we use our "ti" value to figure out which tile we need to draw at our pixel. Note that this is pretty much what we originally did in our vertex shader.
  
  // and get out color
  fragcolor = texture(tiles, to);  
}

And voila, we've got the same result as before we started but using a very different way to get there.

So where from here?

There is much less to say about this then the previous technique.

For one, if you want to overlay multiple layers of maps it would be good to look into retrieving the pixel values for each layer in one fragment shader and either picking the right color or blending them. It would be much more effective then rendering each layer separately and having OpenGL blending it for you.

This technique does not lend itself well initially when you start looking at projecting the map using a 3D projection however using the inverse of a 3D projection opens up an entirely different door. Look around on shadertoy.com and you'll find legions of effects that use this approach as a starting point to apply raycasting type techniques. That however is an entirely different subject for another day.

On a related subject, and this applies to the previous example as well, it will not have escaped you that allowing the user to rotate our map isn't exactly looking nice with the map we are using. That option really only works if we have a pure top down map. I just wanted to show what we can do with our view matrix even if it may not be something you wish to use in a real game environment.

What's next?

I'm not sure yet. I've been going back and forth on what type of mini game to use to get to the next stage. I want to start looking into the OpenGL equivalent of sprites. In its simplest forms we'll just be rendering textured quads but it will allow us to go into rendering with blending turned on at one end, and at GLFW and control input on the other.

It may thus take awhile before I make the next blog as I'll be experimenting with a few ideas before I get something worth blogging about.

Saturday 4 April 2015

Rendering tiles (part 6)

It's finally time to step it up a little and start drawing something usable. As I mentioned in my previous part I want to stick with 2D techniques for awhile.

The first technique I want to look at is two different ways of drawing a 2D tiled map. 2D tiled maps are used in many games to render our background or floor with and are cornerstone to many 2D platformers.

First off, I have to give a wave to Thorbjørn Lindeijer who's behind the excellent http://www.mapeditor.org/. Mapeditor is a tiled map editor that allows you to design your maps that you can subsequently use in your application and the best bit is that it is free to use and supports Windows, Mac and Linux.

Now I'm not going to look into a full loader for the TMX file format that Tiled uses but stick much closer to the basics. I'm going to use the desert sample that comes with Tiled.

There are two parts to our map.

The first are our tiles. For our tiles we have a simple image file in which we have lots of smaller bitmaps all of equal size that we can use repeatedly to make our map:
This looks really handy to the human eye, nice lines in between the tiles, easy to identify, but for a computer they are annoying. Each tile is nicely 32x32 pixels wide and that is something computers like. Another thing OpenGL tends to like, thought this limitation no longer for all hardware, is that texture maps are a magnitude of 2 and are square. You'll see that the png that ended up in our sample code has the black lines removed and is a nice 256x256 texture map.

Another question you may ask is why not create 48 individual image files, one for each tile? Besides it being easier to handle one file instead of 48, it is also an optimization on our hardware. We'll be drawing each of those tiles multiple times and if we'd have to switch between texture maps our hardware will be wasting precious time.

The second part builds our map itself. We've numbered our tiles 0 through 47 (0-7 being the first row, 8-15 the second, etc) and can now create a map of any size where each cell references one of our tiles that needs to be drawn at that cell. Our sample map is 40x40 tiles and all pieced together looks like this:
That's already a pretty big map :) This could still be off a size where you would just load the entire image as a single texture, it's "only" 1280x1280 big (we've lost a few pixels in the export) but that said, this is a very small sample map.
For a real implementation you may use tiles of a much higher resolution and a much larger map.

Tiled stores this information in a format called TMX but for our example logic that format contains way more functionality that I'd want to cover here so we're taking a simpler approach. Tiled luckily has a CSV export to export a map to CSV files.

To make life easy for our sample logic we're taking that CSV file, add some comma's to the end of each lines, and copy-paste it into our source code. This once again is not something you would do for production, you would save the map to a binary file or maybe even to a texture and load that but for our purposes of today, it will suffice:
// map data
unsigned char mapdata[1600] = {
...
};

Now we need to load this information in a way we can use. We are going to do so by loading them as texture maps into our GPU memory.
For our map data this is pretty easy as we already have our data in memory:
  // Need to create our texture objects
  glGenTextures(2, textures);
  
  // now load in our map texture into textures[0]
  glBindTexture(GL_TEXTURE_2D, textures[0]);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RED, 40, 40, 0, GL_RED, GL_UNSIGNED_BYTE, mapdata);

We start with creating two texture objects by calling glGenTextures. Textures is simply an array of unsigned int 2 ints large just like our other GL objects.
We create one for our data map and one for our tile map which we will load in a minute.
We then need to bind the texture we wish to interact with using glBindTexture. Any texture related commands we issue after that effect that texture. First we call glTexParameteri twice to tell OpenGL how we want to filter our image. This determines how OpenGL will interpolate our image if it is rendered at a higher resolution or when it is scaled back. We used GL_NEAREST here as our data map should not be interpolated at all. We also set our image wrapping to clamp to edge. This means that we do not "tile" our image but that the edges of the images are the borders for our texture lookups.
Finally we load our texture map. Note that we're loading it as a single channel image (hence GL_RED).

For our tile map we will use the image loader that can be found in the STB library. I've mentioned STB before as we're using the truetype font logic embedded within to render our text.  It is a wonderful library containing a collection of handy functions to do various things. Loading image files that we can then use as textures is one of those.
  // and we load our tiled map
  data = stbi_load("desert256.png", &x, &y, &comp, 4);
  if (data != 0) {
    glBindTexture(GL_TEXTURE_2D, textures[1]);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, x, y, 0, GL_RGBA, GL_UNSIGNED_BYTE, data);
 
    stbi_image_free(data);
  };
The code here isn't all that different from loading our mapdata. We are using stbi_load to load our PNG file and we are setting our filter to a linear filter.
After we've loaded our texture into GPU memory we no longer need to retain the copy we just loaded and can free the data by calling stbi_image_free.

Note: we do need to include our image library and add our implementation #define in main.c.  
P.S. Another library to look at is SOIL. SOIL is a library that uses stb_image but adds various file formats and many handy wrappers for loading images directly into OpenGL texture objects.

First method - render code


The first method we'll use to render our map is pretty straight forward but somewhat wasteful. But it has some advantages in flexibility. Basically what we are going to do is take our entire 40x40 map and turn it into a 40x40 mesh (well x2 as we need two triangles to render one tile).

We are not however going to build the mesh using vertex buffers. Instead we're going to off load all this to the GPU. Basically what we are going to do is call glDrawArrays (a simpler version of glDrawElements), tell it to draw triangles and tell it the number of vertices involved (40 tiles wide * 40 tiles high * 2 triangles * 3 vertices) and we're doing this without giving it any data.

The result is simply that our vertex shader gets called for each of our vertices and OpenGL will start rendering triangles for every 3 vertices. We'll need to build all our data within our shaders.

Note: we do still need a vertex array object to encapsulate our state even though we hardly load any state.

Before we get to our shaders that do the heavy lifting this is the code in our render loop that eventually will result in our map being drawn:
    // now tell it which textures to use
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_2D, textures[0]);
    glUniform1i(mapdataId, 0);

    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_2D, textures[1]);
    glUniform1i(tileId, 1);
    
    // and draw our triangles
    glBindVertexArray(VAO);
    glDrawArrays(GL_TRIANGLES, 0, 40 * 40 * 3 * 2);
    glBindVertexArray(0);
Here we see that we inform our shader of our two textures we are about to use. We do this by assigning our textures to be the active textures. We can only have a limited number of textures active at any given time, the number depending on the capabilities of the hardware. The GL_TEXTUREn constants may only be defined up to a limited amount but you can access any active texture index by using GL_TEXTURE0 + n instead.

Drawing our map is now a question of making our VAO active (which is pretty much an empty VAO) and calling glDrawArrays.

Note that earlier on in our source code we're loading different shaders and retrieving our mapdataId and tileId uniforms.

Shaders


We're going to stick with our reverse order and look at our fragment shader first. Just to recap, our fragment shader gets called for every pixel we draw on screen to determine its color:
#version 330

uniform sampler2D tiles;

in vec2 T;
out vec4 fragcolor;

void main() {
  fragcolor = texture(tiles, T);  
}

Note that we have two new variables. The first is a uniform (so set from our C source) but of type sampler. This tells OpenGL this is linked to a texture resource.
The second is a new input called T which is a 2D vector, we'll be setting this in our vertex shader. This is the texture coordinate within our tilemap that we'll be drawing.
Our main function simply looks up our texture value using the texture function and assigns it to fragcolor.

Our vertex shader however has become fairly complex so we'll handle that part by part.
#version 330

uniform mat4 mvp;
uniform sampler2D mapdata;

out vec2 T;
We still have our mvp matrix but new is another sampler, this time for our mapdata, and our texture coordinate output T.
void main() {
  // our triangle primitive
  // 2--------1/5
  // |        /|
  // |      /  |
  // |    /    |
  // |  /      |
  // |/        |
  //0/3--------4

  const vec3 vertices[] = vec3[](
    vec3(-0.5,  0.5, 0.0),
    vec3( 0.5, -0.5, 0.0),
    vec3(-0.5, -0.5, 0.0),
    vec3(-0.5,  0.5, 0.0),
    vec3( 0.5,  0.5, 0.0),
    vec3( 0.5, -0.5, 0.0)
  );

  const vec2 texcoord[] = vec2[](
    vec2(         0.0, 31.0 / 256.0),
    vec2(31.0 / 256.0,          0.0),
    vec2(         0.0,          0.0),
    vec2(         0.0, 31.0 / 256.0),
    vec2(31.0 / 256.0, 31.0 / 256.0),
    vec2(31.0 / 256.0,          0.0)
  );
Much like in C code we can define arrays and that is just what we're doing here. We're using two small arrays to create primitives for the two triangles that make up each tile. The first array contains our vertex coordinates for a uniform square centered on 0.0, 0.0, 0.0. The second array contains the coordinates for one tile as if our tile is our top right most tile. Note that as with everything else our coordinate system is unified so our texture map is 1.0 wide by 1.0 high. As our real texture map is 256x256 we need to divide our coordinates.
  // now figure out for which tile we are handling our vertex
  int v = gl_VertexID % 6;
  int i = (gl_VertexID - v) / 6;
  // and for which cell
  int x = i % 40;
  int y = (i - x) / 40;
gl_VertexID is an index given to our shader that tells us which of our 9600 vertices we're currently handling. By taking modulus 6 of this we know which of our 6 vertices for each tile we are handling, that then gives us i which indicates the tile we are currently dealing with. From i we can then determine the x and y of the cell within our map.
  // figure out our vertex position
  vec4 V = vec4((vertices[v] + vec3(float(x - 20), float(y - 20), 0.0)), 1.0);
  
  // scale it to a usable size
  V.xy *= 100.0; 

  // and project it
  gl_Position = mvp * V;
Here we calculate V which initially builds a 40.0 by 40.0 mesh. We scale this up to a 4000.0 by 4000.0 map and finally project it onscreen using our mvp. Remember we set our mvp so that the height of our window is considered to be 1000.0 and the width adjusted for aspect ratio so our map is roughly 4x larger then our screen can display.
  // now figure out our texture coord
  int ti = int(texture(mapdata, vec2((float(x) + 0.5) / 40.0, (float(y) + 0.5) / 40.0)).r * 256.0);
  int s = ti % 8;
  int t = (ti - s) / 8;
  T = texcoord[v] + vec2((float(s * 32) + 0.5) / 256.0, (float(t * 32) + 0.5) / 256.0);
}
Finally using our x,y cell coordinates we lookup our value in our mapdata for that cell. Note again our texture map coordinates being unified so we need to divide the coordinates by 40.0 (the +0.5 is to ensure we get the center of our pixel). The output of our texture function is an RGBA value (vec4) but our map only uses the red channel so we grab only our red value. Finally our color also is unified so we need to multiply by 256.0 to get our actual tile index value (ti). We then modulo 8 our tile index to determine the offset of our tile in our tilemap texture and assign the result to T (again unified, so divide by 256.0).
OpenGL does the rest:)

Navigating our map


If we'd compile of sample at this time OpenGL would nicely draw the center of our map but only about 1/4th of our map but we have no way to interact with it so it is time to look at some basic keyboard interaction.


We'll add support for WASD controls for moving up/left/down/right. We'll also add support for rotating our map using our O and P keys.

There are two ways of handling this. When we press our key our keyboard callback function gets called with a GLFW_PRESS action, when we release the key we get a GLFW_RELEASE action but more importantly if we keep our key pressed we'll also get GLFW_REPEAT calls every couple of ticks. Checking for GLFW_PRESS and/or GLFW_REPEAT is one way to handle this (and this we will use), the other is checking the status of the key by calling glfwGetKey within our update loop.

Finally important to note is that we will do all our key handling in our main.c but add actions to our engine for our movements. This allows us to call the same actions based on other inputs as well (say a mouse, or joystick, or touch) at some later stage without needing to make changes to our core game engine.

We're going to start with our O and P keys for rotating our map.

For this we're going to define our view matrix properly in our source code first as this is what we'll be modifying using our keystrokes (we're kinda changing our 'camera', the map doesn't move, it is our viewpoint to the map that changes).
All we need to do to kick this off is change our view matrix to be a global variable and to initialize it as an identity matrix in engineLoad instead of in our render loop.

For rotating we add the following method to our engine.c source:
void engineViewRotate(float pAngle) {
  vec3 axis;
  mat4 rotate;
  
  mat4Identity(&rotate);
  mat4Rotate(&rotate, pAngle, vec3Set(&axis, 0.0, 0.0, 1.0));
  mat4Multiply(&rotate, &view);
  mat4Copy(&view, &rotate);
};
And then in our main.c we change our key callback to:
static void key_callback(GLFWwindow* window, int key, int scancode, int action, int mods) {
  if (key == GLFW_KEY_ESCAPE && action == GLFW_PRESS) {
    glfwSetWindowShouldClose(window, GL_TRUE);
  } else if ((action == GLFW_PRESS) || (action == GLFW_REPEAT)) {
    switch (key) {
      case GLFW_KEY_O: {
        engineViewRotate(-1.0);
      } break;
      case GLFW_KEY_P: {
        engineViewRotate( 1.0);        
      } break;
      default: {
        // ignore
      } break;
    };
  };
};
Thanks to being able to rotate our map our WASD function gets a bit more complex because we need to take our rotation into account. Thankfully we can use our view matrix to help us here. We'll use a small part of our matrix to rotate our movement vector and get the following function to move our view:
void engineViewMove(float pX, float pY) {
  vec3 translate;
  
  // we apply our matrix "transposed" to get counter rotate our movement to our rotation
  vec3Set(&translate
    , pX * view.m[0][0] + pY * view.m[0][1]
    , pX * view.m[1][0] + pY * view.m[1][1]
    , 0.0  
  );
  
  mat4Translate(&view, &translate);  
};
And then enhance our keyboard callback:
      case GLFW_KEY_W: {
        engineViewMove( 0.0,  5.0);        
      } break;
      case GLFW_KEY_S: {
        engineViewMove( 0.0, -5.0);        
      } break;
      case GLFW_KEY_A: {
        engineViewMove( 5.0,  0.0);        
      } break;
      case GLFW_KEY_D: {
        engineViewMove(-5.0,  0.0);        
      } break;


So where from here?

So that takes care of our basic approach but it is still far from a complete solution. This is however as far as I will take it for now.

So what are things that you'll want to approve or can do with this next?
  • The most important improvement is that this approach is nice for a 40x40 map but once it gets larger we've got a lot of overhead. Say our map is 1000x1000 cells that is a lot of triangles being created every frame 90% of which are off screen. You could load the 1000x1000 cell map as your texture but only create a, say, 10x10 mesh centered on the screen and use an offset to render only a 10x10 slice of cells in your larger map. That wouldn't take more then a few additional lines of code in the vertex shader to accomplish.
  • In line with that, you'd probably want to convert your mapdata into an image file so you can just load it. Do be sure you don't use a format such as JPEG as it will corrupt your data. 
  • Add multi layered maps. A good optimization is to load the map into a single texture using R for the first layer, B for the second, G for your third and A for your fourth and render your map 4 times (vary the Z!). Do make sure when using the alpha channel that you do not use 0 as the GPU assumes that is transparent and will discard the RGBA value. Do use your alpha channel in your tile map for your overlapping layers, remember that even when blending is turned off, an alpha of zero means the pixel is not drawn. You could even keep 4 separate view matrices to create parallax scrolling.
  • Use a perspective matrix instead of an orthographical matrix and you can create an F-Zero type map. Add a texture map as a height field and you've got the beginnings of a full 3D terrain renderer (we'll definitely revisit that once we start looking at 3D).
  • Finally, add support for other input devices like the mouse (we'll add mouse support later on in the series).
And our end result:


What's next?

As with the previous large write-up I'll be rereading all this over the next couple of days so expect some edits as I find dumb typos and stuff.

The next post we will look at a different technique to render the same map as in this write-up. While the method described here is very flexible it suffers from wasted processing power as we're rendering far more triangles then we actually display on screen. Often enough the added flexibility far outweighs the overhead especially if we apply the enhancement in our first bullet point up above there is another technique which means we only render what we need.