A Simple OpenGL-based API for Texture Mapped Text

Look around you. Most of the human-made surfaces around you probably have writing on them. Written text is pervasive in our world. Many 3D games and other 3D applications suffer from the lack of readable text within the 3D scenes they render.

Text in scenes adds a real-world richness. Particularly in 3D games, text can convey immersive information. For example, the proverbial writing on the wall may really say "Death lies beyond this corner." The game player will probably do well to take the clue.

So how does a 3D programmer add text into 3D scenes? Good 3D text rendering must be fast but also flexible so that it can be projected, scaled, and rotated as required. This sounds like a job for texture mapping, particularly when accelerated by 3D graphics hardware. As we will see, it is.

Basics of Texture Mapping Text

Texture mapping is well suited for rendering text because textures can be rendered quickly with current 3D hardware and even via clever programming of today's fast CPUs. Textures can be stretched. rotated, scaled, and even projected (assuming the texture mapping is perspective correct) so that texture mapped text looks reasonable in 3D scenes. Other text rendering techniques are drawing bitmaps, rendering characters as connected lines (stroke fonts), or rendering as polygons (outline fonts). Bitmaps are fast but do not rotate, scale, or project well. Stroke and outline fonts tend to render slowly (more transformation overhead) and are somewhat hard to adapt to the varying levels of detail in 3D scenes.

A naive approach would treat every word or phrase written on every single surface as a distinct texture. This would let you add text to your scene, but it is expensive because of the amount of texture images that would be required for lots of text. Additionally, the text would be static. If the only text in your scene is only a couple objects like traffic signs (Stop, Yield, etc.), static textures with text may be fine, but the real world has a lot more variety. A static textures containing text do not handle situations like chalkboards (dynamic text) or books (lots and lots of text).

From a graphics standpoint, if you think about what text is, text consists of sequences of letters (or characters or glyphs) decalled on a surface. Instead of treating every instance of text as a separate texture, what if we could render text from a set of characters. In 2D graphics, you are used to having a font (or character set) from which you render text. Imagine the same thing in 3D.

Again, there is a naive approach to avoid. Say we created a texture for each glyph in a font. That would work, but that would be a lot of textures. Plus, texture mapping hardware generally adds a cost to switching textures. If you switch textures too fast, you are probably wasting performance. Instead, let's keep all the glyphs in a font (or at least all the ones we plan on actually using) in a single texture. Then, when we render different characters in a line of text, we just change our texture coordinates to match where that character resides in our single texture map. Does this really work? Yes.

An Example

Here is an example (the full source code for this example and the texture data are available on-line, see below):

Notice that the text is rotated, projected, differently scaled, decalled on a 3D surface, and color is varied across the text reading "3D". It looks pretty nice. A bonus is that all the rendering operations for texture mapped text are accelerated by good OpenGL hardware.

The scene above is all drawn with a single texture. So what does the texture map look like? Check it out:

Notice that you can find all the glyphs in the 3D scene within the texture image (that shouldn't be too surprising).

A two questions are worth answering. The first question is about the background color. The second question is about the foreground color.

About the Background Color

First, what about the background color? In the texture, the background color is black, but there is no black in the text or on the cube. What happened? Well, the texture isn't really a simple black & white texture. The texture actually uses OpenGL's GL_INTENSITY texture format. The GL_INTENSITY format has both luminance (grayness) and alpha (transparency) information, but the luminance and alpha values at every texel are the same. This texture format turns out to be ideal for texture mapped text.

GL_INTENSITY textures do not take up much space compared to other texture formats. That's good since it leaves more texture memory for other fonts and other color textures. We don't want to waste valuable texture memory if we can avoid it.

So white areas of the texture have a luminance value of 1.0 (white) and an alpha value of 1.0 (fully opaque, not transparent); the black areas have a luminance value of 0.0 (black) and an alpha value 0.0 (fully transparent). The alpha component lets us do two nice things: first, with OpenGL alpha testing, we can avoid drawing pixels below a certain threshold. You can't tell in the scene above because alpha testing is enabled, but each character is really a single 3D rectangle. Alpha testing only allows the glyph pixels to get updated.

Let me show you what I mean. The scene below is almost the same as the one above, except I've disabled alpha testing so you can see the actual textured rectangles that are being drawn:

Maybe now this wondering about the background makes more sense! Of course, you already know the answer now. Alpha testing has eliminated every texture pixel with an alpha of less than 0.5. That means all the background texels get discarded so that you don't see any ugly black in the correct initial version of the scene.

We can actually do a little better than that. We can use alpha blending to actually blend the texels in with the underlying surface (the green cube). But the alpha component in the texture was 1.0 for the foreground of each character? Why would we want to blend then when basic blending with an alpha of 1.0 is just a replace? Well, if we use improved texture filtering (say GL_LINEAR or better), the texture filtering will give us alpha values between 1.0 and 0.0 so blending makes sense. Let's see a magnified view of the scene with high-quality texture filtering and alpha blending enabled:

Wow, the edges of the text actually look nicely antialiased! By combining several OpenGL features (alpha blending, alpha testing, intensity textures, and high-quality texture filtering), we can get very nice looking 3D text.

As an aside, consider if we tried the above in Direct3D. Uh, wouldn't really work that well. Why? First off, as best as I can tell (the documentation is so poor), Direct3D does not support an intensity texture format. If that is indeed true (I believe it is), you'll end up using way more texture memory than an OpenGL 1.1 program would. Indeed, I believe Direct3D only supports RGB and RGBA textures according to the documentation I have. Since texture mapped text must have an alpha component (the A), the texture would be FOUR times bigger than what we really need if we use OpenGL's GL_INTENSITY. That's bad, particularly on consumer 3D graphics hardware cards were texture memory is a very limited resource. And Microsoft claims Direct3D is good for low-end 3D games on inexpensive hardware! Also, the alpha testing and alpha blending capabilities we use (alpha testing is vital!) may not be supported on all Direct3D implementations. If a card doesn't have alpha testing, users of your game or 3D application with that card will be mighty unhappy (either they get slow performance or bad artifacts). You'll have to probe Direct3D's poorly defined capability bits to figure out if you can use alpha testing or not (see dwAlphaCmpCaps). If this sounds like a criticism of Direct3D, it is (though a purely technical one).

About the Foreground Color

Second, the foreground color for the text is blue, not white like in the texture shown above. How did that happen? Be happy it's not white because you probably don't want just pure white text everywhere. It happens that OpenGL allows the current color (post-lighting) to be modulated with the texture color. So in the scene above, the blue letters are drawn with the current color set to (0.2, 0.2, 0.9) RGB (nice blue). This color vector gets multiplied (modulated) by the texel color (post-texture filtering). This means the 1.0 luminance values end up with the same blue color. With high-quality texture filtering enabled, you'll actually get less blue around the edges of characters which is good for the antialiasing shown above.

Also, if you look at the letters "3D" on the left side of the first scene, you'll actually see the color smoothly varying from red at the top to blue at the bottom. This nice effect falls out by just specifying red at the top vertices of each glyph textured rectangle and blue at the bottom vertices of the rectangles. OpenGL's smooth shading takes care of interpolating (easing between) the two colors. It adds a nice effect.

One important issue is avoiding depth buffer artifacts when decalling glyphs onto a surface. If you are not careful, the plane of the surface and the plane of the glyph will have nearly identical depth buffer values and they will "fight" so that the glyphs will only appear some of the time. The solution is to use OpenGL's polygon offset functionality to "lift" the glyph slightly above the surface in depth buffer precision. Here is what can happen without polygon offset:

The previous example images all are use polygon offset to lift the textured glyphs off the surface to avoid the sort of depth buffer precision artifacts shown in the above image.

What about Direct3D? Again, Direct3D is not be up to the task. You'd have the query Direct3D's dwTextureBlendCaps capability bits to see if D3DPTBLENDCAPS_MODULATE was supported. What if this capability is not supported? Then kiss fast modulation of your textured text color goodbye. The polygon offset capability of OpenGL 1.1 is not even supported by Direct3D. You might be able to avoid the decalling artifacts by offsetting the textured glyphs in object space, but that can introduce other artifacts.

Making 3D Texture Mapped Text Easy with a Simple OpenGL-based API

So the above pictures and images sound promising. But isn't texture mapping hard and how exactly would you manage to construct a texture image like the one shown above? Do not worry. The explanation that follows shows you both how to use texture mapped fonts with OpenGL and how to generate font texture images for your applications. All the source code for the API shown and example programs described here are available for free.

So how does the program pictured above render texture mapped fonts? It uses a very simple API that I developed. The API lets you load a texture font file (a .txf file) and then start rendering textured fonts with it. All the routines in the API begin with txf (for TeXtured Font).

Start at the beginning. Here's how your would load a texture font file:

  TexFont *txf;
  txf = txfLoadFont("rockfont.txf");
  if (txf == NULL) {
   fprintf(stderr, "Problem loading %s, %s\n",
     filename, txfErrorString());
   exit(1);
  }

Pretty easy. What does that do? It opens and loads the texture font (shown above). The texture font includes both the texture image and (very importantly) all the texture coordinates for the glyphs contained in the texture image. Obviously, we'll need to be able to tell where the various glyphs are within the texture when assigning texture coordinates for the textured glyphs we draw later.

Now, you actually need to establish the texture for the font. This involves setting up a texture object (or a display list for OpenGL 1.0 without the texture object extension) containing the font's texture image. Make sure you are current to an OpenGL rendering context and then call:

  txfEstablishTexture(txf, 0, GL_TRUE);

This routine tells OpenGL to make a texture object and let OpenGL assign the texture object number (what the zero means) and create mipmaps to allow higher-quality texture filtering (what GL_TRUE means). You could specify the specific texture object number by passing a positive integer to txfEstablishTexture instead of zero. You could save a bit of texture memory (25%) by not creating mipmaps and specifying GL_FALSE.

Now, setup some OpenGL state and you'll be ready to render with textured text:

  glEnable(GL_TEXTURE_2D);
  glAlphaFunc(GL_GEQUAL, 0.0625);
  glEnable(GL_ALPHA_TEST);
  glEnable(GL_BLEND);
  glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
  glEnable(GL_POLYGON_OFFSET);
  glPolygonOffset(0.0, -3);

That enabled 2D texture mapping, enables alpha testing to drop fragments that are fairly close to a 0.0 alpha value, enables blending for nice edges. We could skip the blending for better performance, but not get the nice edges.

Now, figure out how long the string you want to render is. Say you want to render the string "OpenGL" (a good choice since all the letters are in the rockfont.txf texture image) on a 5 by 5 unit square centered at the origin of our modeling coordinate system. We want to know the length so that we can scale to text to fit in our 3D scene correctly. Figure the "OpenGL" text metrics like this:

  int width, ascent, descent;

  text = "OpenGL";
  txfGetStringMetrics(txf, text, strlen(text),
    &width, &ascent, &descent);

The width is 351 units (the ascent above the baseline is 75 units, the descent below the baseline is 32 units). This means to perfectly fit the text on the 5 by 5 unit square, we'll need to scale the text by 5/351 before rendering and then translate it over 2.5 units (after the scaling). That is easy with OpenGL's modeling matrix manipulation routines:

  glMatrixMode(GL_MODELVIEW);
  glTranslatef(-2.5, 0.0, 0.0);
  glScalef(5.0/width, 5.0/width, 5.0/width);

Now, we just render the texture mapped text:

  txfRenderString(txf, "OpenGL", strlen("OpenGL");

Pretty easy. The fancy effects like making a circle from the words "OpenGL OpenGL" is not much harder. You just do a slight rotate with glRotatef after rendering each character. A routine called txfRenderFancyString lets you embed "escape" sequences in strings to control character coloration effects like the smooth shading in the word "3D" in the images above.

Show Me the Source Code!

The TXF texture mapped font API is implemented by texfont.c; the API uses the TexFont.h header file.

The program shown above is txfdemo.c; a simpler TXF program showing rotating text is simpletxf.c. The program to show the complete glyph set texture image for a .txf file is showtxf.c. Check out the source code.

You'll want a few TXF texture font files. I'll supply a few .txf files that you can download: rockfont.txf, curlfont.txf, default.txf, haeberli.txf and sorority.txf (be sure to download these to a file; you can't view them with your browser; use Shift-LeftClick in Netscape to download). Each font is just over 8 kilobytes. Below is what the sorority.txf font (designed by Paul Haeberli of SGI) looks like:

Notice that this font has a lot more useful characters that the rockfont.txf font. Of course, the glyphs in the font are at a lower resolution.

Generating Your Own Textured Font Files

If you use TXF for a limited set of glyphs (say just the capital letters and numbers), you can generate your own .txf files for just the characters your application needs. Since your 3D game or other 3D application may have fairly limited text rendering requirements and use a small set of glyphs, you can generate a .txf file with the characters at higher resolutions. Obviously, the characters in Paul's Sorority font shown above won't look very good if they get heavily scaled up. More resolution in the texture image would help.

So how can you generate your own .txf files? Use the gentexfont utility. This X11-based utility lets you read glyphs from fonts available on any standard X server and combine them together into a texture image. This image and the associated glyph metrics are then written out in a .txf file that you can then load and use with the TXF API described and supplied above.

Here is the source code for gentexfont.c. While the program is X-based (I am not a Windows programmer, sorry), I'm sure someone could convert the program to convert standard Microsoft Windows fonts into .txf files. It doesn't matter how you generate the .txf files though. You can use .txf files generated by the X-based gentexfont program on Windows machines.

Here's an example of the command used to generate the sorority.txf file:

  gentexfont \
    -fn '-sgi-sorority-medium-r-normal--40-*-*-*-p-*--ascii' \
    -file sorority.txf \
    -glist \
'`"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMN\
OPQRSTUVWXYZ01234567890 \!@#$$%^&*()-=+/,.<>;:~{}[]' \
    -bitmap \
    -w 256 \
    -h 256 \
    -gap 3

The font is specific to SGI X servers, so don't expect to find the sorority font on non-SGI X servers. The -fn option names the X font to capture. The -file option says what filename to save the .txf file as. The -glist option lists all the glyphs you want captured (you can duplicate characters on the lists; be sure to include space). The -bitmap option says to save in the compact bitmap format (instead of -byte). The -w and -h options specify the width and height of the texture image. The -gap option specifies the gap in texels between each glyph in the texture image.

A little advice. You can generally fit an entire character set in a 256 by 256 texture. (32 kilobytes in OpenGL's 4-bit per pixel GL_INTENSITY4 internal texture format). You can try cramming glyphs into 128 by 128 textures, but you'll probably have to leave out characters to get things to fit (or be at a tiny glyph resolution). Increasing the texel gap beyond 1 is worthwhile if you plan on using mipmapping with your fonts (see the next section).

Room for Improvement

In the digital image processing world, a texture containing text glyphs would be said to have "high frequency" data. That means good texture filtering for fonts is hard. For fonts to be readable, people like them to be very sharp (high contrast). This raises issues for texture sampling and filtering. In general, things work pretty well, but you should be aware that textured fonts can look pixelized when highly scaled. A GL_LINEAR magnify texture filter can help a little bit, but it may make the edges of characters look a little blurry and they will still look pixelized at high enough resolutions. You can play with the "Filtering" pop-up submenu in txfdemo to see how different filtering options affect text sharpness, bluriness, and in the end, readability.

Mipmapping of glyph texture images can be particularly troublesome. To construct the smaller levels of detail, you have to decimate the larger levels of details. The problem is that with conventional texture filtering, texels from one glyph can "bleed" into another glyph as you decimate. You could avoid some of the problem by implementing a fancy mipmap filter that is more "aware" of glyph boundaries, but eventually you get down to a 1 by 1 level of detail and all the glyphs bleed into a single pixel (even before that, things are pretty bad). In general, when glyphs start to use the smaller levels of detail when mipmapping, the text probably isn't going to be readable no matter what so you don't need to be overly concerned. You can help things by adding a larger texel gap with the -gap option to gentexfont but this steals some space that could have been used for more glyphs (you make the tradeoff).

Better texture filtering (anisotropic filtering) could provide better quality for very "edge on" text (like words written on a road far in the distance viewed by a driver), but it is generally slower and more expensive than conventional mipmapped filtering. Almost no current 3D graphics hardware for PCs and workstations does anisotropic filtering. For most games and other 3D applications, most text that needs reading is generally viewed reasonably straight on. Even with better filtering, "edge on" text can still be hard to read.

Conclusions

I hope that you find this discussion and the associated source code interesting. Texture mapped fonts can add a whole new level of realism to 3D games and other applications. Unfortunately, Microsoft's self-proclaimed "consumer 3D" API is actually quite poor at rendering with the techniques described. Direct3D lacks an intensity texture format; lacks guaranteed alpha testing, alpha blending and texture modulation; and lacks OpenGL's polygon offset functionality - all needed for good text texture mapping. On the other hand, OpenGL 1.1 is extremely well suited for fast, reasonable quality texture mapped fonts. Happy font rendering.

Mark Kilgard (mjk@sgi.com)
Silicon Graphics