WebFonts Proposal

WebFonts Proposal

Paul Haeberli
Silicon Graphics Computer Systems
13 Dec 1995
Last Updated 4 Mar 1996
Last Updated 25 Jan 1999

This note describes a simple and efficient way of describing bitmap fonts for down-loading to web browsers. I call this font format a "WebFont".
Motivation
Current HTML tags supported by Netscape and others provide control over the relative size of text, and let the web page author select Italic and Bold for text used on a page. unfortunately, the actual bitmaps for the characters are supplied by the various platforms the browser is running on. This means that documents can and will look quite different on a PC, a MAC, a DEC workstation or an IRIS workstation. This will cause line wrapping to happen in different places, and also can result in very poor looking text on some platforms at some sizes.
The fact that text is drawn differently on each platform makes it extremely difficult to actually design for this medium.
However, if descriptions of bitmap fonts could be sent down to the web browser along with the text and images in the page, I believe we could make a major step forward. A grid for a page could be defined using the <table> tags provided by HTML, and the text would be drawn using bitmaps that are selected by the page designer. This would have to be supported by modifications to the browsers.
A Few Benefits
This approach makes it easy to support non-Latin character sets in a very simple, decentralized way. Documents in Sanskrit, Arabic or Hebrew could be represented very nicely using downloaded bitmap fonts.
Another nice thing about this idea is that it democratizes fonts on the web. Any font designer can easily use or make their fonts available for web page designers.
In addition, anti-aliased bitmaps could be transmitted to the client for display at particular sizes.
Format of a WebFont

A WebFont consists of a GIF image, and a binary file that contains information describing where the pixels for each character in a font can be found in this image.

The .gif image may look something like this.

The GIF image may have only two colors or may have up to 256 levels of gray. It can have transparency if it is desired. This GIF image uses 4 levels of gray.
The binary file containing the other information has a format that is described by this simple grammar:
font = magic | advancedivisor | lineadvance | advancecode | glyphsetlist | glyphlist
The font information consists of a magic number, a divisor that is applied to x and y advances, the normal advance between lines of text, a code that tells whether the font has x advances, y advances, or both x and y advances, a list of glyphsets, and a list of glyphs. A glyph is the image of part or all of a printed character. Some characters will be described by one glyph, but others may be more compactly represented as two or more glyphs. A letter with an accent with typically be described by one glyph that is the image of the letter, and one glyph that is the image of the accent.

magic = 23 | 45
The magic number is the character value 23 followed by the character value 45
advancedivisor = divisor
This value is a short. Remember all short values are written as two bytes. The most significant byte followed by the least significant byte. Advance values are divided by this value before being used. This is typically 1.
lineadvance = xadvance | yadvance
Two short values specify how much to move vertically and horizontally to advance one line.
advancecode = value
This is a single byte.

if b0 = 0, then type advances from left to right.
if b0 = 1, then type advances from right to left.
if b1 = 0, then lines advance from top to bottom
if b1 = 1, then lines advance from bottom to top

if b2 = 0, then no X advances are given per character.
if b2 = 1, then X advances are given per character.
if b3 = 0, then no Y advances are given per character.
if b3 = 1, then Y advances are given per character.
glyphsetlist = glyphset | glypssetlist

glyphset = nglyphs | startcode
startcode is a short providing a character code. nglyphs specifies how many sequential characters are defined in the font. There is a variable length list of glyphsets. These values are each shorts. The last glyphset is followed by two zero bytes.
glyphlist = glyph | glyphlist

glyph = sizecode | advance | imagerectlist

sizecode = value
This is a single byte. But sometimes 2.

if b0 = 0, the advancevalue is a single byte.
if b0 = 1, the advancevalue is a single short.
if b1 = 0, the imagerect sizes are given with a single byte each.
if b1 = 1, the imagerect sizes are given with a single short each.
if b2 = 0 and b3 = 0, the imagerect pixoffset is specified by 1 byte.
if b2 = 1 and b3 = 0, the imagerect pixoffset is specified by 2 bytes.
if b2 = 0 and b3 = 1, the imagerect pixoffset is specified by 3 bytes.
if b2 = 1 and b3 = 1, the imagerect pixoffset is specified by 4 bytes.
if b4 = 0, the imagerect dstorgs (xdorg and ydorg) are given with a single byte each.
if b4 = 1, the imagerect dstorgs (xdorg and ydorg) are given with a single short each.
if b5 = 0, the pixel order is row by row from the bottom left
if b5 = 1, the pixel order is col by col from the bottom left
if b6 = 0, then there is only one imagerect given
if b6 = 1, the number of imagerects follows the sizecode as an unsigned byte.
advance = value or value | value
If the font has both x and y advances, both are given, otherwise only one advance value is given, the other defaults to 0. Each advance value is represented with a singe byte or a single short depending on the sizecode b0.
imagerectlist = imagerect | imagerectlist

imagerect = xsize | ysize | pixoffset | xdorg | ydorg
Each of the size values is represented as a singe byte or a single short depending on the sizecode b1.

The source pixoffset is an offset in the pixel data provided by the GIF image in the standard GIF pixel order from the upper left hand corner to the lower right hand corner. This is represented by 1, 2, 3 or 4 bytes depending on the sizecode b2 and b3.

Each of the destination org values is represented as a singe byte or a single short depending on the sizecode b4.

The GIF image describing the WebFont is read in after the fontinfo file above is read. A naming convention is be used to find the name of the GIF image from the name of the fontinfo file. The name of the GIF file is just the name of the fontinfo file followed by ".gif". The suggested extension on WebFonts is ".wf". The fontinfo for the WebFont shown at the top of this page would be "MatrixBook36.wf" the GIF image with the glyph pixels would be named "MatrixBook36.wf.gif"
Performance
A WebFont encoded in this way will typically be transmitted as about 2000 to 3000 bytes of data. This is comparable to a single GIF image. WebFonts can also be cached on the client just as images are.
New HTML tags
To allow HTML page designers to specify a WebFont to use, I propose three new tags. The first one defines a font. The second one selects a defined font for use. The last one redefines which typeface is used for paragraph text an headers.
To define a font for later use on a page you would use:
<fontdef src="MatrixBook36.wf" idno=1>
To select a defined font to be used for a section of text you would use this tag:
<font idno=1>Your text.</font>
As an alternative to using the font tag to select the typeface to use for a particular section of text, another tag can be used to redefine the typefaces used for headers and paragraph text.
<usefont idno=1 tag=p> <usefont idno=3 tag=p.b> <usefont idno=2 tag=p.i> <usefont idno=4 tag=p.bi> <usefont idno=1 tag=h3> <usefont idno=3 tag=h3.b> <usefont idno=2 tag=h3.i> <usefont idno=4 tag=h3.bi>

An Example
For the WebFont given here, the GIF image uses 4167 bytes, and the fontinfo file requires 770 bytes. This is for a 36 pixel high anti-aliased font with 4 levels of gray. Times-Roman 12 pixels high can be described with 1375 bytes.
I wrote a JaVA demonstration that reads and displays a typeface defined in this way. Please give it a try if you have a JaVA enabled browser. This is just a demonstration, in real application, the web browser would be modified to read in and display the WebFont.
Generating WebFonts
I use this C program to create a WebFont from an Adobe Type1 font. It accepts arguments that select the size that is desired and the amount of supersampling to use to generate anti-aliased bitmaps. It runs an Adobe PostScript RIP to render each character. The black and white output image is converted into a GIF image with some number of gray levels.
This is the typical sequence of UNIX commands:
% webfont MatrixBook out.bw MatrixBook12.wf 12 4 % togif out.bw MatrixBook12.wf.gif -n 4 -t 1000 -u
It would be very easy to support conversion of Truetype fonts as well.
Use of PNG image format instead of GIF image format
PNG format could be used instead of GIF format to represent bitmaps in the font. This would have three advantages; the metric information could be stored as metainformation in the PNG file, more levels of transparency can be used, and the PNG format does not rely on patented compression technology. Thanks to Hakon Lie for suggesting this.
Adjusting Tracking and Leading
It would be usefull to add tags that allow the space between letters, and the space between lines, to be adjusted.
Identifying Illegal Font Use
There may be a technical solution that would make it very easy to find public web pages that use fonts illegally. This could be done by combining a network search engine with a 64 bit tag in each web font. This tag would be associated with the web page author, and written to the WebFont when as it is created.
Printing
It might be nice to store the name of the Type1 or Truetype font that the WebFont is derived from in the fontinfo file. This would allow high quality printing to happen if the associated outlines are installed on your machine.
The Outline Alternative
The idea of using a hinted outline format like Type1 or Truetype is a nice alternative to all this. I'm just afraid it may be too difficult to get the font suppliers to agree to letting people send these across the net. Can anyone comment on this?
Other Suggestions
Someone mentioned a desire to handle kerning and ligatures. I suppose this could be added without too much work.
Watch the Font Discussion at Verso
David Siegel and associates put together a web site a few years ago, that presented various industry viewpoints. Alas it's gone now
Interesting Font Hacks Using HTML or JaVA
Here are several sites that show different ways of getting more control over typography using HTML and/or JaVA.

GraphicFont is a completly amazing hack by Kevin Hughes. Kevin has developed and documented his JAVA class that lets you use different fonts in your applets. Where's Kevin Now?
Shodouka is an Internet mediator service that renders Japanese pages for systems without Japanese fonts. This very nice work was done by Ping.
GifWrapper renders each character as a GIF image. This was developed by the folks from LettError, Erik van Blokland and Just van Rossum.
HotTea draws Farsi with Java. This was developed by Anoosh Hosseini.
Greek text renders each character as a separate GIF image. This was developed by Pete Ferreira.
A hack by Sawad Brooks. A multiline message is printed in an unusual typeface. Here each letter is a GIF image. Where is this now?
Here are two of my own experiments. The custom typefaces shown here were created by rendering each word as a separate GIF image. A similar idea is used in this Kanji/Roman alphabet.
Macromedia's Flash format allows typefaces to be downloaded and animated with ease. This is some fine font technology.
Hrant H. Papazian at UCLA has been exploring the idea of making hand tuned grey scale bitmaps for drawing nice looking text. A sample can be found here. He describes this as "A bunch of little hand-crafted anti-aliased GIFs of the letters (only lowercase for now) coupled to a CGI script that takes a text file and converts the ASCII to a stream of GIFs."
If you know of other interesting WEB font hacks, please let me know!
A Call for Comments
This is just a suggestion, and I can think of other things that could be added. If you have comments or questions, please contact me.

New and modified parts of this proposal are shown in brown.
This proposal does not represent the position of Silicon Graphics.

Paul Haeberli
BACK