Draft proposal for Gnome printing architecture

28 Feb 2000: This document is still basically a valid overview of Gnome-Print. In addition to the original PostScript driver, drivers now exist for Gnome Canvas-based preview, direct render to RGB (also using the Canvas to render), and metafiles.

The high level text interface is still not done, but work continues on it under the Pango umbrella.

There are a fuckload of other things out there that overlap somewhat with Gnome-Print, including:

VA and HP are doing something or other about printing, or at least talking about it.
APSL is a project by Corel for printing. It seems to have more to do with printer control than the graphics API per se.
Gimp has its own Print plugin, of course with its own set of drivers and so on.
GhostScript is of course what most people use when they want to print in Unix.
UMI, the Universal Multimedia Interface, has printing within its scope. Not sure if it's actually useful, though.
CUPS, or Common Unix Printing System, seems to deal primarily with printer control.
libprint, an attempt to make an everything-neutral printing thingy.
If you really want to, you can use the X API to print. I don't particularly recommend it, though.
Qt applications (including KDE) do their printing through QPainter objects, ie the same imaging model as for drawing on the screen.

Also, check this page for info about color management issues.

24 Sep 1998: Sample code is now available.

Download the most recent stable release from:

ftp://ftp.gnome.org/pub/GNOME/stable/sources/gnome-print/

See also: a description of RedHat 6.0's Advanced Font System.

Gnome is in need of a unified printing architecture. This document outlines a proposal for such an architecture, geared towards heavily graphics-intensive applications.

The goals of this architecture include:

Absolutely uncompromised output quality
Speed, memory efficiency, and other related performance goals
Ability to work smoothly with PostScript printers, fonts, and other resources
A screen display derived from the Caanvas
An extension path for a wide variety of Unicode scripts
An extension path for a richer set of graphics operators than PostScript supports, especially transparency
To make life as easy as possible for application developers

Overview

Towards these goals, we propose an architecture comprising several different components. The main component that an application program sees is the printing API. This API is implemented as a library (as part of Gnome). Upon initialization, the application recieves a printing context, which is conceptually a canvas for the application to paint on using a sequence of paint method invocations. Finally, the application invokes the showpage method, which causes the page to be imaged.

The printing context has a virtualized interface, and may represent a simple translation into PostScript, rasterization for a non-PostScript printer, rasterization for the screen, or translation into a display list file format.

Another major feature of the printing API is access to fonts. In our conceptual model, fonts are not associated with a particular printer, but are rather generally available resources, and are sent to the printer when necessary.

Along with the printing API, Gnome will include a text formatting API, which will handle the basics of text formatting, including hyphenation, justification, kerning, and ligatures. In the extension path, this API also combines several PostScript fonts into a single virtual font, and also handles bidirectional text formatting.

First cut at the printing API

This draft of the printing API contains the functions needed to get basic PostScript printing working. It is expected that many functions will be added later. However, it shouldn't be too hard to support backward compatibility with these functions.

Initialization
GnomePrintContext *
gnome_print_context_new (GnomePrinter *printer);
The main function to create a new printing context. For doing a print preview, there may be a similar function that returns both a print context and a print preview widget.
GnomePrinter *
gnome_print_default_printer (void);
This just returns the default printer. There will be a similar call for popping up a "Select printer" dialog box.
void
gnome_print_context_close (GnomePrintContext *gpc);

void
gnome_print_context_free (GnomePrintContext *gpc);
The close call sends the rendered pages to the printer (eg by invoking lpr on the temporary file). The free call destroys all data structures and frees up any other resources. If free is called before close, it's considered an abort.

API calls for rendering vector graphics

A shorthand notation will be used here - the C prototype:
int
gnome_print_moveto (GnomePrintContext *gpc, double x, double y);
will be represented as:
printer->moveto (double x, double y)
This notation is intended to be reminiscent of object oriented notations. The return code is zero on success, or an error code on failure.

To make the API as consistent with PostScript as possible, a PrintContext contains quite a bit of implicit state, including a current color, a current path, a current clipping path, a current font, and a host of other settings for the specific graphics operators. It's up to the specific implementation whether to actually represent this state or to simply pass it along to the next stage in the printing pipeline (i.e. to generate a PostScript file).

One anticipated future extension is the ability of the printing context to reflect the graphics state. This will need to be enabled in advance of any painting. When enabled, a number of methods are enabled which return pieces of the implicit state in appropriate data structures. For example, getcurrentpath () returns the current path in a Bezier path data structure. Since many of the operations can do nontrivial manipulations on the state (for example, strokepath ()), this implies that the library actually maintains the state and is capable of serious imaging functions.

Thus, one way to access some of these imaging functions would be to support a null printing context, the only function of which is to support these reflection calls. The implementation of the printing API may also choose to dispatch method invocations to both a simple pass-through implementation and the null context implementation to implement enabling reflection.

Like PostScript, the methods are invoked "bottom to top," i.e. each painting method paints over what's already present. Thus, fairly sophisticated layering techniques should be possible by carefully ordering the method invocations.
printer->newpath ()
printer->moveto (double x, double y)
printer->lineto (double x, double y)
printer->curveto (double x1, double y1, double x2, double y2, double
x3, double y3)
printer->closepath ()
These calls simply append segments to the "current path" object in the printer context.

We may also want to support the rmoveto, rlineto, and rcurveto operators, which are identical except for representing coordinates relative to the current point.

Also, the arc, arcn, and arcto operators would probably be handy, even though they can be fairly easily simulated using curveto.
printer->setrgbcolor (double r, double g, double b)
Set the color, with (0, 0, 0) as black and (1, 1, 1) as white. It is expected that the universe of color setting options will expand widely as ICC profile support and prepress graphics are added. But this will do nicely for screen display and basic printing.
printer->fill ()
printer->eofill ()
Fill the current path, using either the nonzero or even-odd winding rules.
printer->setlinewidth (double width)
printer->setmiterlimit (double limit)
printer->setlinejoin (int jointype)
printer->setlinecap (int captype)
printer->setdash (int n_values, double *values, double offset)

printer->strokepath ()
printer->stroke ()
These are basically straightforward implementations of the PostScript operators.

Font support

The font methods in the print API are low-level. Most applications will probably want to use the higher level interface in the text formatting API.
GnomePrintFont *
findfont (char *fontname, double size);

printer->setfont (GnomePrintFont *font);
The findfont function doesn't work on the basis of implicit state in the print context. Rather, it goes off and finds the font, and returns some kind of handle to it. If the font cannot be found, it returns NULL.
printer->show (char *text);
This works the same way as the show operator in PostScript - it displays the text at the current point (i.e. the point set by moveto) in the current font, and advances the point. The text is represented as an 8-bit null-terminated string in the font's own encoding. Thus, this function is not very good if kerning, ligatures, or non-Roman scripts are desired. For most applications, the text formatting API will be superior.

Thus, this function is a fairly thin layer over PostScript's show operator. One function it will provide, however, is to automatically download the font to the printer if it exists in .pfb format on the machine but is not resident in the printer.

Matrix operations

PostScript uses the concept of a current transformation matrix (CTM) to represent scaling, rotation, and generalize affine transforms. The matrix is represented as a six-element array. The transformation from user space to device space is as follows:
x_device = x_user * CTM[0] + y_user * CTM[2] + CTM[4];
y_device = x_user * CTM[1] + y_user * CTM[3] + CTM[5];
The initial CTM represents the bottom left corner of the page as (0, 0) in user space, the point one inch above the corner as (0, 72), and the point one inch to the right of the corner as (72, 0). Note that this coordinate system is "upside down" relative to the usual screen coordinate system.
printer->concat (double matrix[6])
printer->setmatrix (double matrix[6])
The concat method executes CTM = matrix X CTM, using matrix multiplication as defined in section 4.3 of the PostScript Languge Reference Manual, 2nd ed.

The setmatrix method blows away the current CTM and replaces it with the one given. As such, it's fairly dangerous to use.

The translate, rotate, and scale methods can be implemented as simple wrappers over concat.

The state stack

PostScript's state has a number of elements that are easy to set, but fairly difficult to unset, specifically modifications to the CTM, and the clipping path. Thus, printing a tree-structured page in which individual nodes modify the state is best done by wrapping the traversal of nodes in a gsave/grestore pair. These operators push the entire graphics state on a stack and pop it.
printer->gsave ()
printer->grestore ()
Clipping
printer->clip ()
printer->eoclip ()
These methods compute the intersection of the current path and the current clip path, and assign the result to the current clip path. There is no way to expand the clip path except for wrapping the operation in a gsave/grestore.

The clip method uses the nonzero winding rule, while eoclip uses the even-odd winding rule.

Images

Image support is a large can of worms - there are many, many options that could be supported. Let's keep it simple for now, though.
printer->grayimage (char *data, int width, int height, double matrix[6])
printer->rgbimage (char *data, int width, int height, double matrix[6])
The grayimage method is effectively similar to the image operator in PostScript, and rgbimage is effectively similar to rgbimage with ncomp fixed at 3. Bit depth is fixed at 8bpp.

Lots of extensions are possible here. PostScript supports bitdepths larger than 8bpp, and CMYK color spaces. Other extensions we may want to support include larger color spaces (eg 6-color hifi color), and RGBA images. But those are best left for another day.

Another important extension path is support for ICC color profiles. We expect ICC profiles to be the native color management model in the Gnome printing architecture, with PostScript CRD's basically ignored.

Just print the damned page
printer->showpage ()
This method closes out the page. If the document is being printed to a temporary file, it may just add an end-of-page code to the file. If the document is being printed directly to the printer, it may start the actual paper in motion. In the case of the null printing context, it just clears out all of the graphics state.

I am confident that this set is sufficient for most basic printing needs.

Text formatting API

It is expected that most printing in Gnome will be done through the text formatting API rather than the low level font methods of the main printing API. Here is a brief list of the additional features:

Typographic sophistication including kerning and ligatures
Direct support for reflection
Easy access to hyphenation and justification
String encoding is simple Unicode
We will be able to render text on the screen quicker and better

The basic datatype passed from the application to the text formatting API is a attributed text. Conceptually, this is a sequence of 32-bit Unicode/ISO 10646 character codes, each with an associated attribute. In practice, this data will be encoded to save on space. Characters will be UTF-8 encoded. Attributes will be represented as run lengths (i.e. for characters i through j, use this attribute).

The first primary function of the text formatting API will be to convert this attributed text into a list of lines, each of which consists of a list of attributed glyphs. This process is generally known as "hyphenation and justification". In our library, it also includes the steps of kerning and ligatures. This is also the step where "virtual fonts" get resolved into real fonts.

The resulting data structure is opaque to the application (perhaps), but can be queried extensively for geometry info. Example queries are metrics (width of each line, bounding box for each line), and also enough info to resolve an (x, y) coordinate back into a character number.

A special character will be used to represent a "box" for graphics or other in-line element. The dimensions of the box are given as an attribute.

Finally, the list of lines of attributed glyphs is rendered to the printer, using an additional method of the print context.

Here's what I envision for attributes:

Font family (the name of the font, eg "Helvetica")
Size
Expansion/compression (either through scaling (ugh!) or through multiple master)
Matrix slanting (ugh!)
Weight (normal or bold for most fonts, numeric for multiple master)
Other miscellaneous multiple master axes
Italics off/on
Kerning off/on
Ligatures on/normal/maximal (eg ct ligature in some fonts)
Tracking (ie letterspace)
Small caps
Alternate glyphs, font specific (eg a swash variant)
Underline (ugh!)
Strikethrough
Vertical displacement (ie for subscripts and superscripts)
Color

It would probably be best to implement attributes using some kind of extensible tag mechanism. I'm tempted to just use XML for the whole thing, but that might make it harder to handle queries back.

This section needs to get filled in with more detail.

A digression on virtual fonts

I see three applications where virtual fonts really help.

First, I think it's far better to represent small capitals as an attribute rather than a font change. Standard PostScript practice is to include small caps only in a separate "expert" font. Switching back and forth between the regular and expert fonts is a pain at best, and a serious problem when switching to a regular font that does not have a separate small caps font - in the latter case, using a font attribute just causes "false" small caps to be used (i.e. regular capitals set smaller and relatively a little wider).

Second, the standard Adobe encoding only encodes the fi and fl ligatures. Others, including ff, ffi, and ffl, are included in the separate expert font. It shouldn't be any harder to use these.

Third, when mixing Roman and non-Roman scripts, it would be very handy to have a single unified virtual font that covered both scripts, even though it would be implemented as more than one low-level font.

Virtual fonts also open an expansion path, for example for fonts with more than one color, or fonts rendered using images. In these cases, the font may not be tied directly to a traditional PostScript font at all.

Finally, virtual fonts abstract away from a specific font format. They may, for example, provide a consistent interface for using TrueType fonts as well.
Incremental vs. static rendering, or, the Caanvas

The print architecture as described above is geared towards rendering static pages. It is not good for maintaining a highly incremental display.

The primary challenge of incremental display is to compute the minimal region on the display that needs to be repainted, then traverse only that part of the data structure representing the page. To do this effectively requires detailed knowledge of the geometry of the page elements.

I currently believe that the best course of action is to go ahead with the Caanvas according more or less to the existing plan, but to make it interoperate easily with the printing subsystem.

At a minimum, there will be a printing context for dumping into a Caanvas. In addition, all Caanvas objects will contain methods for painting themselves into a printing context. And, of course, the actual capabilities of Caanvas objects should match the printing methods quite closely.

The preferred method of obtaining an on-screen print preview will be to dump into a Caanvas printing context, then displaying the resulting Caanvas as a Gtk widget. This will handily support scrolling and zooming without any additional intervention of the application.

The direct Caanvas API's will continue to be based more on self-contained data structures rather than implicit state, so as to make the manipulation and editing of the Caanvas object tree easier, and also to support fast computation of "deltas" as objects change. Nonetheless, the conversion between Caanvas data structures and printing methods will remain simple.

Extension paths

A number of extension paths come to mind immediately. It is important to get the basic printing functionality to work well first, but it is worth planning for a number of future extensions.

Transparency

The Caanvas currently supports partially-transparent colors, as well as full RGBA images. However, PostScript does not. What's needed is a way to render pages containing transparency efficiently in PostScript. Such a task is difficult but not impossible

A sorted display list file format

The best possible printing performance can be obtained using a sorted display list file format. This is also the file format that's best to spool.

Conceptually, a sorted display list consists of a list of elements, each with an associated layer code and bounding box. The bounding box is guaranteed to enclose all successive painting methods.

In an efficient sorted display list, the bounding box starts out enclosing the page, then shrinks as rapidly as possible towards the bottom of the page. Thus, data can be sent to the printer as the file is parsed. By contrast, with PostScript, it is not possible to send the first byte of data to the printer until the last element is handled, because it's always possible that the last element paints the first pixel.

This is not to be undertaken lightly, however.

ICC transforms

For color matching, the Gnome printing model should use ICC transforms. It should be possible to associate an ICC profile with the printer, and, in addition, it should be possible to specify an ICC profile for each image. For many images, no ICC profile is available, and the image can be assumed to be in sRGB.

Prepress

To adequately support prepress, the printing model must at a bare minimum directly support a CMYK color space, as well as the appropriate ICC transforms.

However, doing prepress correctly also requires control over screening, support for trapping, and a host of other tricky things. It would also make sense to support more than four colors at this point, as well.

Internationalization

Rendering non-Roman scripts correctly is difficult. Issues include bidirectional text, placement of diacriticals, infinitely more complex ligature rules, and the need to handle very large fonts well (a typical CJK font is 3-5 megabytes).
Document Structuring Convention

One area of fairly high priority is to get Document Structuring Convention properly supported in the resulting PostScript files. To do this efficiently requires a few additional methods, especially for specifying at the beginning the number of pages and the list of fonts used.

Encapsulated PostScript

On the other side is the ability to import Encapsulated PostScript files. For outputting to PostScript, these can be included inline. For screen display, it's probably best to invoke GhostScript to render the document.

Performance issues

In this section, I muse on how the preceding specification will be best implemented. I treat each printing context separately.

PostScript output

This section covers generation of PostScript to a file.

For the most part, printing to PostScript is just a question of writing the appropriate printf statements for each of the methods. The one tricky area is to download the fonts to the printer before they're needed.

As we add extensions (mostly for transparency), the PostScript output method will become less and less trivial.

Direct rendering to a buffer

One of the other modes that will be supported is direct rendering to a buffer. This is done using the graphics primitives currently being developed for gfonted and the Caanvas.

This is not, however, the best approach for printing to a color inkjet printer. Keep in mind than an RGB buffer covering an 8x10 print area at 720 dpi is about 124 MB.

To solve this problem, display list techniques will be used instead.

Rendering to the Caanvas

One of the more efficient modes of operation will be to render into the Caanvas, effectively creating a display list, then use the Caanvas architecture to render to the screen or actual printer.

Essentially, each fill, stroke, image, or show method invocation becomes a Caanvas object. Each concat or clip method invocation causes a new node in the grouping tree, while gsave and grestore control the structure of the tree.

When the page is stored as a display list, it is much easier to efficiently implement repaint of exposed areas, as well as to support scrolling. In addition, zooming of a display list can be implemented without any work from the application side.

The Caanvas architecture is also one of the better ways of managing output to color inket printers. The simple "band" technique of rendering long narrow strips, then sending those to the printer, should work best.

The display list approach has one serious drawback: the entire page must be stored in memory, more likely than not duplicating structures stored by the application.

Rendering to non-PostScript laser printers

Here, the main issue is rendering plain text efficiently. In general, for a display list consisting of mostly plain text, the most efficient course of action is to render each glyph actually used in the document, send those to the printer, then simply invoke the glyphs as the display list is traversed.

For images and more complex graphics, this optimization is irrelevant, and it's best to just render the entire page as a bitmap and send it to the printer.

A concept: image callbacks

One way to deal with the memory usage issues of a large display list is to represent images as callbacks to the application rather than big buffers of pixels. Then, whenever image data is needed, the rendering engine (be it PostScript or whatever) calls the callback, passing in a bounding box and a scaling factor (possibly an entire transformation matrix). This technique also supports low-resolution images for screen display as well as high-resolution images for PostScript output.

There should probably be a "free" call of the callback to indicate that the printing subsystem no longer requires the image data and will not call the callback again.

This technique addresses the main drawback of the display list architecture. The extra complexity is almost certainly worth it.

Printing to color inkjets

We've already covered most of the performance issues, but I'll go over printing to a color inkjet in slightly more detail.

The application starts by getting a "color inkjet" print context. This context dumps the page elements into a display list, as described above.

Upon the showpage method invocation, the display list is complete. The print driver begins to render the page, one band at a time.

A typical band size would be the page width (about 5000 pixels for 7 inches at 720 dpi) by about 64 pixels tall, and 3 or 4 planes deep (depending on whether the page model is RGB or CMYK). This buffer is about a megabyte.

For each band, the print driver traverses the display list. Hopefully, most objects can be culled out using bounding box computations. After traversing the display list, the driver has in hand about a megabyte worth of raw image pixels. It then halftones these (using error diffusion) and writes the halftoned pixels (using the printer's native escape codes) to a temporary file.

When the print context is closed, the print driver invokes lpr on the resulting temporary file.

For future work, the temporary file should be bypassed and there should be some way to start the paper moving as soon as the display list is in hand.

levien.com Gnome home