Proposal for a Gnome DOM Engine

Raph Levien, 29 March 1999

16 Sep 1999: The XDBM project looks potentially interesting; it seems to have somewhat similar goals to Gdome.

This document presents a proposal for a Gnome DOM engine, as part of the world DOMination project. See also Design considerations for a Gnome DOM for some of the other design considerations.

A standard Gnome DOM engine would be used as the store for virtually all document state in DOM-based applications. In these applications, it would serve as a strict divider between editing and display. Further, tree structure would be the basis for combining components.

Because it is to be used to store all application state, allowing a small, fast representation of the tree is essential. Yet, such small, fast representations are tricky at best - it is far easier to use an existing tree representation such as the one in gnome-xml.

I believe that the best way to approach these problems is to define an interface that supports a number of different tree representations.

Goals and assumptions

The Gnome DOM engine has the following goals:

I'm making some assumptions as well:

The combination of this assumption and the goal not to leak memory is obviously a difficult constraint. My approach to this constraint is to define a "node handle." Memory for the node handle is managed by clients of the DOM engine. In addition, my proposal for the node handle uses a fixed amount of storage, so it can easily be allocated as automatic (i.e. stack) structures, and does not require a lot of malloc/free behavior simply to do a traversal.

Node handles

The client-side representation of all DOM objects is a "node handle". I propose the following C structure:

typedef struct _GdomeHandle GdomeHandle;

struct GdomeHandle {
  GdomeDocument *document;
  union {
    int32 tag;
    void *p;
  } h1, h2, h3;

The implementation of this interface is free to use the h[123] fields in whatever way is required. In the case of wrapping the libxml tree structures, h1.p could simply point to the xmlNode.

The document field contains a vtable for the various methods. The RDOM prototype contains a single vtable for all DOM objects. This seems like a reasonable approach to me. It's a big vtable, but it's simpler than having lots of smaller ones. (I'm not too sure about this. It might be better to have multiple vtables for different objects, but all sharing the same pointer to the implementation-private area)

The document field may also contain arbitrarily much private information. For example, one other plausible implementation of the interface is that the document contains a UTF-8 encoded flat XML representation of the tree, and an array of tag -> physical locations in the XML document. In this case, h1.tag stores the tag, and DOM operations use the array to navigate. I'm of course not claiming that this is a good implementation, only that the interface does not rule it out.

Conversion of the DOM idl to Gdome

The DOM interface is expressed in the standard in idl format. I propose that this idl description be systematically converted to a C-language binding as follows:

Where only one object defines an attribute or method, the canonical name of that attribute or method is the IDL name (eg getElementsByTagName). When more than one object defines a method or attribute, the canonical name is the "short" form of the object name, an underscore, and the IDL name (eg "nlist_length", "nnm_length"). (perhaps the short form of the object name should be used for all attributes and methods) An attribute is expanded into set and get methods, eg. the value attribute becomes:

DOMString get_value();
void set_value(in DOMString value)

The vtable defines a function pointer for each expanded method. When there is a return value that is a DOM object, the type is as follows: there is a GdomeHandle * return value; the arguments are the IDL arguments (with each DOM object replaced by a GdomeHandle *), and finally there is a GdomeHandle *result. Returning a null value is represented by returning a NULL pointer; otherwise, the return value is equal to the result argument, and the fields of the result structure are set to the resulting node handle. There are no "out" parameters in the DOM idl to be concerned with.

Return values of integers and other non-allocated values are completely standard - the arguments are the IDL arguments (again, with GdomeHandle * replacing any DOM objects), and the standard return value. Void returns are handled the same way.

String return values are allocated by the DOM, and the client must call a "string_release" method to release the string. The representation is simply a char * pointer to a UTF-8 encoded string. DOM implementations that allocate strings at stable addresses are free to simply return a pointer, and implement string_release as a no-op.

Finally, Gdome offers a number of convenience functions (which may be either actual functions or macros) for invoking the DOM methods. For example:

GdomeHandle *
Gdome_get_nextSibling (GdomeHandle *self, GdomeHandle *result)
  return self->document->get_nextSibling (self, result);

RDOM implements a prototype of this structure, and I find that there is some overhead compared to the libxml structures, but not much. I am able to create and print to stdout a one million node tree in about 8 seconds on a PII/233. I consider this to be acceptable.

CORBA interface

It should be fairly easy to see how this interface might exported through CORBA - after all, it is based on systematic translation of the DOM idl.

(Elliot, you need to help me with this - I understand that the node handle should map fairly straightforwardly to an Object ID and object reference, but I'm struggling with the details. It also seems to me that the object_release method can just release the 16 bytes or whatever required by the node handle, except for strings, which should be handleable using existing ORBit techniques.

In the other direction (i.e. using the Gdome interface to access remote DOMs via corba), the document field points to the remote CORBA server, while the contents of the h[123] fields are preserved. Note: this only works when the server is also a Gdome implementation, and has the same size pointers. This seems like a serious limitation to me. The pointer size thing may be fixed by requiring 8 bytes per pointer no matter if the local wordsize is 32 bits. For general access to foreign DOM servers, a variation on the Gdome interface is probably required in which the node handles can be arbitrary size and therefore must be allocated with malloc/free.

Integration with existing code

I think it should be fairly straightforward to wrap libxml with the Gdome interface. Thus, applications will be able to use the Gdome interface to access their state without all that much extra work.

One of the directions of gnome-xml is to present the parsed document as a stream of events (using a C-language adaptation of the SAX interface). Code should be written gluing this interface to Gdome, so that any parser or other XML source that uses this SAX interface can generate a Gnome DOM document.

Much of the gnome-dom code will probably need to be scrapped, but with luck it will be possible to learn from its implementation.

Tree mutation events

The level 1 DOM does not provide a mechanism for applications to be notified of changes to the tree. The level 2 working draft does contain an event mechanism, but it looks pretty heavyweight to me. I think we still need to decide what we want to do here. Gnome home