Towards Full Hypermedia Support

Although the style sheet mechanism illustrated in the previous examples provides full operational support, it must be considered too low level to support more complex hypermedia document formats. A more generic, declarative approach is needed, to allow for interoperability between hypermedia applications. Such a generic, declarative approach has been proposed by the HyTime standard  [HyTime], which defines syntactic conventions (known as architectural forms) to describe hypertextual features of SGML encoded documents. The actual look-and-feel of a HyTime encoded document, that is the operational realization of these features, is taken to be application dependent. While a full implementation of the HyTime standard is yet to be realized, a limited subset of the architectural forms is supported by some commercial document processing systems  [Panorama]. Issues addressed by the HyTime standard include addressing, linking facilities and spatial and temporal alignment. Not addressed by the HyTime standard are facilities for user interaction, and synchronization issues must be considered to be only weakly addressed by HyTime. In this section we will discuss to what extent the HyTime standard and other hypertext or hypermedia models such as the Dexter Hypertext Reference model  [Dexter] or the Amsterdam Hypermedia Model  [AHM] may be taken as a point of reference for extending the Web with full hypermedia functionality. Another standard of interest is proposed by the MHEG  [MHEG] group. However, whereas MHEG only provides a solution for final form documents (with the presentation information built-in), HyTime (and both other models) are of a more generic nature, since they abstract from detailed presentation issues.


The ability to identify (ranges of) elements within an arbitrary document is a critical issue for all hypermedia systems, since addressing is a prerequisite for hyperlinking and alignment. In the Web, addressing is based on the location of the data on a specific server. Research to make the addressing mechanism independent of the location of the data is still in its infancy. Given a certain document, one must be able to address fragments within the document. In the Dexter Hypertext Reference model  [Dexter], the notion of anchors has been introduced to address items within a document without requiring knowledge of their inner structure. In text documents, anchors are typically encoded within the document structure (e.g. HTML). To identify items in audio, video or read-only text documents (e.g. on CD-ROM), mechanisms are needed to define and store extrinsic anchors independently of the document itself. To allow for both forms of addressing, HyTime provides a plethora of syntactic constructs representing (aggregate) pointers to data. Standardizing these addressing schemes is required for exchanging hypermedia documents between different hypermedia applications. To prevent that several applications need to re-implement these mechanisms, a hypermedia framework needs to support them in a generic way.


Having powerful addressing mechanisms is not sufficient to fully support hyperlinking. To facilitate document exchange, a hypermedia framework needs to support standardized syntactic elements defining both contextual links (i.e. links defined by the source anchor) as well as independent links (i.e. the definition of the link is independent from the definition of its anchors). Independent links may be stored in another document or link base. HyTime provides architectural forms for both type of links. These links can be bidirectional (allowing traversal from source to destination and vice versa) and multi-headed (allowing multiple source and destination anchors). On the Web, support for bidirectional links will require the development of some kind of (globally) distributed link service, since links can no longer be embodied in the (source) document. The Amsterdam Hypermedia Model  [AHM] introduces the notion of link context to specify the scope of both source and destination of hyperlinks in hierarchically structured documents. While such a concept would be useful for most HyTime-encoded documents as well, the HyTime standard does not seem to address this problem. For text-based applications, the style of link presentation can be defined by a style sheet (e.g. by specifying how anchors should be marked). However, for image, audio and video, the presentation of links and anchors seems to be too complicated to be solved on that level.

Spatial and temporal alignment

From a HyTime perspective, presenting a document requires the ability to position multimedia components in a (n-dimensional) coordinate space. Note that for text-based documents, the exact positions of the document's elements can usually be calculated by combining the logical structure (e.g. a title) and the declarative rendering information in the style sheet (e.g. which font and indentation to use for titles). This mechanism relies on the fact that the basic logical structure of the common document formats (e.g. letters, articles, books) is well understood by both application developers and authors. For hypermedia documents, there is still little understanding of such fundamental structural elements. Another issue is the temporal synchronization between the time dependent elements of a hypermedia presentation. These dependencies are often hard to express in a style sheet. In contrast to the specification of fonts and indentation, knowledge of the temporal dependencies is often required to understand the semantics of the document. Such important dependencies should be defined by the content and structure, and not by an accompanying style sheet. HyTime provides mechanisms to wrap multimedia objects in an event. Each event has an extent (i.e. an n-dimensional bounding box) which can be positioned in a finite coordinate space. Although these generic mechanisms can be used to specify spatial alignment and temporal scheduling in a uniform manner, the resulting document structures tend to contain (too) long, complex and detailed extent specifications. Examples of temporal constraint specification in HyTime can be found in  [Erfle].

User interaction

Apart from textual anchors to retrieve documents by URL, the Web provides a limited number of facilities for user interaction, including clickable maps and forms. Both these facilities, however, depend on server-side processing. Adding such facilities as client-side extensions is easy to do, and allows for using more appealing metaphors for interaction, for example by employing 3D objects or virtual reality extensions. However, fitting these extensions and the enriched repertoire of user interactions they allow for, in a hypermedia model is very difficult  [AHM]. For example, considering time-based media, when we take a time-line model as a starting point, we are forced to take into account the indeterminacy caused by possible user interactions. As already observed, the HyTime proposal seems not fit for this, nor for expressing indeterminacy caused by network and other delays.