Any errors in the descriptions below are our own.

Annotation Graph Toolkit (AGTK)

The Annotation Graph Toolkit, or AGTK, employs a data model, the annotation graph, which is a directed acyclic graph where edges are labeled with feature-value pairs and nodes can be labeled with time offsets. Structural relationships can be expressed by convention in the edge labelling, but they are not exposed directly in the API as they are in NXT; instead, the focus is on the efficient handling of temporal information. AGTK is written in C++ and comes with a Java port. A query language was planned for AGTK but has not advanced recently. Although AGTK does not provide direct support for writing graphical user interfaces, it does include wrappers for Tcl/Tk and Python, two scripting languages in which writing such interfaces is easier than in C++ itself. The developers expect interfaces to call upon WaveSurfer, a compatible package, to display waveforms and play audio files.


Atlas is intended to generalize the annotation graph and differs in two main ways. First, it allows richer relationships between annotation and signal. In annotation graphs, the only relationship between annotation and signal that is supported in the data handling is the timespan on the signal to which the annotation refers, given as a start and end time. NXT is similar to AGTK in this regard. Atlas, however, defines more generic signal regions which can refer to other properties besides the timing. For example, on a video signal, a region could pinpoint a screen location using X and Y coordinates. Second, Atlas explicitly represents structural relationships by allowing annotations to name a set of ``children'', without constraining how many ``parents'' an annotation may have. The framework for defining the semantics of this relationship and for specifying which types of annotations expect which other types as children, MAIA, is not complete and there have been no new developments for some time. It has the potential to be very flexible, especially if the semantics of the parent-child relationship can vary depending on the types of data objects that they link. The Atlas data model is implemented in Java, and the developers have in the past planned both a query language and direct support for writing graphical user interfaces.


MMAX2 ia primarily used for annotation of text, but it has the facility to play some kinds of audio signal in synchrony with its data display. Timing information is represented in the stylesheet that MMAX2 uses to specify a data display format declaratively and not in the data itself. MMAX2's data model is rather simpler than NXT's, but it allows one to specify different types of annotation all of which point independently to the base documents, and links between annotations. MMAX2 also has a query language based on the idea of intersections between paths. MMAX2 is easier to set up than NXT but in general NXT is more useful, the more one's work relies on crossing hierarchies, complex structural relationships, or timing information.


EMU also shares some properties with NXT, in that it allows time-aligned labelling of speech data including hierarchical decomposition across different tiers of labels and specifically supports query of the label sets. (This differentiates EMU from tools such as Anvil and TASX that are just coding tools without more general support, although given the availability of XML query languages to deal with their data formats, it's not clear that this really makes a difference.)


Other tools and frameworks worth considering: GATE, WordFreak and CALLISTO if your data is textual (i.e., you don't need signal playing to annotate) and you can tolerate stand-off using character offsets; The Observer, Event Editor,TASX, Anvil, and ELAN for simple time-stamped labelling of signals (with some tools offering linking between elements).


Last modified 04/19/06