net.sourceforge.nite.nom.nomwrite.impl
Class NOMWriteCorpus

java.lang.Object
  extended by net.sourceforge.nite.nom.nomwrite.impl.NOMWriteCorpus
All Implemented Interfaces:
NOMControl, NOMCorpus, SearchableCorpus, org.xml.sax.ext.LexicalHandler
Direct Known Subclasses:
NOMReadCorpus

public class NOMWriteCorpus
extends java.lang.Object
implements NOMCorpus, org.xml.sax.ext.LexicalHandler, NOMControl, SearchableCorpus

NOMCorpus is the top-level class & represents a multi-rooted directed graph: the NOM structure. The constructor must pass a pre-loaded NMetaData structure which is used in many of the methods.

Author:
Jonathan Kilgour, Holger Voormann (SearchableCorpus implementation etc.)

Field Summary
 
Fields inherited from interface net.sourceforge.nite.nom.nomwrite.NOMCorpus
UNTIMED
 
Constructor Summary
NOMWriteCorpus(NiteMetaData meta)
          Construct a NOM corpus ready to load / edit data.
NOMWriteCorpus(NiteMetaData meta, java.io.PrintStream log)
          Construct a NOM corpus ready to load / edit data.
 
Method Summary
 void addDerivedAttribute(java.lang.String oldatt, java.lang.String newatt, double offset)
          add a derived attribute to all the relevant elements in the entire corpus.
 void addDurations(java.lang.String attname)
          add derived 'duration' attributes to all the timed elements in the corpus.
 void clearData()
          Deletes all data in the NOM
 void clearDataForObservation(NObservation ob)
          Removes any currently loaded data relating to the given observation
 void clearDataForObservation(java.lang.String ob)
          Removes any currently loaded data relating to the named observation
 void comment(char[] ch, int start, int length)
          Store comments (part of the LexicalHandler interface)
 void completeLoad()
          finish loading *all* files we know about from the corpus: this only makes sense if lazy loading is switched on, otherwise it will do nothing.
 void deregisterViewer(NOMView display)
          Remove a NOMView from the list of viewers that get notifed of changes.
 boolean edited()
          returns true if the corpus has unsaved edits
 void endCDATA()
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 void endDTD()
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 void endEntity(java.lang.String name)
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 void forceAnnotatorCoding(java.lang.String annotator, java.lang.String coding)
          Force one coding to be loaded for a specific annotator when loadData is called.
 java.lang.String generateID(java.lang.String colour)
          generates an Identifier that's globally unique - used when creating elements - we use 'colour' in an NXT-specific way: it's precisely the filename the element will be serailized into, without its the '.xml' extension: thus it comprises observation name; '.'; the agent name followed by '.' (if an agent coding); the coding name.
 java.lang.Comparable getAttributeComparableValue(java.lang.Object element, java.lang.String name)
          Returns the value of an attribute of an element as Comparable.
 boolean getBatchMode()
          used internally to indicate when the process is in batch mode i.e.
 java.lang.Comparable getCenterComparableValue(java.lang.Object element)
          Returns the center of start and end time as a Comparable value.
 java.lang.String getCodingFilename(NObservation no, NCoding co, NAgent ag)
          Return the actual file to which this data should be serialized (including any annotator-specific subdirectory).
 double getCorpusDuration()
          returns the duration of the corpus (last end time - earliest start time) (or UNTIMED if there is are no timed elements)
 double getCorpusEndTime()
          returns the latest end time of any element in the corpus (or UNTIMED if there is no timed element)
 double getCorpusStartTime()
          returns the earliest start time of any element in the corpus (or UNTIMED if there is no timed element)
 java.lang.Comparable getDurationComparableValue(java.lang.Object element)
          Returns the temporal duration as a Comparable value.
 NOMElement getElementByID(java.lang.String id)
          Return a NOMWriteElement which has the given element ID: you can either pass an unadorned ID in which case NXT searches for the element in all already-loaded files, or you can specify the 'full' ID like this: colour#id (e.g.
 NOMElement getElementByID(java.lang.String colour, java.lang.String id)
           
 java.util.Iterator getElements()
          Returns an Iterator which visits each element in the NOM exactly once: this version loads any data that has not already been loaded.
 java.util.Iterator getElements(java.util.List names)
          Returns an Iterator which visits each element with the specified type (= name of element) in the corpora exactly once.
 java.util.List getElementsByName(java.lang.String name)
          Return a list of NOMWriteElements which have the given element name.
 java.util.Iterator getElementsDominatedBy(java.lang.Object rootElement)
          Returns an Iterator which visits each element dominated by the specified element in the corpora exactly once.
 java.util.Iterator getElementsDominating(java.lang.Object childElement)
          Returns an Iterator which visits each element dominating the specified element in the corpora exactly once.
 java.util.Iterator getElementsLoaded()
          Returns an Iterator which visits each element that has already been loaded into the NOM exactly once: this version does not check whether there is data still to be loaded.
 java.util.Iterator getElementsOfSubgraph(java.lang.Object pointingElement)
          Returns an Iterator which visits each element of the specified subgraphs in the corpora exactly once.
 java.util.Iterator getElementsPointedBy(java.lang.Object startElement)
          Returns an Iterator which visits each element which has a pointer from the specified element in the corpora exactly once.
 java.lang.Comparable getEndComparableValue(java.lang.Object element)
          Returns the start time as a Comparable value.
 java.io.PrintStream getErrorStream()
          Return the error PrintStream
 java.lang.String getHrefAttr()
          Link syntax information: get the name of the 'href' attribute
 java.lang.Comparable getIdComparableValue(java.lang.Object element)
          Returns the ID of an element as a Comparable value.
 java.lang.String getLinkAfterID()
          Link syntax information: get the String that appears after an ID
 java.lang.String getLinkBeforeID()
          Link syntax information: get the String that appears before an ID
 java.lang.String getLinkFileSeparator()
          Link syntax information: get the String that separates a filename from an ID
 java.util.List getLoadedObservations()
          returns a List of NObservation elements - each one the name of an observation that has been asked to be loaded (how much, if any of the observation data actually loaded depends on lazy loading).
 java.io.PrintStream getLogStream()
          Return the log PrintStream
 NOMMaker getMaker()
          This is used by internal corpus-building routines so we make sure we always use the right constructors.
 int getMaxDepth(NLayer layer)
          Return the deepest nesting of elements in this recursive layer (if the layer is not recursive, returns 1 or 0)
 NMetaData getMetaData()
          returns the metadata associated with this NOM
 java.lang.String getNameOfElement(java.lang.Object element)
          Returns the name/type of the specified element.
 java.util.List getPointersTo(NOMElement to_element)
          Return the reverse index of pointers to the given element
 QueryRewriter getQueryRewriter()
          Return the query rewriter that should be used (or null if it is not set)
 java.lang.String getRangeSeparator()
          Link syntax information: get the String that appears between IDs in a range
 java.util.List getRootElements()
          returns a List of NOMElements: the top level "stream" elements
 NOMElement getRootWithColour(java.lang.String colour)
          returns the root NOMElement which has the given colour: we use 'colour' in an NXT-specific way: it's precisely the filename the element will be serailized into, without its the '.xml' extension: thus it comprises observation name; '.'; the agent name followed by '.' (if an agent coding); the coding name.
 java.lang.Comparable getStartComparableValue(java.lang.Object element)
          Returns the start time as a Comparable value.
 java.lang.Comparable getText(java.lang.Object element)
          Returns the value of the text content as Comparable.
 boolean isEditSafe()
          Return true if the corpus can be edited safely - for internal use.
 boolean isLazyLoading()
          Set to true (default) to lazy-load any future calls to load data; false means everything in future load calls is loaded up-front.
 boolean isLoadingFromFile()
          Returns true if data is currently being loaded from file.
 boolean isQueryRewriting()
          true means we have enabled the new query rewrite functionality that can increase the speed of your queries.
 boolean isValidating()
          returns true if the corpus is validating (i.e.
 void loadData()
          Load all data for the corpus into the NOMCorpus.
 void loadData(java.util.List observations, java.util.List codings)
          Load data for a specific set of observations into the NOMCorpus.
 void loadData(NObservation observation)
          Load data for a single observation into the NOMCorpus.
 void loadReliability(NLayer top, NLayer top_common, java.lang.String coder_attribute_name, java.lang.String path, java.util.List observations)
          Load data for the purpose of comparing different coders' data.
 void loadReliability(NLayer top, NLayer top_common, java.lang.String coder_attribute_name, java.lang.String path, java.util.List observations, java.util.List extra_layers)
          Load data for the purpose of comparing different coders' data.
 boolean lock(NOMView view)
          lock the corpus for edits - this is only necessary if more than one application will be writing to the same NOM simultaneously.
 java.util.Iterator NOMWalker()
          Provides an iterator which visits each element in the NOM exactly once.
 void notifyChange()
          Notify all NOMViews that an (unspecified) edit has ocurred
 void notifyChange(NOMEdit edit)
          Notify all NOMViews that a specific NOMEdit has ocurred
 void notifyChange(NOMEdit edit, NOMView view)
          Notify all NOMViews except the one passed as an argument that a NOMEdit has ocurred
 void preferAnnotatorCoding(java.lang.String annotator, java.lang.String coding)
          Prefer one coding to be loaded for a specific annotator when loadData is called.
 void printStructure()
          A method to show the structure of the multi-rooted XML in the NOM
 void registerID(java.lang.String id, java.lang.String colour)
          registers an Identifier as having been used and if necessary, notes an Integer in the ID hash for quick generation of IDs.
 void registerViewer(NOMView display)
          Add a NOMView to the list of viewers that get notifed of changes.
 void removePointerIndex(NOMPointer point)
          Remove a pointer from our global index (the index is required so we can delete appropriate pointers to elements that are themselves deleted.
 NOMElement resolveLink(java.lang.String xlink)
          Resolve an individual xlink expression which points to exactly one NOM element.
 NOMElement resolveLink(java.lang.String xlink, int linktype)
          Resolve an individual xlink expression which points to exactly one NOM element - the second argument explicitly names the link type involved.
 void serializeCorpus()
          Serialize all loaded files
 void serializeCorpus(java.util.List observations)
          Serialize all loaded files for the given list of observations
 void serializeCorpusChanged()
          Serialize all files which have been changed.
 boolean serializeInheritedTimes()
          True if we should allow inherited times to be serialized
 boolean serializeMaximalRanges()
          True if we should serialize ranges
 void setDefaultAnnotator(java.lang.String annotator)
          Set the preferred annotator for *all* codings that is used on subsequent loadData calls.
 void setErrorStream(java.io.PrintStream ps)
          Set the error PrintStream
 void setForceStreamElementNames(boolean bool)
          Set to true to make future serialization calls serialize with stream element names conforming to meta.getStreamElementName().
 void setLazyLoading(boolean bool)
          Set to true (default) to lazy-load any future calls to load data; false means everything in future load calls is loaded up-front.
 void setLogStream(java.io.PrintStream ps)
          Set the log PrintStream
 void setQueryRewriter(QueryRewriter writer)
          Enable the query rewrite functionality and select a rewriter to use (if the argument is null, query rewriting will not be enabled).
 void setQueryRewriting(boolean val)
          set to true to enable the new query rewrite functionality that can increase the speed of your queries
 void setSchemaLocation(java.lang.String location)
          If this method is used with a non-null argument, we make sure the schema instance namespace is output on every stream-like element on serialization along with this as the noNamespaceSchemaLocation
 void setSerializeInheritedTimes(boolean bool)
          Set to true to make future serialization calls serialize with inherited times on structural elements.
 void setSerializeMaximalRanges(boolean bool)
          Set to true (default) to make future serialization calls serialize with ranges where possible.
 void setValidation(boolean validate)
          Set validation for the corpus.
 void startCDATA()
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 void startDTD(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 void startEntity(java.lang.String name)
          part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use
 boolean testContactWith(java.lang.Object a, java.lang.Object b)
          Returns true if element A ends at the time element B starts.
 boolean testDominates(java.lang.Object a, java.lang.Object b)
          Returns true if element A dominates element B.
 boolean testDominates(java.lang.Object a, java.lang.Object b, int distance)
          Returns true if element A dominates element B with the specified distance.
 boolean testDominatesSubgraph(java.lang.Object a, java.lang.Object b)
          Returns true if there is a pointer from the first element to another element which is dominated by the second element.
 boolean testDominatesSubgraph(java.lang.Object a, java.lang.Object b, java.lang.String role)
          Returns true if there is a pointer with a specified role from the first element to another element which is dominated by the second element.
 boolean testHasPointer(java.lang.Object from, java.lang.Object to)
          Returns true if there is a pointer from the first to the second element.
 boolean testHasPointer(java.lang.Object from, java.lang.Object to, java.lang.String role)
          Returns true if there is a pointer from the first to the second element with the specified role.
 boolean testIncludes(java.lang.Object a, java.lang.Object b)
          Returns true if element A temporally includes element B.
 boolean testIsEqual(java.lang.Object a, java.lang.Object b)
          Returns true if element A is the same element as element B.
 boolean testIsInequal(java.lang.Object a, java.lang.Object b)
          Returns true if element A is not the same element as element B.
 boolean testLeftAlignedWith(java.lang.Object a, java.lang.Object b)
          Returns true if element A is left aligned with element B.
 boolean testOverlapsLeft(java.lang.Object a, java.lang.Object b)
          Returns true if element A overlaps left element B.
 boolean testOverlapsWith(java.lang.Object a, java.lang.Object b)
          Returns true if element A overlaps element B.
 boolean testPrecedes(java.lang.Object a, java.lang.Object b)
          Returns true if element A precedes element B.
 boolean testPrecedesTemporal(java.lang.Object a, java.lang.Object b)
          Returns true if element A temporally precedes element B.
 boolean testRightAlignedWith(java.lang.Object a, java.lang.Object b)
          Returns true if element A is right aligned with element B.
 boolean testSameDuration(java.lang.Object a, java.lang.Object b)
          Returns true if element A and element B have the same duration.
 boolean testSameExtend(java.lang.Object a, java.lang.Object b)
          Returns true if element A and element B have the same duration.
 boolean testTimed(java.lang.Object a)
          Returns true if the element A is timed.
 boolean unlock(NOMView view)
          unlock the corpus
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NOMWriteCorpus

public NOMWriteCorpus(NiteMetaData meta)
Construct a NOM corpus ready to load / edit data. This constructor also does some basic loading of non-observation specific data like ontologies and object-sets here.


NOMWriteCorpus

public NOMWriteCorpus(NiteMetaData meta,
                      java.io.PrintStream log)
Construct a NOM corpus ready to load / edit data. This constructor also does some basic loading of non-observation specific data like ontologies and object-sets here.

Method Detail

getBatchMode

public boolean getBatchMode()
used internally to indicate when the process is in batch mode i.e. we're currently loading a set of files.

Specified by:
getBatchMode in interface NOMCorpus

isLoadingFromFile

public boolean isLoadingFromFile()
Returns true if data is currently being loaded from file.

Specified by:
isLoadingFromFile in interface NOMCorpus

loadData

public void loadData(java.util.List observations,
                     java.util.List codings)
              throws NOMException
Load data for a specific set of observations into the NOMCorpus. Incremental loading of data is the default, so a new call to loadData will not zero-out the data loaded in a previous call. If the list of codings is non-null, it will be expected to be a list of NCodings that is the maximal set to be loaded whether lazy loading is on or off.

Specified by:
loadData in interface NOMCorpus
Throws:
NOMException

loadData

public void loadData(NObservation observation)
              throws NOMException
Load data for a single observation into the NOMCorpus. Incremental loading of data is the default, so a new call to loadData will not zero-out the data loaded in a previous call.

Specified by:
loadData in interface NOMCorpus
Throws:
NOMException

setDefaultAnnotator

public void setDefaultAnnotator(java.lang.String annotator)
Set the preferred annotator for *all* codings that is used on subsequent loadData calls. This will be overridden by any codings that are forced to a specific annotator using 'forceAnnotatorCoding', or even preferred using 'preferAnnotatorCoding'. Note that this is the preferred annotator only, and if there is no annotator data for any coding but gold-standard data is present, that will be loaded instead.

Specified by:
setDefaultAnnotator in interface NOMCorpus

forceAnnotatorCoding

public void forceAnnotatorCoding(java.lang.String annotator,
                                 java.lang.String coding)
                          throws NOMException
Force one coding to be loaded for a specific annotator when loadData is called. This loads from the annotator's directory even if it's empty, and there is gold-standard data available.

Specified by:
forceAnnotatorCoding in interface NOMCorpus
Throws:
NOMException

preferAnnotatorCoding

public void preferAnnotatorCoding(java.lang.String annotator,
                                  java.lang.String coding)
                           throws NOMException
Prefer one coding to be loaded for a specific annotator when loadData is called. This means if there's no annotator data for the coding (in fact its enclosing coding-file) we take any 'gold-standard' data instead.

Specified by:
preferAnnotatorCoding in interface NOMCorpus
Throws:
NOMException

loadData

public void loadData()
              throws NOMException
Load all data for the corpus into the NOMCorpus. Incremental loading of data is the default, so a new call to loadData will not zero-out the data loaded in a previous call.

Specified by:
loadData in interface NOMCorpus
Throws:
NOMException

loadReliability

public void loadReliability(NLayer top,
                            NLayer top_common,
                            java.lang.String coder_attribute_name,
                            java.lang.String path,
                            java.util.List observations)
                     throws NOMException
Load data for the purpose of comparing different coders' data. This is a write-able implementation so it's invalid to do this (we don't want anyone attempting to serialize this stuff).

Specified by:
loadReliability in interface NOMCorpus
Throws:
NOMException

loadReliability

public void loadReliability(NLayer top,
                            NLayer top_common,
                            java.lang.String coder_attribute_name,
                            java.lang.String path,
                            java.util.List observations,
                            java.util.List extra_layers)
                     throws NOMException
Load data for the purpose of comparing different coders' data. This is a write-able implementation so it's invalid to do this (we don't want anyone attempting to serialize this stuff).

Specified by:
loadReliability in interface NOMCorpus
Throws:
NOMException

clearData

public void clearData()
Deletes all data in the NOM

Specified by:
clearData in interface NOMCorpus

clearDataForObservation

public void clearDataForObservation(NObservation ob)
Removes any currently loaded data relating to the given observation

Specified by:
clearDataForObservation in interface NOMCorpus

clearDataForObservation

public void clearDataForObservation(java.lang.String ob)
Removes any currently loaded data relating to the named observation

Specified by:
clearDataForObservation in interface NOMCorpus

getRootElements

public java.util.List getRootElements()
returns a List of NOMElements: the top level "stream" elements

Specified by:
getRootElements in interface NOMCorpus

getRootWithColour

public NOMElement getRootWithColour(java.lang.String colour)
returns the root NOMElement which has the given colour: we use 'colour' in an NXT-specific way: it's precisely the filename the element will be serailized into, without its the '.xml' extension: thus it comprises observation name; '.'; the agent name followed by '.' (if an agent coding); the coding name.

Specified by:
getRootWithColour in interface NOMCorpus

getElementsByName

public java.util.List getElementsByName(java.lang.String name)
Return a list of NOMWriteElements which have the given element name.

Specified by:
getElementsByName in interface NOMCorpus

getElementByID

public NOMElement getElementByID(java.lang.String colour,
                                 java.lang.String id)
Specified by:
getElementByID in interface NOMCorpus

getElementByID

public NOMElement getElementByID(java.lang.String id)
Return a NOMWriteElement which has the given element ID: you can either pass an unadorned ID in which case NXT searches for the element in all already-loaded files, or you can specify the 'full' ID like this: colour#id (e.g. q4nc4.f.moves#move.3 would refer to element 'move.3' in the file q4nc4.f.moves.xml)

Specified by:
getElementByID in interface NOMCorpus

getMaxDepth

public int getMaxDepth(NLayer layer)
Return the deepest nesting of elements in this recursive layer (if the layer is not recursive, returns 1 or 0)

Specified by:
getMaxDepth in interface NOMCorpus

isValidating

public boolean isValidating()
returns true if the corpus is validating (i.e. if it is checking against the metadata whether changes are valid). The default value for validation is true

Specified by:
isValidating in interface NOMCorpus

setValidation

public void setValidation(boolean validate)
Set validation for the corpus. The default value for validation is true.

Specified by:
setValidation in interface NOMCorpus

getMetaData

public NMetaData getMetaData()
returns the metadata associated with this NOM

Specified by:
getMetaData in interface NOMCorpus

getLoadedObservations

public java.util.List getLoadedObservations()
returns a List of NObservation elements - each one the name of an observation that has been asked to be loaded (how much, if any of the observation data actually loaded depends on lazy loading).

Specified by:
getLoadedObservations in interface NOMCorpus

setForceStreamElementNames

public void setForceStreamElementNames(boolean bool)
Set to true to make future serialization calls serialize with stream element names conforming to meta.getStreamElementName(). Default is that stream elements will be output as they are input.

Specified by:
setForceStreamElementNames in interface NOMCorpus

setSchemaLocation

public void setSchemaLocation(java.lang.String location)
If this method is used with a non-null argument, we make sure the schema instance namespace is output on every stream-like element on serialization along with this as the noNamespaceSchemaLocation

Specified by:
setSchemaLocation in interface NOMCorpus

setLazyLoading

public void setLazyLoading(boolean bool)
Set to true (default) to lazy-load any future calls to load data; false means everything in future load calls is loaded up-front.

Specified by:
setLazyLoading in interface NOMCorpus

completeLoad

public void completeLoad()
finish loading *all* files we know about from the corpus: this only makes sense if lazy loading is switched on, otherwise it will do nothing.

Specified by:
completeLoad in interface NOMCorpus

isLazyLoading

public boolean isLazyLoading()
Set to true (default) to lazy-load any future calls to load data; false means everything in future load calls is loaded up-front.

Specified by:
isLazyLoading in interface NOMCorpus

setSerializeInheritedTimes

public void setSerializeInheritedTimes(boolean bool)
Set to true to make future serialization calls serialize with inherited times on structural elements. Set to false (default) to only serialize start and end times on timed elemets.

Specified by:
setSerializeInheritedTimes in interface NOMCorpus

serializeInheritedTimes

public boolean serializeInheritedTimes()
True if we should allow inherited times to be serialized

Specified by:
serializeInheritedTimes in interface NOMCorpus

serializeMaximalRanges

public boolean serializeMaximalRanges()
True if we should serialize ranges

Specified by:
serializeMaximalRanges in interface NOMCorpus

setSerializeMaximalRanges

public void setSerializeMaximalRanges(boolean bool)
Set to true (default) to make future serialization calls serialize with ranges where possible. Set to false to explicitly list all nite children.

Specified by:
setSerializeMaximalRanges in interface NOMCorpus

getLinkFileSeparator

public java.lang.String getLinkFileSeparator()
Link syntax information: get the String that separates a filename from an ID

Specified by:
getLinkFileSeparator in interface NOMCorpus

getLinkBeforeID

public java.lang.String getLinkBeforeID()
Link syntax information: get the String that appears before an ID

Specified by:
getLinkBeforeID in interface NOMCorpus

getLinkAfterID

public java.lang.String getLinkAfterID()
Link syntax information: get the String that appears after an ID

Specified by:
getLinkAfterID in interface NOMCorpus

getRangeSeparator

public java.lang.String getRangeSeparator()
Link syntax information: get the String that appears between IDs in a range

Specified by:
getRangeSeparator in interface NOMCorpus

getHrefAttr

public java.lang.String getHrefAttr()
Link syntax information: get the name of the 'href' attribute

Specified by:
getHrefAttr in interface NOMCorpus

NOMWalker

public java.util.Iterator NOMWalker()
Provides an iterator which visits each element in the NOM exactly once. We guarantee to traverse each "document" in document order, where "document" refers to a file that is read in or a pseudo-file that is created internally when data is loaded for a particular purpose. These "documents" are not considered to be ordered.

Specified by:
NOMWalker in interface NOMCorpus

serializeCorpus

public void serializeCorpus()
                     throws NOMException
Serialize all loaded files

Specified by:
serializeCorpus in interface NOMCorpus
Throws:
NOMException

serializeCorpusChanged

public void serializeCorpusChanged()
                            throws NOMException
Serialize all files which have been changed.

Specified by:
serializeCorpusChanged in interface NOMCorpus
Throws:
NOMException

serializeCorpus

public void serializeCorpus(java.util.List observations)
                     throws NOMException
Serialize all loaded files for the given list of observations

Specified by:
serializeCorpus in interface NOMCorpus
Throws:
NOMException

getCodingFilename

public java.lang.String getCodingFilename(NObservation no,
                                          NCoding co,
                                          NAgent ag)
Return the actual file to which this data should be serialized (including any annotator-specific subdirectory).

Specified by:
getCodingFilename in interface NOMCorpus

resolveLink

public NOMElement resolveLink(java.lang.String xlink)
Resolve an individual xlink expression which points to exactly one NOM element. Note that the format of the link depends on the metadata link syntax setting.

Specified by:
resolveLink in interface NOMCorpus

resolveLink

public NOMElement resolveLink(java.lang.String xlink,
                              int linktype)
Resolve an individual xlink expression which points to exactly one NOM element - the second argument explicitly names the link type involved. It can be one of XPOINTER_LINKS or LTXML1_LINKS (defined in the NMetaData class)

Specified by:
resolveLink in interface NOMCorpus

comment

public void comment(char[] ch,
                    int start,
                    int length)
Store comments (part of the LexicalHandler interface)

Specified by:
comment in interface org.xml.sax.ext.LexicalHandler

endCDATA

public void endCDATA()
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
endCDATA in interface org.xml.sax.ext.LexicalHandler

endDTD

public void endDTD()
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
endDTD in interface org.xml.sax.ext.LexicalHandler

endEntity

public void endEntity(java.lang.String name)
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
endEntity in interface org.xml.sax.ext.LexicalHandler

startCDATA

public void startCDATA()
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
startCDATA in interface org.xml.sax.ext.LexicalHandler

startDTD

public void startDTD(java.lang.String name,
                     java.lang.String publicId,
                     java.lang.String systemId)
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
startDTD in interface org.xml.sax.ext.LexicalHandler

startEntity

public void startEntity(java.lang.String name)
part of the org.xml.sax.ext.LexicalHandler implementation - not for programming use

Specified by:
startEntity in interface org.xml.sax.ext.LexicalHandler

registerViewer

public void registerViewer(NOMView display)
Add a NOMView to the list of viewers that get notifed of changes.

Specified by:
registerViewer in interface NOMControl

deregisterViewer

public void deregisterViewer(NOMView display)
Remove a NOMView from the list of viewers that get notifed of changes.

Specified by:
deregisterViewer in interface NOMControl

isEditSafe

public boolean isEditSafe()
Return true if the corpus can be edited safely - for internal use. The corpus is always safe to edit if it is not shared; if the corpus is shared, edits are permitted only if a process has locked the corpus.

Specified by:
isEditSafe in interface NOMCorpus

notifyChange

public void notifyChange()
Notify all NOMViews that an (unspecified) edit has ocurred

Specified by:
notifyChange in interface NOMControl

notifyChange

public void notifyChange(NOMEdit edit)
                  throws NOMException
Notify all NOMViews that a specific NOMEdit has ocurred

Specified by:
notifyChange in interface NOMControl
Throws:
NOMException

notifyChange

public void notifyChange(NOMEdit edit,
                         NOMView view)
                  throws NOMException
Notify all NOMViews except the one passed as an argument that a NOMEdit has ocurred

Specified by:
notifyChange in interface NOMControl
Throws:
NOMException

lock

public boolean lock(NOMView view)
lock the corpus for edits - this is only necessary if more than one application will be writing to the same NOM simultaneously.

Specified by:
lock in interface NOMCorpus

unlock

public boolean unlock(NOMView view)
unlock the corpus

Specified by:
unlock in interface NOMCorpus

edited

public boolean edited()
returns true if the corpus has unsaved edits

Specified by:
edited in interface NOMCorpus

getPointersTo

public java.util.List getPointersTo(NOMElement to_element)
Return the reverse index of pointers to the given element

Specified by:
getPointersTo in interface NOMCorpus

removePointerIndex

public void removePointerIndex(NOMPointer point)
Remove a pointer from our global index (the index is required so we can delete appropriate pointers to elements that are themselves deleted.

Specified by:
removePointerIndex in interface NOMCorpus

generateID

public java.lang.String generateID(java.lang.String colour)
generates an Identifier that's globally unique - used when creating elements - we use 'colour' in an NXT-specific way: it's precisely the filename the element will be serailized into, without its the '.xml' extension: thus it comprises observation name; '.'; the agent name followed by '.' (if an agent coding); the coding name.

Specified by:
generateID in interface NOMCorpus

registerID

public void registerID(java.lang.String id,
                       java.lang.String colour)
registers an Identifier as having been used and if necessary, notes an Integer in the ID hash for quick generation of IDs. Should only be used internally to the net.sourceforge.net.sourceforge.net.sourceforge.nite.nom.nomwrite.impl package. 'colour' is the filename the element will be serialized into without the .xml extension.

Specified by:
registerID in interface NOMCorpus

printStructure

public void printStructure()
A method to show the structure of the multi-rooted XML in the NOM


getElements

public java.util.Iterator getElements()
Returns an Iterator which visits each element in the NOM exactly once: this version loads any data that has not already been loaded.

Specified by:
getElements in interface SearchableCorpus
Returns:
an Iterator which visits each element in the NOM exactly once

getElementsLoaded

public java.util.Iterator getElementsLoaded()
Returns an Iterator which visits each element that has already been loaded into the NOM exactly once: this version does not check whether there is data still to be loaded.

Returns:
an Iterator which visits each element in the NOM exactly once

getText

public java.lang.Comparable getText(java.lang.Object element)
Returns the value of the text content as Comparable.

Specified by:
getText in interface SearchableCorpus
Parameters:
element - the element containing the text, that will be returned
Returns:
the value of the text content as Comparable

getAttributeComparableValue

public java.lang.Comparable getAttributeComparableValue(java.lang.Object element,
                                                        java.lang.String name)
Returns the value of an attribute of an element as Comparable.

Specified by:
getAttributeComparableValue in interface SearchableCorpus
Parameters:
element - the element with the requested attribute
name - the name of the attribute
Returns:
the value of an attribute of an element as Comparable

testIsEqual

public boolean testIsEqual(java.lang.Object a,
                           java.lang.Object b)
Returns true if element A is the same element as element B.

Specified by:
testIsEqual in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A is the same element as element B

testIsInequal

public boolean testIsInequal(java.lang.Object a,
                             java.lang.Object b)
Returns true if element A is not the same element as element B.

Specified by:
testIsInequal in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A is not the same element as element B

testDominates

public boolean testDominates(java.lang.Object a,
                             java.lang.Object b)
Returns true if element A dominates element B. Notice that an element also dominates itself.

Specified by:
testDominates in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A dominates element B

testDominates

public boolean testDominates(java.lang.Object a,
                             java.lang.Object b,
                             int distance)
Returns true if element A dominates element B with the specified distance. Notice that with distance=0 this metode is equale to testIsEqual(java.lang.Object, java.lang.Object). Also distance < 0 is possible, means element B dominates element B.

Specified by:
testDominates in interface SearchableCorpus
Parameters:
a - element A
b - element B
distance - distance between element A and element B
Returns:
true if element A dominates element B with the specified distance

testPrecedes

public boolean testPrecedes(java.lang.Object a,
                            java.lang.Object b)
Returns true if element A precedes element B.

Specified by:
testPrecedes in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A precedes element B

testHasPointer

public boolean testHasPointer(java.lang.Object from,
                              java.lang.Object to)
Returns true if there is a pointer from the first to the second element.

Specified by:
testHasPointer in interface SearchableCorpus
Parameters:
from - start element of the pointer
to - target element of the pointer
Returns:
true if there is a pointer from the first to the second element

testHasPointer

public boolean testHasPointer(java.lang.Object from,
                              java.lang.Object to,
                              java.lang.String role)
Returns true if there is a pointer from the first to the second element with the specified role.

Specified by:
testHasPointer in interface SearchableCorpus
Parameters:
from - start element of the pointer
to - target element of the pointer
role - the role of the pointer
Returns:
true if there is a pointer from the first to the second element with the specified role

testDominatesSubgraph

public boolean testDominatesSubgraph(java.lang.Object a,
                                     java.lang.Object b)
Returns true if there is a pointer from the first element to another element which is dominated by the second element. This methode may be usefull for type hierarchies.

Specified by:
testDominatesSubgraph in interface SearchableCorpus
Parameters:
from - start element of the pointer
to - element which dominates target element of the pointer
Returns:
true if there is a pointer from the first element to another element which is dominated by the second element

testDominatesSubgraph

public boolean testDominatesSubgraph(java.lang.Object a,
                                     java.lang.Object b,
                                     java.lang.String role)
Returns true if there is a pointer with a specified role from the first element to another element which is dominated by the second element. This methode may be usefull for type hierarchies.

Specified by:
testDominatesSubgraph in interface SearchableCorpus
Parameters:
from - start element of the pointer
to - element which dominates target element of the pointer
role - the role of the pointer
Returns:
true if there is a pointer with a specified role from the first element to another element which is dominated by the second element

testSameExtend

public boolean testSameExtend(java.lang.Object a,
                              java.lang.Object b)
Description copied from interface: SearchableCorpus
Returns true if element A and element B have the same duration. Means a_start == b_start and a_end == b_end.

Specified by:
testSameExtend in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A and element B have the same duration

testTimed

public boolean testTimed(java.lang.Object a)
Returns true if the element A is timed. Timed means either the element has explicit start and end time or all its children are timed.

Specified by:
testTimed in interface SearchableCorpus
Parameters:
a - element A
Returns:
true if the element A is timed

testOverlapsLeft

public boolean testOverlapsLeft(java.lang.Object a,
                                java.lang.Object b)
Returns true if element A overlaps left element B. Means a_start <= b_start and a_end > b_start and a_end <= b_end.

Specified by:
testOverlapsLeft in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A overlaps left element B

testLeftAlignedWith

public boolean testLeftAlignedWith(java.lang.Object a,
                                   java.lang.Object b)
Returns true if element A is left aligned with element B. Element A and B are starting at the same time, so a_start is equale to b_start.

Specified by:
testLeftAlignedWith in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A is left aligned with element B

testRightAlignedWith

public boolean testRightAlignedWith(java.lang.Object a,
                                    java.lang.Object b)
Returns true if element A is right aligned with element B. Element A and B are stoping at the same time, so a_end is equale to b_end.

Specified by:
testRightAlignedWith in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A is right aligned with element B

testIncludes

public boolean testIncludes(java.lang.Object a,
                            java.lang.Object b)
Returns true if element A temporally includes element B. Means a_start <= b_start and a_end >= b_end.

Specified by:
testIncludes in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A temporally includes element B.

testSameDuration

public boolean testSameDuration(java.lang.Object a,
                                java.lang.Object b)
Returns true if element A and element B have the same duration. Means a_start == b_start and a_end == b_end.

Parameters:
a - element A
b - element B
Returns:
true if element A and element B have the same duration

testOverlapsWith

public boolean testOverlapsWith(java.lang.Object a,
                                java.lang.Object b)
Returns true if element A overlaps element B. Means a_end > b_start and b_end > a_start.

Specified by:
testOverlapsWith in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A overlaps element B

testContactWith

public boolean testContactWith(java.lang.Object a,
                               java.lang.Object b)
Returns true if element A ends at the time element B starts. Means a_end == b_start.

Specified by:
testContactWith in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A ends at the time element B starts

testPrecedesTemporal

public boolean testPrecedesTemporal(java.lang.Object a,
                                    java.lang.Object b)
Returns true if element A temporally precedes element B. Means a_end <= b_start.

Specified by:
testPrecedesTemporal in interface SearchableCorpus
Parameters:
a - element A
b - element B
Returns:
true if element A temporally precedes element B

getNameOfElement

public java.lang.String getNameOfElement(java.lang.Object element)
Returns the name/type of the specified element. If the specified element isn't a NOMElement null will be returned.

Specified by:
getNameOfElement in interface SearchableCorpus
Parameters:
element - the element with the name, that will be returned
Returns:
the name/type of the specified element

getStartComparableValue

public java.lang.Comparable getStartComparableValue(java.lang.Object element)
Returns the start time as a Comparable value. If the specified element isn't a NOMElement null will be returned.

Specified by:
getStartComparableValue in interface SearchableCorpus
Parameters:
elemen - the element with the start time, that will be returned
Returns:
the start time as a Comparable value

getEndComparableValue

public java.lang.Comparable getEndComparableValue(java.lang.Object element)
Returns the start time as a Comparable value. If the specified element isn't a NOMElement null will be returned.

Specified by:
getEndComparableValue in interface SearchableCorpus
Parameters:
elemen - the element with the start time, that will be returned
Returns:
the start time as a Comparable value

getDurationComparableValue

public java.lang.Comparable getDurationComparableValue(java.lang.Object element)
Returns the temporal duration as a Comparable value. If the specified element isn't a NOMElement null will be returned.

Specified by:
getDurationComparableValue in interface SearchableCorpus
Parameters:
elemen - the element with the temporal duration, that will be returned
Returns:
the temporal duration as a Comparable value

getCenterComparableValue

public java.lang.Comparable getCenterComparableValue(java.lang.Object element)
Returns the center of start and end time as a Comparable value. If the specified element isn't a NOMElement null will be returned.

Specified by:
getCenterComparableValue in interface SearchableCorpus
Parameters:
elemen - the element with the center of start and end time, that will be returned
Returns:
the center of start and end time as a Comparable value

getIdComparableValue

public java.lang.Comparable getIdComparableValue(java.lang.Object element)
Returns the ID of an element as a Comparable value. If the specified element isn't a NOMElement null will be returned.

Specified by:
getIdComparableValue in interface SearchableCorpus
Parameters:
element - the element with the ID, that will be returned
Returns:
the ID of an element as a Comparable value

getElementsDominatedBy

public java.util.Iterator getElementsDominatedBy(java.lang.Object rootElement)
Returns an Iterator which visits each element dominated by the specified element in the corpora exactly once. The algorithm is based on the idea that there are no multiple paths between two different elements.

Specified by:
getElementsDominatedBy in interface SearchableCorpus
Parameters:
rootElement - the element which should dominate the requested elements
Returns:
an Iterator which visits each element dominated by the specified element in the corpora exactly once

getElementsDominating

public java.util.Iterator getElementsDominating(java.lang.Object childElement)
Returns an Iterator which visits each element dominating the specified element in the corpora exactly once.

Specified by:
getElementsDominating in interface SearchableCorpus
Parameters:
childElement - the element whisch should be dominated by the requested elements
Returns:
an Iterator which visits each element dominating the specified element in the corpora exactly once

getElementsPointedBy

public java.util.Iterator getElementsPointedBy(java.lang.Object startElement)
Returns an Iterator which visits each element which has a pointer from the specified element in the corpora exactly once.

Specified by:
getElementsPointedBy in interface SearchableCorpus
Parameters:
startElement - the element where a pointer pointing to the requested elements starts
Returns:
an Iterator which visits each element which has a pointer from the specified element in the corpora exactly once

getElements

public java.util.Iterator getElements(java.util.List names)
Returns an Iterator which visits each element with the specified type (= name of element) in the corpora exactly once.

Specified by:
getElements in interface SearchableCorpus
Parameters:
types - list of types
Returns:
an Iterator which visits each element with the specified type (= name of element) in the corpora exactly once

getElementsOfSubgraph

public java.util.Iterator getElementsOfSubgraph(java.lang.Object pointingElement)
Returns an Iterator which visits each element of the specified subgraphs in the corpora exactly once.

Specified by:
getElementsOfSubgraph in interface SearchableCorpus
Parameters:
pointingElement - the element pointing to the subgraphs
Returns:
an Iterator which visits each element of the specified subgraphs in the corpora exactly once.

getCorpusStartTime

public double getCorpusStartTime()
returns the earliest start time of any element in the corpus (or UNTIMED if there is no timed element)

Specified by:
getCorpusStartTime in interface NOMCorpus

getCorpusEndTime

public double getCorpusEndTime()
returns the latest end time of any element in the corpus (or UNTIMED if there is no timed element)

Specified by:
getCorpusEndTime in interface NOMCorpus

getCorpusDuration

public double getCorpusDuration()
returns the duration of the corpus (last end time - earliest start time) (or UNTIMED if there is are no timed elements)

Specified by:
getCorpusDuration in interface NOMCorpus

getLogStream

public java.io.PrintStream getLogStream()
Return the log PrintStream


setLogStream

public void setLogStream(java.io.PrintStream ps)
Set the log PrintStream


getErrorStream

public java.io.PrintStream getErrorStream()
Return the error PrintStream


setErrorStream

public void setErrorStream(java.io.PrintStream ps)
Set the error PrintStream


addDerivedAttribute

public void addDerivedAttribute(java.lang.String oldatt,
                                java.lang.String newatt,
                                double offset)
add a derived attribute to all the relevant elements in the entire corpus. The name of the attribute is given and the value is derived from a different attribute value plus the offset given.


addDurations

public void addDurations(java.lang.String attname)
add derived 'duration' attributes to all the timed elements in the corpus. The name of the duration attribute is given.


getMaker

public NOMMaker getMaker()
This is used by internal corpus-building routines so we make sure we always use the right constructors.

Specified by:
getMaker in interface NOMCorpus

setQueryRewriting

public void setQueryRewriting(boolean val)
set to true to enable the new query rewrite functionality that can increase the speed of your queries

Specified by:
setQueryRewriting in interface SearchableCorpus

isQueryRewriting

public boolean isQueryRewriting()
true means we have enabled the new query rewrite functionality that can increase the speed of your queries. false (the default) means we haven't

Specified by:
isQueryRewriting in interface SearchableCorpus

setQueryRewriter

public void setQueryRewriter(QueryRewriter writer)
Enable the query rewrite functionality and select a rewriter to use (if the argument is null, query rewriting will not be enabled).

Specified by:
setQueryRewriter in interface SearchableCorpus

getQueryRewriter

public QueryRewriter getQueryRewriter()
Return the query rewriter that should be used (or null if it is not set)

Specified by:
getQueryRewriter in interface SearchableCorpus