TimeML: A Formal Specification Language for Events and Temporal Expressions
Bob Ingria and James Pustejovsky
TimeML Working Group Members: Branimir Boguraev, Jose Castano, Rob Gaizauskas, Bob Ingria, Graham Katz, Bob Knippen, Jessica Littman, Inderjeet Mani, James Pustejovsky, Antonio Sanfilippo, Andrew See, Andrea Setzer, Roser Saur’, Beth Sundheim, Svetlana Symonenko.
1.0 Introduction
This document represents the current specification of TimeML. This revision specifies the syntax of TimeML, i.e. essentially its tags and their attributes, with examples illustrating their basic use. Since the pure syntax of TimeML will often leave open how a particular phenomenon should be annotated (e.g. should modals in English be marked up as SIGNALs or EVENTs), this document leaves a number of issues underspecified. Fuller discussion of the conventions by which TimeML should be applied can be found in the accompanying annotation guidelines (Pustejovsky, et al. (2002)).
The document begins with the "leaf nodes" of TimeML: the tags that include texts (in most cases) that describe the basic temporal elements within a document. The next section introduces SIGNAL, the tag that wraps expressions that specify how temporal elements should be related. The third section deals with links, empty tags that explicitly annotate the temporal relations either marked by signals or indicated purely syntactically, The next section deals with miscellaneous other tags. The last section deals with open questions.
Inasmuch as XML is case-sensitive, it is necessary for TimeML to specify exactly the case of all its elements. This document follows the convention of indicating tag names and attribute values in all upper case (e.g. EVENT, PROGRESSIVE) and attribute names in lower or mixed case (e.g. tense, relatedToTime). Since attribute values are typically atomic (one-word) while attribute names often consist of multiple words, this convention would seem to maximize readability of the annotation. (Multi-word attribute values use the underscore character to separate their component parts.)
This document also follows the attribute naming convention introduced in Setzer (2001). Attributes that range over values of XML datatype ID---a unique index---are short, consisting of one or two characters indicating the name of the element, followed by 'id' (e.g. tid, eiid). Attributes that range over values of XML datatype IDREF---references to IDs---typically consist of the name of the element indexed, followed by 'ID' (e.g. eventID) or a descriptive name (e.g. relatedToTime).
The values of the various ID attributes are specified as beginning with one or two characters, followed by an integer. This scheme is mandated by the syntax of XML. While attribute values of type ID can consist of any sequence of letters, digits, and the hyphen, underscore, and period characters, they must begin with either an underscore or a letter. Therefore "e23" is a valid XML ID; but "23" is not. This naming convention also helps make the examples a bit more readable, especially in the case of link tags, which can contain multiple IDREFs of different kinds.
Finally, in the descriptions of the values of attributes, where XML DTD and XML schema definitions would differ, the schema definition is indicated between {}.
Though this document describes the full TimeML language, many of the example annotations provided show the result of annotation only through the output of initial automatic tagging combined with human annotation/editing, but do not include elements (e.g. attributes and/or attribute values) that may be introduced by later processing components (e.g. the closure tool). In particular, TIMEX3 tags that are treated as temporal functions typically appear in the examples in an underspecified form. However, those elements that do appear are sufficient for the output of manual annotation.
Finally, note that all examples in this document have been validated against a TimeML DTD corresponding to the BNF given here, using the oXygen XML editor, version 1.1.
2.0 Temporal Entities
The EVENT tag is used to annotate those elements in a text that mark the semantic events described by it. Syntactically, EVENTs are typically verbs, although event nominals, such as "crash" in "...killed by the crash", will also be annotated as EVENTs.
The EVENT tag is also used to annotate a subset of the states in a document. This subset of states includes those that are either transient or explicitly marked as participating in a temporal relation. See the TimeML annotation guidelines for more details.
attributes ::= eid class
eid ::= ID
{eid ::= EventID
EventID ::= e<integer>}
class ::= 'OCCURRENCE' | 'PERCEPTION' | 'REPORTING' | 'ASPECTUAL' | 'STATE' | 'I_STATE' | 'I_ACTION'
MAKEINSTANCE is a realization link; it indicates different instances of a given event. Since different instances can have different attribute values, the tense and aspect of the event are represented within this tag. In addition, if the instance is modified by a negation or modal operator, this is represented in the appropriate attributes within this tag. One can create as many instances as are motivated by the text. All relations indicated by the other links are stated over these instances. Because of this, every EVENT introduces at least one corresponding MAKEINSTANCE.
attributes ::= eiid eventID tense aspect nf_morph [polarity] [modality] [signalID] [cardinality]
eiid ::= ID
{eiid ::= EventInstanceID
EventInstanceID ::= ei<integer>}
eventID ::= IDREF
{eventID ::= EventID}
tense ::= 'PAST' | 'PRESENT' | 'FUTURE' | 'NONE'
aspect ::= 'PROGRESSIVE' | 'PERFECTIVE' | 'PERFECTIVE_PROGRESSIVE' | 'NONE'
nf_morph ::= 'ADJECTIVE' | 'NOUN' | 'PRESPART' | 'PASTPART' | 'INFINITIVE' | 'NONE'
polarity ::= 'NEG' | 'POS' {default, if absent, is 'POS'}
modality ::= CDATA
signalID ::= IDREF
{signalID ::= SignalID}
cardinality ::= CDATA
A MAKEINSTANCE can be considered to be a functional object that takes an EventID as its input and produces an EventInstanceID as its output.
We expect that the tense and aspect attributes will have their values filled in by a pre-processing program, according to the following paradigm:
Verb group |
aspect= |
Teaches |
"NONE" |
Is teaching |
"PROGRESSIVE" |
has taught |
"PERFECTIVE" |
has been teaching |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
Taught |
"NONE" |
was teaching |
"PROGRESSIVE" |
had taught |
"PERFECTIVE" |
had been teaching |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
will teach |
"NONE" |
will be teaching |
"PROGRESSIVE" |
will have taught |
"PERFECTIVE" |
will have been teaching |
"PERFECTIVE_PROGRESSIVE" |
Note: Forms marked with (?) do not seem fully acceptable. They are included to show the full logical paradigm.
Verb group |
aspect= |
Is taught |
"NONE" |
Is being taught |
"PROGRESSIVE" |
has been taught |
"PERFECTIVE" |
has been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
was taught |
"NONE" |
was being taught |
"PROGRESSIVE" |
had been taught |
"PERFECTIVE" |
had been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
will be taught |
"NONE" |
will be being taught (?) |
"PROGRESSIVE" |
will have been taught |
"PERFECTIVE" |
will have been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
The
nf_morph attribute
captures distinctions among the grammatical categories of phrases which are
marked as events, but do not contain finite verbs. This attribute only contains a value other than NONE if the tense and aspect attributes both
contain NONE as their value.
signalID indicates a SIGNAL that either motivates the existence of the MAKEINSTANCE, or which indicates the value of the cardinality attribute (see annotation of "John taught twice on Monday but only once on Tuesday" below for an example of this).
The possible value of cardinality is given as CDATA, i.e. any ASCII text. In reality, its values are most likely to range over the integers, along with a limited number of quantificational elements such as "EVERY", "MOST", etc. It may be possible to create a more constraining datatype (e.g. "Cardinality"), based on the string datatype, which constrains it to a fixed set of word tokens, and any sequence of digits, but we have not yet done this.
The values of polarity and modality are determined by modifiers found near the event in the text. Formally, this information was annotated using a SIGNAL and a SLINK. Some examples:
(1) should have bought
should have
<EVENT eid="e1" class="OCCURRENCE">
bought
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="PERFECTIVE" modality="SHOULD"/>
(2) did not teach
did not
<EVENT eid="e1" class="OCCURRENCE">
teach
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE" polarity="POS"/>
(3) must not teach twice
must not
<EVENT eid="e1" class="OCCURRENCE">
teach
</EVENT>
<SIGNAL sid="s1">
twice
</SIGNAL>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE" polarity="POS" modality="MUST" signalID="s1" cardinality="2"/>
The TIMEX3 tag is primarily used to mark up explicit temporal expressions, such as times, dates, durations, etc. It is modeled on Setzer's (2001) TIMEX tag, as well as the TIDES (Ferro, et al. (2002)) TIMEX2 tag. Since it differs both in attribute structure and in use, it seemed best to give it a separate name, which reveals its heritage while at the same time indicating that it is different from its forebears.
attributes ::= tid type [functionInDocument] [beginPoint] [endPoint] [quant] [freq] [temporalFunction] (value | valueFromFunction) [mod] [anchorTimeID]
tid ::= ID
{tid ::= TimeID
TimeID ::= t<integer>}
type ::= 'DATE' | 'TIME' | 'DURATION' | 'SET'
beginPoint ::= IDREF
{beginPoint ::= TimeID}
endPoint ::= IDREF
{endPoint ::= TimeID}
quant ::= CDATA
freq ::= Duration
functionInDocument ::= 'CREATION_TIME' | 'EXPIRATION_TIME' | 'MODIFICATION_TIME' | 'PUBLICATION_TIME' | 'RELEASE_TIME'| 'RECEPTION_TIME' | 'NONE' {default, if absent, is 'NONE'}
temporalFunction ::= 'true' | 'false' {default, if absent, is 'false'}
{temporalFunction ::= boolean}
value ::= Duration | Date | Time | WeekDate | WeekTime | Season | PartOfYear | PaPrFu
valueFromFunction ::= IDREF
{valueFromFunction ::= TemporalFunctionID
TemporalFunctionID ::= tf<integer>}
mod ::= 'BEFORE' | 'AFTER' | 'ON_OR_BEFORE' | 'ON_OR_AFTER' |'LESS_THAN' | 'MORE_THAN' | 'EQUAL_OR_LESS' | 'EQUAL_OR_MORE' | 'START' | 'MID' | 'END' | 'APPROX'
anchorTimeID ::= IDREF
{anchorTimeID ::= TimeID}
functionInDocument, an optional attribute, indicates the function of the TIMEX3 in providing a temporal anchor for other temporal expressions in the document. If this attribute is not explicitly supplied, the default value is "NONE". The non-empty values take their names from the temporal metadata tags in the Prism draft standard (available at http://www.prismstandard.org/techdev/prismspec1.asp), and are intended to have the same interpretations:
There are several times that mark the major milestones in the life of a news resource: The time the story is published, the time it may be released (if not immediately), the time it is received by a customer, and the time that the story expires (if any). Dates and times should be represented using the W3C-defined profile of ISO 8601 [W3C-NOTE-datetime].
Table 4: Elements for Time and Date Information
Element Role
prism:creationTime Date and time the identified resource was first created.
prism:expirationTime Date and time when the right to publish material expires.
prism:modificationTime Date and time the resource was last modified.
prism:publicationTime Date and time when the resource is released to the public.
prism:releaseTime Earliest date and time when the resource may be distributed.
prism:receptionTime Date and time when the resource was received on current system.
Note that there can be as many instances of TIMEX3s containing a functionInDocument attribute with a non-empty value as there are TIMEX3s that express different functions. In practice, there will probably be no more than two, one with CREATION_TIME and another with PUBLICATION_TIME, since these are likely to be the only attributes that will appear in the text of documents to be annotated. Note that RELEASE_TIME does not indicate when the document was actually released. It is a specification of when the document is allowed to be released. This comes up in documents that are syndicated and where the issuing organization wants to delay publication by syndicators, so as not to be scooped.
Note also that the Prism standard, at least in its temporal
indicators, is interested only in the document as an artifact, a piece of
intellectual property. This means that the Prism values do not indicate the
function of a TIMEX3 relative to the internal narrative of the document. The
specification of the TimeML language can fill this gap by adding values for the
functionInDocument attribute that capture narrative functions. At present, we
leave the specification of possible values as is, and will defer the obvious
extension until annotation of existing texts indicates that this is a pressing
issue.
temporalFunction, an optional attribute, indicates whether the TIMEX3 is
used as a temporal function; e.g. "two weeks ago". If this attribute
is not explicitly supplied, the default value is "false". It is used
in conjunction with anchorTimeID, which indicates the TIMEX3 to which its denotation is
applied. It also appears with valueFromFunction, a
pointer to a temporal function that determines its value. As was noted above,
TIMEX3 tags that behave as temporal functions are often underspecified in the
example annotations below.
The datatypes specified for the value attribute---Duration, Date, Time, WeekDate, WeekTime, Season, PartOfYear,
PaPrFu---are XML datatypes based on the 2002 TIDES guideline, which extends the
ISO 8601 standard for representing dates, times, and durations. See the 2002
TIDES guidelines for details about the value attribute, and see the TimeML
Schema (www.timeml.org/timeMLdocs/TimeML.xsd) for complete definitions of each
of these datatypes.
mod is an optional attribute adopted from TIDES. It is used
for temporal modifiers that cannot be expressed either within value proper, or via links or temporal functions. Some examples:
(4) no more than 60 days
<TIMEX3 tid="t1" type="DURATION" value="P60D" mod="EQUAL_OR_LESS">
no more than 60 days
</TIMEX3>
(5) the dawn of 2000
<TIMEX3 tid="t2" type="DATE" value="2000" mod="START">
the dawn of 2000
</TIMEX3>
anchorTimeID is used to point to another TIMEX3 in the case of expressions such as "last week", which have a functional interpretation. The value of anchorTimeID provides the reference point to which the functional interpretation applies.
quant and freq are used to specify sets that that denote quantified times in a TIMEX3. quant is generally a literal from the text that quantifies over the expression. freq contains an integer value and a time granularity to represent any frequency contained in the set, just as a period of time is represented in a duration. Some examples:
(6) twice a month
<TIMEX3 tid="t3" type="SET" value="P1M" freq="2X">
twice a month
</TIMEX3>
(7) three days every month
<TIMEX3 tid="t4" type="SET" value="P1M" quant="EVERY" freq="3D">
three days every month
</TIMEX3>
(8) daily
<TIMEX3 tid="t5" type="SET" value="P1D quant="EVERY">
daily
</TIMEX3>
beginPoint and endpoint are used to anchor durations to other time expressions in the document. If there is no explicit tid to assign to one of these values, then an empty TIMEX3 tag is created to represent the unspecified point. Conversely, if both the beginning and end points of a duration are explicitly stated in the document, an empty TIMEX3 tag is created to represent the unspecified duration. Some examples:
(9) two weeks from June 7, 2003
<TIMEX3 tid="t6" type="DURATION" value="P2W" beginPoint="t61" endPoint="t62">
two weeks
</TIMEX3>
<SIGNAL sid="s1">
from
</SIGNAL>
<TIMEX3 tid="t61" type="DATE" value="2003-06-07">
June 7, 2003
</TIMEX3>
<TIMEX3 tid="t62" type="DATE" value="2003-06-21" temporalFunction="true" anchorTimeID="t6"/>
(10) 1992 through 1995
<TIMEX3 tid="t71" type="DATE" value="1992">
1992
</TIMEX3
<SIGNAL sid="s1">
through
</SIGNAL>
<TIMEX3 tid="t72" type="DATE" value="1995">
1995
</TIMEX3>
<TIMEX3 tid="t7" type="DURATION" value="P4Y" beginPoint="t71" endPoint="t72" temporalFunction="true"/>
attributes ::= sid
sid ::= ID
{sid ::= SignalID
SignalID ::= s<integer>}
SIGNAL is used to annotate sections of text, typically function words, that indicate how temporal objects are to be related to each other. The material marked by SIGNAL constitutes several types of linguistic elements:
indicators of temporal relations
such as temporal prepositions (e.g "on", "during") and other temporal connectives (e.g. "when") and subordinators (e.g. "if"). This functionality of the SIGNAL tag was introduced by Setzer (2001).
indicators of temporal quantification
such as "twice", "three times",
etc.
Link tags encode the various relations that exist between the temporal elements of a document. The motivations for having multiple types of links are the following:
TLINK is a temporal link. It represents the relation between two temporal elements.
attributes ::= [lid] [origin] (eventInstanceID | timeID) [signalID] (relatedToEventInstance | relatedToTime) relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= IDREF
{eventInstanceID ::= EventInstanceID}
timeID ::= IDREF
{timeID ::= TimeID}
signalID ::= IDREF
{signalID ::= SignalID}
relatedToEventInstance ::= IDREF
{relatedToEventInstance ::= EventInstanceID}
relatedToTime ::= IDREF
{relatedToTime ::= TimeID}
relType ::= 'BEFORE' | 'AFTER' | 'INCLUDES' | 'IS_INCLUDED' | 'DURING' |
'SIMULTANEOUS' | 'IAFTER' | 'IBEFORE' | 'IDENTITY' |
'BEGINS' | 'ENDS' | 'BEGUN_BY' | 'ENDED_BY'
The value of the optional origin attribute will be supplied by closure. This information and the link ID (lid) are primarily used by the closure algorithm. All links in TimeML may have these two attributes, but neither will be included in the examples presented here.
Examples:
(11) John taught 20 minutes every Monday.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<TIMEX3 tid="t1" type="DURATION" value="P20TM">
20 minutes
</TIMEX3>
<TIMEX3 tid="t2" type="SET" value="xxxx-wxx-1" quant="EVERY">
every Monday
</TIMEX3>
<TLINK timeID="t1" relatedToTime="t2" relType="IS_INCLUDED"/>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/>
(12) John taught twice on Monday but only once on Tuesday.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<SIGNAL sid="s1">
twice
</SIGNAL>
<SIGNAL sid="s2".
on
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="xxxx-wxx-1">
Monday
</TIMEX3>
but only
<SIGNAL sid="s3">
once
</SIGNAL>
<SIGNAL sid="s4">
on
</SIGNAL>
<TIMEX3 tid="t2" type="DATE" value="xxxx-wxx-2">
Tuesday
</TIMEX3>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE" signalID="s1" cardinality="2"/>
<MAKEINSTANCE eiid="ei2" eventID="e1" tense="PAST" aspect="NONE" signalID="s3" cardinality="1"/>
<TLINK eventInstanceID="ei1" signalID="s2" relatedToTime="t1" relType="IS_INCLUDED"/>
<TLINK eventInstanceID="ei2" signalID="s4" relatedToTime="t2" relType="IS_INCLUDED"/>
(13) John taught 5 minutes after the explosion.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<TIMEX3 tid="t1" type="DURATION" value="PT5M" beginPoint="t2" endPoint="t3">
5 minutes
</TIMEX3>
<SIGNAL sid="s1">
after
</SIGNAL>
the
<EVENT eid="e2" class="OCCURRENCE">
explosion
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE" nf_morph="NOUN"/>
<TIMEX3 tid="t2" type="TIME" value="xxxx-xx-xx" temporalFunction="true" anchorTimeID="t1"/>
<TIMEX3 tid="t3" type="TIME" value="xxxx-xx-xx" temporalFunction="true" anchorTimeID="t1"/>
<TLINK eventInstanceID="ei2" signalID="s1" relatedToTime="t1" relType="BEGINS"/>
<TLINK eventInstanceID="ei2" relatedToTime="t2" relType="IS_INCLUDED"/>
<TLINK eventInstanceID="ei1" relatedToTime="t3" relType="IS_INCLUDED"/>
Treatment of Temporal Functions:
(14) John taught from September to December last year.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
from
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="xxxx-09">
September
</TIMEX3>
<SIGNAL sid="s2">
to
</SIGNAL>
<TIMEX3 tid="t2" type="DATE" value="xxxx-12">
December
</TIMEX3>
<TIMEX3 tid="t5" type="DURATION" value="P4M" beginPoint="t1" endPoint="t2" temporalFunction="true"/>
<TIMEX3 tid="t3" type=DATE" value="1995" temporalFunction="true" anchorTimeID="t4">
last year
</TIMEX3>
<TIMEX3 tid="t4" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK timeID="t1" signalID="s1" relatedToTime="t5" relType="BEGINS"/>
<TLINK timeID="t2" signalID="s2" relatedToTime="t5" relType="ENDS"/>
<TLINK eventInstanceID="ei1" relatedToTime="t5" relType="HOLDS"/>
(15) John taught last week.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX" temporalFunction="true" anchorTimeID="t2">
last week
</TIMEX3>
<TIMEX3 tid="t2" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="IS_INCLUDED"/>
Note: The TLINK relates TIMEX3 expressions. This is the only representation that will adequately express the temporal anchoring of this event.
(16) John taught last week on Monday.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX" temporalFunction="true" anchorTimeID="t2">
last week
</TIMEX3>
<SIGNAL sid="s1">
on
</SIGNAL>
<TIMEX3 tid="t3" type="DATE" value="XXXX-WXX-1" temporalFunction="true" >
Monday
</TIMEX3>
<TIMEX3 tid="t2" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="IS_INCLUDED"/>
<TLINK timeID="t3" signalID="s1" relatedToTime="t2" relType="IS_INCLUDED"/>
This is a subordination link that is used for contexts involving modality, evidentials, and factives. An SLINK is used in cases where an event instance subordinates another event instance type. These are cases where a verb takes a complement and subordinates the event instance referred to in this complement.
attributes ::= [lid] [origin] [eventInstanceID] [signalID] subordinatedEventInstance relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= IDREF
{eventInstanceID ::= EventInstanceID}
subordinatedEventInstance ::= IDREF
{subordinatedEventInstance ::= EventInstanceID}
signalID ::= IDREF
{signalID ::= SignalID}
relType ::= 'MODAL' | 'EVIDENTIAL' | 'NEG_EVIDENTIAL'
| 'FACTIVE' | 'COUNTER_FACTIVE' | 'CONDITIONAL'
Note that eventInstanceID is optional because an event can be subordinated (e.g. in a conditional) without being subordinated to a particular event.
The following EVENT classes interact with SLINK:
Some lexical notes:
Verbs that introduce I_STATE EVENTs that induce SLINK:
Verbs that introduce I_ACTION EVENTs that induce SLINK:
Examples:
(17) If Graham leaves today, he will not hear Sabine.
<SIGNAL sid="s1">
if
</SIGNAL>
Graham
<EVENT eid="e1" class="OCCURRENCE">
leaves
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE"/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-XX-XX" temporalFunction="true" >
today
</TIMEX3>
he will not
<EVENT eid="e2" class="OCCURRENCE">
hear
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="FUTURE" aspect="NONE" polarity="NEG" modality="WILL"/>
Sabine.
<SLINK eventInstanceID="ei1" subordinatedEventInstance="ei2" signaled="s1" relType="CONDITIONAL"/>
<TLINK eventInstanceID="ei1" relatedToEventInstance="ei2" relType="BEFORE"/>
(18) Bill denied that John taught on Monday.
Bill
<EVENT eid="e1" class="I_ACTION">
denied
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
that John
<EVENT eid="e2" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
on
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX-1">
Monday
</TIMEX3>
<TLINK eventInstanceID="ei2" signalID="s1" relatedToTime="t1" relType="IS_INCLUDED"/>
<SLINK eventInstanceID="ei1" subordinatedEventInstance="ei2" relType="NEG_EVIDENTIAL"/>
(19) Bill wants to teach on Monday.
Bill
<EVENT eid="e1" class="I_STATE" >
wants
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE" >
teach
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE" nf_morph="INFINITIVE"/>
<SIGNAL sid="s2">
on
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX-1">
Monday
</TIMEX3>
<TLINK eventInstanceID="ei2" signalID="s2" relatedToTime="t1" relType="IS_INCLUDED"/>
<SLINK eventInstanceID="ei1" signalID="s1" subordinatedEventInstance="ei2" relType="MODAL"/>
(20) Bill attempted to save her.
Bill
<EVENT eid="e1" class="I_ACTION">
attempted
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE">
save
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE" nf_morph="INFINITIVE"/>
her
<SLINK eventInstanceID="ei1" signalID="s1" subordinatedEventInstance="ei2" relType="MODAL"/>
ALINK is an aspectual link; it indicates an aspectual connection between two events. In some ways, it is like a cross between TLINK and SLINK in that it indicates both a relation between two temporal elements, as well as aspectual subordination
attributes ::= [lid] [origin] eventInstanceID [signalID] relatedToEventInstance relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= ID
{eventInstanceID ::= EventInstanceID}
signalID ::= IDREF
{signalID ::= SignalID}
relatedToEventInstance ::= IDREF
{relatedToEventInstance ::= EventInstanceID}
relType ::= 'INITIATES' | 'CULMINATES' | 'TERMINATES' | 'CONTINUES' | 'REINITIATES'
Some examples:
(21) The boat began to sink.
The boat
<EVENT eid="e1" class="ASPECTUAL">
began
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE" >
sink
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE" nf_morph="INFINITIVE"/>
<ALINK eventInstanceID="ei1" signalID="s1" relatedToEventInstance="ei2" relType="INITIATES"/>
(22) The search party stopped looking for the survivors.
The search party
<EVENT eid="e1" class="ASPECTUAL">
stopped
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<EVENT eid="e2" class="OCCURRENCE">
looking
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="PROGRESSIVE"/>
<ALINK eventInstanceID="ei1" relatedToEventInstance="ei2" relType="TERMINATES"/>
for the survivors
In various discussions of the full TERQAS groups, the utility of being able to mark confidence values for various aspects of the annotation was pointed out. In general, it would be useful to allow confidence values to be assigned to any tag, and, in fact, to any attribute of any tag.
A convenient way to do this would be to create a confidence tag, which would consume no input, and which would have the following attributes:
attributes ::= tagType tagID [attributeName] confidenceValue
tagType ::= CDATA
tagID ::= IDREF
attributeName ::= CDATA
confidenceValue ::= CDATA
{confidenceValue ::= 0 < x < 1}
where
tagType
would range over the names of all the tags of TimeML
tagID
would range over the set of actual tag IDs within the current document (XML type IDREF)
attributeName
would range over the names of all the attributes of all the tags of TimeML
confidenceValue
would range over the rationals (i.e. would have a floating point value) between 0 and 1
So, for example, given this annotation:
(23) The TWA flight
<EVENT eid="e1" class="OCCURRENCE">
crashlanded
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
on Easter Island
<TIMEX3 tid="t1" type="DURATION" value="P2W" beginPoint="t2" endPoint="t3">
two weeks ago
</TIMEX3>
<TIMEX3 tid="t3" type="DATE" value="1999-12-06" temporalFunction="true" anchorTimeID="t1"/>
...
<TIMEX3 tid="t2" type="DATE" functionInDocument="CREATION_TIME" value="1999-12-20">
12-20-1999
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t3" relType="IS_INCLUDED"/>
If we wanted to indicate that we were unsure that we had annotated "two weeks ago" correctly, we could add this annotation:
(23') <CONFIDENCE tagType="TIMEX3" tagID="t1" confidenceValue="0.50"/>
where the lack of the optional attribute, attributeName, indicates that the confidence applies to the whole tag.
On the other hand, if we wanted to indicate that we weren't sure if the tense of "crashlanded" was really "PAST", we could add this annotation:
(23'') <CONFIDENCE tagType="EVENT" tagID="e1" attributeName="TENSE" confidenceValue="0.75"/>
Abstracting confidence measures as a separate tag frees the annotation from having to include a confidence value attribute in every tag and eliminates the problem of uncertainty over the exact attribute of a tag the confidence value applies to.
As for how confidence values should be assigned in manual annotation, we feel that, in a large-scale annotation effort such as TIMEBANK, two conditions should be satisfied:
Therefore, the annotation of a scalar value such as confidence should have at least two features:
The constraint on human annotators to a subset of the possible values should be documented in the annotation guidelines and implemented in the annotation tool. And it would probably be best if the annotation tool did not present numbers but rather natural language descriptions such as those suggested above, which would be represented in the underlying annotation numerically. For example, the annotator might pick "moderately certain", which would enter the annotation as .5.
Moreover, for manual annotation, it does not seem that the 0 and 1 values will be used/useful. Presumably if the annotator doesn't trust an annotation at all s/he won't add it. And, as was suggested above, 1, at least for manual annotation, should be the default or unmarked value, and so need not be noted, since it would bulk up the files considerably, even if it were used only on entire tags.
Inasmuch as every well-formed XML document must have a single root node, we supply TimeML as this node. For example, a sample annotated TimeML document might look like this:
<?xml version="1.0"?>
<!DOCTYPE TimeML SYSTEM "TimeML.dtd">
<TimeML>
FAMILIES SUE OVER AREOFLOT CRASH DEATHS
The Russian airline Aeroflot has been
<EVENT eid="e1" class="OCCURRENCE">
hit
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="PERFECTIVE"/>
with a writ for loss and damages,
<EVENT eid="e2" class="OCCURRENCE">
filed
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="PAST" aspect="NONE"/>
in Hong Kong by the families of seven passengers
<EVENT eid="e3" class="OCCURRENCE">
killed
</EVENT>
<MAKEINSTANCE eiid="ei3" eventID="e3" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
in
</SIGNAL>
an air
<EVENT eid="e4" class="OCCURRENCE">
crash
</EVENT>
<MAKEINSTANCE eiid="ei4" eventID="e4" tense="NONE" aspect="NONE"/>.
All 75 people
<EVENT eid="e7" class="STATE">
on board
</EVENT>
<MAKEINSTANCE eiid="ei7" eventID="e7" tense="NONE" aspect="NONE"/>
<TLINK eventInstanceID="ei7" relatedToEvent="ei5" relType="INCLUDES"/>
the Aeroflot Airbus
<EVENT eid="e5" class="OCCURRENCE" >
died
</EVENT>
<MAKEINSTANCE eiid="ei5" eventID="e5" tense="PAST" aspect="NONE"/>
<TLINK eventInstanceID="ei5" signalID="s2" relatedToEvent="ei6" relType="IAFTER"/>
<SIGNAL sid="s2">
when
</SIGNAL>
it
<EVENT eid="e6" class="OCCURRENCE">
ploughed
</EVENT>
<MAKEINSTANCE eiid="ei6" eventID="e6" tense="PAST" aspect="NONE"/>
<TLINK eventInstanceID="ei6" signalID="s3" relatedToTime="t2" relType="IS_INCLUDED"/>
<TLINK eventInstanceID="ei6" relatedToEvent="ei4" relType="IDENTITY"/>
into a Siberian mountain
<SIGNAL sid="s3">
in
</SIGNAL>
<TIMEX3 tid="t2" type="DATE" value="1994-04">
March 1994
</TIMEX3>.
...
<TIMEX3 tid="t1" type="DATE" value="1996-03-27">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="BEFORE"/>
<TLINK eventInstanceID="ei2" relatedToEvent="ei1" relType="BEFORE"/>
<TLINK eventInstanceID="ei3" relatedToEvent="ei2" relType="BEFORE"/>
<TLINK eventInstanceID="ei3" signalID="s1" relatedToEvent="ei4" relType="IS_INCLUDED"/>
</TimeML>
Ferro, Lisa, Gerber, Laurie, Mani, Inderjeet, Sundheim, Beth, and Wilson, George. (2002) Instruction Manual for the Annotation of Temporal Expressions, MITRE Washington C3 Center, McLean, Virginia.
Setzer, Andrea (2001) Temporal Information in Newswire Articles: An Annotation Scheme and Corpus Study, Doctoral Dissertation, University of Sheffield, Sheffield, UK.
Pustejovsky, James, Saur’, Roser, Setzer, Andrea, Ingria, Bob (2002) TimeML Annotation Guidelines.