TimeML: A Formal Specification Language for Events and Temporal Expressions
Bob
Ingria and James Pustejovsky
TimeML Working
Group Members: Branimir Boguraev,
Jose Castano, Rob Gaizauskas, Bob Ingria, Graham Katz, Bob Knippen, Jessica Littman, Inderjeet Mani, James Pustejovsky, Antonio
Sanfilippo, Andrew See, Andrea Setzer, Roser Saurí, Beth Sundheim, Svetlana Symonenko.
1.0 Introduction
This document
represents the current specification of TimeML. This revision specifies the
syntax of TimeML, i.e. essentially its tags and their attributes, with examples
illustrating their basic use. Since the pure syntax of TimeML will often leave
open how a particular phenomenon should be annotated (e.g. should modals in
English be marked up as SIGNALs or EVENTs), this document leaves a number of
issues underspecified. Fuller discussion of the conventions by which TimeML
should be applied can be found in the accompanying annotation guidelines
(Pustejovsky, et al. (2002)).
The document
begins with the “leaf nodes” of TimeML: the tags that include texts (in most
cases) that describe the basic temporal elements within a document. The next
section introduces SIGNAL, the tag that wraps expressions that specify how
temporal elements should be related. The third section deals with links, empty
tags that explicitly annotate the temporal relations either marked by signals
or indicated purely syntactically, The next section deals with miscellaneous
other tags. The last section deals with open questions.
Inasmuch as XML is
case-sensitive, it is necessary for TimeML to specify exactly the case of all
its elements. This document follows the convention of indicating tag names and
attribute values in all upper case (e.g. EVENT, PROGRESSIVE) and attribute
names in lower or mixed case (e.g. tense, relatedToTime). Since attribute
values are typically atomic (one-word) while attribute names often consist of
multiple words, this convention would seem to maximize readability of the
annotation. (Multi-word attribute values use the underscore character to
separate their component parts.)
This document also
follows the attribute naming convention introduced in Setzer (2001). Attributes
that range over values of XML datatype ID---a unique index---are short,
consisting of one or two characters indicating the name of the element,
followed by ‘id’ (e.g. tid, eiid). Attributes that range over values of XML
datatype IDREF---references to IDs---typically consist of the name of the
element indexed, followed by ‘ID’ (e.g. eventID) or a descriptive name (e.g.
relatedToTime).
The values of the
various ID attributes are specified as beginning with one or two characters,
followed by an integer. This scheme is mandated by the syntax of XML. While
attribute values of type ID can consist of any sequence of letters, digits, and
the hyphen, underscore, and period characters, they must begin with either an
underscore or a letter. Therefore "e23"
is a valid XML ID; but "23" is not. This
naming convention also helps make the examples a bit more readable, especially
in the case of link tags, which can contain multiple IDREFs of different kinds.
Finally, in the
descriptions of the values of attributes, where XML DTD and XML schema definitions
would differ, the schema definition is indicated between {}.
Though this
document describes the full TimeML language, many of the example annotations
provided show the result of annotation only through the output of initial
automatic tagging combined with human annotation/editing, but do not include
elements (e.g. attributes and/or attribute vaues) that may be introduced by
later processing components (e.g. the closure tool). In particular, TIMEX3 tags
that are treated as temporal functions typically appear in the examples in an
underspecified form. However, those elements that do appear are sufficient for
the output of manual annotation.
Finally, note that
all examples in this document have been validated against a TimeML DTD corresponding
to the BNF given here, using the oXygen XML editor, version 1.1.
2.0 Temporal Entities
The EVENT tag is
used to annotate those elements in a text that mark the semantic events
described by it. Syntactically, EVENTs are typically verbs, although event
nominals, such as “crash” in “...killed by the crash”, will also be annotated
as EVENTs.
The EVENT tag is also used to annotate a subset of the states in a document. This subset of states includes those that are either transient or explicitly marked as participating in a temporal relation. See the TimeML annotation guidelines for more details.
attributes ::= eid class
eid ::= ID
{eid ::= EventID
EventID ::= e<integer>}
class ::= 'OCCURRENCE' | 'PERCEPTION' | 'REPORTING' | 'ASPECTUAL' | 'STATE' | 'I_STATE' | 'I_ACTION'
MAKEINSTANCE is a
realization link; it indicates different instances of a given event. Since
different instances can have different attribute values, the tense and aspect
of the event are represented within this tag. In addition, if the instance is
modified by a negation or modal operator, this is represented in the
appropriate attributes within this tag.
One can create as many instances as are motivated by the text. All
relations indicated by the other links are stated over these instances. Because
of this, every EVENT introduces at least one corresponding MAKEINSTANCE.
attributes ::= eiid eventID tense aspect [polarity] [modality] [signalID] [cardinality]
eiid ::= ID
{eiid ::= EventInstanceID
EventInstanceID ::= ei<integer>}
eventID ::= IDREF
{eventID ::= EventID}
tense ::= 'PAST' | 'PRESENT' | 'FUTURE' | 'NONE'
aspect ::= 'PROGRESSIVE' | 'PERFECTIVE' | 'PERFECTIVE_PROGRESSIVE' | 'NONE'
polarity ::= 'NEG' | 'POS' {default, if absent, is ‘POS’}
modality ::= CDATA
signalID ::= IDREF
{signalID ::= SignalID}
cardinality ::= CDATA
A MAKEINSTANCE can
be considered to be a functional object that takes an EventID as its input and
produces an EventInstanceID as its output.
We expect that the
tense and aspect
attributes will have their values filled in by a pre-processing program,
according to the following paradigm:
Verb group |
aspect= |
Teaches |
"NONE" |
Is teaching |
"PROGRESSIVE" |
has taught |
"PERFECTIVE" |
has been teaching |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
Taught |
"NONE" |
was teaching |
"PROGRESSIVE" |
had taught |
"PERFECTIVE" |
had been teaching |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
will teach |
"NONE" |
will be teaching |
"PROGRESSIVE" |
will have taught |
"PERFECTIVE" |
will have been teaching |
"PERFECTIVE_PROGRESSIVE" |
Note: Forms marked
with (?) do not seem fully acceptable. They are included to show the full
logical paradigm.
Verb group |
aspect= |
Is taught |
"NONE" |
Is being taught |
"PROGRESSIVE" |
has been taught |
"PERFECTIVE" |
has been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
was taught |
"NONE" |
was being taught |
"PROGRESSIVE" |
had been taught |
"PERFECTIVE" |
had been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
Verb group |
aspect= |
will be taught |
"NONE" |
will be being taught (?) |
"PROGRESSIVE" |
will have been taught |
"PERFECTIVE" |
will have been being taught (?) |
"PERFECTIVE_PROGRESSIVE" |
signalID indicates a SIGNAL
that either motivates the existence of the MAKEINSTANCE, or which indicates the
value of the cardinality attribute (see annotation of “John taught twice on
Monday but only once on Tuesday” below for an example of this).
The possible value of cardinality is given as CDATA, i.e. any ASCII text. In reality, its values are most likely to range over the integers, along with a limited number of quantificational elements such as "EVERY", "MOST", etc. It may be possible to create a more constraining datatype (e.g. “Cardinality”), based on the string datatype, which constrains it to a fixed set of word tokens, and any sequence of digits, but we have not yet done this.
The values of polarity and modality are
determined by modifiers found near the event in the text. Formally, this
information was annotated using a SIGNAL and a SLINK. Some examples:
(1) should have bought
should have
<EVENT eid=”e1” class=”OCCURRENCE”>
bought
</EVENT>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PAST” aspect=”PERFECTIVE” modality=”SHOULD”/>
(2) did not teach
did not
<EVENT eid=”e1” class=”OCCURRENCE”>
teach
</EVENT>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PRESENT” aspect=”NONE” polarity=”POS”/>
(3) must not teach twice
must not
<EVENT eid=”e1” class=”OCCURRENCE”>
teach
</EVENT>
<SIGNAL sid=”s1”>
twice
</SIGNAL>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PRESENT” aspect=”NONE” polarity=”POS” modality=”MUST” signalID=”s1” cardinality=”2”/>
The TIMEX3 tag is
primarily used to mark up explicit temporal expressions, such as times, dates,
durations, etc. It is modeled on Setzer's (2001) TIMEX tag, as well as the
TIDES (Ferro, et al. (2002)) TIMEX2 tag. Since it differs both in attribute
structure and in use, it seemed best to give it a separate name, which reveals
its heritage while at the same time indicating that it is different from its
forebears.
attributes ::= tid type [functionInDocument] [beginPoint] [endPoint] [quant] [freq] [temporalFunction] (value | valueFromFunction) [mod] [anchorTimeID]
tid ::= ID
{tid ::= TimeID
TimeID ::= t<integer>}
type ::= 'DATE' | 'TIME' | 'DURATION' | 'SET'
beginPoint ::= IDREF
{beginPoint ::= TimeID}
endPoint ::= IDREF
{endPoint ::= TimeID}
quant ::= CDATA
freq ::= CDATA
{freq ::= duration}
functionInDocument ::= 'CREATION_TIME' | 'EXPIRATION_TIME' | 'MODIFICATION_TIME' | 'PUBLICATION_TIME' |
'RELEASE_TIME'| 'RECEPTION_TIME' | 'NONE' {default, if absent, is 'NONE'}
temporalFunction ::= 'true' | 'false' {default, if absent, is 'false'}
{temporalFunction ::= boolean}
value ::= CDATA
{value ::= duration | dateTime | time | date | gYearMonth | gYear | gMonthDay | gDay | gMonth}
valueFromFunction ::= IDREF
{valueFromFunction ::= TemporalFunctionID
TemporalFunctionID ::= tf<integer>}
mod ::= 'BEFORE' | 'AFTER' | 'ON_OR_BEFORE' | 'ON_OR_AFTER' |'LESS_THAN' | 'MORE_THAN' |
'EQUAL_OR_LESS' | 'EQUAL_OR_MORE' | 'START' | 'MID' | 'END' | 'APPROX'
anchorTimeID ::= IDREF
{anchorTimeID ::= TimeID}
functionInDocument, an
optional attribute, indicates the function of the TIMEX3 in providing a
temporal anchor for other temporal expressions in the document. If this
attribute is not explicitly supplied, the default value is "NONE". The non-empty values take their names from the
temporal metadata tags in the Prism draft standard (available at http://www.prismstandard.org/techdev/prismspec1.asp),
and are intended to have the same interpretations:
There are several times that mark the major milestones
in the life of a news resource: The time the story is published, the time it
may be released (if not immediately), the time it is received by a customer,
and the time that the story expires (if any). Dates and times should be
represented using the W3C-defined profile of ISO 8601 [W3C-NOTE-datetime].
Table 4: Elements for Time and Date Information
Element Role
prism:creationTime Date and time the identified resource was first created.
prism:expirationTime Date and time when the right to publish material expires.
prism:modificationTime Date and time the resource was last modified.
prism:publicationTime Date and time when the resource is released to the public.
prism:releaseTime Earliest date and time when the resource may be distributed.
prism:receptionTime Date and time when the resource was received on current system.
Note that there
can be as many instances of TIMEX3s containing a functionInDocument attribute with a non-empty value as there are TIMEX3s
that express different functions. In practice, there will probably be no more
than two, one with CREATION_TIME and another with PUBLICATION_TIME, since these are likely to be the only attributes that will appear in
the text of documents to be annotated. Note that RELEASE_TIME does not indicate when the document was actually
released. It is a specification of when the document is allowed to be released.
This comes up in documents that are syndicated and where the issuing
organization wants to delay publication by syndicators, so as not to be
scooped.
Note also that the
Prism standard, at least in its temporal indicators, is interested only in the
document as an artifact, a piece of intellectual property. This means that the
Prism values do not indicate the function of a TIMEX3 relative to the internal
narrative of the document. The specification of the TimeML language can fill
this gap by adding values for the functionInDocument
attribute that capture narrative functions. At present, we leave the
specification of possible values as is, and will defer the obvious extension
until annotation of existing texts indicates that this is a pressing issue.
temporalFunction, an
optional attribute, indicates whether the TIMEX3 is used as a temporal
function; e.g. “two weeks ago”. If this attribute is not explicitly supplied,
the default value is "false". It is used in conjunction with anchorTimeID, which indicates the TIMEX3 to which its denotation
is applied. It also appears with valueFromFunction,
a pointer to a temporal function that determines its value. As was noted above,
TIMEX3 tags that behave as temporal functions are often underspecified in the
example annotations below.
The values
specified for the value attribute---duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, and gMonth---are
the XML time datatypes based on the ISO 8601 standard. See http://www.w3.org/TR/xmlschema-2/
for the definitions of these and the other built-in XML schema datatypes. Since
the TIDES guidelines, which we follow in this area, extend the ISO 8601 values,
we will need to extend these data types to include these additional values.
mod is an optional
attribute adopted from TIDES. It is used for temporal modifiers that cannot be
expressed either within value proper, or via links
or temporal functions. Some examples:
(4) no more than 60 days
<TIMEX3 tid="t1" type="DURATION" value="P60D" mod="EQUAL_OR_LESS">
no more than 60 days
</TIMEX3>
(5) the dawn of 2000
<TIMEX3 tid="t2" type="DATE" value="2000" mod="START">
the dawn of 2000
</TIMEX3>
anchorTimeID is used to
point to another TIMEX3 in the case of expressions such as “last week”, which
have a functional interpretation. The value of anchorTimeID provides the reference point to which the functional
interpretation applies.
quant and freq are used to specify sets that that denote quantified
times in a TIMEX3. quant is generally a literal from the text that quantifies
over the expression. freq contains an integer value and a time granularity to
represent any frequency contained in the set, just as a period of time is
represented in a duration. Some
examples:
(6) twice a month
<TIMEX3 tid=”t3” type=”SET” value=”P1M” freq=”2X”>
twice a month
</TIMEX3>
(7) three days every month
<TIMEX3 tid=”t4” type=”SET” value=”P1M” quant=”EVERY” freq=”3D”>
three days every month
</TIMEX3>
(8) daily
<TIMEX3 tid=”t5” type=”SET” value=”P1D quant=”EVERY”>
daily
</TIMEX3>
beginPoint and endPoint are
used to anchor durations to other time expressions in the document. If there is no explicit tid to assign to one of these values, then a
non-consuming TIMEX3 is created to represent the unspecified point. Conversely, if both the beginning and
end points of a duration are explicitly stated in the document, a non-consuming
TIMEX3 is created to represent the unspecified duration. Some examples:
(9) two weeks from June 7, 2003
<TIMEX3 tid="t6" type="DURATION" value="P2W" beginPoint=”t61” endPoint=”t62”>
two weeks
</TIMEX3>
<SIGNAL sid=”s1”>
from
</SIGNAL>
<TIMEX3 tid=”t61” type=”DATE” value=”2003-06-07”>
June 7, 2003
</TIMEX3>
<TIMEX3 tid=”t62” type=”DATE” value=”2003-06-21” temporalFunction=”true” anchorTimeID=”t6”/>
(10) 1992 through 1995
<TIMEX3 tid=”t71” type=”DATE” value=”1992”>
1992
</TIMEX3
<SIGNAL sid=”s1”>
through
</SIGNAL>
<TIMEX3 tid=”t72” type=”DATE” value=”1995”>
1995
</TIMEX3>
<TIMEX3 tid=”t7” type=”DURATION” value=”P4Y” beginPoint=”t71” endPoint=”t72” temporalFunction=”true”/>
attributes ::= sid
sid ::= ID
{sid ::= SignalID
SignalID ::= s<integer>}
SIGNAL is used to
annotate sections of text, typically function words, that indicate how temporal
objects are to be related to each other. The material marked by SIGNAL
constitutes several types of linguistic elements:
indicators of temporal relations
such as temporal prepositions (e.g “on”, “during”) and other temporal connectives (e.g. “when”) and subordinators (e.g. “if”). This functionality of the SIGNAL tag was introduced by Setzer (2001).
indicators of temporal quantification
such as “twice”, “three times”, etc.
Link tags encode
the various relations that exist between the temporal elements of a document.
The motivations for having multiple types of links are the following:
TLINK is a
temporal link. It represents the relation between two temporal elements.
attributes ::= [lid] [origin] (eventInstanceID | timeID) [signalID] (relatedToEventInstance | relatedToTime) relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= IDREF
{eventInstanceID ::= EventInstanceID}
timeID ::= IDREF
{timeID ::= TimeID}
signalID ::= IDREF
{signalID ::= SignalID}
relatedToEventInstance ::= IDREF
{relatedToEventInstance ::= EventInstanceID}
relatedToTime ::= IDREF
{relatedToTime ::= TimeID}
relType ::= 'BEFORE' | 'AFTER' | 'INCLUDES' | 'IS_INCLUDED' | ‘DURING’ |
'SIMULTANEOUS' | 'IAFTER' | 'IBEFORE' | 'IDENTITY' |
'BEGINS' | 'ENDS' | 'BEGUN_BY' | 'ENDED_BY'
The value of the
optional origin attribute will be supplied by closure. This
information and the link ID (lid) are
primarily used by the closure algorithm.
All links in TimeML may have these two attributes, but neither will be
included in the examples presented here.
Examples:
(11) John taught 20 minutes every Monday.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense=”PAST” aspect=”NONE”/>
<TIMEX3 tid="t1" type="DURATION" value="P20TM">
20 minutes
</TIMEX3>
<TIMEX3 tid=”t2” type=”SET” value=”xxxx-wxx-1” quant=”EVERY”>
every Monday
</TIMEX3>
<TLINK timeID=”t1” relatedToTime=”t2” relType=”IS_INCLUDED”/>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/>
(12) John taught twice on Monday but only once on Tuesday.
John
<EVENT eid=”e1” class=”OCCURRENCE”>
taught
</EVENT>
<SIGNAL sid=”s1”>
twice
</SIGNAL>
<SIGNAL sid=”s2”.
on
</SIGNAL>
<TIMEX3 tid=”t1” type=”DATE” value=”xxxx-wxx-1”>
Monday
</TIMEX3>
but only
<SIGNAL sid=”s3”>
once
</SIGNAL>
<SIGNAL sid=”s4”>
on
</SIGNAL>
<TIMEX3 tid=”t2” type=”DATE” value=”xxxx-wxx-2”>
Tuesday
</TIMEX3>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PAST” aspect=”NONE” signalID=”s1” cardinality=”2”/>
<MAKEINSTANCE eiid=”ei2” eventID=”e1” tense=”PAST” aspect=”NONE” signalID=”s3” cardinality=”1”/>
<TLINK eventInstanceID=”ei1” signalID=”s2” relatedToTime=”t1” relType=”IS_INCLUDED”/>
<TLINK eventInstanceID=”ei2” signalID=”s4” relatedToTime=”t2” relType=”IS_INCLUDED”/>
(13) John taught 5 minutes after the explosion.
John
<EVENT eid=”e1” class=”OCCURRENCE”>
taught
</EVENT>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PAST” aspect=”NONE”/>
<TIMEX3 tid=”t1” type=”DURATION” value=”PT5M” beginPoint=”t2” endPoint=”t3”>
5 minutes
</TIMEX3>
<SIGNAL sid=”s1”>
after
</SIGNAL>
the
<EVENT eid=”e2” class=”OCCURRENCE”>
explosion
</EVENT>
<MAKEINSTANCE eiid=”ei2” eventID=”e2” tense=”NONE” aspect=”NONE”/>
<TIMEX3 tid=”t2” type=”TIME” value=”xxxx-xx-xx” temporalFunction=”true” anchorTimeID=”t1”/>
<TIMEX3 tid=”t3” type=”TIME” value=”xxxx-xx-xx” temporalFunction=”true” anchorTimeID=”t1”/>
<TLINK eventInstanceID=”ei2” signalID=”s1” relatedToTime=”t1” relType=”BEGINS”/>
<TLINK eventInstanceID=”ei2” relatedToTime=”t2” relType=”IS_INCLUDED”/>
<TLINK eventInstanceID=”ei1” relatedToTime=”t3” relType=”IS_INCLUDED”/>
Treatment of
Temporal Functions:
(14) John taught from September to December last year.
John
<EVENT eid=”e1” class=”OCCURRENCE”>
taught
</EVENT>
<MAKEINSTANCE eiid=”ei1” eventID=”e1” tense=”PAST” aspect=”NONE”/>
<SIGNAL sid=”s1”>
from
</SIGNAL>
<TIMEX3 tid=”t1” type=”DATE” value=”xxxx-09”>
September
</TIMEX3>
<SIGNAL sid=”s2”>
to
</SIGNAL>
<TIMEX3 tid=”t2” type=”DATE” value=”xxxx-12”>
December
</TIMEX3>
<TIMEX3 tid=”t5” type=”DURATION” value=”P4M” beginPoint=”t1” endPoint=”t2” temporalFunction=”true”/>
<TIMEX3 tid=”t3” type=DATE” value=”1995” temporalFunction=”true” anchorTimeID=”t4”>
last year
</TIMEX3>
<TIMEX3 tid="t4" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK timeID=”t1” signalID=”s1” relatedToTime=”t5” relType=”BEGINS”/>
<TLINK timeID=”t2” signalID=”s2” relatedToTime=”t5” relType=”ENDS”/>
<TLINK eventInstanceID=”ei1” relatedToTime=”t5” relType=”HOLDS”/>
(15) John taught last week.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense=”PAST” aspect=”NONE”/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX" temporalFunction="true" anchorTimeID="t2">
last week
</TIMEX3>
<TIMEX3 tid="t2" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="IS_INCLUDED"/>
Note: The TLINK
relates Timex3 expressions. This is the only representation that will
adequately express the temporal anchoring of this event.
(16) John taught last week on Monday.
John
<EVENT eid="e1" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense=”PAST” aspect=”NONE”/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX" temporalFunction="true" anchorTimeID="t2">
last week
</TIMEX3>
<SIGNAL sid="s1">
on
</SIGNAL>
<TIMEX3 tid="t3" type="DATE" value="XXXX-WXX-1" temporalFunction="true" >
Monday
</TIMEX3>
<TIMEX3 tid="t2" type="DATE" value="1996-03-27" functionInDocument="CREATION_TIME">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="IS_INCLUDED"/>
<TLINK timeID="t3" signalID="s1" relatedToTime="t2" relType="IS_INCLUDED"/>
This is a
subordination link that is used for contexts involving modality, evidentials,
and factives. An SLINK is used in cases where an event instance subordinates
another event instance type. These are cases where a verb takes a complement
and subordinates the event instance referred to in this complement.
attributes ::= [lid] [origin] [eventInstanceID] [signalID] subordinatedEventInstance relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= IDREF
{eventInstanceID ::= EventInstanceID}
subordinatedEventInstance ::= IDREF
{subordinatedEventInstance ::= EventInstanceID}
signalID ::= IDREF
{signalID ::= SignalID}
relType ::= 'MODAL' | 'EVIDENTIAL' | 'NEG_EVIDENTIAL'
| 'FACTIVE' | 'COUNTER_FACTIVE'
Note that eventInstanceID is optional because an event can be subordinated
(e.g. in a conditional) without being subordinated to a particular event.
The following
EVENT classes interact with SLINK:
Some lexical
notes:
Verbs that
introduce I_STATE EVENTs
that induce SLINK:
Verbs that
introduce I_ACTION EVENTs that induce SLINK:
Examples:
(17) If Graham leaves today, he will not hear Sabine.
<SIGNAL sid="s1">
if
</SIGNAL>
Graham
<EVENT eid="e1" class="OCCURRENCE">
leaves
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE"/>
<TIMEX3 tid="t1" type="DATE" value="XXXX-XX-XX" temporalFunction="true" >
today
</TIMEX3>
he will not
<EVENT eid="e2" class="OCCURRENCE">
hear
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="FUTURE" aspect="NONE" polarity=”NEG” modality=”WILL”/>
Sabine.
<SLINK subordinatedEventInstance="ei1" signalID="s1" relType="MODAL"/>
<TLINK eventInstanceID="ei1" relatedToEventInstance="ei2" relType="BEFORE"/>
<SLINK eventInstanceID=”ei1” subordinatedEventInstance=”ei2” reltype=”MODAL”/>
(18) Bill denied that John taught on Monday.
Bill
<EVENT eid="e1" class="I_ACTION">
denied
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense=”PAST” aspect=”NONE”/>
that John
<EVENT eid="e2" class="OCCURRENCE">
taught
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense=”PAST” aspect=”NONE”/>
<SIGNAL sid="s1">
on
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX-1">
Monday
</TIMEX3>
<TLINK eventInstanceID="ei2" signalID=”s1” relatedToTime="t1" relType="IS_INCLUDED"/>
<SLINK eventInstanceID="ei1" subordinatedEventInstance="ei2" relType="NEG_EVIDENTIAL"/>
(19) Bill wants to teach on Monday.
Bill
<EVENT eid="e1" class="I_STATE" >
wants
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE" >
teach
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE"/>
<SIGNAL sid="s2">
on
</SIGNAL>
<TIMEX3 tid="t1" type="DATE" value="XXXX-WXX-1">
Monday
</TIMEX3>
<TLINK eventInstanceID="ei2" signalID=”s2” relatedToTime="t1" relType="IS_INCLUDED"/>
<SLINK eventInstanceID="ei1" signalID="s1" subordinatedEventInstance="ei2" relType="MODAL"/>
(20) Bill attempted to save her.
Bill
<EVENT eid="e1" class="I_ACTION">
attempted
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE">
save
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="NONE" aspect="NONE"/>
her
<SLINK eventInstanceID="ei1" signalID="s1" subordinatedEventInstance="ei2" relType="MODAL"/>
ALINK is an
aspectual link; it indicates an aspectual connection between two events. In
some ways, it is like a cross between TLINK and SLINK in that it indicates both
a relation between two temporal elements, as well as aspectual subordination
attributes ::= [lid] [origin] eventInstanceID [signalID] relatedToEventInstance relType
lid ::= ID
{lid ::= LinkID
LinkID ::= l<integer>}
origin ::= CDATA
eventInstanceID ::= ID
{eventInstanceID ::= EventInstanceID}
signalID ::= IDREF
{signalID ::= SignalID}
relatedToEventInstance ::= IDREF
{relatedToEventInstance ::= EventInstanceID}
relType ::= 'INITIATES' | 'CULMINATES' | 'TERMINATES' | 'CONTINUES' | 'REINITIATES'
Some examples:
(21) The boat began to sink.
The boat
<EVENT eid="e1" class="ASPECTUAL">
began
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
to
</SIGNAL>
<EVENT eid="e2" class="OCCURRENCE" >
sink
</EVENT>
<MAKEINSTANCE eiid=”ei2” eventID=”e2” tense="NONE" aspect= "NONE"/>
<ALINK eventInstanceID="ei1" signalID="s1" relatedToEventInstance="ei2" relType="INITIATES"/>
(22) The search party stopped looking for the survivors.
The search party
<EVENT eid="e1" class="ASPECTUAL">
stopped
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
<EVENT eid="e2" class="OCCURRENCE">
looking
</EVENT>
<MAKEINSTANCE eiid=”ei2” eventID=”e2” tense="NONE" aspect="PROGRESSIVE"/>
<ALINK eventInstanceID="ei1" relatedToEventInstance="ei2" relType="TERMINATES"/>
for the survivors
In various
discussions of the full TERQAS groups, the utility of being able to mark
confidence values for various aspects of the annotation was pointed out. In
general, it would be useful to allow confidence values to be assigned to any
tag, and, in fact, to any attribute of any tag.
A convenient way to do this would be to create a confidence tag, which would consume no input, and which would have the following attributes:
attributes ::= tagType tagID [attributeName] confidenceValue
tagType ::= CDATA
tagID ::= IDREF
attributeName ::= CDATA
confidenceValue ::= CDATA
{confidenceValue ::= 0 < x < 1}
where
tagType
would range over the names of all the tags of TimeML
tagID
would range over the set of actual tag IDs within the current document (XML type IDREF)
attributeName
would range over the names of all the attributes of all the tags of TimeML
confidenceValue
would range over the rationals (i.e. would have a floating point value) between 0 and 1
So, for example, given this annotation:
(23) The TWA flight
<EVENT eid="e1" class="OCCURRENCE">
crashlanded
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PAST" aspect="NONE"/>
on Easter Island
<TIMEX3 tid="t1" type="DURATION" value="P2W" beginPoint=”t2” endPoint=”t3”>
two weeks ago
</TIMEX3>
<TIMEX3 tid=”t3” type=”DATE” value=”1999-12-06” temporalFunction=”true” anchorTimeID=”t1”/>
...
<TIMEX3 tid="t2" type="DATE" functionInDocument="CREATION_TIME" value="1999-12-20">
12-20-1999
</TIMEX3>
<TLINK eventInstanceID=”ei1” relatedToTime=”t3” relType=”IS_INCLUDED”/>
If we wanted to
indicate that we were unsure that we had annotated “two weeks ago” correctly,
we could add this annotation:
(23') <CONFIDENCE tagType="TIMEX3" tagID="t1" confidenceValue="0.50"/>
where the lack of
the optional attribute, attributeName, indicates
that the confidence applies to the whole tag.
On the other hand,
if we wanted to indicate that we weren't sure if the tense of “crashlanded” was
really "PAST", we could add this annotation:
(23'') <CONFIDENCE tagType="EVENT" tagID="e1" attributeName="TENSE" confidenceValue="0.75"/>
Abstracting confidence
measures as a separate tag frees the annotation from having to include a
confidence value attribute in every tag and eliminates the problem of
uncertainty over the exact attribute of a tag the confidence value applies to.
As for how
confidence values should be assigned in manual annotation, we feel that, in a
large-scale annotation effort such as TIMEBANK, two conditions should be
satisfied:
Therefore, the
annotation of a scalar value such as confidence should have at least two
features:
The constraint on
human annotators to a subset of the possible values should be documented in the
annotation guidelines and implemented in the annotation tool. And it would
probably be best if the annotation tool did not present numbers but rather
natural language descriptions such as those suggested above, which would be
represented in the underlying annotation numerically. For example, the
annotator might pick “moderately certain”, which would enter the annotation as
.5.
Moreover, for
manual annotation, it does not seem that the 0 and 1 values will be
used/useful. Presumably if the annotator doesn't trust an annotation at all
s/he won't add it. And, as was suggested above, 1, at least for manual
annotation, should be the default or unmarked value, and so need not be noted,
since it would bulk up the files considerably, even if it were used only on
entire tags.
Inasmuch as every
well-formed XML document must have a single root node, we supply TimeML as this node. For example, a sample annotated TimeML
document might look like this:
<?xml version="1.0"?>
<!DOCTYPE TimeML SYSTEM "TimeML.dtd">
<TimeML>
FAMILIES SUE OVER AREOFLOT CRASH DEATHS
The Russian airline Aeroflot has been
<EVENT eid="e1" class="OCCURRENCE">
hit
</EVENT>
<MAKEINSTANCE eiid="ei1" eventID="e1" tense="PRESENT" aspect="PERFECTIVE"/>
with a writ for loss and damages,
<EVENT eid="e2" class="OCCURRENCE">
filed
</EVENT>
<MAKEINSTANCE eiid="ei2" eventID="e2" tense="PAST" aspect="NONE"/>
in Hong Kong by the families of seven passengers
<EVENT eid="e3" class="OCCURRENCE">
killed
</EVENT>
<MAKEINSTANCE eiid="ei3" eventID="e3" tense="PAST" aspect="NONE"/>
<SIGNAL sid="s1">
in
</SIGNAL>
an air
<EVENT eid="e4" class="OCCURRENCE">
crash
</EVENT>
<MAKEINSTANCE eiid="ei4" eventID="e4" tense="NONE" aspect="NONE"/>.
All 75 people
<EVENT eid="e7" class="STATE">
on board
</EVENT>
<MAKEINSTANCE eiid="ei7" eventID="e7" tense="NONE" aspect="NONE"/>
<TLINK eventInstanceID="ei7" relatedToEvent="ei5" relType="INCLUDES"/>
the Aeroflot Airbus
<EVENT eid="e5" class="OCCURRENCE" >
died
</EVENT>
<MAKEINSTANCE eiid="ei5" eventID="e5" tense="PAST" aspect="NONE"/>
<TLINK eventInstanceID="ei5" signalID="s2" relatedToEvent="ei6" relType="IAFTER"/>
<SIGNAL sid="s2">
when
</SIGNAL>
it
<EVENT eid="e6" class="OCCURRENCE">
ploughed
</EVENT>
<MAKEINSTANCE eiid="ei6" eventID="e6" tense="PAST" aspect="NONE"/>
<TLINK eventInstanceID="ei6" signalID="s3" relatedToTime="t2" relType="IS_INCLUDED"/>
<TLINK eventInstanceID="ei6" relatedToEvent="ei4" relType="IDENTITY"/>
into a Siberian mountain
<SIGNAL sid="s3">
in
</SIGNAL>
<TIMEX3 tid="t2" type="DATE" value="1994-04">
March 1994
</TIMEX3>.
...
<TIMEX3 tid="t1" type="DATE" value="1996-03-27">
03-27-96
</TIMEX3>
<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="BEFORE"/>
<TLINK eventInstanceID="ei2" relatedToEvent="ei1" relType="BEFORE"/>
<TLINK eventInstanceID="ei3" relatedToEvent="ei2" relType="BEFORE"/>
<TLINK eventInstanceID="ei3" signalID="s1" relatedToEvent="ei4" relType="IS_INCLUDED"/>
</TimeML>
Ferro, Lisa,
Gerber, Laurie, Mani, Inderjeet, Sundheim, Beth, and Wilson, George. (2002) Instruction
Manual for the Annotation of Temporal Expressions, MITRE Washington C3 Center, McLean, Virginia.
Setzer, Andrea
(2001) Temporal Information in Newswire Articles: An Annotation Scheme and
Corpus Study, Doctoral Dissertation,
University of Sheffield, Sheffield, UK.
Pustejovsky,
James, Saurí, Roser, Setzer, Andrea, Ingria, Bob (2002) TimeML Annotation
Guidelines.