Guidelines for Annotating Events and Temporal Information in Newswire Texts

Andrea Setzer

Contents

1  Introduction
2  Events
    2.1 Linguistic Level
    2.2 Annotating Events
3  Time Expressions
    3.1 Linguistic Level
    3.2 Annotating Time Expressions
4   Temporal Relations and Event Identity
    4.1  Linguistic Level
    4.2  Annotating Temporal Relations and Event identity
5  The process of annotation
6  To be ignored for the moment
7 Others
8 An example

1  Introduction

The overall aim is to identify events in newswire texts and to relate them to a calendrical time on a (fictional or real) timeline. We devised an annotation scheme which is as simple as possible and yet tries to identify all information necessary and available to fulfill our task.

There are four basic annotation types: events, times, temporal relations and event identity. We briefly introduce these here.

Events    Intuitively, an event is something that happens, something that one can imagine placing on a time line. Events can be viewed as conceptually instantaneous or as happening over a period of time and events can stand in coreference relationships to each other.

Times    Events happen at or during certain times. Times can be viewed as either points or intervals.

Temporal Relations    Events and events as well as events and times stand in certain temporal relations to each other, for example before and after.

Event Identity    In newspaper articles, events are often introduced and come back to later to give more information about them.

Anything else, we do not want to annotate. For example states (non-essential relations between entities or holding of a non-essential attribute like The plane can carry four people) and details of events (like logical subject). The ultimate test of whether something is to be annotated as an event is whether it is anchorable in time and whether one would like to place it on a time map.

NOTE: Headlines are not to be annotated at the moment either !
 

NOTE:  Events, times and the signals (see below) indicating the temporal relation between them have unique IDs
            &nbs p;  throughout the text. To make the annotation easier, I created an ID counter that just counts from one up.
            &nbs p;  This counter does not ditinguish between IDs for events or times or signals, but at least if you stick with
            &nbs p;   it you will assign unique IDs to everything. If an annotation is deleted, the counter does not take this into
            &nbs p;   account either, so there will be gaps (which does not matter for the annoation). If an already annotated
            &nbs p;   text is loaded into the tool, then the highest ID (plus one) is ID that is displayed (so here too might be
            &nbs p;   gaps in the numbers, which is fine). You don't have to use this ID counter !  Some annotators said they
            &nbs p;   would find it helpful, so I built it in.

2  Events

As mentioned before, events are something that `happens' and we `want to place in time'. Examples of events are the following:
 


Events must be anchorable in time, even though we might not be able to accurately do so with the information in the article. To get a better idea what events are, we can compare them with states like
 

or


States are either not anchorable in time, like the first example, or are true over time spans longer than that covered by the article (second and third example).

We distinguish different classes of events, the reason being that those classes influence the `temporal interpretation' of the associated events. This is best illustrated with the following example.


The plane hitting the water is what we call occurrence event, one of the event classes we are interested in placing on a time map. The fisherman seeing the crash is also an occurrence event, but is related to the crash event in a particular way. The fisherman saw the crash, which means that both events happened at (roughly) the same time. To be able to exploit this special relationship (it helps to locate the crash event in time), we distinguish perception events (the fisherman observing) from occurrence events (the plane crashing). The third class of events is called reporting events (said Petty Officer Jeff Fenn), which serve a particular purpose in the genre of newspaper texts, that of giving the source of the information.

The final class of events we distinguish is aspectual events, such as

which basically follow the structure mentioned above and involve aspectual verbs like start, stop, finish etc. Their temporal consequence is that the aspectual event indicates the start or ending of the related event.
 
 

2.1 Linguistic Level

Events are usually conveyed by clauses containing finite verbs or certain nominalisations. For example: Events can also be represented by non-finite clauses, as in Sometimes several events can be subsumed under one event as in These events will be treated as one event unless one event is singled out and thus justifies being treated separately. At the moment there is no possibility to indicate this part-whole relationship.

Events can stand in certain coreference relation to each other and the one we want to annotate is event identity. It is impossible to identify the full range of ways in which this can be expressed in text, although coreference relation between participants, animate or inanimate, are good indicators.
 

2.2 Annotating Events

We treat events as a black boxes without details like logical subject or logical object, unless one of the arguments is an event itself. To make the actual annotation easier we annotate a representative of the event and not the whole clause covering it. The first candidate for event representative is the head of the finite verb group. If the event is conveyed by a nominalisation then we chose the head of the nominalisation as the representative. If the events is represented by non-finite clauses, we annotate the non-finite verb as the representative.

The following (simplified) examples show the basics. They are simplified, because they do not show attributes, we will come back to that later.

Events have a number of optional and obligatory attributes associated with them which are described below, including examples which only show the attributes relevant to the example. An example with all attributes is given further below.

eid    The event ID uniquely identifies the event in the text.
            &nbs p;  NOTE: Please make sure that you are using UNIQUE identifiers for each event mentioned in the text, ie.
          &nbs p;                    each time you annotate something as an event. Identical events mentioned more than once in the
          &nbs p;                    text get a different identifier for each time they are mentioned. And event identity is expressed by
          &nbs p;                    using the attributes "relatedToEvent"  and "eventRelType".
            &nbs p;                  For example, if we find the following in the text, then both times the crash is mentioned a
            &nbs p;                  DIFFERENT ID is given:
            &nbs p;                  The plane crashed, <event eid=2> killing </event> all passengers on board. After the <event eid=3>
          &nbs p;                   crash </event> search teams ...
          potential values: positive integers
          optional: no
          example: The plane <event eid=3> crashed </event>

class   An event belongs to one of these classes.
          potential values: OCCURRENCE, PERCEPTION, REPORTING, ASPECTUAL
          optional: no
          See next attribute for example.

argid   Reporting, perception and aspectual events usually have another event or other events as an
            &nbs p; attribute and the argument ID identifies these. Multiple event IDs separated by a comma can
            be filled in.
          potential values: positive integers
          optional: yes
          example: The MoD <event eid=5 class=reporting argid=7> announced </event> that the jet
            &nbs p;                 fighter <event eid=7 class=occurrence> ploughed </event> into the mountain.
            &nbs p;                 The <event eid=8 class=occurrence> search </event>
            &nbs p;          <event eid=9 class=aspectual type=start argid=8 relatedToTime=6
          &nbs p;                           &n bsp;     timeRelType=is_included>
            &nbs p;                        began</event>
                         <signal sid=3> on </signal> <timex tid=6> Thursday </timex>

tense   The tense of the indicating verb plays a crucial role in determining the time the event took place
            &nbs p; and the simple tense of it is recorded in this attribute. This does not hold for nominalisations
            &nbs p; and non-finite clauses,  since these are tenseless by definition. The annotator can fill in the
            &nbs p; tense if it is clear whether the event conveyed happened in the past, is happening or will happen
            &nbs p; in the  future, but this is not necessary. For events conveyed by finite clauses, however, the tense
            &nbs p; SHOULD BE FILLED IN.
          potential values:  PAST, PRESENT, FUTURE
          optional: yes
          example: The plane <event eid=3 tense=past> crashed </event>.
            &nbs p;                 Mr Smith will <event eid=11 tense=future> make </event> an announcement
            &nbs p;                 tomorrow afternoon.

aspect   It is sometimes helpful to know whether the aspect of the verb is progressive or perfective,
            &nbs p;   which can be indicated using the attribute aspect.
          potential values: PROGRESSIVE, PERFECTIVE
          optional:  yes
          example: ...several vessels and a helicopter were <event eid=13 tense=past
          &nbs p;                   class=PROGRESSIVE> combing </event> the area.

relatedToEvent  The ID of the event the current event is related to is stored here.
           potential values: positive integers
           optional:  yes
           See next attribute for example.

eventRelType   The type of temporal relation those two events are related by is stored in this attribute.
          potential values: BEFORE, AFTER, INCLUDES, IS_INCLUDED, SIMULTANEOUS
          optional:  yes
          example: All 75 people on board the Aeroflot Airbus <event eid=4 relatedToEvent=5
                       ;       eventRelType=simultaneous signalID=7> died </event>
            &nbs p;        <signal sid=7> when </signal> it <event eid=5> ploughed </event> into a Siberian
            &nbs p;                mountain.

relatedToTime   The ID of the time-object the current event is related to is stored here.
          potential values: positive integers
          optional:  yes
          See next attribute for example.

timeRelType  The type of temporal relation the event and time-object are related by is stored in this
            &nbs p;                attribute.
          potential values: BEFORE, AFTER, INCLUDES, IS_INCLUDED, SIMULTANEOUS
            &nbs p;                           &n bsp;    (see section 4 for a description of these relations)
          optional:  yes
          example: A small single-engine plane <event eid=9 relatedToTime=5 timeRelType=is_included
          &nbs p;                   signalID=9> crashed </event> into the Atlantic Ocean <signal sid=9 > on </signal>
          &nbs p;                   <timex tid=5> Wednesday </timex>.

signalID  The ID of the text span that signals the temporal relation holding between two entities can be
            &nbs p;       kept in this attribute.
          potential values: positive integers
          optional:  yes
          See relatedToEvent and relatedToTime for example.

Note about event class:
Causal and subevent relationships would come to mind as relationships one might want to annotate. But they are very difficult to define and to distinguish, which is why they are not included as an event class. If the annotator encounters such a relation then this can only be annotated using the attribute eventRelType and choosing the appropriate temporal relation (usually cause precedes effect and subevents are temporally included in their `container' events).  Examples for what can be feasibly interpreted as a causality or subevent relation are the following:

3  Time Expressions

3.1 Linguistic Level

Like events, times can be viewed as having extent (intervals) or as being punctual (points). Rather than trying to reduce one perspective to the other, as has happened in much philosophical discussion on time, we shall simply treat both as time objects. It must, however, be possible to associate a calendrical time with the time object. This is not possible with expressions like, for example, recently, and accordingly, this is not a time object. Examples of time expressions are: However, time expressions can be quite complex, as in the next example, where the whole expression denotes a point in time: The general structure is that a time interval is explicitly related to an event and the calendrical time this refers to is the time 17 seconds after the event.
 

3.2 Annotating Time Expressions

All time expressions have to be anchorable in time to be annotated. This means that is has to be possible to attach a calendrical date to the time expression - even though we might not be able to do so accurately (for example, we could only be able to associate the year 2000 or spring 200 with a time expression). We simply annotate the text span representing the time object, so last Tuesday needs to have  last included because that part is needed to identify the Tuesday in question. Whenever possible though a calendar date has to be attached to the time expression. These can often be worked out by looking at the date of the article and then expressions like "last Tuesday" can be associated with a calendar date.

Following general convention, and the approach taken in MUC, we distinguish between two classes of time objects, dates and times, units which are larger or smaller than a day, respectively. In addition, we flag complex time objects, as in the example before, as complex.

Like events, times have a unique ID so they can be uniquely identified in the text. All time expressions, simple and complex, have the following attributes:

tid    The  ID uniquely identifies the time object in the text.
          potential values: positive integers
          optional:  no

type  The type of the time object.
          potential values:  DATE, TIME, COMPLEX
          optional:  no

calDate   The calendrical date represented by the time object.
          potential values: [[[[[HH]MM]DDMM]YYYY or (`SPR'|`SUM'|`AUT'|`WIN')YYYY (24 hour clock)
          optional:  no

Examples of time expressions are:

The more complex time objects mentioned above, have the following additional attributes:

eid    The ID of the event the time interval is related to.
          potential values: positive integers
          optional:  yes

signalID    The ID of the signal representing the type of temporal relation.
          potential values: positive integers
          optional:  yes

relType    The temporal relation holding between the time interval and the event.
          potential values:  BEFORE, AFTER, INCLUDES, IS_INCLUDED, SIMULTANEOUS
          optional:  yes

An example of an annotated complex time expression is the following:

And the following example shows a situation involving an indefinite duration (see section below): where the indefinite duration is not annotated at all.
 

Miscellaneous Non-Time Expressions
 

Other

Special days, such as holidays referred to by name All Saints' Day are to be tagged.
 

4   Temporal Relations and Event Identity

4.1  Linguistic Level

Events stand in certain temporal relations to other events and to times. Times may be temporally related to other times as well, although this does not happen very often in the articles we have analysed so far. We do not mark up relations between times and times, all that needs to be done is to fill in the calendar date for times and the temporal relations between them will be worked out automatically later.

Temporal relations are either expressed explicitly or implicitly. Explicitly expressed are those where, for example, temporal prepositional phrases, temporal adverbial phrases and temporal subordinate conjunctions are used. The first example shows how an event is related to another event (by using the subordinate conjunction when) and then how an event is related to a time (by using the temporal preposition in).

Two more examples to who the use of temporal subordinate conjunctions. Temporal prepositional phrases are usually employed when relating events to times, as shown here: Neither relating events to events nor relating events to times need necessarily be explicit, as this example shows: Here the preposition on is omitted when relating the crash event to Sunday. And the killing even it related to the crash event by using a non-finite subordinate clause.

Events can stand in certain coreference relationships to each other and the relation we want to annotate is identity, as in the following example.

where the first sentence introduces the event and the second sentence refers to the same event - event identity for crashed.

The full set of temporal relations we suppose at present is:

is_included      The plane crashed on Wednesday.

              ;             & nbsp;  The event is completely included in the time or other event.

includes         By midafternoon, several vessels were combing the area.

              ;             & nbsp;  This is the reverse of is_included, where we know that the combing event includes
            &nbs p;      midafternoon completely.

after          The plane crashed after the pilot and his crew ejected.

              ;             & nbsp;  This relation, as well as the following before, is chosen when an event is clearly after
            &nbs p;                 (or before) another event.

before         ...before the craft fell, its three rotor blades shot off.

simultaneous  All 75 people on board the Aeroflot Airbus died when it ploughed into a Siberian
            & nbsp;            &nb sp;   mountain in March 1994.

              ;             & nbsp;  Events can happen at exactly the same time (like identical events) or overlap and we
            &nbs p;                 know how they overlap exactly. But often we only know that two events happen
            &nbs p;                 `roughly at the same time' without knowing exactly whether and how they overlap.
            &nbs p;                 In both cases (when we know exactly how they are overlapping or we only roughly
            &nbs p;                know) we choose the relation simultaneous.
  < br> 

4.2  Annotating Temporal Relations and Event identity

Events may be related to times or to other events. If two events are related then one of them carries the ID of the other as an attribute to link them and also the type of the relation. Which event has those attributes associated with it depends on the the type of relation (apart from simultaneous which is reflexive). If, for example, event a is before event b then event a could carry the ID of event b and the temporal relation before or event b could carry the ID of event a and the temporal relation after.

If the word or text span signaling two events being related is realised, as in the example below, then not only is the signal annotated but also the ID of the signal is stored as an attribute of the event, so the link between the signaling word and the events is not lost. If an event is related to a time object then the event carries the ID of the time object and the type of relation in its attributes. Should the signaling word be realised, again as in the example, then the ID of the signal becomes an attribute of the event. The signal has only one atribute, the unique identifier sid.

The following examples illustrate the approach.

If the temporal relation is implicit and the signal (a preposition, for example) is omitted, then the ID of
the signal is simply left out as in the following example. The annotation of two identical events is very similar, the only difference being that the relation type is set to identity.
 

5  The process of annotation

Please undertake only steps 1 to 4 for the time for the FIRST text only until we have established that the Guidelines are good enough ! (A manual for the annotation tool (ps version) can be found here)

It is recommended to go through 5 phases when annotating.

  1. Annotate all events and times, without paying attention to relating them.
  2. Relate events to events or times where this is explicitly signaled. This will be done by annotating the signaling text span and adding the appropriate attributes to the event (or one of the events).
  3. Relate all time objects that are not yet linked to an event.
  4. Go through the events and annotate event identity.
  5. After all events, times, most explicit and some implicit temporal relations have been annotated, press the "Calculate Closure" button. The program will prompt for unknown temporal relations (and try to infer as many as possible from the information given so far). Please fill these in as accurate as possible, but do not hesitate to mark relations as "unknown" when unsure.

  6. Mistakes cannot be corrected in this, because of the computational complexity involved in backtracking all the inferences that were drawn since the last input. Because what this does it to draw all possible inferences from the information given in the annotation and then asking for more information about temporal relation it does not know yet. Every time an answer is put it, it draws more inferences. So even it it seems it asks you every possible relation between every time and every event, it actually does not do that. But it does ask a lot (dependingon the text and on the inferences possible).
    All annotations will be "switched off" (ie. be made invisible) once the computing of the closure starts and only the entities involved in the question will be highlighted, to make this task a little bit easier.
  7. Afterwards, press the "View Results" button and save the results under an

  8. appropriate name.

6  To be ignored for the moment

  1. Negated sentences (i.e. where the verb is negated).
  2. Conditionals (e.g. If China would have responded ...), counterfactuals and hypothetical expressions.

7 Others

Annotate the date of the article at the bottom as "doa".
 

8 An example


NAVY BLAMES SHOWING OFF FOR JET CRASH

   WASHINGTON  The Navy saide1Fridayt1 that a pilot was probably
showing offe2 for his parents whens1 he crashede3 an F-14A jet fighter in
Nashville ins2 Januaryt2, killinge4 himself, a fellow officer and three
people on the ground.

(e1) Event:    eid=1 class=reporting tense=past argEvent=2
(s1) Time:     tid=1 type=date calDate=29111996
(e2) Event:    eid=2 class=occurrence tense=past aspect=progressive relatedToEvent=3
            &nbs p;            eventRelType=simultaneous
(s1) Signal:    sid=2
(e3) Event:    eid=3 class=occurrence tense=past relatedToTime=2 timeRelType=included
            &nbs p;            signalID=2
(s2) Signal:    sid=2
(t2) Time:      tid=2 type=date caLDate=011996
(e4) Event:    eid=4 class=occurrence tense=past relatedToEvent=3
            &nbs p;            eventRelType=is_included
            &nbs p;            NOTE: I would have annotated this as a subevent, but since I can't do this I
            &nbs p;                            i ncluded the temporal relation "is_included".


   The pilot, Lt. Comdr. John Stacy Bates, had been groundede5 for a
month ins3 April 1995t3 afters4 he loste6 control of another F-14A afters5
taking offe7 from the aircraft carrier USS Lincoln. The plane crashede8
afters6 the pilot and his crew ejectede9. The crew members were
rescuede10.

( e5) Event:     eid=5 class=occurrence tense=past aspect=perfective relatedToTime=3
            &nbs p;              timeRelType=is_included signalID=3
( s3) Signal:     sid=3
(  t3) Time:      tid=3 type=date calDate=041995
(  s4) Signal:    sid=4
(  e6) Event:    eid=6 class=occurrence tense=past relatedToEvent=5 eventRelTyp=before signalID=4
(  s5) Signal:    sid=5
(  e7) Event:    eid=7 class=occurrence tense=past relatedToEvent=6 eventRelType=before
(  e8) Event:    eid=8 class=occurrence tense=past
            &nbs p;              NOTE: This is not the same crash event as the one in the first paragraph!
(  s6) Signal:    sid=6
(  e9) Event:    eid=9 class=occurrence tense=past relatedToEvent=8 eventRelType=before
(e10) Event:    eid=10 class=occurrence tense=past
   Rear Adm. Bernard Smith, who investigatede11 the Nashville crashe12,
said Bates' judgment was influenced by his parents' presence at the
field'' and his desire to show them risky takeoff and flight
maneuvers.
(e11) Event:    eid=11 class=occurrence tense=past
(e12) Event:    eid=12 class=occurrence tense=past relatedToEvent=8 eventRelType=identity
   The admiral saide13 that ins7 taking offe14 from the Nashville airport
on Jan. 29, Bates ascendede15 at an angle steeper than 50 degrees,
violatinge16 Navy rules.
(e13) Event:    eid=13 class=reporting tense=past argEvent=14,15,16
(  s7) Signal:    sid=7
(e14) Event:    eid=14 class=occurrence tense=past aspect=progressive
            &nbs p;              NOTE: This is not the same take off as event 7!
(e15) Event:    eid=15 class=occurrence tense=past relatedToEvent=14
            &nbs p;              eventRelType=included signalID=7
(e16) Event:    eid=16 class=occurrence tense=past aspect=progressive relatedToEvent=15
            &nbs p;              eventRelType=simultaneous
   Afters8 the near-perpendicular takeoffe17, in the clouds, Smith saide18,
the pilot becamee19 disoriented, and in all likelihood did not realize
that he was headinge20 earthward untils9 his jet piercede21 the clouds at
2,300 feet.
(  s8) Signal:    sid=8
(e17) Event:    eid=17 class=occurrence tense=past relatedToEvent=14
            &nbs p;              eventRelType=identity
(e18) Event:    eid=18 class=reporting tense=past argEvent:17,19,20,21,22
(e19) Event:    eid=19 class=occurrence tense=past relatedToEvent=17 eventRelType=after
            &nbs p;              signalID=8
(e20) Event:    eid=20 class=occurrence aspect=progessive tense=past relatedToEvent=17
            &nbs p;              eventRelType=after
(e21) Event:    eid=21 class=occurrence tense=past aspect=progressive
(  s9) Signal:    sid=9
(e22) Event:    eid=22 class=occurrence tense=past  relatedToEvent=21 eventRelType=after
            &nbs p;              signalID=9
   By then, the admiral said, it was too late to prevent the plane
from smashing into the house of Elmer and Ada Newsom.
(e23) Event:    eid=23 class=occurrence tense=past aspect=progessive relatedToEvent=3
            &nbs p;              eventRelType=is_included

NOTE: "prevent" could be interpreted as an event, but is hypothetical and thus excluded. And because "prevent" is not annotated, the reporting event "said" is not annotated because reporting events are only marked up if their argument events are marked up.

   In addition to Bates, 33, and his radar intercept officer, Lt.
Graham Alden Higgins, 28, those killede24 were the Newsoms and Ewing
Wair, who was visitinge25 them.
(e24) Event:    eid=24 class=occurrence tense=past relatedToEvent=3
            &nbs p;              eventRelType=is_included
            &nbs p;              NOTE: I interpret the killing event as a subevent of the crash, so the
            &nbs p;                           &n bsp;  temporal relation is included
(e25) Event:    eid=25class=occurrence tense=past aspect=progressive relatedToEvent=24
            &nbs p;              eventRelType=simultaneous
            &nbs p;              NOTE: This relation would be open for discussion. One could say that the
            &nbs p;                           &n bsp;  visiting event includes the crash event but I would argue that that
                                   &n bsp;       is not all that clear and thus I put "simultaneous".
   The crash would never have happened had the pilot made a normal
take-off or taken the cloudy weather into account, Smith said after
an 11-week review of the incident.
NOTE: No events are annotated here for the same reason as above (events being hypothetical).


04-12-96

(doa)  This is the date of the article. No attributes involved.