TempEval Test Data

February 28th, 2007

This document is largely a rehash of the readme.html document in the training data distribution, but the last paragraphs of the "Data Description" section are different and a section on how to submit results is added to the end.

We describe the TempEval data, the way they were created, the validation and scoring scripts that are bundled with the data, and the format for submissions. This document does not replace the task description on the SemEval and TempEval websites, but complements it.

Data Description

The TempEval annotation language is a simplified version of TimeML. The TimeML specifications, annotation guidelines and document type definition (all for TimeML version 1.2.1) are included here for easy reference. For TempEval, we use the following five tags:
<TempEval>
The document root.
<s>
The sentence tag. All sentence tags in the TempEval data are automatically created using the Alembic Natural Language processing tools. A sentence tag can contain TIMEX3 tags and EVENT tags, but no TLINK tags.
<TIMEX3>
Tags the time expressions in the text. It is identical to the TIMEX3 tag in TimeML. See the TimeML specifications and guidelines for further details on this tag and its attributes. Each document has one special TIMEX3 tag, the Document Creation Time, which is interpreted as an interval that spans the whole day.
<EVENT>
Tags the events in the text. The TempEval EVENT merges the information on two TimeML tags: EVENT and MAKEINSTANCE. TimeML used these two tags to refer to two instances of an event in sentences like "He taught on Wednesday and Friday". This complication was not necessary for the TempEval data. Both tags and their attributes are described in the TimeML specifications and guidelines. For TempEval task C, one extra attribute is added: mainevent, with possible values YES and NO.
<TLINK>
A simplified version of the TimeML TLINK tag. The relation types for the TimeML version form a fine-grained set based on James Allen's interval logic (James Allen, "Maintaining Knowledge about Temporal Intervals." Communications of the ACM 26, 11, 832-843, November 1983). For TempEval, we only use three relations as well as three disjunctions over those three: BEFORE, OVERLAP, AFTER, BEFORE-OR-OVERLAP, OVERLAP-OR-AFTER, and VAGUE. Here, OVERLAP refers to two events (or an event and a time interval) that have a non-empty overlap. VAGUE is used for those cases where no particular relation can be established.
The test data contain all event and timex information, including, for task C, markers to indicate main events for each sentence. In addition, the test data in data/taskAB contain all TLINKS required by tasks A and B, and data/taskC contains all links needed for task C. However, the relType attribute of each TLINK is set to UNKNOWN. The task is to replace the UNKNOWN values with one of the six allowed values listed above.

It is not necessary to determine what events are on the Event Target List (ETL). Recall that the Event Target list consists of those events that occur 20 times or more in the corpus. A complete list of stems ordered on frequency is included in the docs directory (only stems occurring more than once are added to the list).

The data directory has two sub directories, one with the data for tasks A and B, and one with data for task C. Both contain 20 documents.

It should be noted that the test set included here does not quite follow the specifications in the task definition, where we said we would annotate 20-25 documents drawn from a source like TimeBank and that these documents would contain at least 5 instances of every event on the ETL. This proved to be both hard and impossible. Instead, we opted for a more traditional approach where we split TimeBank into a training set and a test set.

Annotation Procedure

The EVENT and TIMEX3 annotation were taken from TimeBank (http://timeml.org/site/timebank/timebank.html). The annotation procedure for TLINKs includes dual annotation by seven annotators using a web-based annotation interface (see the screen shot page for more details). After this phase, two experienced annotators looked at all occurrences where two annotators differed as to what relation type to select. For task C, there was an extra annotation phase where the main events were selected. Annotation guidelines for main event annotation are included in this distribution.

Validation

Included with the trial data are a Perl validation script and a Document Type Definition for TempEval annotation. All files in the test set have been validated. Note that the DTD differs from the one given in the training set in that it allows UNKNOWN as the value of the relType attribute.

To validate TempEval files using the DTD, open a terminal window (Linux/Unix/MacOSX) or a command prompt (Windows) and type the following:

% perl validate.pl ../data/taskAB
% perl validate.pl ../data/taskC
This will write validation errors and warnings to the standard output. All lines with INFO-300 can be ignored, in general, they report on reference counts. On Unix/Linux systems, these lines can be filtered out by using:
% perl validate.pl ../data/taskAB | grep -v INFO-300
% perl validate.pl ../data/taskC | grep -v INFO-300

The script assumes the Perl modules XML::Checker and XML::RegExp, both available at CPAN (http://www.cpan.org).

Evaluation

Also included with the training data is a Perl scoring script. It measures precision and recall using a strict and a relaxed scoring scheme. See the evaluation document in the docs directory for more details.

Submitting Results

Results need to be submitted to the SemEval organizers by file upload. Please prepare a zip file or gzipped tar file that contains two directories:
taskAB/
taskC/
Directory taskAB should contain all 20 documents from data/taskAB with UNKNOWN values of the relType attribute replaced with one of the six TempEval relations. Similarly, taskC should contain all 20 documents from data/taskC with UNKNOWN values replaced. Participants who chose to not participate in task C can leave this directory empty.

Questions or Comments?

Please direct questions to tempeval@timeml.org.