SemEval 2010: VP Ellipsis Processing
From SIGSEM
Contents |
The Phenomenon
Verb Phrase Ellipsis (VPE) occurs in the English language when an auxiliary or modal verb abbreviates an entire verb phrase recoverable from the linguistic context, as in the following examples:
- Both Dr. Mason and Dr. Sullivan [oppose federal funding for abortion], as does President Bush, except in cases where a woman's life is threatened.
- They also said that vendors were [delivering goods] more quickly in October than they had for each of the five previous months.
- He spends his days [sketching passers-by], or trying to.
Here occurrences of VPE are typeset in a bold face font. The antecedent is marked by square brackets.
The Task
The proposed shared task consists of two subtasks: (1) automatically detecting VPE in free text; and (2) selecting the textual antecedent of each found VPE. Task 1 is reasonably difficult (Nielsen 2004 reports an F-score of 71% on Wall Street Journal data).
Task 2 is challenging. With a "head match" evaluation Hardt 1997 reports a success rate of 62% for a baseline system based on recency only, and an accurracy of 84% for an improved system taking recency, clausal relations, parallelism, and quotation into account. We will make the task more realistic (but more difficult) by not using head match but rather precision and recall over each token of the antecedent.
We will provide texts where sentence boundaries are detected and each sentence is tokenised and printed on a new line. An occurrence of VPE is marked by a line number plus token positions of the auxiliary or modal verb. Textual antecedents are assumed to be on one line, and are marked by the line number plus begin/end token position.
The Data
As development data we provide the stand-off annotation of nearly 500 occurrences of manually annotated VPE in the Wall Street Journal part (all 25 sections) of the Penn Treebank. It is available here.
Organisation
Participation
If you'd like to participate in this shared task, please contact the organisers by email. We will set up a mailing list to keep all participants up to date on all matters of the task.
Organisers
- Johan Bos (University of Rome "La Sapienza")
- Jennifer Spenader (University of Groningen)
SemEval 2010
The VP Ellipsis Processing shared task is one of the evaluation exercices organised at SemEval 2010. The time period for SemEval 2010 has not yet been finalised, but it will be held over a two-month period in the first part of 2010. The trial data for the VP Ellipsis Processing task will be released not later than July 2009 (perhaps even earlier). post scriptum: The shared task is postponed to a future instance of SemEval.
References
- Johan Bos and Jennifer Spenader (2011): An annotated corpus for the analysis of VP ellipsis, Language Resource and Evaluation
- Daniel Hardt (1997): An Empirical Approach to VP Ellipsis, Computational Linguistics 23(4):525-541
- Leif A. Nielsen (2004): Verb phrase ellipsis detection using automatically parsed text, Proceedings of the 20th international Conference on Computational Linguistics (Geneva, Switzerland), pp. 1093-1099