Latest News
- New LRs in the ELRA Catalogue July 25, 2024
- New LRs in the ELRA Catalogue June 5, 2024
- New LRs in the ELRA Catalogue Dec. 7, 2023
- New LRs in the ELRA Catalogue Nov. 13, 2023
- The LDS vision by Philippe Gelin Oct. 17, 2023
EVALDA
The EVALDA project has been financed by the French Ministry of Research in the context of its Technolangue programme. The aim of the project was to establish a permanent evaluation infrastructure for the language engineering sector in France and for the French language.
The aim of such a project was to put together reuseable components such as organisation, logistics, language resources, evaluation protocols, methodologies and metrics as well as major actors in the field (scientific advisory boards, panels of experts, partners etc). This guaranteed the possibility to capitalise on the results of previous experiments, but also to favour collaborative research and the setting up of new and improved evaluation campaigns. It was imperative that the evaluations envisaged in this project could be reproduced by third parties, using the resources assembled over the course of the project, in order to enable a genuine comparison of system performance and benchmarking of the state in the art of language engineering. All evaluation resources have been made available on the ELRA catalogue at the end of the project in the form of an evaluation package.
A second aim of the project was to set up evaluation campaigns involving several linguistic technologies including both written and spoken media. Industrial and academic partners took part in the project. The campaigns were largely based around black box evaluation protocols and quantitative methods, drawing and expanding upon previous evaluation campaigns, such as ARC-AUPELF, GRACE, TREC etc.
Each evaluation campaign was largely independent, however a certain amount of synergy between the campaigns was envisaged. This involved the sharing of know-how, resources or even personnel.
The choice of linguistic technologies to evaluate was made on the basis of those that appeared to be the most crucial or important in the field. Details on the selected projects are provided in the notebook below, along with the link to the corresponding Evaluation Package in the catalogue.
Action de Recherche Concertée sur l’Alignement de Documents et son Evaluation
Evaluation of bilingual text and vocabulary alignment systems. Following the success of ARCADEI, this follow up campaign aims to evaluate alignments between more distant or ’exotic’ languages ie Greek, Russian, Japanse, Chinese.
Introduction
The ARCADE project, started in 1995 and achieved in 1999, was designed to provide standard methods for the evaluation and comparison of French-English parallel text alignment systems. The ARCADE II aims at exploring the techniques of multilingual text alignment through a fine evaluation of the existing techniques and the development of new alignment methods.
ARCADE II consists of two tracks devoted to the evaluation of alignment at sentence and word level respectively. It differs from previous ARCADE in the multilingual aspect and the investigation of lexical alignment. The concerned languages include 5 European languages (English, French, German, Italian and Spanish) and 6 languages of different writing systems (Arabic, Russian, Chinese, Japanese, Greek and Persian). Multilingual reference corpora have been made available for the evaluation exercise.
Contact : Khalid Choukri - choukri@elda.org
For more information (in French), please visit technolangue.net.
Méthodologie d’Evaluation automatique de la compréhension hors et en contexte du DIAlogue
Evaluation of Man-Machine dialogue systems. In this case, the task of hotel room reservation (including some local touristic information) is envisaged.
MEDIA Evaluation Package & MEDIA Speech Database for French
Introduction
The aim of the MEDIA evaluation campaign is to test an automatic evaluation methodology for man-machine dialogue systems. The evaluation methodology is based on a paradigm that uses test sets taken from a corpus of real-world dialogues, a semantic representation of dialogue and common evaluation metrics. This protocol is designed to test the capacity of dialogue systems, both taking into account and not taking into account, the context of the dialogue.
In order to validate the evaluation protocol and the semantic representations, an evaluation campaign will take place where each partner in the project tests their system. The task chosen is hotel room reservation, with touristic information as an additional point of entry into the dialogue.
The final Media Workshop took place at the Sainte-Marthe University in Avignon, France, on July 6-7 2006.
Contact : Khalid Choukri - choukri@elda.org
For more information (in French), please visit technolangue.net.
Campagne d’Evaluation de Systèmes de Traduction Automatique
Evaluation of Machine Translation Systems. French is to be the pivotal language, however, several languages from and into French are envisaged (English, Spanish, German, Arabic) according to the capabilities of the participants’ systems.
Introduction
The CESTA campaign proposes a series of evaluation campaigns of machine translation systems for various language pairs towards French. The statistical metrics BLEU/NIST (IBM) are being used for the evaluations and adapted to French as a target language, along with other automatic metrics based on grammatical and semantic scores (X-Score and D-Score). The Weighted N-gram Model (WNM), WER and PER are also used. The other aim of CESTA is to conduct a meta-evaluation, comparing the automatic results with human judgments.
Coordinator
ELDA
Participants
Université de Lille 3, IDIS/CESARTES
Ecole Polytechnique Fédérale de Lausanne, LIA
Université de Leeds
Temis S.A.
Systran S.A.
Softissimo S.A.
CIMOS S.A.
Université de Grenoble, IMAG
Université de Montréal, Dept. Linguistique et Traduction
Université de Montréal, RALI
Université de Genève, ISSCO
University of Aachen, RWTH
Universitat Politècnica de Catalunya (UPC)
SDL International
Comprendium S.L.
Contact : Khalid Choukri - choukri@elda.org
For more information (in French), please visit technolangue.net.
CESART - Evaluation de Systèmes d’Acquisition de Ressources Terminologiques
Evaluation of terminology extraction tools, including tools for extracting ontologies and semantic relations. Evaluation is to take place with reference to a predetermined list of terms/relations.
Introduction
CESART project deals with the user-oriented evaluation of terminological resources acquisition tools. This kind of user-oriented evaluation relies on the support of experts in information management who are capable of assessing terminological data and confirming usage. The aim is to propose and validate an evaluation protocol allowing one to objectively evaluate and compare different systems for terminology application such as terminological resource creation and semantic relation extraction. The project also aims to create quality-controlled resources such as domain-specific corpora, automatic scoring tool, etc.
CESART consists of two tracks devoted to the evaluation of term extraction and term structuring. Five French language terminology acquisition tools have been participated in the CESART evaluation exercise. As these tools are based on different models and designed for different applications, two evaluation tasks have been defined : term extraction and semantic relation extraction (synonymy) in order to cope with the context of the use of these tools.
Contact : Khalid Choukri - choukri@elda.org
For more information, please visit technolangue.net (in French).
Evaluation des Analyseurs Syntaxiques du français
An evaluation camapign designed to test syntactic parsers. A side effect of the campaign is the creation of a syntactically parsed reference text composed of several genres of text (newpapers, literary texts, electronic texts etc).
Introduction
The EASY project is dedicated to the evaluation of syntactic analysers for the French language. The project is financed by the French Ministry of Research in the context of the Technolangue programme.
The aim of the EASy campaign is to design and test an evaluation methodology to compare syntactic analysers on French and to produce a large validated linguistic resource obtained combining automaticaly the annotated corpora produced. The corpora consists of texts taken from various domains (litterature, medicine, technique, general, ...) and of different types : newspapers, questions, websites, oral transcriptions, ...
The project will last 24 months. The evaluation campaign is currently running and will last until 15th December 2004.
Contacts
Khalid Choukri Coordinators ELDA Corpora providers |
Participants
ERSS |
Contact : Khalid Choukri - choukri@elda.org
For more information (in French), please visit technolangue.net.
Evaluation en Question-Réponse
Evaluation of Question/Answering systems. Three reference corpora are envisaged : a large general corpus (newspapers, general texts), a web corpus and a corpus made up of medical texts.
Introduction
The EQueR Evaluation Campaign provides an evaluation framework for Question/Answering systems for the French language. It aims at giving pertinent input to this research activity by providing it with a state of the art, especially in France.
EQueR includes two tasks of automatic answer retrieval : a generic task over an heterogeneous collection of texts - mainly newspaper articles, and a specialised task over a corpus of medical texts.
Contact : Khalid Choukri - choukri@elda.org
Participants
ELDA / ELRA, Organiser
CISMEF Centre Hospitalier de Rouen
Systal / Pertimm S.A.S.
France Telecom R&D, DMI/GRI
iSmart S.A.R.L.
CNRS/Université d’Avignon, Laboratoire d’Informatique d’Avignon (LIA)
CEA, Laboratoire d’ingénierie de la connaissance multimédia multilingue (LIC2M)
CNRS, Laboratoire d’Informatique pour la Mécanique et les Sciences de l’Ingénieur (LIMSI)
Université de Neuchâtel, Laboratoire Interfacultaire d’Informatique
Sinequa S.A.S.
Assistance Publique / Hôpitaux de Paris, Sciences et Technologies de l’Information Médicale (STIM)
Synapse S.A.
Scientific committee
Brigitte Grau, LIMSI - Animatrice
Patrice Bellot, LIA
Michel Benoit, iSmart
Malek Boualem, FranceTelecom RetD
Mohand Boughanem, IRIT
Patrick Constant, Systal
Olivier Ferret, CEA
Martine Hurault-Plantet, LIMSI
Dominique Laurent, Synapse
Claude de Loupy, Sinequa
Jacques Savoy, Université de Neuchâtel
Pierre Zweigenbaum, STIM
Contact : Khalid Choukri - choukri@elda.org
For more information (in French), please visit technolangue.net.
Evaluation des Systèmes de Transcription Enrichie d’émissions Radiophoniques
Evaluation of automatic broadcast news transcriptions systems. This campaign includes the evaluation of segmentation tasks and identification of named entities.
ESTER Evaluation Package & ESTER Corpus
Introduction
The purpose of the ESTER Camapign is to evaluate the performance of broadcast news transcription systems.
Contact : Khalid Choukri - choukri@elda.org
For more information, please visit technolangue.net (in French).
Evaluation des Synthétiseurs de parole en français
Evaluation of Speech synthesis systems. This campaign is to feature a novel method for the evaluation of prosody in sythesised speech.
Introduction
The EVASY project is dedicated to the evaluation of speech synthesis systems for the French language. The project is financed by the French Ministry of Research in the context of the Technolangue programme.
This evaluation campaign is intended to expand upon the ARC-AUPELF (now AUF) campaign of 1996-1999, the only previous evaluation campaign for text-to-speech systems for the French language. The EvaSy campaign is subdivided into three components :
If you would like to obtain more information about the project and the related work-in-progress report, you are kindly invited to contact :
Contacts
Khalid Choukri - choukri@elda.org
Christophe d’Alessandro (LIMSI)
Coordinator
ELDA
Consortium Partners
DELIC
Bell Labs - Lucent Technologies
CRISCO
Elan Speech
ICP
LIMSI
LATL
LIA
MULTITEL ASLB
For more information (in French), please visit technolangue.net.