RSS twitter Login
elda-vs.jpg
Home Contact Login

EVALDA

Share this page!
twitter google-plus linkedin share

The EVALDA project has been financed by the French Ministry of Research in the context of its Technolangue programme. The aim of the project was to establish a permanent evaluation infrastructure for the language engineering sector in France and for the French language.

The aim of such a project was to put together reuseable components such as organisation, logistics, language resources, evaluation protocols, methodologies and metrics as well as major actors in the field (scientific advisory boards, panels of experts, partners etc). This guaranteed the possibility to capitalise on the results of previous experiments, but also to favour collaborative research and the setting up of new and improved evaluation campaigns. It was imperative that the evaluations envisaged in this project could be reproduced by third parties, using the resources assembled over the course of the project, in order to enable a genuine comparison of system performance and benchmarking of the state in the art of language engineering. All evaluation resources have been made available on the ELRA catalogue at the end of the project in the form of an evaluation package.

A second aim of the project was to set up evaluation campaigns involving several linguistic technologies including both written and spoken media. Industrial and academic partners took part in the project. The campaigns were largely based around black box evaluation protocols and quantitative methods, drawing and expanding upon previous evaluation campaigns, such as ARC-AUPELF, GRACE, TREC etc.

Each evaluation campaign was largely independent, however a certain amount of synergy between the campaigns was envisaged. This involved the sharing of know-how, resources or even personnel.

The choice of linguistic technologies to evaluate was made on the basis of those that appeared to be the most crucial or important in the field. Details on the selected projects are provided in the notebook below, along with the link to the corresponding Evaluation Package in the catalogue.

Méthodologie d’Evaluation automatique de la compréhension hors et en contexte du DIAlogue

Evaluation of Man-Machine dialogue systems. In this case, the task of hotel room reservation (including some local touristic information) is envisaged.

MEDIA Evaluation Package & MEDIA Speech Database for French

Introduction

The aim of the MEDIA evaluation campaign is to test an automatic evaluation methodology for man-machine dialogue systems. The evaluation methodology is based on a paradigm that uses test sets taken from a corpus of real-world dialogues, a semantic representation of dialogue and common evaluation metrics. This protocol is designed to test the capacity of dialogue systems, both taking into account and not taking into account, the context of the dialogue.

In order to validate the evaluation protocol and the semantic representations, an evaluation campaign will take place where each partner in the project tests their system. The task chosen is hotel room reservation, with touristic information as an additional point of entry into the dialogue.

The final Media Workshop took place at the Sainte-Marthe University in Avignon, France, on July 6-7 2006.

Contact : Khalid Choukri - choukri@elda.org

For more information (in French), please visit technolangue.net.