These proceedings contain the final versions of the papers presented at the 7th International Workshop on Finite-State Methods and Natural Language Processing, FSMNLP 2008. The workshop was held in Ispra, Italy, on September 11–12, 2008. The event was the seventh instance in the series of FSMNLP workshops, and the third that was arranged as a stand-alone event. In 2008 FSMNLP was merged with the FASTAR workshop.
The aim of the FSMNLP workshops is to bring together members of the research and industrial community working on finite-state based models in language technology, computational linguistics, web mining, linguistics, and cognitive science on one hand, and, on related theory and methods in fields such as computer science and mathematics, on the other. Thus, the workshop series is a forum for researchers and practitioners working on applications as well as theoretical and implementation aspects. The special theme of FSMNLP 2008 centered around high performance finite-state devices in large-scale natural language text processing systems and applications.
In the context of FSMNLP 2008, we received in total 37 submisions, of which 13 were selected as regular papers, 6 as short papers and 1 as demo paper. The acceptance rate for regular papers was 46,4%. Most of the papers were evaluated by at least four Programme Committee members, with the help of external reviewers. Only 15% of the papers were reviewed by three reviewers. In addition to the submitted papers, four lectures were given by invited speakers. The invited speakers and the authors of the papers represented (at least) Croatia, Finland, France, Gemany, Italy, Luxembourg, Netherlands, Portugal, Puerto Rico, Sweden, U.K., and the USA.
We would like to thank all workshop participants for their contributions and lively interaction during the two days. The presented papers covered a range of interesting NLP applications, including machine learning and translation, logic, computational phonology, morphology and semantics, data mining, information extraction and disambiguation, as well as programming, optimization and compression of finite-state networks. The applied methods included weighted algorithms, kernels, and tree automata. In addition, relevant aspects of software engineering, standardization, and European funding programmes were discussed.
We are greatly indebted to the members of the Programme Committee and the external referees for reviewing the papers and maintaining the high standard of the FSMNLP workshops. The members of the Programme Committee of FSMNLP 2008 were Cyril Allauzen (Google Research, New York, USA), Francisco Casacuberta (Instituto Tecnologico De Informática, Valencia, Spain), Jean-Marc Champarnaud (Université de Rouen, France), Maxime Crochemore (Department of Computer Science, King's College London, U.K.), Jan Daciuk (Gdańsk University of Technology, Poland), Karin Haenelt (Fraunhofer Gesellschaft and University of Heidelberg, Germany), Thomas Hanneforth (University of Potsdam, Germany), Colin de la Higuera (Jean Monnet University, Saint-Etienne, France), André Kempe (Yahoo Search Technologies, Paris, France), Derrick Kourie (Dept. of Computer Science, University of Pretoria, South Africa), Andras Kornai (Budapest Institute of Technology, Hungary and MetaCarta, Cambridge, USA), Marcus Kracht (Univeristy of California, Los Angeles, USA), Hans-Ulrich Krieger (DFKI GmbH, Saarbrücken, Germany), Eric Laporte (Université de Marne-la-Vallée, France), Stoyan Mihov (Bulgarian Academy of Sciences, Sofia, Bulgaria), Herman Ney (RWTH Aachen University, Germany), Kemal Oflazer (Sabanci University, Turkey), Jakub Piskorski (Joint Research Center of the European Commission, Italy), Michael Riley (Google Research, New York, USA), Strahil Ristov (Ruder Boskovic Institute, Zagreb, Croatia), Wojciech Rytter (Warsaw University, Poland), Jacques Sakarovitch (Ecole nationale supérieure des Télécommunications, Paris, France), Max Silberztein (Université de Franche-Comté, France), Wojciech Skut (Google Research, Mountain View, USA), Bruce Watson (Dept. of Computer Science, University of Pretoria, South Africa) (PC co-chair), Shuly Wintner (University of Haifa, Israel), Atro Voutilainen (Connexor Oy, Finland), Anssi Yli-Jyrä (University of Helsinki and CSC – IT Center for Science, Espoo, Finland) (PC co-chair), Sheng Yu (University of Western Ontario, Canada), and Lynette van Zijl (Stellenbosch University, South Africa). The external reviewers were Marco Almeida, Marie-Pierre Beal, Oliver Bender, Jan Bungeroth, Pascal Caron, Loek Cleophas, Matthieu Constant, Stefan Hahn, Christopher Kermorvant, Sylvain Lombardy, Patrick Marty, Evgeny Matusov, Takuya Nakamura, Ernest Ketcha Ngassam, Jyrki Niemi, Sébastien Paumier, Maciej Pilichowski, Adam Przepiórkowski, Magnus Steinby, Yael Sygal, David Vilar, Hsu-Chun Yen, Francois Yvon, Artur Zaroda, and Djelloul Ziadi.
FSMNLP 2008 was organised by the Institute for the Protection and Security of the Citizen of the Joint Research Centre (JRC) of the European Commission in Ispra, Italy, in cooperation with the host of the next FSMNLP event, the FASTAR group of the University of Pretoria in South Africa. The Organizing Committee in 2008 had five JRC representatives: Regina Corradini, Daniela Negri, Jakub Piskorski (OC chair), Hristo Tanev, and Vanni Zavarella, and two members from the Department of Computer Science, University of Pretoria, South Africa: Derrick Kourie and Bruce Watson. A complementary role in long-term planning and coordination was played by the Steering Committee: Lauri Karttunen (Palo Alto Research Center, USA and Stanford University, USA) Kimmo Koskenniemi (University of Helsinki, Finland), Kemal Oflazer (Sabanci University, Turkey) and Anssi Yli-Jyrä (University of Helsinki and CSC – IT Centre for Science, Espoo).
The current year's event is pivotal to the series of FSMNLP workshops since it starts the tradition of organizing the workshops on a yearly basis. Locations for successive events, including FSMNLP 2008 in Ispra and FSMNLP 2009 in Pretoria were proposed already in FSMNLP 2007 in Potsdam. The success of FSMNLP 2008 indicates that there is a growing and wide interdisciplinary community with shared interest in finite-state methods and natural language processing. Therefore, we are looking forward to the FSMNLP 2009 that is to be held in Pretoria, South Africa next year!
In October 2008
Jakub Piskorski, Bruce Watson, Anssi Yli-Jyrä