Using Learning to Rank Approach for Parallel Corpora Based Cross Language Information Retrieval

Azarbonyad, Hosein; Shakery, Azadeh; Faili, Heshaam

doi:10.3233/978-1-61499-098-7-79

Abstract

Learning to Rank (LTR) refers to machine learning techniques for training a model in a ranking task. LTR has been shown to be useful in many applications in information retrieval (IR). Cross language information retrieval (CLIR) is one of the major IR tasks that can potentially benefit from LTR to improve the ranking accuracy. CLIR deals with the problem of expressing query in one language and retrieving the related documents in another language. One of the most important issues in CLIR is how to apply monolingual IR methods in cross lingual environments. In this paper, we propose a new method to exploit LTR for CLIR in which documents are represented as feature vectors. This method provides a mapping based on IR heuristics to employ monolingual IR features in parallel corpus based CLIR. These mapped features are considered as training data for LTR. We show that using LTR trained on mapped features can improve CLIR performance. A comprehensive evaluation on the English-Persian CLIR suggests that our method has significant improvements over parallel corpora based methods and dictionary based methods.

Contact

IOS Press Copyright 2024

Contact

IOS Press Copyright 2024

This website uses cookies

This website uses cookies