

Multiple business scenarios require an automated generation of descriptive human-readable long text from structured input data, where the source is typically a high-resource language and the target is a low or medium resource language. We define the Cross-Lingual Fact to Long Text Generation (XFLT) as a novel natural language generation (NLG) task that involves generating descriptive and human-readable long text in a target language from structured input data (such as fact triples) in a source language. XFLT is challenging because of (a) hallucinatory nature of the state-of-the-art NLG models, (b) lack of good quality training data, and (c) lack of a suitable cross-lingual NLG metric. Unfortunately previous work focuses on different related problem settings (cross-lingual facts to short text or monolingual graph to text) and has made no efforts to handle hallucinations. In this paper, we contribute a novel dataset, XLALIGN with over 64,000 paragraphs across 12 different languages, and English facts. We propose a novel solution to the XFLT task which addresses these challenges by training multilingual Transformer-based encoder-decoder models with coverage prompts and grounded decoding. Further, it improves on the XFLT quality by defining task-specific reward functions and training on them using reinforcement learning. On XLALIGN, we compare this novel solution with several strong baselines using a new metric, cross-lingual PARENT. We also make our code and data publicly available https://drive.google.com/file/d/1sHgcwXKribjrm2grbs-LzXUUqXQitD2N/.