
Ebook: Controlled Natural Language

Controlled natural languages (CNLs) are based on natural language and apply restrictions on vocabulary, grammar, and/or semantics. They fall broadly into 3 groups. Some are designed to improve communication for non-native speakers of the respective natural language; in others, the restrictions are to facilitate the use of computers to analyze texts, for example, to improve computer-aided translation; and a third group of CNLs are designed to enable reliable automated reasoning and formal knowledge representation from seemingly natural texts.
This book presents the 11 papers, selected from 14 submitted, and delivered at the sixth in the series of workshops on Controlled Natural Language, (CNL 2018), held in Maynooth, Ireland, in August 2018. The papers cover a full spectrum of controlled natural languages, ranging from human oriented to machine-processable controlled languages and from more theoretical results to interfaces, reasoning engines, and the real-life application of CNLs.
The book will be of interest to all those working with controlled natural language, whatever their approach.
CNL 2018 is the sixth in the series of workshops on Controlled Natural Language (CNL), which was first organised in 2009. The 2018 edition was organised in Co. Kildare, Ireland, on 27 and 28 August.
As with previous editions of the workshop, this year's papers cover the wide spectrum of the area of Controlled Natural Languages, ranging from human oriented to machine processable controlled languages, and from more theoretical results to interfaces, reasoning engines, and real-life applications of CNLs.
This year we invited both long and short papers to be submitted to the workshop, and we received a total of 14 submissions. All papers were peer-reviewed by at least three, and in most cases four, members of the workshop's Program Committee. Based on the reviews, 11 papers were accepted, out of the 14. As per usual the outcomes remain truly international, where PC members represented organisations from 13 countries and authors' affiliations of accepted papers came from 10 countries on 3 continents.
In addition to the presentation of the accepted papers, the programme includes 3 invited speakers: Claire Gardent (Loria, France), Albert Gatt (University of Malta, Malta) and Teresa Lynn (Dublin City University, Ireland) and a late-breaking results posters & demos session. We would like to thank the Programme Committee for their reviews and feedback, the authors for their contributions and the invited speakers for accepting our invitation to present their work in the workshop. Furthermore, we would also like to thank Maynooth University for hosting the workshop and the workshop sponsors Science Foundation Ireland, Maynooth University and Digital Grammars Gothenburg AB.
Brian Davis
C. Maria Keet
Adam Wyner
July 13, 2018
CNL'18, Maynooth, Co Kildare, Ireland
We present an editor for controlled languages which is a combination of a syntax editor and a predictive editor. It shows a bird's-eye view which lets the user to explore what is possible in the language. Still, unlike the syntax editors the user is not expected to understand the underlying abstract syntax or ontology behind the language. It also lets the user to enter arbitrary phrases from which the editor finds the phrases which are the closest match.
Controlled natural languages have long been used as a surface form for formal descriptions, allowing easy transitioning between natural language specifications and implementable specifications. In this paper we motivate the use of a controlled natural language in the representation and verification of financial services regulations. The verification context is that of payment applications that come with a model of their promised behaviour and which are deployed on a payments ecosystem. The semantics of this financial services regulations controlled natural language (FSRCNL) can produce compliance checks that analyse both the promised model and/or monitor the application itself after it is deployed.
Controlled natural languages (CNL) have the benefits to combine the readability of natural languages, and the accuracy of formal languages. They have been used to help users express facts, rules or queries. While generally easy to read, CNLs remain difficult to write because of the constrained syntax. A common solution is a grammar-based auto-completion mechanism to suggest the next possible words in a sentence. However, this solution has two limitations: (a) partial sentences may have no semantics, which prevents giving intermediate results or feedback, and (b) the suggestion is often limited to adding words at the end of the sentence. We propose a more responsive and flexible CNL authoring by designing it as a sequence of sentence transformations. Responsiveness is obtained by having a complete, and hence interpretable, sentence at each time. Flexibility is obtained by allowing insertion and deletion on any part of the sentence. Technically, this is realized by working directly on the abstract syntax, rather than on the concrete syntax, and by using Huet's zippers to manage the focus on a query part, the equivalent of the text cursor of a word processor.
Increase in isiZulu language learning is hampered by the predominantly manual approach to creating and marking homework and test exercises. Extant computer-assisted language learning platforms cannot handle the intricacies of agglutination in isiZulu and related languages. We seek to address this by designing a controlled natural language-based exercise generator and marker for isiZulu. This consists of question and answer sentence templates for exercise types, reusable algorithm snippets as grammar library, a small corpus of words and sentences to be used by the system, a constrained sentence generator to combine the right type of words, and finally the exercise creation and automated marking system. The preliminary evaluation shows encouraging results.
The hazard analysis and risk assessment (HARA) is a safety activity, which is performed during the concept phase of the functional safety standard ISO 26262. The results of this activity are usually documented by using a natural language. On the one hand, natural languages are expressive and powerful, but on the other hand, they are also ambiguous and complex. The usage of controlled natural languages (CNLs) is a means to reduce the drawbacks of natural languages. In this paper, we introduce controlled natural languages for the rationales of the three risk parameters: severity, exposure, and controllability to extend our set of CNLs for the HARA. In the first place, the application of controlled languages leads to more harmonized descriptions and rationales. Subsequently, an automatic processing based on these languages shall be implemented to enable the detection of inconsistencies across different HARA documentations.
Scientific communication still mainly relies on natural language written in scientific papers, which makes the described knowledge very difficult to access with automatic means. We can therefore only make limited use of formal knowledge organization methods to support researchers and other interested parties with features such as automatic aggregations, fact checking, consistency checking, question answering, and powerful semantic search. Existing approaches to solve this problem by improving the scientific communication methods have either very restricted coverage, require formal logic skills on the side of the researchers, or depend on unreliable machine learning for the formalization of knowledge. Here, I propose an approach to this problem that is general, intuitive, and flexible. It is based on a unique kind of controlled natural language, called AIDA, consisting of English sentences that are atomic, independent, declarative, and absolute. Such sentences can then serve as nodes in a network of scientific claims linked to publications, researchers, and domain elements. I present here some small studies on preliminary applications of this language. The results indicate that it is well accepted by users and provides a good basis for the creation of a knowledge graph of scientific findings.
Controlled Natural Languages (CNLs) have many applications including document authoring, automatic reasoning on texts and reliable machine translation, but their application is not limited to these areas. We explore a new application area of CNLs, the use of CNLs in computer-assisted language learning. In this paper we present a a web application for language learning using CNLs as well as a detailed description of the properties of the family of CNLs it uses.
A prototype object-oriented natural-language programming system for computer/video games is described, in which sentences written in object-oriented English is automatically converted to a functional, executable game code in Javascript. In addition, new attributive words are automatically learned while converting the text to code. Any syntactic or semantic errors are reported during the compilation to help users to debug and fine-tune the game. With less than 20 plain English sentences, up to 1800 lines of game code can be generated.
Understanding texts in Attempto Controlled English (ACE) is considered undemanding, nonetheless hides some problems. To deal with these problems I propose an experiment based on Kuhn's ontographs that tests the understanding of simple ACE texts. Furthermore, I suggest to compare the relation between authors and readers with human verbal conversations. My conclusion is that the correct understanding of an ACE text is possible, but requires contributions from both authors and readers, quasi their cooperation.
While machine processable Controlled Natural Languages (CNLs) as a natural language interface have proven a popular, unambiguous and user friendly method for non experts to engineer formal knowledge-bases, human-oriented CNLs however remain under-researched despite having found favor within industry over many years. Whether such human orientated CNLs like the machine processable counterparts can be captured automatically as formal knowledge remains an open question. In addition, rewriting all or most of a human-oriented CNL into a machine-oriented CNL could unlock significant silos of general purpose domain knowledge, contained within existing human-oriented CNL content for exploitation by knowledge based systems. This paper explores the feasibility of rewriting a human-orientated CNL represented in Simplified English into a well know machine-oriented CNL represented in ACE CNL and describes preliminary results.
The correct modelling of negation in a computational grammar is essential in order for the grammar to be useful in natural language processing, and controlled language development and applications. An important and unique feature of Afrikaans is how it deals with negation. We present an exposition of a substantial fragment of negation in Afrikaans and discuss our implementation thereof in GF. Examples are given as illustration of the relevant issues. The paper is concluded with a discussion of results and plans for future work.