Ebook: Legal Knowledge and Information Systems
Technological advances related to legal information, knowledge representation, engineering, and processing have aroused growing interest within the research community and the legal industry in recent years. These advances relate to areas such as computational and formal models of legal reasoning, legal data analytics, legal information retrieval, the application of machine learning techniques to different legal tasks, and the experimental evaluation of these systems.
This book presents the proceedings of JURIX 2023, the 36th International Conference on Legal Knowledge and Information Systems, held from 18–20 December 2023 in Maastricht, the Netherlands. This annual conference has become recognized as an international forum where academics and professionals working at the intersection of law and artificial intelligence can exchange knowledge and experience. A total of 92 submissions were received for the conference, of which 18 were selected as long papers, 30 as short papers and 7 as demo papers following a rigorous review process. This represents an acceptance rate of around 20% for long papers (60% overall). Topics covered include formal approaches applied to various aspects of legal reasoning; machine learning and information retrieval methods applied to various natural language processing tasks; hybrid approaches to working on the frontier between symbolic and sub-symbolic methods; experimental inquiries into the interfaces between computational systems and legal systems; and network analysis in law.
Providing a comprehensive overview of recent advances in the field, the book will be of interest to all those working at the intersection between law and AI.
We are pleased to present the proceedings of JURIX 2023, the 36th International Conference on Legal Knowledge and Information Systems. Organised under the auspices of the Foundation for Legal Knowledge-Based Systems (unmapped: uri https://www.jurix.nl), the JURIX annual conference has become established as an internationally renowned forum for the exchange of ideas concerning theoretical models and practical applications in the broadly construed domain of artificial intelligence (AI) and law research, including legal information-system and legal-knowledge-system research. Traditionally, the field of AI & Law has been concerned with legal knowledge representation and engineering, logic, computational models of legal reasoning, legal data analytics, and legal information retrieval, but recent years have witnessed the rise to prominence of the application of machine-learning tools to legally relevant tasks. Furthermore, the constantly growing influence of AI on different spheres of society has prompted interest in the explainability, trustworthiness and responsibility of computational systems within the community. Indeed, since the first editions, JURIX conferences have become a venue for interdisciplinary research and the integration of methods, approaches, and conceptual frameworks from different branches of computer science and jurisprudence, including cognitive and socio-technical dimensions.
The 2023 edition of JURIX, which runs from 18 to 20 December, is hosted by the Maastricht Law and Tech Lab of Maastricht University (UM) in Maastricht, the Netherlands. We received 92 submissions by 248 authors from 32 countries for this edition, of which 18 were selected for publication as long papers (10 pages), 30 as short papers (6 pages), and 7 as demo papers (4 pages). This translates into an acceptance rate of 19.5% for long papers and 52.1% for long and short papers. This is the result of a balance between inclusiveness and a competitive and rigorous review process, which was carried out by a Program Committee composed of 80 recognised experts in the field. The accepted papers cover a wide range of topics, including formal approaches (case-based reasoning, deontic logic, formal argumentation, and other formalisms) applied to various aspects of legal reasoning, machine learning (large language models, including conversational) and information-retrieval methods applied to various natural-language processing tasks, hybrid approaches working on the frontiers between symbolic and sub-symbolic methods, experimental inquiries into the interfaces between computational systems and legal systems, and network analysis in law.
Three invited speakers from different and complementary areas honoured JURIX 2023 by kindly agreeing to deliver their keynote lectures: prof. Jaap Hage, prof. Piek Vossen, and prof. Iris van Rooij.
Jaap Hage is emeritus professor of Legal Theory at Maastricht University. His research is focused on legal logic, with an emphasis on the logic of rules, basic legal concepts (ontology of law), and social ontology. His publications include the following books: ‘Reasoning with Rules’ (1997), ‘Studies in Legal Logic’ (2005), and ‘Foundations and Building Blocks of Law’ (2018). Jaap was also one of the early participants in JURIX, as well as its program chair in 1995.
Piek Vossen is professor in Computational Lexicology at the VU (Free University) Amsterdam, and head of the Computational Linguistics & Text Mining Lab. His groundwork on cross-language conceptual modelling and interoperability led him to found the Global-WordNet-Association (GWA) for building WordNets in languages and connecting these through semantic graphs. Funded by the prestigious Spinoza prize (2013), he studied three foundations for language understanding: identity, reference, and perspective, resulting in the GraSP model as the ‘theory-of-mind’ for robots communicating with people within the Hybrid Intelligence gravitation programme.
Iris van Rooij is professor of Computational Cognitive Science at Radboud University, and principal investigator at the Donders Institute for Brain, Cognition and Behaviour. She is also a Guest Professor in the Department of Linguistics, Cognitive Science, and Semiotics, and the Interacting Minds Centre at Aarhus University, Denmark. Her research interests lie at the interface of psychology, philosophy and theoretical computer science, with a focus on the theoretical foundations of the computational explanations of cognition.
This year, continuing a tradition, JURIX has also been accompanied by satellite co-located events: four workshops (ALP 2023, AI & Access to Justice, AI4Legs-II, Annotation of Legal Data), and the Doctoral Consortium. ALP 2023 – the first workshop on AI, Law and Philosophy – aims to take the time for philosophical reflection on where we stand and what can be expected from the interaction between AI and Law. The workshop brings together young researchers and experienced experts interested in the interface between AI, law and philosophy, aiming to connect contemporary tracks of research in the fields of AI and Law to the tradition of computational legal theory. The AI & Access to Justice workshop aims to bring together lawyers, computer scientists, and social-science researchers to discuss findings and proposals around how AI might be used to improve access to justice, as well as how to hold AI models accountable for the public good. AI4Legs-II – the second Workshop on AI for Legislation – aims to discuss the state-of-the-art of the most advanced applications of AI in support of better regulation and law-making systems, applying interdisciplinary instruments drawn from the philosophy of law, constitutional law, legal informatics, AI & Law, computational linguistics, computer science, HCI, and legal design. The Annotation of Legal Data Workshop aims to provide a platform for in-depth discussions, knowledge sharing, demonstrations, and practical insights into the challenges and opportunities of annotating legal data. The workshop is designed to bring together researchers and experts interested in exploring the nuances of annotating legal data, focusing on topics such as software tools, annotator training, inter-rater and intra-rater reliability, and the publication of data and metadata.
Organising this conference edition would not have been possible without the support of many people and institutions. Special thanks are due to the local team of Gijs van Dijck, Jerry Spanakis, Konrad Kollnig, Aurelia Tamo-Larrieux, the Maastricht Law and Tech Lab, and the Law Events Office at Maastricht University. We would like to thank the workshop organisers for their proposals and for the effort involved in arranging the events. We owe our gratitude to Monica Palmirani, who kindly assumed the function of the Doctoral Consortium Chair, and we are particularly grateful to the 80 members of the Program Committee for their work in the rigorous review process and for their participation in the discussions. Finally, we would like to thank the former JURIX and ICAIL program chairs for their support and advice, and the current JURIX executive committee and steering committee members for their support, advice, and for taking care of all JURIX initiatives.
Giovanni Sileno, JURIX 2023 Program chair
Jerry Spanakis, JURIX 2023 Local co-chair
Gijs van Dijck, JURIX 2023 Local co-chair
In recent years a considerable amount of research has been devoted to formal theories of precedential constraint. In this note I consider a recent paper which explores the use of factor hierarchies in this connection. In that work it was shown both that cases constrained with the use of a hierarchy may be unconstrained if the hierarchy is flattened, and that cases unconstrained with a hierarchy may be constrained when the hierarchy is flattened. I discuss the nature of factor hierarchies and attempt to explain these results.
Bench-Capon argues that intermediate factors have no role to play in precedential constraint. We offer a constrasting perspective.
This article continues the research initiated in [1,2], which established a connection between Boolean classifiers and legal case-based reasoning. We relax the assumption that case bases are such that all situations have been decided in favour of the defendant or the plaintiff and we introduce an inductive strategy for assigning plausible outcomes to undecided cases. Using counterfactual reasoning, we propose a method to determine whether, at each step of the induction, a feature is a factor, i.e., it consistently favours a single outcome, or is irrelevant, i.e., it is does not favour any outcome, or is ambiguous, i.e., it favours opposite outcomes.
We extend the result model for precedent-based reasoning with incomplete case bases. In contrast to regular case bases, these consist of incomplete cases for which not all dimension values need to be specified, but rather each dimension is assigned a set of possible values. The outcome of cases then applies for each (combination of) the possible dimension values. Building on earlier proposed notions of justification and stability for incomplete focus cases, we introduce the notion of possible justification statuses, which are required to maintain consistency of the incomplete case base. We demonstrate how these theoretic notions can be applied in practice for human-in-the-loop decision support, discuss their computational complexity and provide efficient algorithms.
In recent years, a model of a fortiori argumentation, developed to describe legal reasoning based on precedent, has been successfully applied in the field of artificial intelligence to improve interpretability of data-driven decision systems. In order to make this model more broadly applicable for this purpose, work has been done to expand the knowledge representation on the basis of which it functions, as the original model accommodates only binary propositional information. In particular, two separate expansions of the original model emerged; one which accounts for non-binary input information, and a second which accommodates hierarchically structured reasoning. In the present work we unify these expansions to a single model, incorporating both dimensional and hierarchical information.
Data-driven AI systems can make the right decisions for the wrong reasons, which can lead to irresponsible behavior. The rationale of such machine learning models can be evaluated and improved using a previously introduced hybrid method. This method, however, was tested using synthetic data under ideal circumstances, whereas labelled datasets in the legal domain are usually relatively small and often contain missing facts or inconsistencies. In this paper, we therefore investigate rationales under such imperfect conditions. We apply the hybrid method to machine learning models that are trained on court cases, generated from a structured representation of Article 6 of the ECHR, as designed by legal experts. We first evaluate the rationale of our models, and then improve it by creating tailored training datasets. We show that applying the rationale evaluation and improvement method can yield relevant improvements in terms of both performance and soundness of rationale, even under imperfect conditions.
One way of reasoning with uncertainties in the context of law is to use probabilities. However, methods for reasoning about the probability of guilt in a court case requires us to specify a prior probability of guilt, which is the probability of guilt before any evidence is known. There is no accepted approach for specifying the prior probability of guilt but multiple solutions have been proposed. In this paper, we consider three approaches: a prior that is based on the population, a prior based on the number of agents that have similar opportunity as the suspect and a prior that represents a legal norm. For comparing and evaluating the approaches, we use an agent-based model as a ground truth in which all probabilities are known. With the data generated in the ground truth model, we investigate how the choice of prior influences the posterior probability of guilt for both guilty and innocent agents. Using a decision threshold, we can determine the effect of the three approaches on the rates of correct and incorrect convictions and acquittals. We find that the opportunity prior results in higher rates of both correct convictions and false convictions and requires more assumptions and access to data and knowledge than the legal prior and population prior.
This paper presents a formal model of specific reasoning patterns in conflict of laws (CoL). CoL arises when multiple countries have jurisdiction due to the diverse nationalities of the involved factors. When initiating legal action in one country, the question of which country’s substantial law to apply emerges, possibly involving the CoL regulations of other countries (in cases of transmission and renvoi). Moreover, parties contemplating legal action in a case falling under CoL often engage in a deliberation process known as forum shopping: determining which country’s CoL regulations would result in the most favorable outcome for them. Our model integrates deontic logic (specifically Input/Output logic) with proof theory and formal argumentation techniques to model both types of reasoning.
Given a common pool of facts and legal rules, Judges on a panel may form different justifications for decisions, which are then voted upon. It is clear that a Judge’s personal values and purposes play in developing their opinion, which is a form of teleological reasoning. The paper introduces the Value-based Formal Reasoning (VFR) framework, which describes how a Judge’s personal values can be used in the construction of a justification for a decision.
We design a framework for assisted normative reasoning based on Aristotelian diagrams and algorithmic graph theory which can be employed to address heterogeneous tasks of deductive reasoning. Here we focus on two problems of normative determination: we show that the algorithms used to address these problems are computationally efficient and their operations are traceable by humans. Finally, we discuss an application of our framework to a scenario regulated by the GDPR.
Case-based reasoning is known to play an important role in several legal settings. We focus on a recent approach to case-based reasoning, supported by an instantiation of abstract argumentation whereby arguments represent cases and attack between arguments results from outcome disagreement between cases and a notion of relevance. We explore how relevance can be learnt automatically with the help of decision trees, and explore the combination of case-based reasoning with abstract argumentation (AA-CBR) and learning of case relevance for prediction in legal settings. Specifically, we show that, for two legal datasets, AA-CBR with decision-tree-based learning of case relevance performs competitively in comparison with decision trees, and that AA-CBR with decision-tree-based learning of case relevance results in a more compact representation than their decision tree counterparts, which could facilitate cognitively tractable explanations.
Current theories of precedential constraint attempt to incorporate dimensions into the reasons for decisions. We argue that this is an unnecessary complication, and precedential constraint can be handled using only factors. In our account the role of dimensions is to organise facts, and their effect operates at the factor ascription level, prior to precedential constraint being applied.
This paper introduces a framework of conceptual structures allocated to statutory expressions during interpretive heuresis. Drawing from cognitive science research on conceptual structures, the study seeks to enhance existing computational models of legal reasoning across various domains. A comprehensive set of conceptual structures applicable in statutory interpretation is reconstructed. This framework increases awareness of potential interpretive options and contributes to the transparency of legal reasoning.
Although permissions are of crucial importance in several settings, they have garnered less attention within the deontic logic community than obligations. In previous work we showed how to reconstruct deontic logic using Kelsen’s quasi-causal conception of norms, restricting ourselves to the notion of obligation. Here we extend the account to permission, and show how to analyse the notion of strong permission through a Kelsenian lens. In our framework various forms of conflicts between obligation and permission are disentangled.
Existing approaches to modelling contracts often rely on deontic logic to reason about norms, and only treat time qualitatively. Using L4, a textual domain specific language (DSL) for the law, we offer a more operational interpretation of norms, based on states and transitions, that also accounts for the granular timing of events. In this paper, we present a higher-level rendering of the loan agreement from Flood & Goodenough in L4, and an accompanying operational semantics amenable to execution and static analysis. We also implement this semantics in Maude and show how this lets us visualize the execution of the loan agreement.
The EU AI Act is the first step toward a comprehensive legal framework for AI. It introduces provisions for AI systems based on their risk levels in relation to fundamental rights. Providers of AI systems must conduct Conformity Assessments before market placement. Recent amendments added Fundamental Rights Impact Assessments for high-risk AI system users, focusing on compliance with EU and national laws, fundamental rights, and potential impacts on EU values. The paper suggests that automating business process compliance can help standardize these assessments and outlines some methodological guidelines.
Data breaches and other security incidents are an emerging challenge in the digital era. The General Data Protection Regulation (GDPR) requires conducting an impact assessment to understand the effects of the breach, and to then notify authorities and affected individuals in certain cases. Communication of this information typically takes place via conventional mediums such as emails and forms on the websites of authorities, and is a manual process. To assist in developing tools to support data breach investigations, and to enable automated systems for assisting with breach assessments and GDPR compliance, we present a machine-readable specification for the representation and documentation of information related to data breaches and their communications. The specification uses current requirements from the GDPR obligations and authoritative guidelines. To represent information, it extends the Data Privacy Vocabulary (DPV) by introducing new concepts required for data breach relevant information.
Insurance Portfolio Analysis (IPA) is the process of comparing multiple, potentially overlapping insurance portfolios with an eye to detecting and characterizing redundancies and gaps in coverage. Unfortunately, insurees usually do not have the time or patience to compare policies from multiple insurance providers, and they often do not have the legal background needed to understand the complex legal wording of the contracts associated with those policies. Past work has shown that, by encoding policies as logic programs, it is possible to automatically determine compliance of specific claims with a policy’s terms and conditions. In this paper, we show that it is also possible to automatically analyze multiple-policy portfolios for gaps and redundancies by assessing coverage over multiple hypothetical claims. We formalize the process of IPA and show how to use well-studied techniques for logic program containment testing to automate the process.
The purpose limitation principle is a GDPR cornerstone that aims to minimize data processing risks by limiting instances of personal data access and usage. We model purpose as an action or sequences of actions and formalize action relationships to derive purpose-based permissions. Based on these permissions, we introduce a novel purpose-based access control model with a purpose matching algorithm illustrated with a healthcare research use case.
To exhaustively understand the impact of rule amendments and unforeseen cases on existing norms, it requires connecting their rule-based and case-based representations. However, those connections have not been explored in depth, especially for norms that are represented as soft constraints. This paper aims to explore the connection between constraint hierarchies and case models as representative formalisms of rule-based and case-based representations of soft-constraint norms respectively. To explore the connection, we express norm scopes and preferences in both formalisms as diagrams. Based on tightening and arranging diagrams, we found the translation of constraint hierarchies with one constraint per level into case models. This provides new insights into understanding prototypical cases made by rule-based soft-constraint norms.
Manual annotation is just as burdensome as it is necessary for some legal text analytic tasks. Given the promising performance of Generative Pretrained Transformers (GPT) on a number of different tasks in the legal domain, it is natural to ask if it can help with text annotation. Here we report a series of experiments using GPT-4 and GPT 3.5 as a pre-annotation tool to determine whether a sentence in a legal opinion describes a legal factor. These GPT models assign labels that human annotators subsequently confirm or reject. To assess the utility of pre-annotating sentences at scale, we examine the agreement among gold-standard annotations, GPT’s pre-annotations, and law students’ annotations. The agreements among these groups support that using GPT-4 as a pre-annotation tool is a useful starting point for large-scale annotation of factors.
Encoding legislative text in a formal representation is an important prerequisite to different tasks in the field of AI & Law. For example, rule-based expert systems focused on legislation can support laypeople in understanding how legislation applies to them and provide them with helpful context and information. However, the process of analyzing legislation and other sources to encode it in the desired formal representation can be time-consuming and represents a bottleneck in the development of such systems. Here, we investigate to what degree large language models (LLMs), such as GPT-4, are able to automatically extract structured representations from legislation. We use LLMs to create pathways from legislation, according to the JusticeBot methodology for legal decision support systems, evaluate the pathways and compare them to manually created pathways. The results are promising, with 60% of generated pathways being rated as equivalent or better than manually created ones in a blind comparison. The approach suggests a promising path to leverage the capabilities of LLMs to ease the costly development of systems based on symbolic approaches that are transparent and explainable.