Ebook: Legal Knowledge and Information Systems
In the research community and the legal industry, interest continues to grow in technological advances related to legal information, knowledge representation, engineering, and processing in areas such as computational and formal models of legal reasoning, legal data analytics and information retrieval, as well as in the application of machine learning techniques to legal tasks, and the evaluation of these systems.
This book presents the proceedings of JURIX 2024, the 37th International Conference on Legal Knowledge and Information Systems, held from 11 to 13 December in Brno, Czech Republic. The annual JURIX conference has become an international forum for academics and professionals to exchange knowledge and experiences at the intersection of law and artificial intelligence, and a total of 90 submissions were received for the conference. Following a rigorous review process, 21 long-paper submissions were selected for presentation and publication together with 17 short papers, representing an acceptance rate of 23% for long papers and 42% overall. An additional 16 submissions were accepted as posters. Topics covered included formal approaches applied to various aspects of legal reasoning; machine learning; natural language processing and information retrieval methods as applied to various legal tasks; hybrid approaches to working on the frontier between symbolic and sub-symbolic methods; experimental inquiries on the interface between computational systems and legal systems; and network analysis in law.
Covering a wide range of topics and providing an overview of recent advances, the book will be of interest to all those working at the intersection between artificial intelligence and law.
We are pleased to present the proceedings of JURIX 2024, the 37th International Conference on Legal Knowledge and Information Systems. Organised under the auspices of the Foundation for Legal Knowledge-Based Systems (https://www.jurix.nl), the JURIX annual conference has become established as an internationally renowned forum for the exchange of ideas concerning theoretical models and practical applications in the broadly construed domain of artificial intelligence (AI) and law research, including legal information systems and legal knowledge systems research. Traditionally, the field of AI and Law has been concerned with legal knowledge representation and engineering, logic, and computational models of legal reasoning, legal data analytics, and legal information retrieval, but recent years have witnessed the rise to prominence of the application of machine learning tools to legally relevant tasks. Furthermore, the constantly growing influence of AI on different spheres of social life has prompted interest in the explainability, trustworthiness, and responsibility of computational systems. Indeed, from the outset, the JURIX conferences have been a venue for interdisciplinary research, integrating methods, approaches, and conceptual frameworks from different branches of computer science and jurisprudence, including cognitive and socio-technical dimensions.
The 2024 edition of JURIX, which runs from 11 to 13 December, is hosted by the Masaryk University in Brno, Czechia. For this edition, we received 90 submissions from 220 authors from 32 countries. Following a rigorous review process, carried out by a programme committee of 84 experts recognised in the field, 21 submissions were selected for publication as long papers and 17 as short papers, representing a 23.3% acceptance rate for long papers (42.2% overall). An additional 17 submissions were ultimately accepted as posters. The accepted papers cover a wide range of topics, including formal approaches (case-based reasoning, deontic logic, formal argumentation, and other formalisms) applied to various aspects of legal reasoning, machine learning, natural language processing and information retrieval methods as applied to various legal tasks, hybrid approaches working on the frontiers between symbolic and sub-symbolic methods, experimental inquiries on interfaces between computational systems and legal systems, and network analysis in law.
Two invited speakers, Henry Prakken and Ondrej Bojar, have honoured JURIX 2024 by kindly agreeing to deliver keynote lectures. Henry Prakken is a professor of AI and Law in the Responsible AI group of the Department of Information and Computing Sciences at Utrecht University. His main research interests concern AI and Law and computational models of argumentation. He is a past president of the International Association for AI and Law (IAAIL), of the JURIX Foundation for Legal Knowledge-Based Systems and of the steering committee of the COMMA conferences on Computational Models of Argument. He is on the editorial board of several journals, including Artificial Intelligence and Law. Between 2017 and 2022 he was an associate editor of Artificial Intelligence. Ondrej Bojar is a professor and machine translation researcher at the Institute of Formal and Applied Linguistics at Charles University in Prague. His research interests include interactive machine translation, machine-translation post-editing, and psycholinguistic aspects of machine translation. He co-authored Moses, a system for statistical machine translation and is a long-term organizer of the WMT Conference on Machine Translation. We are very grateful to them for accepting our invitation and for their inspiring talks.
Continuing the tradition, JURIX is once more accompanied by satellite co-located events: three workshops (CLAIRvoyant, AI4Legs-III, AI4A2J), and the Doctoral Consortium. CLAIRvoyant: ConventicLE on Artificial Intelligence Regulation explores topics such as implications of the European AI Act for business (legal and ethical aspects), the impact of the AI Act on high risk sectors (e.g. healthcare, justice, transport, finance), and standards for legal AI. AI4Legs-III, the 3rd Workshop on AI for Legislation, discusses challenging questions involving the use of AI and technology in general to support the legislative process, with interdisciplinary instruments coming from the philosophy of law, constitutional law, legal informatics including AI and law, computational linguistics, computer science, HCI and legal design. AI4A2J, AI for Access to Justice, focuses on innovations in AI which help to close the access to justice gap – the majority of legal problems that go unsolved around the world fail because potential litigants lack the time, money, or ability to participate in court processes to solve their problems.
Organising this edition would not have been possible without the support of many people and institutions. Special thanks are due to the local organising team, Jakub Harasta, Tereza Novotna, and Jakub Misek, and the Institute of Law and Technology at Masaryk University. We would like to thank the workshop organisers for their proposals and for the effort involved in organising the events. We owe our gratitude to Monica Palmirani, who kindly assumed the function of the Doctoral Consortium Chair. We are particularly grateful to the 84 members of the Programme Committee for their work in the rigorous review process and for their participation in the discussions. Finally, we would like to thank Giovanni Sileno, Michal Araszkiewicz and Morgan Gray for their support and advice, and the current JURIX General Board for their support, advice, and for taking care of all JURIX initiatives.
Jaromir Savelka, JURIX 2024 Programme Chair
Jakub Harasta, JURIX 2024 Organisation Co-Chair
Tereza Novotna, JURIX 2024 Organisation Co-Chair
Jakub Misek, JURIX 2024 Organisation Co-Chair
We introduce a novel conceptual Case Frame model that represents the content of cases involving statutory interpretation within civil law frameworks, accompanied by an associated argument scheme enriched with critical questions. By validating our approach with a modest dataset, we demonstrate its robustness and practical applicability. Our model not only provides a structured method for analyzing statutory interpretation but also highlights the distinct needs of lawyers operating under statutory law compared to those reasoning with common law precedents. The model presented here is a step towards developing a hybrid Machine LearningArgumentation system that includes a module for constructing well-structured arguments from textual datasets
This paper implements Large Language Models (LLMs) to support the development of expert systems in the legal domain. Our goal is to tackle one of the most critical issues related to the creation and management of rule-based systems, being the knowledge representation bottleneck. To do so, we employ GPT-4o in combination with an existing expert system developed using the Prolog language, presenting a case study based on multiple tasks. The first task deals with the formalization of legal articles in Prolog given a stable knowledge base and factual structure, including the revision of existing facts. The second task deals with the implementation of case law for updating of the expert system. To do so, it identifies the influence of case law on the application of existing norms, creates new rules and implements them in the system. This paper contributes to the field of law and Artificial Intelligence (AI) by investigating the relationship between LLMs and legal expert systems, and exploring its usefulness for knowledge engineers, as well as contributing to the research of hybrid architectures combining generative and symbolic AI.
With the emergence of the digital transition, the need to control the processing of digital information has significantly increased. In the EU in particular, Law Enforcement Agencies (LEAs) are caused to exchange information. In recent years, many regulations have emerged to control data processing and exchange. Texts other than the GDPR, such as the “Law Enforcement Directive (LED)”, appeared to regulate specifically their data processing. And although many new formalisms have emerged to represent legal norms and rules, few are provided with a reasoning mechanism. The explainability of the results of systems using these formalisms also remains a major issue when dealing with critical decision situations. This paper aims to propose a framework to operate formal rules from regulations and guide a user in its decision process in a situation of data processing by LEAs by focusing on both the operability of the rules through reasoning and the explainability of the results from the reasoning.
Consistency of case bases is a way to avoid the problem of retrieving conflicting constraining precedents for new cases to be decided. However, in legal practice the consistency requirements for cases bases may not be satisfied. As pointedout in [6], a model of precedential constraint should take into account the hierarchical structure of the specific legal system under consideration and the temporaldimension of cases. This article continues the research initiated in [18,9], whichestablished a connection between Boolean classifiers and legal case-based reasoning. On this basis, we enrich the classifier models with an organisational structurethat takes into account both the hierarchy of courts and which courts issue decisions that are binding/constraining on subsequent cases. We focus on common lawsystems. We also introduce a temporal relation between cases. Within this enrichedframework, we can formalise the notions of overruled cases and cases decided perincuriam: such cases are not to be considered binding on later cases. Finally, weshow under which condition principles based on the hierarchical structure and onthe temporal dimension can provide an unambiguous decision-making process fornew cases in the presence of conflicting binding precedents.
Judicial discretion is a central question in both the theory and the practice of law, but it received very little explicit attention from AI&Law yet. What is more, it is often considered as the limitation of what can be formalized in law, which might have serious implications for the future of computational law. In this paper, we introduce a deontic logic extended with nuanced permissions pursuing to grasp the characteristics, normative framework of and reasoning process in the discretionary decision-making of the judge. We illustrate the modeling capacity of the Discretionary Judicial Decision Logic (DJDL) by formalizing examples from an area of law where discretion plays an openly crucial role: family law, more precisely child custody cases.
Factors are a foundational component of legal analysis and computational models of legal reasoning. These factor-based representations enable lawyers, judges, and AI and Law researchers to reason about legal cases. In this paper, we introduce a methodology that leverages large language models (LLMs) to discover lists of factors that effectively represent a legal domain. Our method takes as input raw court opinions and produces a set of factors and associated definitions. We demonstrate that a semi-automated approach, incorporating minimal human involvement, produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can.
Despite some improvements in compliance metrics after the implementation of the European General Data Protection Regulation (GDPR), privacy policies have become longer and more ambiguous. They often fail to fully meet GDPR requirements, thus leaving users without a reliable way to understand how their data is processed. We present a novel corpus composed by 30 privacy policies of online platforms and a new set of annotation guidelines, to assess the level of comprehensiveness of information. We focus on the processed categories of data, classifying each clause either as fully informative or as insufficiently informative. In our experimental evaluation, we perform 6 different classification and detection tasks, comparing BERT models and generative Large Language Models.
This study introduces the Collision Clarification Generator (CCG), a Large Language Model-based system designed to assist in documenting traffic accidents. The CCG comprises three modules: Questioning, Information Extraction, and Accident Sequence Generation, which collectively streamline the process of gathering and structuring accident information. The system employs predefined question templates and a standardized Traffic Accident Record Format (TARF) to ensure comprehensive data collection.
Evaluation of the CCG involved both human assessment and LLM-based automatic evaluation. Results showed an F1 score of 0.909 in human evaluation, and scores exceeding 7 out of 10 for accuracy and completeness in LLM-based assessment. These findings demonstrate the CCG’s effectiveness in accurately documenting accident information, potentially facilitating subsequent legal and insurance processes.
Computable legal contracts offer a formal structure and semantics that can help identify incompatibilities among clauses, such as clauses that will never be used or clauses whose simultaneous application is impossible. In this paper we study a methodology for spotting incompatibilities in contracts written in Stipula, a domain-specific language for legal contracts. Drawing on real case laws, we identify recurring incompatible code patterns and propose techniques for integrating their analysis in the Stipula toolchain.
We report on a study undertaken to analyse AI performance on two tasks involved in automating processing of cases from the European Court of Human Rights: classification of legal case outcomes and keyword prediction. Results show variation across Articles and Court levels, and challenge the common viewpoint that larger legal corpora combined with larger models will be sufficient for effective automated legal reasoning. Legal summarisation, as reflected with keyword prediction, proved more challenging than outcome classification. Our results suggest the need for improved case law retrieval and understanding of contextual factors for effective automated legal decision support.
We modify the restraining bolt technique, originally designed for safe reinforcement learning, to regulate agent behavior in alignment with social, ethical, and legal norms. Rather than maximizing rewards for norm compliance, our approach minimizes penalties for norm violations. We demonstrate in case studies the effectiveness of our approach in capturing benchmark challenges in normative reasoning like contrary-to-duty obligations, exceptions, and temporal obligations.
This paper reports on an experiment on using case-based reasoning in Dutch administrative law. The use case is decision-support for human medical experts at the Dutch Central Office of Driving Certification who have to decide whether a citizen who applies for a driving licence is fit to drive. Case-based reasoning is investigated for this purposes because of its potential advantages over machine-learning approaches as regards transparency and explainability. Both traditional case-based reasoning, AI & Law models of precedential constraint and their combination are investigated on predictive accuracy relative to a large case base with more than 30.000 cases. A combined model is found to have the highest accuracy. The results indicate that human-in-the-loop support with a tool based on the combined model may be feasible, but whether this is indeed so requires further investigation.
In the legal domain, research efforts are being made to enhance precedent retrieval by exploiting features based on meta-data, catchphrases, citations, sentences, paragraphs, etc. It is reported in the legal domain that the text surrounding a citation provides information to identify the referenced judgment and supplies additional information about the referenced judgment and its connection to the formulated argument. In this paper, we have exploited the resourcefulness of the text surrounding the citation to improve the document representation of the referenced judgment. Experiments conducted on Indian court judgments show that the proposed Preceding citation Anchor Text (PAT)-based approach captures certain nuances that are not captured by the text present in the referenced judgment, indicating that there is a scope to exploit PAT to improve the performance of precedent retrieval systems.
Legal intake, the process of finding out if an applicant is eligible for help from a free legal aid program, takes significant time and resources. In part this is because eligibility criteria are nuanced, open-textured, and require frequent revision as grants start and end. In this paper, we investigate the use of large language models (LLMs) to reduce this burden. We describe a digital intake platform that combines logical rules with LLMs to offer eligibility recommendations, and we evaluate the ability of 8 different LLMs to perform this task. We find promising results for this approach to help close the access to justice gap, with the best model reaching an F1 score of .82, while minimizing false negatives.
Mediation is a dispute resolution method featuring a neutral third-party (mediator) who intervenes to help the individuals resolve their dispute. In this paper, we investigate to what extent large language models (LLMs) are able to act as mediators. We investigate whether LLMs are able to analyze dispute conversations, select suitable intervention types, and generate appropriate intervention messages. Using a novel, manually created dataset of 50 dispute scenarios, we conduct a blind evaluation comparing LLMs with human annotators across several key metrics. Overall, the LLMs showed strong performance, even outperforming our human annotators across key dimensions. Specifically, in 62% of the cases, the LLMs chose intervention types that were rated as better than or equivalent to those chosen by humans. Moreover, in 84% of the cases, the intervention messages generated by the LLMs were rated as better than or equal to the intervention messages written by humans. LLMs likewise performed favourably on metrics such as impartiality, understanding and contextualization. Our results demonstrate the potential of integrating AI in online dispute resolution (ODR) platforms.
In this paper we build on a formal model of reasoning with dimensions to analyze data from the COMPAS program—a widely used and studied tool for predicting recidivism. We extend the underlying theory of the model by introducing a notion of consistency and apply it to assess whether COMPAS follows this principle in its risk assessments and supervision level recommendations. Our analysis yields three key findings. First, the program’s risk score assignments appear highly inconsistent, but we argue this is due to important input features missing from the dataset. Second, the program’s recommended supervision levels do exhibit a high degree of consistency. Third, we uncover errors in the dataset related to the conversion of raw scores to decile scores. These findings cast doubts on previous studies conducted on the COMPAS dataset, and demonstrate the need for evaluation studies like ours.
In this work, we propose a hybrid approach for legal norm retrieval that combines the structural information modeled in knowledge graphs with the textual content of legal documents. Our method utilizes the intricate relationships within the Japanese Civil Code, supplemented by relevant precedents, references, commentary, and mentions in legal textbooks on Japanese law. We assess the effectiveness of our approach in Task 3 of the Competition on Legal Information Extraction/Entailment (COLIEE), using both a transformer model and BM25 as a more explainable retrieval model. In our experiments, we examine the contributions of the different legal document types, showing the positive impact of the knowledge graph and auxiliary information.
With the adoption of the AI Act, fundamental rights impact assessment (FRIA) processes become highly relevant for both public and private institutions; yet such processes can be challenging, especially for small- to medium-sized organizations. One recent research that piloted a partial automation of FRIA is Anticipating Harms of AI (AHA!), relying on the use of a large language model and crowd-sourcing; unfortunately, the paper provides limited insights upon its internal working. Therefore, this work presents AFRIA, a processing pipeline that performs specific aspects of FRIA, conceived with AHA! as inspiration. In order to assess to what extent AFRIA is a successful reconstruction of AHA!, we analyzed the percentage of meaningful harms that AFRIA generates and the distribution of harm categories, and compared it to AHA!’s results, finding a satisfactory convergence. Beyond inspiration from AHA!, we also looked into the requirements of the AI Act and scholarly critique to make AFRIA more meaningful for identifying impacts on fundamental rights, targeting categories of human rights impacts, potential harm mitigation measures, and the severity and likelihood of the harms. The results show opportunities, but also limitations in what type of support this technology can bring.
The development of autonomous vehicles (AVs) requires a comprehensive understanding of both explicit and implicit traffic rules to ensure legal compliance and safety. While explicit traffic laws are well-defined in statutes and regulations, implicit rules derived from judicial interpretations and case law are more nuanced and challenging to extract. This research investigates the potential of Large Language Models (LLMs), particularly GPT-4o, in automating the extraction of implicit traffic rules from judicial decisions. By utilizing various prompt engineering techniques, including Standard Prompts, Chain-of-Thought (CoT), Chain-of-Instructions (CoI), and Layer-of-Thought (LoT) prompts, this study aims to assess the effectiveness of GPT-4o in identifying normative content relevant to specific traffic laws. The contributions of this paper include an assessment of LLMs for legal text processing, the automation of implicit rule extraction, and the development of a scalable framework that can continuously update as new legal precedents emerge. The results indicate promising avenues for integrating automated normative extraction in AV systems, improving both the safety and legal compliance of autonomous driving technologies.
In recent years, Brazil’s federal judicial system has embraced digitalization, making a large amount of legal process information available to citizens and legal experts. Despite the advances, a significant portion of the data produced and stored in legal systems presents itself in the form of natural language text, including numerous petitions and legal decisions. This creates barriers for automated querying and analysis of legal process data, especially considering the importance of the content of legal decisions in these tasks. In this paper, we report on an automated semantic annotation pipeline for judicial decision texts obtained from the official National Uniformization Panel (TNU) jurisprudence website. NLP models are trained in a few-shot learning context with a training set annotated by legal experts. The semantic annotation approach is evaluated using precision and recall. The results of the semantic annotation are produced into RDF-based nanopublications aligned with a reference domain ontology. The annotations are accompanied with provenance information including identification of the machine learning model used.
Argumentation is often an attempt to resolve disagreement, but it is not always possible to reach a resolution. This is illustrated in law where multi-judge trials often end with a split decision. Not only do the judges disagree as to outcome (dissenting opinions), but also as to the reasons for a given outcome (concurring opinions). These disagreements can be explained in terms of different values held by the judges concerned. But while the role of values in determining which arguments are accepted has been widely explored, values can also determine which arguments can be constructed. The paper provides an analysis of this phenomenon.
One can find various temporal deontic logics in literature, most focusing on discrete time. The literature on real-time constraints and deontic norms is much sparser. Thus, many analysis techniques which have been developed for deontic logics have not been considered for continuous time. In this paper we focus on the notion of conflict analysis which has been extensively studied for discrete time deontic logics. We present a sound, but not complete algorithm for detecting conflicts in timed contract automata and prove the correctness of the algorithm, illustrating the analysis on a case study.