Ebook: Compendium of Neurosymbolic Artificial Intelligence
If only it were possible to develop automated and trainable neural systems that could justify their behavior in a way that could be interpreted by humans like a symbolic system. The field of Neurosymbolic AI aims to combine two disparate approaches to AI; symbolic reasoning and neural or connectionist approaches such as Deep Learning. The quest to unite these two types of AI has led to the development of many innovative techniques which extend the boundaries of both disciplines.
This book, Compendium of Neurosymbolic Artificial Intelligence, presents 30 invited papers which explore various approaches to defining and developing a successful system to combine these two methods. Each strategy has clear advantages and disadvantages, with the aim of most being to find some useful middle ground between the rigid transparency of symbolic systems and the more flexible yet highly opaque neural applications. The papers are organized by theme, with the first four being overviews or surveys of the field. These are followed by papers covering neurosymbolic reasoning; neurosymbolic architectures; various aspects of Deep Learning; and finally two chapters on natural language processing. All papers were reviewed internally before publication. The book is intended to follow and extend the work of the previous book, Neuro-symbolic artificial intelligence: The state of the art (IOS Press; 2021) which laid out the breadth of the field at that time.
Neurosymbolic AI is a young field which is still being actively defined and explored, and this book will be of interest to those working in AI research and development.
The field of Neurosymbolic AI, also known as Neuro-Symbolic or Neural-Symbolic AI, aims to unite two disparate approaches to AI, symbolic reasoning and neural networks. Symbolic reasoning is often mathematically and logically explicit, and therefore suitable for formal deductive reasoning applications and knowledge representation. Neural or connectionist approaches such as Deep Learning, on the other hand, usually have inductive strategies, making them well suited for statistical and empirical tasks such as classification and prediction. Research on uniting both of these types of AI has led to many innovative techniques that extend the boundaries of both disciplines.
It is important to emphasize at the very beginning that Neurosymbolic AI is a young field, still actively being defined and explored. Many recent surveys such as [1, 2, 3] have sought to define the discipline but efforts are in many ways still scattered. To give an example, one of the challenges in this domain is that, due to the overwhelming conceptual difficulty of the task, it is never entirely clear a priori what a successful system should look like. Many adopt the techniques from neural networks to improve symbolic tasks, where statistics such as accuracy are very relevant for evaluating predictions and classifications. However, when we do the reverse and try to use symbolic systems to perform neural operations, these metrics are potentially redundant or meaningless, like asking how accurate a formal proof is. Considering this difference we can see quite clearly one of the largest distinguishing factors between different approaches to integration, namely: are we attempting to do symbolic reasoning with neural networks, performing neural network related tasks using symbols, using one type of AI to augment a task that is purely within the domain of the other, or some novel combination of the two? The answer to this question will guide how such a system is designed, developed, and evaluated, and how it could compare with others. In this book we will see examples of each of these approaches.
Broadly speaking, the hope of many of these efforts is to find some useful middle-ground between the rigid transparency of symbolic systems and more flexible yet highly opaque neural applications. Each strategy has clear advantages and disadvantages that would be useful occasionally to combine. Symbolic reasoning, for example, is usually entirely transparent and system behavior can be inspected at any level of execution. This useful characteristic is largely absent in neural systems, which often have a black-box style of unsupervised behavior. Yet the transparency comes at a cost. In order to achieve this interpretability, the symbolic systems must be manually written by human engineers, which is often extremely time consuming. If it were possible to somehow have both automated and trainable neural systems that can also justify their behavior in a way that can be interpreted by humans like a symbolic system, this would be in effect a best of both worlds scenario. This goal, along with many others in neurosymbolic reasoning, is of course a long ways off, but it is clear that advances in this direction would greatly serve the cause of AI in general.
This book is intended to follow and extend on the work in the previous book [4] which begun to lay out the current breadth of the field. The chapters in the current book are organized by theme, each chapter within the theme appearing in alphabetical order by surname of the first author. The first four chapters are overview or survey papers in the field, the following three discuss the fundamentals of neurosymbolic reasoning, following that are five chapters about neurosymbolic architectures, then we have four chapters about symbolic reasoning using Deep Learning, five about symbolic inference with Deep Learning, three chapters about improving Deep Learning with symbolic methods, four on explainable Deep Learning, and finally two chapters on natural language processing.
Chapters were all selected as invited contributions from an open call for abstract submissions that would combine previous works by authors. Papers were reviewed internally before publication. We thank all individuals who contributed to the publication of this book.
References
[1] Bader S, Hitzler P. Dimensions of neural-symbolic integration – A structured survey. In: Artëmov SN, Barringer H, d’Avila Garcez AS, et al., editors. We Will Show Them! Essays in Honour of Dov Gabbay, Volume One. College Publications; 2005. p. 167–194.
[2] Garcez Ad, Bader S, Bowman H, et al. Neural-symbolic learning and reasoning: a survey and interpretation. Neuro-Symbolic Artificial Intelligence: The State of the Art. 2022;342(1).
[3] Sarker MK, Zhou L, Eberhart A, et al. Neuro-symbolic artificial intelligence: Current trends. AI Communications. 2021;34(3):197–209.
[4] Hitzler P, Sarker MK, editors. Neuro-symbolic artificial intelligence: The state of the art. (Frontiers in Artificial Intelligence and Applications; Vol. 342). IOS Press; 2021. Available from: https://doi.org/10.3233/FAIA342.
We propose that symbols are first and foremost external communication tools used between intelligent agents that allow knowledge to be transferred in a more efficient and effective manner than having to experience the world directly. But, they are also used internally within an agent through a form of self-communication to help formulate, describe and justify subsymbolic patterns of neural activity that truly implement thinking. Symbols, and our languages that make use of them, not only allow us to explain our thinking to others and ourselves, but also provide beneficial constraints (inductive bias) on learning about the world. In this paper we present relevant insights from neuroscience and cognitive science, about how the human brain represents symbols and the concepts they refer to, and how today’s artificial neural networks can do the same. We then present a novel neuro-symbolic hypothesis and a plausible architecture for intelligent agents that combines subsymbolic representations for symbols and concepts for learning and reasoning. Our hypothesis and associated architecture imply that symbols will remain critical to the future of intelligent systems NOT because they are the fundamental building blocks of thought, but because they are characterizations of subsymbolic processes that constitute thought.
Ontologies are used in various domains, with RDF and OWL being prominent standards for ontology development. RDF is favored for its simplicity and flexibility, while OWL enables detailed domain knowledge representation. However, as ontologies grow larger and more expressive, reasoning complexity increases, and traditional reasoners struggle to perform efficiently. Despite optimization efforts, scalability remains an issue. Additionally, advancements in automated knowledge base construction have created large and expressive ontologies that are often noisy and inconsistent, posing further challenges for conventional reasoners. To address these challenges, researchers have explored neuro-symbolic approaches that combine neural networks’ learning capabilities with symbolic systems’ reasoning abilities. In this chapter, we provide an overview of the existing literature in the field of neuro-symbolic deductive reasoning supported by RDF(S), the description logics EL and ALC, and OWL 2 RL, discussing the techniques employed, the tasks they address, and other relevant efforts in this area.
We propose a set of architectural patterns that describe a large variety of neuro-symbolic systems. As in other areas of computer science (knowledge engineering, software engineering, ontology engineering, process mining and others), such design patterns provide a unified vocabulary to describe a large variety of systems, help to systematise the literature, clarify which combinations of techniques serve which purposes, and encourage re-use of software components. We have validated our set of compositional design patterns against a large body of recent literature, and we apply them to a number of systems described in the different sections of this volume.
Sections 1.1-1.7 of this chapter are based on our earlier publications [1] and [2], the new contributions of this chapter are in section 1.5 and 1.6.
In line with the general trend in artificial intelligence research to create intelligent systems that combine learning and symbolic techniques (a.k.a. neuro-symbolic systems), a new sub-area has emerged that focuses on combining machine learning (ML) components with techniques developed by the Semantic Web (SW) community – Semantic Web Machine Learning (SWeML for short). Due to the rapid growth of this area and its impact on several communities in the last two decades, there is a need to better understand the space of these SWeML Systems, their characteristics, and trends. Of particular interest are the emerging variations of processing patterns used in these systems in terms of their inputs/outputs and the order of the processing units. While several such neuro-symbolic system patterns were identified previously from a large number of papers, there is currently no insight into their adoption in the field, e.g., about the completeness of the introduced system patterns, or about their usage frequency. To fill that gap, we performed a systematic study and analyzed nearly 500 papers published in the last decade in this area, where we focused on evaluating the type and frequency of such system patterns. Overall we discovered 41 different system patterns, which we categorized into six pattern types. In this chapter we detail these pattern types, exemplify their use in concrete papers and discuss their characteristics in terms of their semantic and machine learning modules.
In this paper, we motivate three approaches for integrating symbolic logic systems and deep learning methods. First, we consider whether the hidden layers of neural networks can be used to represent and reason about Boolean functions via so-called Tractable Circuits. Second, we discuss a method for encoding domain knowledge into the training and outputs of neural networks via so-called MultiplexNets. Finally, we show how we can instantiate deep learning architectures that perform exact function learning, via so-called Signal Perceptrons.
While neuro-inspired and symbolic artficial intelligence have for a long time been considered ideal complements, approaches to hybridize these concepts often lack an unifying grand theory. The way the philosophical concept of constructivism has been adapted for eductional purposes, however, provides a fruitful source of inspiration for this purpose. To this end, we have been developing a framework termed Constructivist Machine Learning, which applies constructivist learning principles and exploits meta data on the grounds of Stachowiak’s General Model Theory in order to bridge the gap between neuro-spired and symbolic approaches. In this chapter, we summarize our previous work in order to introduce the reader to the most important ideas and concepts.
Symbolic knowledge is vital in various human cognitive functions, including reasoning, skill acquisition, and communication. Its integration is also essential for creating AI with human-like capabilities, such as robustness, creativity, and interpretability. Nevertheless, current machine learning approaches still dominantly emphasize learning from large data sets, struggling to effectively and scalably incorporate symbolic knowledge. This leads to fundamental limitations, such as brittle results when faced with complex or novel concepts, and difficulty in understanding or explaining the decision processes of models. Past attempts at integrating symbolic information with neural networks frequently rely on manually created knowledge bases that are defined in specific configurations, thereby impeding their generalizability to new applications and domains. This chapter seeks to introduce interaction and co-evolving mechanisms between neural models and symbolic knowledge bases. It starts from constructing a panoramic learning framework for learning with all experiences (data, rules, knowledge graphs, etc.). Subsequently, it delves into a novel inversion problem of extracting symbolic knowledge from black-box neural models. Finally, based on the components mentioned above, the chapter will explore a blueprint of a lifelong neural-symbolic system that accommodates human intervention.
This chapter proposes neuro-causal models, a novel neuro-symbolic model architecture that uses a synthesis of deep generative models and causal graphical models to automatically infer higher level symbolic information from lower level “raw features”, while also allowing for rich relationships among the symbolic variables. Neuro-causal models retain the flexibility of modern deep neural network architectures while simultaneously capturing statistical semantics such as identifiability and causality, which are important to discuss ideal, target representations and their tradeoffs. We consider a general setting for this problem: No assumptions are placed on these relationships, and the number of hidden variables, their state spaces, and their relationships are presumed unknown. The primary objective is to provide explicit conditions under which all of this can be recovered uniquely, and to develop practical algorithms for learning these representations from data.
Commonsense reasoning is an attractive test bed for neuro-symbolic techniques, because it is a difficult challenge where pure neural and symbolic methods fall short. In this chapter, we review commonsense reasoning methods that combine large-scale knowledge resources with generalizable neural models to achieve both robustness and explainability. We discuss knowledge representation and consolidation efforts that harmonize heterogeneous knowledge. We cover representative neuro-symbolic commonsense methods that leverage this commonsense knowledge to reason over questions and stories. The range of reasoning mechanisms includes procedural reasoning, reasoning by analogy, and reasoning by imagination. We discuss different strategies to design systems with native explainability, such as engineering the knowledge dimensions used for pretraining, generating scene graphs, and learning to produce knowledge paths.
The topic of this chapter are cognitive neuroarchitectures, which have been developed since the mid-eighties in connectionism as theoretical models to explain, based on empirical-experimental data, as neurophysiologically plausible as possible perceptual and linguistic performances of self-organizing neuronal networks in the human brain that are related to the binding problem. Thus, neurocognition can be viewed as organized by integrative (phase) synchronization mechanisms that orchestrate the flow of neurocognitive information in self-organizing networks with positive and/or negative feedback loops in subcortical and cortical areas of the brain. This dynamic perspective on cognition contributes significantly to bridging the gap between the discrete, abstract symbolic description of propositions in the mind and their continuous, numerical, and dynamic modeling in terms of cognitive neuroarchitectures in connectionism. This dynamic binding mechanism in connectionist cognitive neuroarchitectures thus has the advantage of enabling more accurate modeling of cognitive processes by conceptualizing these neuroarchitectures as nonlinear dynamical systems. This means that neurocognition is modeled using a neurodynamics in abstract n-dimensional phase spaces in the form of nonlinear vector fields or vector flows.
The prediction made by a learned model is rarely the end outcome of interest to a given agent. In most real-life scenarios, a certain policy is applied on the model’s prediction and on some relevant context to reach a decision. It is the (possibly temporally distant) effects of this decision that bring value to the agent. Moreover, it is those effects, and not the model’s prediction, that need to be evaluated as far as the agent’s satisfaction is concerned. The formalization of such scenarios naturally raises certain questions: How should a learned model be integrated with a policy to reach decisions? How should the learned model be trained and evaluated in the presence of such a policy? How is the training affected in terms of the type of access that one has on the policy? How can the policy be represented and updated in a way that is cognitively compatible with a human, so that it offers an explainable layer of reasoning on top of the learned model?
This chapter offers a high-level overview of past work on the integration of modular reasoning with autodidactic learning and with user-driven coaching, as it applies on neural-symbolic architectures that combine sequentially a neural module with an arbitrary symbolically represented (and possibly non-differentiable) policy. In this context, the chapter offers responses to the questions above when the policy can be reasoned with only in a deductive manner, or in a deductive and an abductive manner. It further discusses how the policy can be learned / updated in an elaboration-tolerant and cognitively-light manner through machine coaching, and highlights the connections of the dialectical coaching process with the central role that argumentation plays in human reasoning.
“For the entire nervous system is nothing but a system of paths between a sensory terminus a quo and a muscular, glandular, or other terminus ad quem.” William James (Principles of Psychology, 1890, p. 108, italics by James)
This chapter provides a larger perspective and background on the neural blackboard architectures and the underlying theory that have been developed over the last decades. The aim of these is to model compositional ‘symbolic’ processing, e.g. as found in language, in a neural manner. Neural blackboard architectures achieve this with a form of ‘logistics of access’ that is different from symbolic architectures. In particular, conceptual representations remain ‘in situ’ and hence content addressable in any compositional structure of which they are a part. ‘Symbolic’ processing then consists of the creation and control of temporal connection paths in neural blackboards that possess a ‘small world’ connection structure. In language, a connection path provides the intrinsic structure of a sentence. In this way, arbitrary sentence structures can be created and processed, and simulations can reproduce and predict brain activity observed in sentence processing. Next to presenting an overview, the chapter will discuss theoretical and modeling foundations and compare them with forms of symbolic processing as found in other AI architectures.
Knowledge bases are now first-class citizens of the Web. Circa 50% of the 3.2 billion websites in the 2022 crawl of Web Data Commons contains knowledge base fragments in RDF. The 82 billion assertions known to exist in these websites are complemented by a roughly comparable number of triples available in dumps. As this data is now the backbone of a number of applications, it stands to reason that machine learning approaches able to exploit the explicit semantics exposed by RDF knowledge bases must scale to large knowledge bases. In this chapter, we present approaches based on continuous and symbolic representations that aim to achieve this goal by addressing some of the main scalability bottlenecks of existing class expression learning approaches. While we focus on the description logic ALC , the approaches we present are far from being limited to this particular expressiveness.
Numerous large knowledge graphs, such as DBpedia, Wikidata, Yago and Freebase, have been developed in the last decade, which contain millions of facts about various entities in the world. These knowledge graphs have proven to be incredibly useful for intelligent Web search, question understanding, in-context advertising, social media mining, and biomedicine. As some researchers have pointed out, a knowledge graph is not just a graph database, but it should have a layer of conceptual knowledge, which is usually represented as a set of first-order rules. However, it is challenging to automatically extract first-order rules from large knowledge graphs. In particular, traditional models are usually unable to handle rule learning in large knowledge graphs. This chapter aims to present state-of-the-art techniques and models on learning first-order rules using representation learning. By first recalling some basics of rule learning in knowledge graphs, we will introduce useful techniques of embedding-based rule learning through major models such as RLvLR and TyRuLe, which embed paths in knowledge graphs into latent spaces for candidate rule search. Then we evaluate the rule learning efficiency and the quality of automatically learned rules by applying them in link prediction. Before the chapter is concluded, we will also discuss some future research problems in the area.
Lifted Relational Neural Networks (LRNNs) were introduced in 2015 [1] as a framework for combining logic programming with neural networks for efficient learning of latent relational structures, such as various subgraph patterns in molecules. In this chapter, we will briefly re-introduce the framework and explain its current relevance in the context of contemporary Graph Neural Networks (GNNs). Particularly, we will detail how the declarative nature of differentiable logic programming in LRNNs can be used to elegantly capture various GNN variants and generalize to novel, even more expressive, deep relational learning concepts. Additionally, we will briefly demonstrate practical use and computation performance of the framework.
This chapter provides an overview of ERIC (Extracting Relations Inferred from Convolutions), a solution to the extraction of explanations and human-comprehensible knowledge from Convolutional Neural Networks (CNNs). ERIC reduces the behaviour of one or more convolutional layers to a discrete logic program over a set of atoms, each corresponding to individual convolutional kernels. Extracted programs yield performances that correlate with those of the original model. When the logic rules are analysed alongside the data as a visual concept learner, ERIC has demonstrated the discovery of relevant concepts when applied to classification tasks, including those in fields with specialised knowledge such as radiology. Concepts with sharper edges seem to have a positive influence on the fidelity of extracted programs to the extent that ERIC was able to yield high fidelity on MNIST and in a traffic sign classification task of up to 43 classes. Also, extracted concepts may be transferred to a different CNN trained on a related but different problem in the same domain. For example, concepts identified for pleural effusion were transferable to a COVID-19 classification task. Also in the medical domain, ERIC has demonstrated the capability of identifying concepts used by CNNs that are not justified anatomically or used by medical doctors in their decision making. This chapter also briefly reviews Elite Backpropagation (EBP), which trains CNNs so that each class is associated with a small set of elite kernels and improves the performance of ERIC by inducing more compact rules while maintaining high fidelity.
Knowledge graphs (KGs) are inherently incomplete because of incomplete world knowledge and bias in what is the input to the KG. Additionally, world knowledge constantly expands and evolves, making existing facts deprecated or introducing new ones. However, we would still want to be able to answer queries as if the graph were complete. In this chapter, we will give an overview of several methods which have been proposed to answer queries in such a setting. We will first provide an overview of the different query types which can be supported by these methods and datasets typically used for evaluation, as well as an insight into their limitations. Then, we give an overview of the different approaches and describe them in terms of expressiveness, supported graph types, and inference capabilities.
Integrations of case-based reasoning (CBR) with neural approaches are appealing because of their complimentary characteristics. This chapter presents research on neuro-symbolic integrations to support CBR, to reduce knowledge engineering and improve performance for CBR systems. It summarizes three strands of research: First, on extracting features for case retrieval from deep neural networks to use in concert with expert-generated features, second, on applying neural networks to learn to adapt the solutions of retrieved cases to fit new situations, and third, on harmonizing similarity learning with case adaptation learning, in order to focus retrieval on adaptable cases. It summarizes strengths, weaknesses and tradeoffs of these approaches, and points to future challenges for neuro-CBR integrations.
Knowledge about space and time is necessary to solve problems in the physical world. Spatio-temporal knowledge, however, is required beyond interacting with the physical world, and is also often transferred to the abstract world of concepts through analogies and metaphors. As spatial and temporal reasoning is ubiquitous, different attempts have been made to integrate this into AI systems. In the area of knowledge representation, spatial and temporal reasoning has been largely limited to modeling objects and relations and developing reasoning methods to verify statements about objects and relations. On the other hand, neural network researchers have tried to teach models to learn spatial relations from data with limited reasoning capabilities. Bridging the gap between these two approaches in a mutually beneficial way could allow us to tackle many complex real-world problems. In this chapter, we view this integration problem from the perspective of Neuro-Symbolic AI. Specifically, we propose a synergy between logical reasoning and machine learning that will be grounded on spatial and temporal knowledge. A (symbolic) spatio-temporal knowledge base and a base of possibly grounded examples could provide a dependable causal seed upon which machine learning models could generalize. Describing some successful applications, remaining challenges, and evaluation datasets pertaining to this direction is the main topic of this contribution.
Recently there has been a lot of focus on developing deep learning models for symbolic reasoning tasks. One such task involves solving combinatorial problems, which can be viewed as instances of a constraint satisfaction problem, albeit with unknown constraints. The task of the neural model is then to discover the unknown constraints using the training data consisting of many solved instances. There are broadly two approaches for learning the unknown constraints: the first approach creates a purely neural model that maps an input puzzle directly to its solution, thereby representing the constraints implicitly in the model’s weights [1, 2, 3], and the second approach invokes a symbolic reasoner, such as an Integer Linear Program (ILP) solver, learning the constraints explicitly in the solver’s language, e.g., linear inequalities for an ILP solver [4, 5]. In this chapter, we discuss about both the implicit and explicit approaches. In the implicit approach, we review three neural architectures for solving combinatorial problems, viz., Neural Logic Machines (NLM) [2], Recurrent Relational Networks (RRN) [1], and its extensions proposed by Nandwani et al. [3] to tackle output space invariance (solving 16×16 sudoku after training on 9×9 sudokus). Under the second approach having explicit representation of constraints, we present two methods (CombOptNet [5] and ILP–Loss[4]) for end-to-end training of a Neural-ILP architecture: a neuro–symbolic model where a neural perception layer is followed by an ILP layer to represent reasoning. CombOptNet proposed by Paulus et al. [5] trains slowly owing to a call to an ILP solver in each learning iteration. On the other hand, ILP–Loss proposed by Nandwani et al. [4] is solver–free during training and thus much more scalable. In the end, we present specific experiments from [3] and [4] that compare the different methods discussed in this chapter.
Ontologies represent human domain expertise symbolically in a way that is accessible to human experts and suitable for a variety of applications. As a result, they are widely used in scientific research. A challenge for the neuro-symbolic field is how to use the knowledge encoded in ontologies together with sub-symbolic learning approaches. In this chapter we describe a general neuro-symbolic architecture for using knowledge from ontologies to improve the generalisability and accuracy of predictions of a deep neural network applied to chemical data. The architecture consists of a multi-layer network with a multi-step training process: first, the network is given a self-supervised pre-training step with a masked language step in order to learn the input representation. Second, the network is given an ontology pre-training step in which the network learns to predict membership in the classes of the ontology as a way to learn organising knowledge from the ontology. Finally, we show that visualisation of the attention weights of the ontology-trained network allows some form of interpretability of network predictions. In general, we propose a three-layered architecture for neuro-symbolic integration, with layers for 1) encoding, 2) ontological classification, and 3) ontology-driven logical loss.
Structured output prediction problems are ubiquitous in machine learning. The prominent approach leverages neural networks as powerful feature extractors, otherwise assuming the independence of the outputs. These outputs, however, jointly encode an object, e.g. a path in a graph, and are therefore related through the structure underlying the output space. We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training by minimizing the network’s violation of such dependencies, steering the network towards predicting distributions satisfying the underlying structure. At the same time, it is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby, while also enabling efficient end-to-end training and inference. We also discuss key improvements and applications of the semantic loss. One limitations of the semantic loss is that it does not exploit the association of every data point with certain features certifying its membership in a target class. We should therefore prefer minimum-entropy distributions over valid structures, which we obtain by additionally minimizing the neuro-symbolic entropy. We empirically demonstrate the benefits of this more refined formulation. Moreover, the semantic loss is designed to be modular and can be combined with both discriminative and generative neural models. We illustrate this point by integrating the semantic loss into generative adversarial networks, yielding constrained adversarial networks, a novel class of deep generative models able to efficiently synthesize complex objects obeying the structure of the underlying domain.
Deep learning approaches have become popular in sentiment analysis because of their competitive performance. The downside of this approach is that they do not provide understandable explanations on how the sentiment values are calculated. Previous approaches that used sentiment lexicons for sentiment analysis can do that, but their performance is lower than deep learning approaches. Therefore, it is natural to wonder if the two approaches can be combined to exploit their advantages. In this chapter, we present a neuro-symbolic approach that combines both symbolic and deep learning approaches for sentiment analysis tasks.
The symbolic approach exploits sentiment lexicon and shifter patterns—which cover the operations of inversion/reversal, intensification, and attenuation/downtoning. The deep learning approach used a pre-trained language model (PLM) to construct sentiment lexicon. Our experimental result shows that the proposed approach leads to promising results, substantially better than the results of a pure lexicon-based approach. Although the results did not reach the level of the deep learning approach, a great advantage is that sentiment prediction can be accompanied by understandable explanations. For some users, it is very important to see how sentiment is derived, even if performance is a little lower.
Vector Symbolic Architecture (VSA) is a powerful computing model that is built on a rich algebra in which all representations—from atomic to composite structures—are high-dimensional holographic distributed vectors of the same, fixed dimensionality. VSA is mainly characterized by the following intriguing properties: (i) quasi-orthogonality of a randomly chosen vector to other random vectors with very high probability, aka concentration of measure; (ii) exponential growth of the number of such quasi-orthogonal vectors with the dimensionality, which provides a sufficiently large capacity to accommodate novel concepts over time; (iii) availability of these vectors to be composed, decomposed, probed, and transformed in various ways using a set of well-defined operations. Motivated by these properties, this chapter presents a summary of recently developed methodologies on the integration of VSA with deep neural networks that enabled impactful applications to few-shot [1] and continual [2, 3] learning. Resorting to VSA-based embedding allows deep neural networks to quickly learn from few training samples by storing them in an explicit memory, where many more class categories can be continually expressed in the abstract vector space of VSA with fixed dimensions, without causing interference among the learned classes. Experiments on various image datasets show that the considered neuro-symbolic AI approach outperforms pure deep neural network baselines with remarkable accuracy, scalability, and compute/memory efficiency.