Ebook: HHAI2022: Augmenting Human Intellect
Hybrid human-artificial intelligence is a new research area concerned with all aspects of AI systems that assist humans, and vice versa. The emphasis is on the need for adaptive, collaborative, responsible, interactive and human-centered artificial intelligence systems that can leverage human strengths and compensate for human weaknesses while taking into account social, ethical and legal considerations. The challenge is to develop robust, trustworthy AI systems that can ‘understand’ humans, adapt to complex real-world environments and interact appropriately in a variety of social settings.
This book presents the proceedings of the 1st International Conference on Hybrid Human-Artificial Intelligence (HHAI2022), held in Amsterdam, The Netherlands, from 13 -17June 2022. HHAI2022 was the first international conference focusing on the study of AI systems that amplify rather than replace human intelligence by cooperating synergistically, proactively, responsibly and purposefully with humans. Scholars from the fields of AI, human computer interaction, cognitive and social sciences, computer science, philosophy, and others were invited to submit their best original work on hybrid human-artificial intelligence. The book contains 24 main-track papers, 17 poster and demo papers, and 1 Hackathon paper, selected from a total of 96 submissions, and topics covered include human-AI interaction and collaboration, co-learning and co-creation; learning, reasoning and planning with humans and machines in the loop; integration of learning and reasoning; law and policy challenges around human-centered AI systems; and societal awareness of AI.
The book provides an up-to-date overview of this novel and timely field of study, and will be of interest to all those working with aspects of artificial intelligence, in whatever field.
1. Introduction
This volume contains the proceedings of the 1st International Conference on Hybrid Human-Artificial Intelligence (HHAI2022), held during June 13–17, 2022, in Amsterdam, The Netherlands. HHAI2022 was the first international conference focusing on the study of Artificial Intelligent systems that cooperate synergistically, proactively, responsibly and purposefully with humans, amplifying instead of replacing human intelligence.
Scholars from diverse fields (from AI, to human computer interaction, cognitive and social sciences, computer science, philosophy, and others) were invited to submit their best original, new as well as in progress, visionary and existing work on Hybrid Human-Artificial Intelligence. This editorial presents the highlights from this fruitful conference.
Hybrid Human-Artificial Intelligence is a new research area that is concerned with all aspects of AI systems that assist humans and vice versa, emphasizing the need for adaptive, collaborative, responsible, interactive and human-centered artificial intelligence systems that leverage human strengths and compensate for human weaknesses, while taking into account social, ethical and legal considerations. This novel and timely field of study is driven by current developments in AI, but also requires fundamentally new approaches and solutions.
The first edition of what is intended to become a series of HHAI conferences welcomed research on different challenges in Hybrid Human-Artificial Intelligence. The following list of topics is illustrative, not exhaustive:
Human-AI interaction and collaboration
Adaptive human-AI co-learning and co-creation
Learning, reasoning and planning with humans and machines in the loop
User modeling and personalisation
Integration of learning and reasoning
Transparent, explainable and accountable AI
Fair, ethical, responsible and trustworthy AI
Technical and critical perspectives on human-AI interaction
Meaningful human control over AI systems
Values and politics in the design and use of human-AI interaction
Law and policy challenges around human-centered AI systems
Societal awareness of AI
Multimodal machine perception of real world settings
Social signal processing
2. Contributions
To stimulate the exchange of novel ideas and interdisciplinary perspectives, three types of papers were accepted: i) full papers, which presented original and impactful work, ii) working papers, which presented work in progress or new and visionary ideas, and iii) extended abstracts, which presented existing, pre-published work of relevance to HHAI. For evaluating the novelty and impact of these contributions, the support of our PC members was key, who brought a diverse broad range of experiences, perspectives and backgrounds (ranging from theoretical AI, to law, policy, philosophy, and others). With the exception of 4 abstracts of previous work, every paper was reviewed by at least 3 reviewers.
Overall, we received a total of 96 submissions. The wordcloud presented in Figure 1 presents an overview over the most prominent topics of the accepted contributions. Analysing the most emerging topics in the submissions we saw that the most technical side of human-AI interaction and collaboration, together with learning, reasoning and planning with humans in the loop, was similarly prominent at this first edition of HHAI as the responsible AI line of work (fairness, ethics, transparency, etc.), with many papers at the intersection of these domains. The conference also featured significant amount of work on adaptive human-AI co-learning and co-creation, user modelling and personalisation, technical and critical perspectives of human-AI interaction and meaningful control over AI systems.
The conference program included presentations of 16 full research papers, 12 working papers and 5 extended abstracts. This volume includes the full papers and the abstract of a subset of working papers.
For the main research program two awards were presented. The paper Challenges of the adoption of AI in High Risk High consequence time compressed decision-making environments by Bart van Leeuwen, Richard Gasaway and Gerke Spaling was chosen as best working paper, while the best research paper award went to HyEnA: A Hybrid Method for Extracting Arguments from Opinions by Michiel van der Meer, Enrico Liscio, Catholijn M. Jonker, Aske Plaat, Piek Vossen and Pradeep K. Murukannaiah.
HHAI highlighted 4 invited keynotes, which together provide an overview of how diverse and multidisciplinary this field is:
-
With whom do we hybridise? Principle Agents of AI, by Joanna Bryson (Professor of Ethics and Technology at The Hertie School of Governance). Abstract: Artificial Intelligence is a set of techniques facilitating our capacity to navigate the information spaces afforded by substantial improvements in digital technologies and infrastructures. But whose is this “our” – who is gaining in capacities and at what costs? In this talk I will review a sampling of AI impacts in the individual, national, and global spheres. I will present my recent research in transparency for and through AI, governance of those that produce AI, and the transnational dynamics that may be obscuring and even compromising our agency. I use this evidence to suggest that ultimately ethics and responsibility are only sensible framings for relationships between peers, and artefacts are never true peers with organisms.
-
Creating Human-Computer Partnerships, by Wendy Mackay (Professor of Human-Computer Interaction at Inria Paris-Saclay). Abstract: Despite incredible advances in hardware, much of today’s software remains stuck in assumptions that date back to the 1970s. As software becomes ever more ‘intelligent’, users often find themselves in a losing battle, unable to explain what they really want. Their role can easily shift from generating new content to correcting or putting up with the system’s errors. This is partly due to the assumptions from AI that treat human users primarily as a source of data for their algorithm – the so-called “human-in-the-loop” – while traditional Human-Computer Interaction practitioners focus on creating the “user experience” with simple icon and menu interfaces, without considering the details of the user’s interaction with an intelligent system. I argue that we need to develop methods for creating human-computer partnerships that take advantage of advances in machine learning, but also leave the user in control. I illustrate how we use generative theory, especially instrumental interaction and reciprocal co-adaptation, to create interactive intelligent systems that are discoverable, appropriable and expressive. Our goal is to design robust interactive systems that augment rather than replace human capabilities, and are actually worth learning over time.
-
Designing AI systems with a variety of users in mind, by Fernanda Viegas (Professor of Computer Science at Harvard, Principal Scientist at Google). Abstract: How should people relate to artificial intelligence technology? Is it a tool to be used, a partner to be consulted, or perhaps a source of inspiration and awe? As technology advances, choosing useful human/AI relationship framings will become an increasingly important question for designers, technologists and users. I’ll discuss a series of research projects – ranging from data visualizations and tools for medical practitioners to guidelines for designers – that illustrate how AI can play each of these roles. By providing users with a diversity of engagement possibilities, I hope to develop more responsible and effective ways to construct, use and evaluate this technology.
-
The HumanE AI Net vision of Human Centric AI, by Paul Lukowicz (Professor of Computer Science at the German Research Center for Artificial Intelligence). The EU-funded HumanE-AI-Net project brings together leading European research centres, universities and industrial enterprises into a network of centres of excellence. Leading global artificial intelligence (AI) laboratories will collaborate with key players in areas, such as human-computer interaction, cognitive, social and complexity sciences. The project is looking forward to drive researchers out of their narrowly focused field and connect them with people exploring AI on a much wider scale. The challenge is to develop robust, trustworthy AI systems that can ‘understand’ humans, adapt to complex real-world environments and interact appropriately in complex social settings. HumanE-AI-Net will lay the foundations for designing the principles for a new science that will make AI based on European values and closer to Europeans. Paul’s talk was supported by EurAI.
HHAI2022 also featured 8 workshops, for which we specifically encouraged contributions that were likely to stimulate critical or controversial discussions about any of the areas of the HHAI conference series. These workshops were:
-
1st Workshop on the representation, sharing and evaluation of multimodal agent interaction (mmai2022)
-
Heterodox Methods for Interpretable and Efficient Artificial Intelligence
-
Imagining the AI Landscape after the AI Act (IAIL 2022)
-
Common Ground Theory and Method Development Workshop: Exploring, Understanding, and Enhancing Human-Centricity in Hybrid Work Settings
-
Knowledge Representation for Hybrid-Intelligence (KR4HI)
-
HI ESDiT Collaboration on AI, Human Values and the Law
-
Human-Centered Design of Symbiotic Hybrid Intelligence
-
The (Eco)systemic challenges in AI
At the first HHAI Hackathon H3AI, the winning team approached the problem of fake news by providing fact checkers with a new modus of operandi, leveraging the capabilities of AI. The team’s envisioned AI system would identify atomic question from a potentially fake news article, put them up for review for the fact checker and crowd source the answers, thus reducing the fact checker’s workload while still leveraging human intellect.
In collaboration with Amsterdam Data Science, we also organised an Industry Meetup with around 30 participants in which researchers and practitioners met to discuss HHAI related problems.
A Posters and Demos Track, finally, complemented the conference and offered an opportunity to present late-breaking results and showcase innovative implementations. Each submission was reviewed based on their self-standing contribution to Hybrid Human-AI, by a minimum 2 reviewers from a program committee from diverse disciplines.
Out of 28 submissions, we accepted 10 posters and 7 demos to present their work in an interactive and informal context with snacks and drinks. Additionally, we also provided space for 10 posters accompanying full paper presentations from the main track of the conference.
All submissions for the Poster and Demo track competed for Best Poster and Best Demo awards based on participants votes. An honorable mentioning was given to an accompanying poster went to Mehul Verma and Erman Acar for their poster Learning to Cooperate with Human Evaluative Feedback and Demonstrations. The best Poster Award went to Enrico Liscio, Catholijn M. Jonker and Pradeep K. Murukannaiah for Identifying Context-Specific Values via Hybrid Intelligence, while the Best Demo was presented by Dou Liu, Claudia Alessandra Libbi and Delaram Javdani Rikhtehgar on What would you like to visit next? Using a Knowledge-Graph Driven Museum Guide in a Virtual Exhibition.
Acknowledgements
This first edition of Hybrid Human-Artificial Intelligence was organized by the Hybrid Intelligence Centre (unmapped: uri https://www.hybrid-intelligence-centre.nl/) and the Humane-AI European Network (unmapped: uri https://www.humane-ai.eu/), who also contributed financially to the conference.
HHAI2022 would not have been possible without the generous support of a number of additional sponsors. There were three Platinum sponsors, SIKS,1unmapped: fn unmapped: label 1 unmapped: uri http://www.siks.nl/ unmapped: uri https://aij.ijcai.org/ unmapped: uri http://www.tno.nl unmapped: uri https://www.huawei.com/nl/ unmapped: uri https://www.iamsterdam.com/en/business/meetings/amsterdam-convention-bureau unmapped: uri https://csl.sony.fr/ unmapped: uri https://www.iospress.com/ unmapped: uri https://www.eurai.org/ unmapped: uri https://networkinstitute.org/
Thanks to everybody, including attendees at the conference and our PC members, for making HHAI 2022 a successful event.
Stefan Schlobach (General Chair)
Maria Perez-Ortiz (Program co-Chair)
Myrthe Tielman (Program co-Chair)
When mathematical modelling is applied to capture a complex system, multiple models are often created that characterise different aspects of that system. Often, a model at one level will produce a prediction which is contradictory at another level but both models are accepted because they are both useful. Rather than aiming to build a single unified model of a complex system, the modeller acknowledges the infinity of ways of capturing the system of interest, while offering their own specific insight. We refer to this pragmatic applied approach to complex systems — one which acknowledges that they are incompressible, dynamic, nonlinear, historical, contextual, and value-laden — as Open Machine Learning (Open ML). In this paper we define Open ML and contrast it with some of the grand narratives of ML of two forms: 1) Closed ML, ML which emphasizes learning with minimal human input (e.g. Google’s Alpha Zero) and 2) Partially Open ML, ML which is used to parameterize existing models. To achieve this, we use theories of critical complexity to both evaluate these grand narratives and contrast them with the Open ML approach. Specifically, we deconstruct grand ML ‘theories’ by identifying thirteen ‘games’ played in the ML community. These games lend false legitimacy to models, contribute to over-promise and hype about the capabilities of artificial intelligence, reduce wider participation in the subject, lead to models that exacerbate inequality and cause discrimination and ultimately stifle creativity in research. We argue that best practice in ML should be more consistent with critical complexity perspectives than with rationalist, grand narratives.
The key arguments underlying a large and noisy set of opinions help understand the opinions quickly and accurately. Fully automated methods can extract arguments but (1) require large labeled datasets and (2) work well for known viewpoints, but not for novel points of view. We propose HyEnA, a hybrid (human + AI) method for extracting arguments from opinionated texts, combining the speed of automated processing with the understanding and reasoning capabilities of humans. We evaluate HyEnA on three feedback corpora. We find that, on the one hand, HyEnA achieves higher coverage and precision than a state-of-the-art automated method, when compared on a common set of diverse opinions, justifying the need for human insight. On the other hand, HyEnA requires less human effort and does not compromise quality compared to (fully manual) expert analysis, demonstrating the benefit of combining human and machine intelligence.
In this study, a formal framework aiming to drive the interaction between a human operator and a team of unmanned aerial vehicles (UAVs) is experimentally tested. The goal is to enhance human performance by controlling the interaction between agents based on an online monitoring of the operator’s mental workload (MW) and performance. The proposed solution uses MW estimation via a classifier applied to cardiac features. The classifier output is introduced as a human MW state observation in a Partially Observable Markov Decision Process (POMDP) which models the human-system interaction dynamics, and aims to control the interaction to optimize the human agent’s performance. Based on the current belief state about the operator’s MW and performance, along with the mission phase, the POMDP policy solution controls which task should be suggested -or not- to the operator, assuming the UAVs are capable of supporting the human agent. The framework was evaluated using an experiment in which 13 participants performed 2 search and rescue missions (with/without adaptation) with varying workload levels. In accordance with the literature, when the adaptive approach was used, the participants felt significantly less MW, physical and temporal demands, frustration, and effort, and their flying score was also significantly improved. These findings demonstrate how such a POMDP-based adaptive interaction control can improve performance while reducing operator workload.
Cooperation is a widespread phenomenon in nature that has also been a cornerstone in the development of human intelligence. Understanding cooperation, therefore, on matters such as how it emerges, develops, or fails is an important avenue of research, not only in a human context, but also for the advancement of next generation artificial intelligence paradigms which are presumably human-compatible. With this motivation in mind, we study the emergence of cooperative behaviour between two independent deep reinforcement learning (RL) agents provided with human input in a novel game environment. In particular, we investigate whether evaluative human feedback (through interactive RL) and expert demonstration (through inverse RL) can help RL agents to learn to cooperate better. We report two main findings. Firstly, we find that the amount of feedback given has a positive impact on the accumulated reward obtained through cooperation. That is, agents trained with a limited amount of feedback outperform agents trained without any feedback, and the performance increases even further as more feedback is provided. Secondly, we find that expert demonstration also helps agents’ performance, although with more modest improvements compared to evaluative feedback. In conclusion, we present a novel game environment to better understand the emergence of cooperative behaviour and show that providing human feedback and demonstrations can accelerate this process.
There has been significant debate in the NLP community about whether or not attention weights can be used as an explanation – a mechanism for interpreting how important each input token is for a particular prediction. The validity of “attention as explanation” has so far been evaluated by computing the rank correlation between attention-based explanations and existing feature attribution explanations using LSTM-based models. In our work, we (i) compare the rank correlation between five more recent feature attribution methods and two attention-based methods, on two types of NLP tasks, and (ii) extend this analysis to also include transformer-based models. We find that attention-based explanations do not correlate strongly with any recent feature attribution methods, regardless of the model or task. Furthermore, we find that none of the tested explanations correlate strongly with one another for the transformer-based model, leading us to question the underlying assumption that we should measure the validity of attention-based explanations based on how well they correlate with existing feature attribution explanation methods. After conducting experiments on five datasets using two different models, we argue that the community should stop using rank correlation as an evaluation metric for attention-based explanations. We suggest that researchers and practitioners should instead test various explanation methods and employ a human-in-the-loop process to determine if the explanations align with human intuition for the particular use case at hand.
Power grids are becoming more complex to operate in the digital age given the current energy transition to cope with climate change. As a result, real-time decision-making is getting more challenging as the human operator has to deal with more information, more uncertainty, more applications, and more coordination. While supervision has been primarily used to help them make decisions over the last decades, it cannot reasonably scale up anymore. There is a great need for rethinking the human-machine interface under more unified and interactive frameworks. Taking advantage of the latest developments in Human-Machine Interface and Artificial Intelligence, we expose our vision of a new assistant framework relying on an hypervision interface and greater bidirectional interaction. We review the known principles of decision-making driving our assistant design alongside its supporting assistance functions. We finally share some guidelines to make progress towards the development of such an assistant.
From diagnosis to patient scheduling, AI is increasingly being considered across different clinical applications. Despite increasingly powerful clinical AI, uptake into actual clinical workflows remains limited. One of the major challenges is developing appropriate trust with clinicians. In this paper, we investigate trust in clinical AI in a wider perspective beyond user interactions with the AI. We offer several points in the clinical AI development, usage, and monitoring process that can have a significant impact on trust. We argue that the calibration of trust in AI should go beyond explainable AI and focus on the entire process of clinical AI deployment. We illustrate our argument with case studies from practitioners implementing clinical AI in practice to show how trust can be affected by different stages in the deployment cycle.
We propose methods for an AI agent to estimate the value preferences of individuals in a hybrid participatory system, considering a setting where participants make choices and provide textual motivations for those choices. We focus on situations where there is a conflict between participants’ choices and motivations, and operationalize the philosophical stance that “valuing is deliberatively consequential.” That is, if a user’s choice is based on a deliberation of value preferences, the value preferences can be observed in the motivation the user provides for the choice. Thus, we prioritize the value preferences estimated from motivations over the value preferences estimated from choices alone. We evaluate the proposed methods on a dataset of a large-scale survey on energy transition. The results show that explicitly addressing inconsistencies between choices and motivations improves the estimation of an individual’s value preferences. The proposed methods can be integrated in a hybrid participatory system, where artificial agents ought to estimate humans’ value preferences to pursue value alignment.
The development and the spread of increasingly autonomous digital technologies in our society pose new ethical challenges beyond data protection and privacy violation. Users are unprotected in their interactions with digital technologies and at the same time autonomous systems are free to occupy the space of decisions that is prerogative of each human being. In this context the multidisciplinary project Exosoul aims at developing a personalized software exoskeleton which mediates actions in the digital world according to the moral preferences of the user. The exoskeleton relies on the ethical profiling of a user, similar in purpose to the privacy profiling proposed in the literature, but aiming at reflecting and predicting general moral preferences. Our approach is hybrid, first based on the identification of profiles in a top-down manner, and then on the refinement of profiles by a personalized data-driven approach. In this work we report our initial experiment on building such top-down profiles. We consider the correlations between ethics positions (idealism and relativism) personality traits (honesty/humility, conscientiousness, Machiavellianism and narcissism) and worldview (normativism), and then we use a clustering approach to create ethical profiles predictive of user’s digital behaviors concerning privacy violation, copyright infringements, caution and protection. Data were collected by administering a questionnaire to 317 young individuals. In the paper we discuss two clustering solutions (k = 2 and k = 4) in terms of validity and predictive power of digital behavior.
We propose SUPPLE (Sequence-Update Pattern-Based Processing with Logical Expansions), a new dialog management framework that takes the core concept of a dialog sequence as its main starting point. SUPPLE naturally enables the integration of the flexible and re-usable conversation patterns from the Natural Conversation Framework (NCF). Whereas NCF primarily provides a design framework, we developed a dialog engine and authoring framework that builds on the notion of a pattern for specifying sequence structure. In our approach we combine patterns with the key concepts of update strategies and agenda adopted from the Information State Update (ISU) approach. The main contributions of our work are the introduction of concepts and mechanisms for automatically managing dialog sequences. The framework is implemented as a cognitive agent, and we show through a cooking assistant case study how the agent keeps track of a recipe instruction agenda while allowing for user- as well as agent-initiated sequence expansions. conversational agents to co-regulate the conversation and thus allows for more flexibility.
Personal values represent what people find important in their lives, and are key drivers of human behavior. For this reason, support agents should provide help that is aligned with the personal values of the users. To do this, the support agent not only should know the value preferences of the user, but also how different situations in the user’s life affect these personal values. We represent situations using their psychological characteristics, and we build predictive models that given the psychological characteristics of a situation, predict whether the situation promotes, demotes or does not affect a personal value. In this work, we focus on predictions for the value ‘enjoyment of life’, and use different machine learning classifiers, all of them performing better than chance when training on data from multiple people. The best predictive model is a multi-layer perceptron classifier, which achieves an accuracy of 72%. Further, we hypothesize that the accuracy of such models would drop when tested on individual data sets. The data supports our hypothesis, and the accuracy of the best performing model drops by at least 11% when tested on individual data. To tackle this, we propose an active learning procedure to build personalized prediction models having the user in the loop. Results show that this approach outperforms the previously built model while using only 30% of the training data. Our findings suggest that how situations affect personal values can have subjective interpretations, but we can account for those subjective interpretations by involving the user when building a prediction model.
With accelerated progress in autonomous agent capabilities, mixed human and agent teams will become increasingly commonplace in both our personal and professional spheres. Hence, further examination of factors affecting collaboration efficacy in these types of teams are needed to inform the design and use of effective human-agent teams. Ad hoc human-agent teams, where team members interact without prior experience with teammates and only for a limited number of interactions, will be commonplace in dynamic environments with short opportunity windows for collaboration between diverse groups. We study ad-hoc team scenarios pairing a human with an agent where both need to assess and adapt to the capabilities of the partner to maximize team performance. In this work, we investigate the relative efficacy of two human-agent collaboration protocols that differ in the team member responsible for allocating tasks to the team. We designed, implemented, and experimented with an environment in which human-agent teams repeatedly collaborate to complete heterogeneous task sets.
A central role in understanding the interaction between humans and AI plays the notion of trust. Especially research from social and cognitive psychology has shown, however, that individuals’ perceptions of trust can be biased. In this empirical investigation, we focus on the single and combined effects of attitudes towards AI and motivated reasoning in shaping such biased trust perceptions in the context of news consumption. In doing so, we rely on insights from works on the machine heuristic and motivated reasoning. In a 2 (author) x 2 (congruency) between-subjects online experiment, we asked N = 477 participants to read a news article purportedly written either by AI or a human author. We manipulated whether the article represented pro or contra arguments of a polarizing topic, to elicit motivated reasoning. We also assessed participants’ attitudes towards AI in terms of competence and objectivity. Through multiple linear regressions, we found that (a) increased perceptions of AI as objective and ideologically unbiased increased trust perceptions, whereas (b), in cases where participants were swayed by their prior opinion to trust content more when they agreed with the content, the AI author reduced such biased perceptions. Our results indicate that it is crucial to account for attitudes towards AI and motivated reasoning to accurately represent trust perceptions.
Theory of mind refers to the human ability to reason about mental content of other people such as beliefs, desires, and goals. In everyday life, people rely on their theory of mind to understand, explain, and predict the behaviour of others. Having a theory of mind is especially useful when people collaborate, since individuals can then reason on what the other individual knows as well as what reasoning they might do. Realization of hybrid intelligence, where an agent collaborates with a human, will require the agent to be able to do similar reasoning through computational theory of mind. Accordingly, this paper provides a mechanism for computational theory of mind based on abstractions of single beliefs into higher-level concepts. These concepts can correspond to social norms, roles, as well as values. Their use in decision making serves as a heuristic to choose among interactions, thus facilitating collaboration on decisions. Using examples from the medical domain, we demonstrate how having such a theory of mind enables an agent to interact with humans efficiently and can increase the quality of the decisions humans make.
Widespread application of uninterpretable machine learning systems for sensitive purposes has spurred research into elucidating the decision making process of these systems. These efforts have their background in many different disciplines, one of which is the field of AI & law. In particular, recent works have observed that machine learning training data can be interpreted as legal cases. Under this interpretation the formalism developed to study case law, called the theory of precedential constraint, can be used to analyze the way in which machine learning systems draw on training data – or should draw on them – to make decisions. These works predominantly stay on the theoretical level, hence in the present work the formalism is evaluated on a real world dataset. Through this analysis we identify a significant new concept which we call landmark cases, and use it to characterize the types of datasets that are more or less suitable to be described by the theory.
Humans use AI-assistance in a wide variety of high- and low-stakes decision-making tasks today. However, human reliance on the AI’s assistance is often sub-optimal — with people exhibiting under- or over-reliance on the AI. We present an empirical investigation of human-AI assisted decision-making in a noisy image classification task. We analyze the participants’ reliance on AI assistance and the accuracy of human-AI assistance as compared to the human or AI working independently. We demonstrate that participants do not show automation bias which is a widely reported behavior displayed by humans when assisted by AI. In this specific instance of AI-assisted decision-making, people are able to correctly override the AI’s decision when needed and achieve close to the theoretical upper bound on combined performance. We suggest that the reason for this discrepancy from previous research findings is because 1) people are experts at classifying everyday images and have a good understanding of their ability in performing the task, 2) people engage in the metacognitive act of deliberation when asked to indicate confidence in their decision, and 3) people were able to build a good mental model of the AI by incorporating feedback that was provided after each trial. These findings should inform future experiment design.
In this paper, we consider some key characteristics that AI should exhibit to enable hybrid agencies that include subject-matter experts and their AI-enabled decision aids. We will hint at the design requirements of guaranteeing that AI tools are: open, multiple, continuous, cautious, vague, analogical and, most importantly, adjunct with respect to decision-making practices. We will argue that especially adjunction is an important condition to design for. Adjunction entails the design and evaluation of human-AI interaction protocols aimed at improving AI usability, while also guaranteeing user satisfaction and human and social sustainability. It does so by boosting people’s cognitive motivation for interacting analytically with the outputs, reducing overreliance on AI and improving performance.