Ebook: HHAI 2024: Hybrid Human AI Systems for the Social Good
The field of hybrid human-artificial intelligence (HHAI), although primarily driven by developments in AI, also requires fundamentally new approaches and solutions. Multidisciplinary in nature, it calls for collaboration across various research domains, such as AI, HCI, the cognitive and social sciences, philosophy and ethics, and complex systems, to name but a few.
This book presents the proceedings of HHAI 2024, the 3rd International Conference on Hybrid Human-Artificial Intelligence, held from 10-14 June 2024 in Malmö, Sweden. The focus of HHAI 2024 was on artificially-intelligent systems that cooperate synergistically, proactively and purposefully with humans, amplifying rather than replacing human intelligence. A total of 62 submissions were received for the main track of the conference, of which 31 were accepted for presentation after a thorough double blind review process. These comprised 9 full papers, 5 blue sky papers, and 17 working papers, making the final acceptance rate for full papers 29%. Acceptance rate across all tracks of the main program was 50%. This book contains all submissions accepted for the main track, as well as the proposals for the Doctoral Consortium and extended abstracts from the Posters and Demos track. Topics covered include human-AI interaction and collaboration; learning, reasoning and planning with humans and machines in the loop; fair, ethical, responsible, and trustworthy AI; societal awareness of AI; and the role of design and compositionality of AI systems in interpretable/collaborative AI, among others.
Providing a current overview of research and development, the book will be of interest to all those working in the field and facilitate the ongoing exchange and development of ideas across a range of disciplines.
This volume presents the proceedings of the 3rd International Conference on Hybrid Human-Artificial Intelligence (HHAI 2024), held in Malmö, Sweden, from 10–14 June 2024. The focus of HHAI 2024 was on artificially-intelligent systems that cooperate synergistically, proactively, and purposefully with humans, amplifying rather than replacing human intelligence.
The HHAI field is driven by developments in AI, but it also requires fundamentally new approaches and solutions. For this reason, we encourage collaboration across research domains such as AI, HCI, cognitive and social sciences, philosophy & ethics, complex systems, and others. For this third international conference, we invited scholars from these fields to submit their best, original – new as well as in progress – work, and visionary ideas on hybrid human-artificial intelligence. The following list of topics is illustrative, not exhaustive:
∙ Human-AI interaction and collaboration
∙ Adaptive human-AI co-learning and co-creation
∙ Learning, reasoning and planning with humans and machines in the loop
∙ User modelling and personalisation
∙ Integration of learning and reasoning
∙ Transparent, explainable, and accountable AI
∙ Fair, ethical, responsible, and trustworthy AI
∙ Societal awareness of AI
∙ Multimodal machine perception of real-world settings
∙ Social signal processing
∙ Representations learning for communicative or collaborative AI
∙ Symbolic and narrative-based representations for human-centric AI
∙ The role of design and compositionality of AI systems in interpretable/collaborative AI
Contributions about all types of technology, from robots and conversational agents to multi-agent systems and machine learning models were welcome.
Acknowledgments
This edition of Hybrid Human-Artificial Intelligence was organised by the Hybrid Intelligence (https://www.hybrid-intelligence-centre.nl) and Humane-AI European Network, which also contributed financially to the conference, and was supported by The Wallenberg AI, Autonomous Systems and Software Program – Humanity and Society (WASP-HS) (https://wasp-hs.org), AI Policy Lab (https://aipolicylab.se) and hosted by Malmö University, with support from Umeå University. Esra Karabiber and the Malmö University Conference Service also provided invaluable support in the organising of the conference.
We would like to take this opportunity to thank everybody who submitted their work for review and all those who presented their work at the conference. Special thanks also to the members of the programme committee, the organisers of the pre-conference workshops, tutorials, and creative events, and the sponsors of the conference for their contributions.
Fabian Lorig (General Chair)
Jason Tucker (General Chair)
Adam Dahlgren Lindström (General Chair)
Frank Dignum (General Chair)
Pradeep Murukannaiah (Program Chair)
Andreas Theodorou (Program Chair)
Pınar Yolum (Program Chair)
Word associations have been extensively used in psychology to study the rich structure of human conceptual knowledge. Recently, the study of word associations has been extended to investigating the knowledge encoded in LLMs. However, because of how the LLM word associations are accessed, existing approaches have been limited in the types of comparisons that can be made between humans and LLMs. To overcome this, we create LLM-generated word association norms modeled after the Small World of Words (SWOW) human-generated word association norms consisting of over 12,000 cue words. We prompt the language models with the same cues and participant profiles as those in the SWOW human-generated norms, and we conduct preliminary comparative analyses between humans and LLMs that explore differences in response variability, biases, concreteness effects, and network properties. Our exploration provides insights into how LLM-generated word associations can be used to investigate similarities and differences in how humans and LLMs process information.
Jobseekers typically not only seek job vacancies matching their skills but also a company aligning with their values. This relates to Industry 5.0, a European Commission initiative emphasizing a more fulfilling role for workers. This study explores the relationship between skills and values in job vacancy selection and suggests several ways in combining these aspects for decision making. The first baseline system only uses skills, the second assigns equal importance to both skills and values, and the third, a hybrid intelligence system, leverages Pareto Optimality, leaving the ultimate decision on the trade-off between skills match versus values match to the jobseeker. Additionally, a small scale user study explores the impact of values on vacancy selection and evaluates the proposed matching systems. The results show that, participants seek a balanced trade-off between both skills and values. Accordingly, systems considering both skills and values outperform the baseline system. The system with equal weights and the Pareto optimality-based system have similar performances, possibly due to the large overlap in their output. Future work with more participants in a real-world application is needed to further validate our first exploration of the relationship between skills and values.
This research focuses on establishing trust in multiagent systems where human and AI agents collaborate. We propose a computational notion of actual trust, emphasising the modelling of an agent’s capacity to deliver tasks. Unlike reputation-based trust or performing a statistical analysis on past behaviour, our approach considers the specific setting in which agents interact. We integrate non-deterministic semantics for capturing inherent uncertainties within the behaviour of a multiagent system, but stress the importance of verifying an agent’s actual capabilities. We provide a conceptual analysis of actual trust’s characteristics and highlight relevant trust verification tools. By advancing the understanding and verification of trust in collaborative systems, this research contributes to responsible and trustworthy human-AI interactions, enhancing reliability in various domains.
As intelligent systems become more autonomous, humans increasingly delegate important tasks and decisions to them. On the one hand, this approach seems to be very supportive to humans, on the other it generates apprehension about a future dominated by machines. These contrasting viewpoints encapsulate what in literature is usually referred to as augmenting, enhancing or amplifying humans versus replacing them. However, these concepts lack clear and shared definitions. To fill this gap, we conducted a semi-systematic literature review to elicit existing definitions, if any. We found out that replacement is generally negatively considered while a hybrid approach is often preferred, as there is a hesitancy to embrace complete automation, primarily driven by a lack of trust in AI systems. To make these concepts applicable, it is essential to identify shared and actionable definitions. Building on these insights, our upcoming research aims at developing a framework that fosters their measurement.
Current advances in large language models (LLMs) and generative AI (GenAI) have produced both enthusiasm and concerns in the academic world, industry, and society in general. While optimistic views foresee unprecedented increase in efficiency and productivity, concerns have been expressed on the potential of these technologies to determine significant changes in most areas of human activity, which may not always have predictable or positive outcomes. One of the challenges often evoked in this context, not yet fully addressed, is the impact of the AI-powered agents on the educational sector, and especially on aspects such as student’s agency and control, creativity, and motivation in pedagogic activities that involve the use of this type of agents. The aim of the study is to address this question starting from the hypothesis that preliminary simulations of AI-based pedagogic scenarios can help instructors to better understand the inner mechanisms of these technologies and their possible impact on the learning, assignment completion and evaluation processes. The paper presents a set of experiments with simulated student-agent interactions generated by AI chatbots and proposes a formal framework for assessing this form of “imitation game” and its possible applications in real teaching-learning environments.
Avoiding violations of privacy-invading technologies is difficult enough for an individual, yet the complexity escalates when online collaborations and social media jeopardize the privacy of multiple parties over co-owned contents. While existing approaches offer solutions for possible conflicts among users’ privacy preferences, they either assume static rules for the preferences of users or require the users to declare separate decisions for each content. In any case, the long term satisfaction of all users remains uncertain. Reinforcement learning (RL) emerges at this point as a suitable candidate for balancing the users’ utilities as their satisfactions about decisions over time. The decentralized and dynamic nature of the problem suggests an RL setting that involves multiple agents interacting not only with the humans whom they model and represent but also with each other. Furthermore, as the knowledge of agents about the factors that lead to other users’ preferences will be limited, the setting has to handle partial observability. Although this introduces new challenges for the framework, it also brings a potential generalization of any solution to multi-party conflicts in different real life contexts with minor adaptations. This study delves deeper into the features of the proposed framework and the ways to construct it.
Decision trees are widely adopted in Machine Learning tasks due to their operation simplicity and interpretability aspects. However, following the decision process path taken by trees can be difficult in a complex scenario or in a case where a user has no familiarity with them. Prior research showed that converting outcomes to natural language is an accessible way to facilitate understanding for non-expert users in several tasks. More recently, there has been a growing effort to use Large Language Models (LLMs) as a tool for providing natural language texts. In this paper, we examine the proficiency of LLMs to explain decision tree predictions in simple terms through the generation of natural language explanations. By exploring different textual representations and prompt engineering strategies, we identify capabilities that strengthen LLMs as a competent explainer as well as highlight potential challenges and limitations, opening further research possibilities on natural language explanations for decision trees.
In this paper, we present Feature Space Navigator, an interactive interface that allows an exploration of the decision boundary of a model. The proposal aims to overcome the limitations of the techno-solutionist approach to explanations based on factual and counterfactual generation, reaffirming interactivity as a core value in designing the conversation between the model and the user. Starting from an instance, users can explore the feature space by selectively modifying the original instance, on the basis of her own knowledge and experience. The interface visually displays how model predictions react in response to the adjustments introduced by the users, letting them to identify relevant prototypes and counterfactuals. Our proposal leverages the autonomy and control of the users that can explore the behavior of the decision model accordingly with their own knowledge base, reducing the need for a dedicated explanation algorithm.
For people with early-dementia (PwD), it can be challenging to remember to eat and drink regularly and maintain a healthy independent living. Existing intelligent home technologies primarily focus on activity recognition but lack adaptive support. This research addresses this gap by developing an AI system inspired by the Just-in-Time Adaptive Intervention (JITAI) concept. It adapts to individual behaviors and provides personalized interventions within the home environment, reminding and encouraging PwD to manage their eating and drinking routines. Considering the cognitive impairment of PwD, we design a human-centered AI system based on healthcare theories and caregivers’ insights. It employs reinforcement learning (RL) techniques to deliver personalized interventions. To avoid overwhelming interaction with PwD, we develop an RL-based simulation protocol. This allows us to evaluate different RL algorithms in various simulation scenarios, not only finding the most effective and efficient approach but also validating the robustness of our system before implementation in real-world human experiments. The simulation experimental results demonstrate the promising potential of the adaptive RL for building a human-centered AI system with perceived expressions of empathy to improve dementia care. To further evaluate the system, we plan to conduct real-world user studies.
In this paper, we explore the synergies between Digital Humanities (DH) as a discipline and Hybrid Intelligence (HI) as a research paradigm. In DH research, the use of digital methods and specifically that of Artificial Intelligence is subject to a set of requirements and constraints. We argue that these are well-supported by the capabilities and goals of HI. Our contribution includes the identification of five such DH requirements: Successful AI systems need to be able to 1) collaborate with the (human) scholar; 2) support data criticism; 3) support tool criticism; 4) be aware of and cater to various perspectives and 5) support distant and close reading. We take the CARE principles of Hybrid Intelligence (collaborative, adaptive, responsible and explainable) as theoretical framework and map these to the DH requirements. In this mapping, we include example research projects. We finally address how insights from DH can be applied to HI and discuss open challenges for the combination of the two disciplines.
Despite the rapid integration of artificial intelligence (AI) into various research domains and the lives of everyday people, challenges with communicating and understanding these AI systems arise. The lack of a consistent method of communication highlights the need for a transdisciplinary approach to explain the inner workings of AI systems in a cohesive and accessible manner. We thus propose an ontological visual framework using semantically-enhanced, symbols, providing a symbolic language for conveying the structure, purpose, and characteristics of AI systems. The framework encompasses a generalizable glyph set of various AI system components, ensuring both common and obscure architectures can be represented. In this paper, we present the underlying logical formalisms that dictate the behavior of this visual framework as a means to significantly enhance the comprehensibility and understandability of AI system behaviors.
In order to enhance collaboration between humans and artificially intelligent agents, it is crucial to equip the computational agents with capabilities commonly used by humans. One of these capabilities is called Theory of Mind (ToM) reasoning, which is the human ability to reason about the mental contents of others, such as their beliefs, desires, and goals. For an agent to efficiently benefit from having a functioning computational ToM of its human partner in a collaboration, it needs to be practical in computationally tracking their mental attitudes and it needs to create approximate ToM models that can be effectively maintained. In this paper, we propose a computational ToM mechanism based on abstracting beliefs and knowledge into higher-level human concepts, referred to as abstractions. These abstractions, similar to those guiding human interactions (e.g., trust), form the basis of our modular agent architecture. We address an important challenge related to maintaining abstractions effectively, namely abstraction consistency. We propose different approaches to study this challenge in the context of a scenario inspired by a medical domain and provide an experimental evaluation over agent simulations.
Domain experts are one of the most important knowledge sources when building a knowledge base. However, communication about uncertain states and events is prone to misinterpretations and misunderstandings, because people prefer to convey probability estimations by verbal probability expressions (VPEs) which have a high between-subject variability. Additionally, several biases exist when expressing uncertainty verbally. Nevertheless, the application of VPEs might be necessary. Therefore, means must be identified to manage VPEs and to translate them into numeric values appropriately. In this paper, we propose a co-learning approach with example to efficiently and effectively communicate (subjective) probabilities of states and events in teams where human and AI team members are familiarized with the translation between VPEs and numeric values until both parties are capable of using solely numeric values.
As large language models (LLMs) continue to make significant strides, their better integration into agent-based simulations offers a transformational potential for understanding complex social systems. However, such integration is not trivial and poses numerous challenges. Based on this observation, in this paper, we explore architectures and methods to systematically develop LLM-augmented social simulations and discuss potential research directions in this field. We conclude that integrating LLMs with agent-based simulations offers a powerful toolset for researchers and scientists, allowing for more nuanced, realistic, and comprehensive models of complex systems and human behaviours.
In a world increasingly reliant on artificial intelligence, it is more important than ever to consider the ethical implications of artificial intelligence. One key under-explored challenge is labeler bias — bias introduced by individuals who label datasets — which can create inherently biased datasets for training and subsequently lead to inaccurate or unfair decisions in healthcare, employment, education, and law enforcement. Hence, we conducted a study (N=98) to investigate and measure the existence of labeler bias using images of people from different ethnicities and sexes in a labeling task. Our results show that participants hold stereotypes that influence their decision-making process and that labeler demographics impact assigned labels. We also discuss how labeler bias influences datasets and, subsequently, the models trained on them. Overall, a high degree of transparency must be maintained throughout the entire artificial intelligence training process to identify and correct biases in the data as early as possible.
Demographical reasons and the increasing demand for improved production efficiency are steering the transformation within the manufacturing domain towards smart manufacturing. This entails introducing artificial intelligence (AI), data analytics, and automation to improve the efficiency, productivity, and flexibility of manufacturing processes. With the integration of AI, there is a shift from humans merely interacting with technology to actively collaborating with it, especially with AI-enabled agents. This shift brings changes in work practices and tasks. Hence, comprehensive understanding of the phenomenon becomes central for the design of human-AI collaboration that genuinely contributes to effective production and supports operators’ well-being. This scoping review study aims to shed light to the evolving landscape of human-AI collaboration in smart manufacturing by presenting six key concepts derived from an analysis of 23 academic papers. Based on the findings, we propose a framework that offers an initial basis for the design of human-AI collaborative systems for smart manufacturing.
Virtual Heritage exhibitions aim to engage a diverse audience through the integration of Virtual Reality and various AI technologies, including Artificial Agents, and Knowledge Graphs. Understanding the nuances of human-agent interactions is crucial to fully harness the potential of these technologies and deliver personalized and captivating experiences. Evaluating the alignment of Virtual Heritage applications with the vision of Hybrid Intelligence – where humans and machines collaborate toward a common goal – presents a significant challenge. In this paper, we investigate the assessment of Hybrid Intelligence within the Virtual Heritage domain using Knowledge Engineering methods. Through the analysis of six different scenarios presented as workflows of tasks and input/output data, we identify and compare classical Knowledge Engineering tasks with HI-specific tasks to measure the level of HI-ness achieved. Our study focuses on evaluating the synergy achieved by mixed teams during various tasks as a measure of HI-ness. The findings provide insights into the effectiveness of Knowledge Engineering to identify HI aspects within existing applications, the potential for quantifying and improving HI-ness in an application, and the identification of modeling limitations.
Generative AI presents vast opportunities but also risks. Misuse, whether intentional or not, can lead to significant “real-world” consequences. We presented subjects (n=139) with five vignettes describing incidents involving generative AI. We explored the relationship between their level of AI literacy, attitude towards AI, trust in AI chatbots, and people’s reactions to the vignettes. Attitude and trust, measured before and after the vignettes, declined significantly. However, these changes as well as the reactions to the vignettes were unrelated to AI literacy. Yet, higher AI literacy was associated with more frequent use of AI chatbots, higher trust and more positive attitudes towards AI. So while AI literacy appeared to be related to the general perceptions and usage of generative AI, it was not linked to the evaluation of incidents involving generative AI. The implications for trust calibration and appropriate reliance are discussed.
The general availability of large language models and thus unrestricted usage in sensitive areas of everyday life, such as education, remains a major debate. We argue that employing generative artificial intelligence (AI) tools warrants informed usage and examined their impact on problem solving strategies in higher education. In a study, students with a background in physics were assigned to solve physics exercises, with one group having access to an internet search engine (N=12) and the other group being allowed unrestricted use of ChatGPT (N=27). We evaluated their performance, strategies, and interaction with the provided tools. Our results showed that nearly half of the solutions provided with the support of ChatGPT were mistakenly assumed to be correct by students, indicating that they overly trusted ChatGPT even in their field of expertise. Likewise, in 42% of cases, students used copy & paste to query ChatGPT — an approach only used in 4% of search engine queries — highlighting the stark differences in interaction behavior between the groups and indicating limited task reflection when using ChatGPT. In our work, we demonstrated a need to (1) guide students on how to interact with LLMs and (2) create awareness of potential shortcomings for users.
For immersive experiences such as virtual reality, explorable worlds are often fundamental. Generative artificial intelligence looks promising to accelerate the creation of such environments. However, it remains unclear how existing interaction modalities can support user-centered world generation and how users remain in control of the process. Thus, in this paper, we present a virtual reality application to generate virtual environments and compare three common interaction modalities (voice, controller, and hands) in a pre-study (N = 18), revealing a combination of initial voice input and continued controller manipulation as best suitable. We then investigate three levels of process control (all-at-once, creation-before-manipulation, and step-by-step) in a user study (N = 27). Our results show that although all-at-once reduced the number of object manipulations, participants felt more in control when using the step-by-step approach.
The engineering of reliable and trustworthy AI systems needs to mature. While facing unprecedented challenges, there is much to be learned from other engineering disciplines. We focus on the five pillars of (i) Models & Explanations, (ii) Causality & Grounding, (iii) Modularity & Compositionality, (iv) Human Agency & Oversight, and (v) Maturity Models. Based on these pillars, a new AI engineering discipline might emerge, which we aim to support using corresponding methods and tools for ‘Trust by Design’. A use case concerning mobility and energy consumption in an urban context is discussed.
The arrival of generative Artificial Intelligence (AI) in educational settings offers a unique opportunity to explore the intersection of human cognitive processes and AI, especially in complex tasks like writing. This study adopts a process-oriented approach to investigate the self-regulated learning (SRL) strategies employed by 21 doctoral and master’s students during a writing task facilitated by generative AI. It aims to identify and analyze the SRL strategies that emerge within the framework of hybrid intelligence, emphasizing the collaboration between human intellect and artificial capabilities. Utilizing a learning analytics methodology, specifically lag sequential analysis (LSA), the research examines process data to reveal the patterns of learners’ interactions with generative AI in writing, shedding light on how learners navigate different SRL strategies. This analysis facilitates an understanding of how learners adaptively manage their writing task with the support of AI tool. By delineating the SRL strategies in AI-assisted writing, this research provides valuable implications for the design of educational technologies and the development of pedagogical interventions aimed at fostering successful human-AI collaboration in various learning environments.
This research presents a compelling exploration at the juncture of feminism and Artificial Intelligence (AI), seeking to discern pathways for empowering women through technological advancements and whether Hybrid Human AI helps reclaim womanhood. It employs feminist theory to contextualise the discourse within today’s socio-political landscape. The research methodology integrates co-design and participatory techniques, fostering a playful environment where humour and wit catalyse participants to confront and address personal experiences. By leveraging satire, the study endeavours to create safe spaces for women to collaborate with AI constructively and responsibly, utilising their experiences as case studies. The result highlights the potential of AI to assist women with social awareness, addressing their needs and reclaiming agency over their everyday lives. The insights indicate that we must rethink cyberfeminism in the light of equitable and inclusive AI technologies. When engaging with ethical considerations surrounding AI design, this paper emphasises transparency and women’s autonomy in decision-making. Through irony and speculative methodologies, the outcome points towards experimenting with identity and claiming agency by designing AI assistance through daily life decisions. While the contemporary discourse around AI focuses mainly on labour, privacy and workforce disruption, this research argues that we can use AI to envision empowering futures for women.