Ebook: Artificial Intelligence and Human-Computer Interaction

Proceedings of the 2nd International Conference (ArtInHCI 2024), Kunming, China, 25-27 October 2024

Series

Frontiers in Artificial Intelligence and Applications

Volume

404

Published

2025

Editors

Yalan Ye, Huiyu Zhou

ISBN

978-1-64368-583-0 (online)

Subject(s)

Artificial Intelligence Human-Computer Interaction

Open Access

Description

The importance of artificial intelligence (AI) to all our lives is now undeniable, and with interactions between humans, computers, and AI continuing to increase, this area has become the focus of growing interest.

This book presents the proceedings of ArtInHCI2024, the 2nd International Conference on Artificial Intelligence and Human-Computer Interaction, held as a hybrid event from 25 to 27 October 2024 in Kunming, China. The ArtInHCI conference series was conceived with the aim of promoting academic exchange within and across disciplines, addressing theoretical and practical challenges and advancing current understanding and application; a process which it is hoped will also serve to spread amity, establish connections and enable future collaboration. ArtInHCI2024 provided a platform for the discussion of a number of hot topics, including deep learning, artificial neural networks, computer vision and pattern recognition, and the conference focused on research challenges as well as those of application. A total of 191 submissions were received for the conference, and after initial screening, 142 were submitted to a rigorous, double blind peer review procedure based on relevance, writing skills, scientific quality and soundness, and contribution or practical implications. Following a final decision-making process, 93 of the papers were selected for presentation and publication here, an acceptance rate of 48.7%.

Covering a wide range of topics in the sphere of AI and human/computer interaction, the book will be of interest to all those working in the field.

↓ more

↑ less

Contents

Front Matter

Pages

i - xx

Category

Front Matter

Preface

This volume in the series, Frontiers in Artificial Intelligence and Applications (FAIA), presents the proceedings of the 2nd International Conference on Artificial Intelligence and Human-Computer Interaction (ArtInHCI2024). The conference was successfully held in Kunming, China from 25 to 27 October 2024.

ArtInHCI2024 organized discussions on a number of hot topics, including deep learning, artificial neural networks, computer vision and pattern recognition, and focused on the challenges of research as well as of application. The conference consisted of an onsite session and an online session, showcasing various items such as keynote speeches, oral reports, poster presentations, and Q&A. We were fortunate to have with us experts and scholars from around the globe to share their latest findings and insights, including Professor Ji Zhang, University of Southern Queensland (UniSQ), Australia; Professor Huiyu Zhou, University of Leicester, UK; Professor Liang Liao, Zhongyuan University of Technology, China; Assoc. Prof. Aslina Baharum, Sunway University, Malaysia; Prof. Xiaohui Zou, Peking University, Director of Interdisciplinary Knowledge Modeling Research Group, Special & Researcher, Hengqin Searle Technology Co. Ltd., China; Prof. Ljiljana Trajkovic, Simon Fraser University, Burnaby, British Columbia, Canada; Assoc. Prof. Teh Sin Yin, Universiti Sains Malaysia; Assoc. Prof. Le Nguyen Quoc Khanh, Taipei Medical University (TMU), Taiwan, China; Asst. Prof. Teoh Wei Lin, Heriot-Watt University Malaysia, Malaysia; Asst. Prof. Chong Zhi Lin, Universiti Tunku Abdul Rahman, Malaysia; Asst. Prof. Luís Silva, NOVA School of Science and Technology, Portugal; and Assoc. Prof. Yuping Song, Shanghai Normal University, China.

The ArtInHCI conference was conceived with the aim of promoting academic exchange within and across disciplines, addressing theoretical and practical challenges and advancing current understanding and application; a process which we hope will also spread amity, establish connections and enable future collaboration.

The ArtInHCI organizing committee extend their sincerest gratitude to all those who have supported the conference in their various ways; the authors who have chosen this platform to publish their works and communicate with peers, the participants who took an interest and attended the conference, the chairs and committee members whose professional expertise and judgment has been indispensible, the keynote speakers who so generously shared their vision and passion, and the reviewers who upheld the faith in scholarship and contributed their experience and honest opinions. It has been a pleasure and honor to work alongside them, and we look forward to further cooperation with them at future ArtInHCI conferences.

The Editors

↓ more

↑ less

Artificial Intelligence and Machine Learning

Page

↓ more

↑ less

Uncertainty-Based Dynamic Weighted Experience Replay for Human-in-the-Loop Deep Reinforcement Learning

Authors

Xia Tian, Yu Kang, Yunbo Zhao, Yaqing Zhou, Pengfei Li

Pages

2 - 8

DOI

10.3233/FAIA250100

Category

Research Article

Abstract

Human-in-the-loop reinforcement learning (HIRL) enhances sampling efficiency in deep reinforcement learning by incorporating human expertise and experience into the training process. However, HIRL methods still heavily depend on expert guidance, which is a key factor limiting their further development and largescale application. In this paper, an uncertainty-based dynamic weighted experience replay approach (UDWER) is proposed to solve the above problem. Our approach enables the algorithm to detect decision uncertainty, triggering human intervention only when uncertainty exceeds a threshold. This reduces the need for continuous human supervision. Additionally, we design a dynamic experience replay mechanism that prioritizes machine self-exploration and human-guided samples with different weights based on decision uncertainty. We also provide a theoretical derivation and related discussion. Experiments in the Lunar Lander environment demonstrate improved sampling efficiency and reduced reliance on human guidance.

↓ more

↑ less

Using Particle Filters as an Optimizer for Model Predictive Control

Authors

Yen-Wen Chung, Ming-Yuan Peng, Chien-Lin Chiang, I-Ju Chen, Yi-Yuan Chiang

Pages

9 - 15

DOI

10.3233/FAIA250101

Category

Research Article

Abstract

Model Predictive Control (MPC) has become an important control method for autonomous vehicles and complex robotic systems. However, MPC requires solving an optimization problem to ensure optimal control inputs, which can be computationally expensive for nonlinear and high-dimensional systems. This paper proposes using Particle Filters (PF) to execute the solving process of MPC to enhance efficiency and accuracy. Our approach applies PF to solve quadratic programming problems and integrates it into the MPC framework. We investigate two specific applications: lane-keeping control for autonomous vehicles and control of a robotic arm mounted on a differential drive mobile platform. Experimental results show that using PF can effectively optimize MPC problems, significantly reduce computation time, and improve control accuracy, particularly in handling complex nonlinear systems in the mentioned applications. This paper demonstrates the potential of PF as an optimizer in MPC and suggests further testing of this approach in more complex control problems to verify its broad applicability and reliability.

↓ more

↑ less

Multimodal Cognitive Workload Recognition with Missing Modalities

Authors

Yalan Ye, Wang Xiao, Qiang Zhao, Hongyu Jiang, Fan Li, Pei Guo, Wenxia Huang, Yujie He

Pages

16 - 22

DOI

10.3233/FAIA250102

Category

Research Article

Abstract

Recognizing the cognitive workload (CW) of operators is crucial to avoid human factor failures. Recently, multimodal CW recognition has attracted increasing attention since it leverages complementary information from different physiological modalities to enhance CW recognition performance. However, in real-world scenarios, not all modalities are always available. The performance of multimodal CW recognition may degrade when any modality misses, especially electroencephalogram (EEG). Although existing methods are capable of addressing the issue by explicitly recovering the missing modalities based on the available ones, they struggle to generate high-quality missing modality features due to their neglect of inter-modality relationships. In this paper, we propose a novel multimodal learning framework for CW recognition to address the incomplete modality problem. To recover the missing modality, a mutual information-assisted recovery strategy, which can maximize the mutual information between the missing and the available modalities, is used to train a feature generation module. Furthermore, to efficiently utilize complementary multimodal information, we employ a feature fusion strategy based on a channel attention mechanism to help the model focus on the key information. As a result, the proposed framework can achieve good CW recognition performance in missing modality scenarios. Our method attains an average accuracy of 75.04% on a public dataset, which is the highest among all compared methods and demonstrates the effectiveness of our framework.

↓ more

↑ less

Error Correction Model Based on NR-GRU for Real Estate Prices Prediction

Authors

Guangcan Cui, Yuetong Zhang, Fakai Yan, Chunyu Kao

Pages

23 - 30

DOI

10.3233/FAIA250103

Category

Research Article

Abstract

The real estate price is an essential index to measure the real estate industry, urban economy, and investment policy. This paper introduces a non-parametric regression (NR)-deep learning framework, which uses the non-parametric model to predict the trend of the real estate price series, and then uses the deep learning model to capture the residual information, to achieve the effect of error correction and optimize the accuracy of real estate price prediction. The empirical results show that error correction can improve the prediction accuracy by an order of magnitude. The improvement degree of six evaluation criteria is far more than 10 times. In addition, under the error correction framework, NR-gated recurrent unit (GRU) has certain advantages in processing nonlinear complex error sequences. Compared with the SVR and LSTM model under the framework, the average improvement percentage of evaluation criteria is about 5.20% and 0.09%, and the DM statistics are all positive.

↓ more

↑ less

Research on Personalized Course Recommendation for Online MOOC Platforms Based on Emotion Recognition

Authors

Bing Li, Yuqi Hou, Biao Yang

Pages

31 - 41

DOI

10.3233/FAIA250104

Category

Research Article

Abstract

In the era of smart education, online courses as the avant-garde force in the educational field are leading the way in innovating teaching methods. Although online learning platforms provide students with convenient channels for learning, issues such as course quality, personalized service, and learning motivation still exist. This study, based on China University MOOC, proposes a personalized online course recommendation method based on emotion recognition, aimed at deeply understanding students’ emotional states to enhance the accuracy and personalization of course recommendations. Initially, this paper collected a dataset of course user comments from China University MOOC and built an emotional dictionary in the education domain to analyze users’ emotional states. Combining emotion analysis, user characteristics, and course features, the SAFM and SDFM models were proposed, incorporating a negative sampling method to generate personalized course recommendations. The experiments prove that this method effectively enhances students’ learning motivation and participation, offering new insights for the development of online education platforms.

↓ more

↑ less

Exploring Text Classification Methods for Bulletin Board System Posts: A Comparative Analysis of BERT, BiLSTM, and Different Loss Functions

Authors

Wenhan Hu, Mini Han Wang

Pages

42 - 54

DOI

10.3233/FAIA250105

Category

Research Article

Abstract

Multi-Label Text Classification (MLTC) is a crucial task in natural language processing (NLP), enabling the assignment of multiple labels to a single text sample, which aligns with the diverse and multifaceted nature of discussions typically found in Bulletin Board System. This study presents an investigation into text classification methodologies, leveraging a dataset comprising 388,693 entries, with 234,237 entries manually annotated for model training. The dataset encompasses diverse text data from prominent social platforms, including GitHub, H5-based forums, WeChat, QQ group chats, and more. Four distinct methods for text classification are compared: BERT and BiLSTM models with Binary Cross-Entropy (BCE) loss, BERT for feature extraction followed by BiLSTM and BCE, BERT and BiLSTM models with Focal Loss (FL), and BERT for feature extraction followed by BiLSTM and FL. The experimentation reveals insights into their performance, indicating that models utilizing pre-trained BERT for feature extraction outperform those without pre-training. Focal Loss emerges as a superior alternative to Binary Cross-Entropy, demonstrating efficacy in handling class imbalance and noisy data, thereby improving overall model accuracy and robustness. These findings underscore the importance of thoughtful model architecture and loss function selection. Future research directions include exploring ensemble methodologies, alternative pre-training techniques for BERT, and enhancing model interpretability. Keeping pace with NLP advancements and integrating cutting-edge techniques into future investigations holds promise for further advancements in model efficacy and practical utility.

↓ more

↑ less

Towards Explainable Deep Learning with Attention Mechanisms for Fake Image Detection

Authors

Suen Kei Lau, Sook-Ling Chua, Lee Kien Foo

Pages

55 - 61

DOI

10.3233/FAIA250106

Category

Research Article

Abstract

The proliferation of fake images on the internet has become increasingly alarming. Advanced techniques including generative adversarial networks can generate visually real images that can mislead people and create false information. This poses a threat and can cause serious impacts. Many methods based on deep learning were proposed to detect fake images. These methods have demonstrated ability to achieve highly accurate results in detecting fake images. However, due to its “black-box” nature, there is a lack of explainability of the decision-making process in these models. In this paper, we integrate the convolution block attention module in ResNet-18 to improve the explainability of the deep learning model for fake image detection. The results showed that our method achieved a higher performance compared to the baseline method.

↓ more

↑ less

A Comprehensive Analysis of Plant Invasion Classification Using Random Forest Method

Authors

Ziming Cui, Xinyue Qian, Mingrui Cai

Pages

62 - 71

DOI

10.3233/FAIA250107

Category

Research Article

Abstract

Plant invasion presents a substantial challenge to ecosystems on a global scale, thus requiring the development of effective categorization techniques for precise recognition and control. This article offers an in-depth investigation of plant invasion categorization using the Random Forest (RF) approach. The fundamental principles of RF and its relevance to invasion ecology are deliberated. By means of a methodical examination, we assess the effectiveness of RF in discerning invasive plant species from non-invasive ones based on a variety of ecological characteristics. Our results emphasize the efficiency of RF in classification assignments, underscoring its potential as a valuable instrument for the management of invasive species and conservation endeavors.

↓ more

↑ less

Prediction of RMB Offshore Exchange Rate Driven by Investor Sentiment: Innovative Application of Deep Learning Model

Authors

Nan Wang, Bochuan Zhou, Po Ning

Pages

72 - 79

DOI

10.3233/FAIA250108

Category

Research Article

Abstract

The offshore Renminbi (CNH) exchange rate against the United States dollar (USD) better reflects the immediate changes in market supply and demand and investor sentiment due to its exemption from foreign exchange control in mainland China. This paper explores how to improve the forecasting accuracy of the offshore RMBUSD exchange rate by constructing an investor sentiment index based on online forums and news comments. This paper firstly collects and analyzes many financial news headlines on the English for Treasury website, and applies the BERT model in natural language processing technology to identify and quantify the sentiment tendencies in the news headlines, to construct a daily investor sentiment index. Subsequently, this sentiment index is combined with traditional financial market and macroeconomic indicators, and a variety of advanced machine learning and deep learning methods, including Random Forest, Support Vector Machines, Long Short-Term Memory Networks (LSTM), and Gated Recurrent Units (GRUs), are applied to forecast the offshore RMB exchange rate. It is found that the introduction of sentiment indices significantly improves the accuracy of the prediction models. Especially in LSTM and GRU models, the inclusion of sentiment index makes the models perform better in capturing the nonlinear features of exchange rate fluctuations.

↓ more

↑ less

Rotation Invariant Color Image Recognition Based on Moments and Micro-Convolution Neural Network

Authors

Jing Wang, Bing He, Wenqiang Xi, Cheng Peng, Guancheng Lin

Pages

80 - 89

DOI

10.3233/FAIA250109

Category

Research Article

Abstract

Image recognition using deep learning, especially deep convolution neural network (DNN), of great success to human due to increasing use of computer vision in our daily life. In this paper, we introduce a novel framework of image recognition system based on quaternion fractional-order radial orthogonal moments and deep learning. The proposed image recognition system is derived by combining quaternion fractional-order polar harmonic-Fourier moments (QFr-PHFMs) and Micro-Convolution Neural Network (Micro-CNN). The proposed methodology can use a small number of network layers and achieve high-quality recognition accuracy, especially in the case of image processing with high noise and smooth filtering conditions. Theoretical analyses and experimental results showed that the proposed methodology offer enhanced image recognition compared with the different moment-based feature-extraction algorithms and the existing CNN methods.

↓ more

↑ less

The Effects of Generative Artificial Intelligence Technologies on Writing Tasks in Foreign Language Learning

Authors

Anna P. Avramenko, Anna A. Nasonova, Alexey A. Tarasov, Vladimir V. Ternovski

Pages

90 - 104

DOI

10.3233/FAIA250110

Category

Research Article

Abstract

The proliferation of generative artificial intelligence (AI) technologies has introduced challenges related to the use of automatically generated texts within foreign language learning settings. This study is aimed at developing clear-cut principles of incorporating large language models (LLM) into the English teachers’ workflow. Our approach employed examining research papers and conducting a small-scale study designed to obtain and evaluate texts produced by state-of-the-art LLMs in response to the General English course writing assignments. The experiment revealed peculiarities and limitations of the chosen LLMs in generating texts for study purposes. The educational potential of this technology as well as suggested conditions for its effective integration into instructional practices were also presented. We concluded that the transformation of writing assignments is necessary, with a focus on fostering critical thinking skills. Furthermore, fostering students’ information and communication technology (ICT) skills while engaging with an LLM chatbot is considered of paramount importance.

↓ more

↑ less

Enhancing LLM’s Expressive Abilities Through Defining Roles and Evaluation Indicators

Authors

Changxu Ma, Na Li, Bo Lu

Pages

105 - 112

DOI

10.3233/FAIA250111

Category

Research Article

Abstract

Recently, in the field of natural language processing, there has been an increasing emphasis on enhancing the language expressiveness of generative pre-training models. To address this challenge, this paper proposes an approach that involves specified identities or roles and evaluation metrics. By introducing specified identities or roles, the model can adapting various communicative roles tailored to specific contexts and needs, thereby better adapting to different scenarios. In terms of model evaluation, we used ten sample models and provided each model with 3000 questions. Other models and humans rated the answers given by the model on a scale of 0 to 10. The average score was then obtained. This average score is then provided as feedback to the model, encouraging it to reflect and provide more accurate answers. Finally, the paper explores the potential application prospects of this approach in human-computer dialogues, personalized Q&A systems, and other domains, demonstrating its value in enhancing natural language processing technology.

↓ more

↑ less

Detecting Suicidal Ideations on Reddit with Transformer Models

Authors

Eldar Yeskuatov, Sook-Ling Chua, Lee Kien Foo

Pages

113 - 119

DOI

10.3233/FAIA250112

Category

Research Article

Abstract

Early detection of suicidal ideations is one of the key suicide prevention strategies. However, there are challenges that obstruct the detection of suicidal ideations. Mainly, the stigma surrounding mental health, and suicide in particular, obstructs traditional risk screening methods, such as questionnaires and interviews. These methods rely on at-risk individuals to explicitly communicate their suicidal thoughts. At the same time, people with suicidal ideations are increasingly turning to online forums such as Reddit to share their experiences and seek emotional support. Consequently, these platforms have emerged as a large source of textual data for detecting suicidal ideations using machine learning and natural language processing methods. This paper aims to explore the effectiveness of transformer-based models for detecting suicidal ideations on Reddit forums. In this study, the transformer models were fine-tuned and compared against machine learning and deep learning baselines. Our experimental results show that the fine-tuned base-BERT transformer model demonstrates superior performance in detecting suicidal ideations compared to baseline machine learning and deep learning models, achieving an F1-score of 99%.

↓ more

↑ less

Rolling Decomposition Prediction of Gold Price Based on Nonparametric and Deep Learning Models

Authors

Jinrui Ruan, Ying Zheng, Chunyu Kao, Yuping Song

Pages

120 - 127

DOI

10.3233/FAIA250113

Category

Research Article

Abstract

Gold futures, as an essential hedge asset, have received much attention. In this paper, the VMD-reconstruction-integration framework combining rolling windows is proposed to predict gold futures prices. The Fine to Coarse (FTC) method is used to reconstruct the decomposed IMFs, and the non-parametric regression (NR) model and extreme learning machine (ELM) model are used to predict the reconstructed long-term trend term and short-term disturbance term respectively, which effectively avoided the information leakage problem in the decomposition process and improved the prediction accuracy of gold futures price. The empirical results show that after avoiding the decomposition leakage problem, the model under the decomposition framework still has a certain improvement effect. In addition, the R-VMD-NR/ELM model has the best prediction effect, and compared with the R-VMD-NR/SVR model, the six evaluation criteria improved by 0.8883, 9.7188, 0.0021, 1.0492, 0.008, and 0.02, respectively.

↓ more

↑ less

Temperature Abnormality Monitoring Model for Industrial Equipment-Based Image and Infrared Thermography

Authors

Ting-Yi Chiang, I-long Lin

Pages

128 - 133

DOI

10.3233/FAIA250114

Category

Research Article

Abstract

The study proposed a model for monitoring temperature anomalies in industrial machinery based on image and thermography data, focusing on motors and electrical panels. Enhance the operational stability of the machine through multimodal data fusion analysis using Convolutional Neural Networks (CNN). The model includes four parts: data collection, data preprocessing, and anomaly detection model training and deployment.

↓ more

↑ less

Improved AH2E2 Contrast Enhancement for Deep Learning on Night-Scenario Canal Segmentation

Authors

Ta-Wen Kuan, Siying Cai, Xiaodong Yu, Yuechun Wang, Ying Chen, Yuh-Chung Lin, Shun Nian Luo

Pages

134 - 143

DOI

10.3233/FAIA250115

Category

Research Article

Abstract

The objective of canal semantic segmentation in this work preliminarily investigates the visual intelligence on an SSSB (Self-Sailing Sweeper Boat) that can visualize constraint sailing in a restricted region further for sweeping the canal. The contribution and novelty of this work firstly proposes the night-scenario canal segmentation in terms of the low-lightness problems, in which the boundaries between the canal region and the bank shore cannot be precisely distinguished for ground-true labeling prior to the dataset training and validation, further decreasing the segmentation efficiency. To do so, this work investigates the contrast enhancement approach on both Histogram Equalization (HE) and Adaptive Histogram Equalization (AHE) methods, respectively, naming AH²E² on night-scenario canal to increase boundary visibility to benefit the precise labeling of the ground-true region for canal segmentation. Thereafter, three U-net-based approaches, including the Primordial U-Net, ResU-Net, and AttresU-net, of which six combinations from HE and AHE are examined for evaluation. To further inspect the tradeoff between the training cost and the segmentation efficiency in terms of required EPOCH and the engaged ground-true number. The experiments dynamically set the participated ground-true labels from 150 to 750 step 150 and set EPOCH as 100. Experimental results reported that HE for contrast enhancement method with AttresU-net learning approach performed superior segmentation efficiency compared to the other five combinations. However, HE+AttresU-net herein is observed to give a higher training cost than the other five combinations. More discussions are elucidated in the experiments.

↓ more

↑ less

Design and Implementation of a Multimodal Language Recognition System

Authors

Jie He, Yue Tan, Chen Liu

Pages

144 - 158

DOI

10.3233/FAIA250116

Category

Research Article

Abstract

In recent years, speech recognition technology and image recognition technology have gradually become the main ways of human-computer interaction, and research on speech recognition based on noisy backgrounds has also gradually emerged. Although the recognition accuracy of isolated words has reached 99% in the testing environment, from a practical perspective, the accuracy of speech recognition is significantly reduced under the influence of noisy background noise. In order to further improve the accuracy of language recognition, this article designs and implements a multimodal language recognition system based on Markov model, which runs in WIN10 and is compiled in C++language. The audio features selected are MFCC features and FBanK features, while the image features selected are the geometric and shape features of the lip fitting curve. And it has been verified that the multimodal language recognition system performs better than pure speech recognition in noisy environments.

↓ more

↑ less

An End-to-End Network for Fast Object Detection Using Stereo Images

Authors

Yangjun Xu, Lingsen Cheng, Xin Jin, Yingchun Yang, Qiaodi Zeng

Pages

159 - 165

DOI

10.3233/FAIA250117

Category

Research Article

Abstract

Compared with the monocular-based object detection approach, the binocular-based one can exploit much richer cues like the depth information. However, existing binocular-based methods typically require to calculate the explicit disparity maps of scenes as a depth cue, which brings extra computational cost and may cause errors in intermediate depth inference. In order to overcome these shortcomings, we propose an end-to-end neural network for binocular-based object detection. The network has an asymmetric two-stream architecture. One stream takes charge of the depth cue extraction from stereo images, called the Implicit Depth Mining Network (IDMN). The other stream, called the Multi-Modal Detection Network (MMDN), is to exploit the appearance cue from a monocular image and then to fuse the appearance cue and depth cue for object detection. Such a model exploits depth information but does not need to explicitly calculate the disparity map or depth map, so it can work efficiently in practice. Experimental results on indicate that our method achieves a good trade-off between effectiveness and efficiency.

↓ more

↑ less

ConvNeXt Based Hybrid Models with Multi-Modal Feature Fusion for ECG Classification

Authors

Ameneshewa Abush Sahlu, Sintayehu Mandefro Gizaw, Mekonen Hiwot Yimer, Abebe Mebratu Mekbibu, Molla Woretaw Teshome, Teshome Nitsihit Yehualashet

Pages

166 - 175

DOI

10.3233/FAIA250118

Category

Research Article

Abstract

Cardiovascular disease (CVD) remains a leading cause of global mortality, requiring accurate and early diagnosis. This study proposes a ConvNeXt-based multi-module feature fusion approach that enhances feature extraction, interpretability, and spatial-temporal representation in ECG classification. Using the MIT-BIH dataset, ECG signals are denoised with Discrete Wavelet Transform (DWT) and balanced using SMOTE, then encoded into GASF, GADF, and MTF images. These images are fused to improve generalization within the 2D module. The first module uses a ConvNeXt architecture with CBAM for feature extraction from the fused 2D representations, while the second combines CNN, SENet, and BiLSTM to analyze 1D ECG signals. This fusion maximizes the benefits of both data types, leading to a robust cardiac abnormality detection. The approach achieves training and validation accuracies of 99.97% and 99.90%, demonstrating its potential for practical cardiovascular diagnostics.

↓ more

↑ less

Day-Ahead NOx Emission Prediction Based on SPCLPM

Authors

Wei Zhang, Xiaoyi Cao, Mingming Wang, Lanna Du

Pages

176 - 184

DOI

10.3233/FAIA250119

Category

Research Article

Abstract

Aiming at the problem of day-ahead NOx emission prediction from thermal power units, a sequence to point model SPCLPM is proposed which combines Conv1D and LSTM Prediction Model. The model is feed with 12 selected features, according to the NOx generation mechanism. One-dimensional convolution network is used to automatically extract the dependencies between selected features, while maintaining the chronological order. LSTM is used to extract the temporal characteristics, and the prediction results are output through the full connection layer. The model is trained based on the monitoring data of four thermal power units. The experimental results show that the prediction performance of SPCLPM is significantly better than that of LSTM model without one-dimensional convolution and traditional random forest model, and it can more accurately track and predict the change trend of NOx emissions in the next 24 hours.

↓ more

↑ less

Coastline Extraction Based on Deep Learning Using Gaofen-1 and Ziyuan-3 Satellite Imagery

Authors

Hui Li, Chenyi Jiang, Jinzhuang Shi, Qiyuan Xie, Linhai Jing

Pages

185 - 193

DOI

10.3233/FAIA250120

Category

Research Article

Abstract

The potential of Chinese high-spatial satellite imagery and deep learning models in coastline extraction were explored in this work. The performances of three deep learning models including U-Net, ResUnet, and SegNet were compared using 2-m resolution pansharpened products of GaoFen-1(GF1), and Ziyuan-3 (ZY3) imagery. The prediction results of ResUnet were significantly more accurate than those of U-Net and SegNet. The trained ResUnet model was then used to predict and extract the coastlines of Haikou City, Hainan Province. The 2-m resolution coastline products of Haikou City in 2016, 2018, and 2019 were obtained. The results showed that no significant changes in the coastline of Haikou City from 2016 to 2019.

↓ more

↑ less

Bitcoin Volatility Forecasting Based on Time Series Decomposition and Deep Learning Model

Authors

Yankun Sun, Xiaolong Tang, Yile Jiang

Pages

194 - 201

DOI

10.3233/FAIA250121

Category

Research Article

Abstract

In recent years, various digital currencies have emerged, among which Bitcoin has been widely accepted as an alternative to sovereign currencies for commodity trading. However, the dramatic volatility of bitcoin prices can pose a risk to global financial markets. In this paper, we firstly construct a more comprehensive forecasting index system from seven aspects, and then construct a VMD-GRU model. This model uses the variational modal decomposition (VMD) to decompose the time series into intrinsic mode functions (IMFs) and use gated recurrent unit (GRU) to forecast different IMFs. This paper also compares the forecast results with classical machine learning models and deep learning models, and the results show that the forecast accuracy of the VMD-GRU model is more than 16% better than other models.

↓ more

↑ less

The Influence of the Layout of Different Frequencies on the Accuracy of SSVEP-Based BCI System

Authors

Lan Niu, Li An, Jingxiang Deng, Jianxiong Bin, Yingbin Zhao, Xiaoyang Kang

Pages

202 - 208

DOI

10.3233/FAIA250122

Category

Research Article

Abstract

The BCI system based on Steady-Stated Visual Evoked Potential (SSVEP) is used widely because of its high system stability, information transfer rate (ITR), and accuracy. Accuracy is an important indicator to evaluate the BCI system performance. There has been a lot of research on the effect of the SSVEP paradigm parameters, such as frequency, shape, color, etc., on recognition accuracy. In this paper, we focus on the influence of the layout of different frequencies on the performance of BCI systems. We carried out experimental verification in the SSVEP-based BCI system. The experimental results show that adjusting the frequency layout improves the recognition accuracy of the system from 93.61% to 96.11%.

↓ more

↑ less

Ebook: Artificial Intelligence and Human-Computer Interaction

This website uses cookies

This website uses cookies