

Machine learning (ML) has become a mainstream approach in the fight against transaction fraud for its intelligence. For financial institutions and businesses, low-latency detection of fraudulent transactions in real-time is highly important as it enables rapid identification and prevention. Concurrently mitigating fraudulent transactions by using ML while also reducing latency remains a challenging endeavor, for which performing inference within programmable network devices offers a potential solution. In this paper, we introduce MIND, conducting ML-based fraud detection within programmable devices. MIND is prototyped on both software and hardware network devices, including BMv2, Intel Tofino, and NVIDIA BlueField-2 DPU, and is evaluated with three publicly available transaction datasets. Experimental results demonstrate that MIND detects transaction fraud in real-time, with a throughput of 6.4 terabits per second and microsecond-scale latency. Compared with server-based solutions, MIND can process over ×800 more transactions per second, along with a latency reduction of over ×1300 per transaction. At the same time, MIND attains 99.94% of server-based benchmarks’ accuracy and 93.66% of their F1-score, exhibiting only marginal degradation in classification performance. Therefore, MIND offers substantial savings in the number of servers, leading to reduced costs and energy consumption, while providing a better customer experience.