As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Many binary prediction situations involve imbalanced datasets where the ratio of the minority class over the majority class is very low. This is especially true when dealing with problems looking to use machine learning to better detect fraud, errors or exceptions. In this paper, we address the problem of extreme imbalance, i.e. where the imbalance ratio of majority over minority instances exceeds 500. Given the scarcity of minority examples, oversampling is not sensible due to expensive computational cost. Hence, we explore and expand undersampling approaches. Specifically, we propose a modeling framework (i.e., sequence of modeling steps) that seeks to leverage as much training data as possible. Our results indicate the better trade-off between the false positives and false negatives, which makes it more suitable for real-life application.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.