As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
The crux of data compression is to process a string of bits in order predicting each subsequent bit as accurately as possible. The accuracy of this prediction is reflected directly in compression effectiveness. Dynamic Markov Compression (DMC) uses a simple finite state model which grows and adapts in response to each bit, and achieves state-of-the art compression on a variety of data streams. While its performance on text is competitive with the best known techniques, its major strength is that is lacks prior assumptions about language and data encoding and therefore works well for binary data like executable programs and aircraft telemetry. The DMC model alone may be used to predict any activity represented as a stream of bits. For example, DMC plays “Rock, Paper, Scissors” quite effectively against humans. Recently, DMC has been shown to be applicable to the problem of email and web spam detection – one of the best known techniques for this purpose. The reasons for its effectiveness in this domain are not completely understood, because DMC performs poorly for some other standard text classification tasks. I conjecture that the reason is DMC's ability to process non-linguistic information like the headers of email, and to predict the nature of polymorphic spam rather than relying on fixed features to identify spam. In this presentation I describe DMC and its application to classification and prediction, particularly in an environment where particular patterns of data and behavior cannot be anticipated, and may be chosen by an adversary so as to defeat classification and prediction.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.