File carving is the process which aims to recover files from storage media without the file system meta-data. The ability to perform such recovery is particularly important in this digital era when it involves forensic investigation. Due to the inevitable occurrence of file fragmentation in storage system, fragment classification is an important step in the file recovery process. Following the increase of storage capacity and usage of mobile phones, large amount of personal data tends to be stored on such devices, which is of great interest for forensic analysis during investigations. In this paper, we present an approach in classifying the most commonly found fragment types on mobile phones, which include JPG, MP3, MP4, MOV and SQLite. Departing from the conventional approaches that utilize analysis derived from unigram statistics, we employ bigram statistics in our approach in order to capture the frequency of local byte order which retains meaningful and exploitable pattern in the fragments. While being able to capture more information, the bigram statistics also contain a large amount of redundant data which greatly increases the computational workload. Therefore, we perform dimensionality reduction through Principal Component Analysis (PCA) in order to extract only the most significant dimensions for classification purpose of the targeted file types. Using the resulting features, an average classification accuracy of 96.19% is achieved, comparing to 88.40% while using the unigram statistics alone through Support Vector Machine (SVM).
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com