APPGRIT: A Parallel Pipeline for Graph Representation in Text Mining

Zamanian, Alireza; Bakhshi Germi, Saeed; Mahmoudpour, Ehsan

doi:10.3233/978-1-61499-882-2-90

Abstract

Graph-based approaches have been shown to be efficient in information extraction, especially in the case of text mining. Compared to methods like vector space models, a graph representation of a document has less information loss caused by feature extraction. However, constructing graph models are more CPU and memory intensive, thus utilizing HPC solutions seems inevitable in this case. This paper suggests a pipeline method of constructing a graph model that lets for an arbitrary level of parallel processing and distributed computing. This method also enables a wide range of data visualization opportunity. It is shown that big data hardware and software infrastructures could be used without any algorithmic limit. Results show a significant decrease in runtime.

Contact

IOS Press Copyright 2024

Contact

IOS Press Copyright 2024

This website uses cookies

This website uses cookies