The convergence between HPC and Big Data processing can be pursued also providing high-level parallel programming tools for developing Big data analysis. Software systems for social data mining provide algorithms and tools for extracting useful knowledge from user-generated social media data. ParSoDA (Parallel Social Data Analytics) is a high-level library for developing data mining applications on HPC systems based on the extraction of useful knowledge from large dataset gathered from social media. The library aims at reducing the programming skills needed for implementing scalable social data analysis applications. To reach this goal, ParSoDA defines a general structure for a social data analysis application that includes a number of configurable steps and provides a predefined (but extensible) set of functions that can be used for each step. User applications based on the ParSoDA library can be run on both Apache Hadoop and Spark clusters. The goal of this paper is to assess the flexibility and usability of the ParSoDA library. Through some code snippets, we demonstrate how programmers can easily extend ParSoDA functions on their own if they need any custom behavior. Concerning the usability, we compare the programming effort required for coding a social media application using or not using the ParSoDA library. The comparison shows that ParSoDA leads to a drastic reduction (i.e., about 65 %) of lines of code, since the programmer only has to implement the application logic without worrying about configuring the environment and related classes.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com