Scientific research has become very data and compute intensive because of the progress in data acquisition and measurement devices, which is particularly true in Life Sciences. To cope with this deluge of data, scientists use distributed computing and storage infrastructures. The use of such infrastructures introduces by itself new challenges to the scientists in terms of proper and efficient use. Scientific workflow management systems play an important role in facilitating the use of the infrastructure by hiding some of its complexity. Althought most scientific workflow management systems are provenance-aware, not all of them come with provenance functionality out of the box. In this paper we describe the improvement and integration of a provenance system into an e-infrastructure for biomedical research based on the MOTEUR workflow management system. The main contributions of the paper are: presenting an OPM implementation using relational database backend for the provenance store, providing an e-infrastructure with a comprehensive provenance system, defining a generic approach to provenance implementation, potentially suitable for other workflow systems and application domains and demonstrating the value of this system based on use cases presenting the provenance data through a user-friendly web interface.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com