The W3C standardized Semantic Web languages enable users to capture data without a schema in a manner which is intuitive to them. The challenge is that, for the data to be useful, it should be possible to query the data and to query it efficiently, which necessitates a schema. Understanding the structure of data is thus important to both users and storage implementers: The structure of the data gives insight to users in how to query the data while storage implementers can use the structure to optimize queries. In this paper we propose that data mining routines be used to infer candidate n-ary relations with related uniqueness- and null-free constraints, which can be used to construct an informative Armstrong RDF dataset. The benefit of an informative Armstrong RDF dataset is that it provides example data based on the original data which is a fraction of the size of the original data, while capturing the constraints of the original data faithfully. A case study on a DBPedia person dataset showed that the associated informative Armstrong RDF dataset contained 0.00003% of the statements of the original DBPedia dataset.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com