

In this paper we describe an experiment on using message-level schema matching for Web services network construction. The aim of this study is to empirically determine a similarity threshold for schema matching which will leverage the same Web service message annotation quality as domain experts will through manual efforts. Since we use message annotations for construction of Web services networks, determining the proper threshold is essential in construction of a dataset for large-scale Web services network studies in realistic settings.
First we apply schema matching system COMA to Web service operations in SAWSDL-TC1 dataset. Then suitable upper and lower bounds of the threshold are determined by comparing the resulting matches at various thresholds with matches of manually crafted annotations from the SAWSDL-TC1 dataset. We construct Web services networks at various thresholds within the identified upper and lower bounds, compute a selection of commonly used network metrics for them and align them with other findings in the literature of Web services networks to select the most appropriate threshold value. Finally, we extend this experiment to a bigger real-world dataset of 8000+ operations. The study showed that automatically constructed Web service networks exhibit comparable topological properties as manually constructed network.