Dissemination can be seen as a communication process between scientists. Over the course of several publications, they expose and support their findings, while discussing claims stated in these publications. Unfortunately, such discourse structures are trapped within the content of the publications, thus making the semantics discoverable only by humans, and only by reading the publications. In addition, the lack of advances in scientific publishing, where electronic publications are still used as simple projections of paper documents, combined with the current growth in the amount of scientific research being published, transforms the process of finding relevant literature into a cumbersome task.
The solution relies in taking advantage of the full support provided by electronic publications and making the different discourse structures explicit. Consequently, the resulting knowledge becomes crystallised and can be shared with and by others. From a technological perspective, Semantic Web technologies provide viable ways for representing this knowledge in a machine-understandable form, as semantic metadata, and for transforming simple electronic publications into semantic publications.
The work in this thesis is about paving the way towards a Semantic Publishing Ecosystem by developing Semantic Authoring and Publishing mechanisms, with the generic goal of alleviating, at least partly, the information overload problem. More concretely, Semantic Authoring is about enriching scientific publications with explicit rhetorical and argumentation discourse structures, in addition to explicit linear structure for identification and localisation, and bibliographic information, while authoring the publication. At the same time, Semantic Publishing is about creating semantic publications, by embedding these structures encoded as semantic metadata, into the publication documents. Additionally, Semantic Publishing will also include the publishing, use and retrieval of semantic publications on the Web.
Our hypothesis is that, the Semantic Authoring and Publishing processes bring added value to researchers and improve their daily activities by enabling new functionalities for structuring, retrieving and browsing scientific publications. Furthermore, based on Semantic Authoring and Publishing, the rhetorical and argumentation discourse structures can be formalised and made machine-interpretable using knowledge representation technology. We devise solutions that: capture information present in scientific publications according to its structural, rhetorical and argumentation roles; acquire such information based on manual and automatic approaches, the latter with a satisfactory eficiency; and store, publish and expose the resulted semantic publications in a machine and human processable way.