Text documents are rich repositories of causal knowledge. While journal publications typically contain analytical explanations of observations on the basis of scientific experiments conducted by researchers, analyst reports, News articles or even consumer generated text contain not only viewpoints of authors, but often contain causal explanations for those viewpoints. As interest in data science shifts towards understanding causality rather than mere correlations, there is also a surging interest in extracting causal constructs from text to provide augmented information for better decision making. Causality extraction from text is viewed as a relation extraction problem which requires identification of causal sentences as well as detection of cause and effect clauses separately. In this paper, we present a joint model for causal sentence classification and extraction of cause and effect clauses, using a sequence-labeling architecture cascaded with fine-tuned Bidirectional Encoder Representations from Transformers (BERT) language model. The cause and effect clauses are further processed to identify named entities and build a causal graph using domain constraints. We have done multiple experiments to assess the generalizability of the model. It is observed that when fine-tuned with sentences from a mixed corpus, and further trained to solve both the tasks correctly, the model learns the nuances of expressing causality independent of the domain. The proposed model has been evaluated against multiple state-of-the-art models proposed in literature and found to outperform them all.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com