Improving Legal Document Summarization Using Graphical Models

Saravanan, M.; Ravindran, B.; Raman, S.

Abstract

In this paper, we propose a novel idea for applying probabilistic graphical models for automatic text summarization task related to a legal domain. Identification of rhetorical roles present in the sentences of a legal document is the important text mining process involved in this task. A Conditional Random Field (CRF) is applied to segment a given legal document into seven labeled components and each label represents the appropriate rhetorical roles. Feature sets with varying characteristics are employed in order to provide significant improvements in CRFs performance. Our system is then enriched by the application of a term distribution model with structured domain knowledge to extract key sentences related to rhetorical categories. The final structured summary has been observed to be closest to 80% accuracy level to the ideal summary generated by experts in the area.

This website uses cookies

This website uses cookies