It has never been easier to get access to content. Rather, we are facing an ever increasing overload which undermines the ability to identify high-quality content relevant to the user. Automatic summarization techniques have been developed to distil down the content to the key points and shorten therewith the time required to grasp the essence and judge about the relevance of the document. Summarization is not a deterministic task and depends very much on the writing style of the person creating the summary. In this work we present a method to, given a set of human-created summaries for a corpus, establishes which automatic extractive summarization technique preserves best the style of the human summary writer. To prove our approach, we use a corpus of 1000 articles by Science Daily with the corresponding human-written summaries and benchmark 3 extractive summarization techniques (BERT-based, keyword-scoring-based and a Luhn summarizer), indicating the best style-preserving method and discussing the results.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com