This study proposes a novel unsupervised approach for extracting keywords from Japanese legal documents by applying knowledge of Japanese syntax. Japanese keywords usually occur in chunks; the task of extracting Japanese keywords is treated as a matter of finding chunks that yield documents' important content. To find these chunks, all chunks in a given document are assigned weights to indicate their importance. Highly weighted chunks are recognized as candidate keywords, which are post-processed to obtain keywords. Although the proposed method employs simple techniques, the experimental results on Japanese legal documents show that the proposed chunk-based approach achieves better performance (10.5% higher on F1-score) than the graph-based ranking approach, the most popular unsupervised method.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org