In this paper we present initial results from our effort to automatically detect references in decisions of the courts in the Czech Republic and link these references to their content. We focus on references to case-law and legal literature. To deal with wide variety in how references are expressed we use a novel distributed approach to reference recognition. Instead of attempting to recognize the references as a whole we focus on their lower level constituents. We assembled a corpus of 350 decisions and annotated it with more than 50,000 annotations corresponding to different reference constituents. Here we present our first attempt to detect these constituents automatically.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com