Polish statutory law so far is distributed as PDF, HTML and text files, where the structure of the rules and the references to internal and external regulations is provided only implicitly. As a result, automatic processing of the regulations in legal information systems is complicated since the semi-structured text needs to be converted to a structured form. In this research, we show how character-level language models help in this task.We apply them to the problems of detecting the cross-references to structural units (e.g. articles, points, etc.) and detecting the cross-references to statutory laws (titles of laws and ordinances). We obtain 98.7% macro-average F1 in the first problem and 95.8% F1 in the second problem.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com