In health question answering (QA) system development, question topic identification is crucial to understand users' information needs and further facilitate answer extraction. This paper presented a machine-learning method to automatically identify topics of health related questions in Chinese asked by the general public. We collected 2000 questions from Chinese consumer health website, and characterized them using 17 types of features such as lexical, grammatical, statistical, and semantic features. This method were applied to identify 6 health question topics of Condition Management, Healthy Lifestyle, Diagnosis, Health Provider Choosing, Treatment, and Epidemiology. The results showed the average F1-scores of the above 6 topic identification were 99.63%, 99.13%, 98.55%, 96.35%, 76.02%, and 71.77%, respectively.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com