Most of the existing word embedding models only consider the relationships between words and their local contexts (e.g. ten words around the target word). However, information beyond local contexts (global contexts), which reflect the rich semantic meanings of words, are usually ignored. In this paper, we present a general framework for utilizing global information to learn word and text representations. Our models can be easily integrated into existing local word embedding models, and thus introduces global information of varying degrees according to different downstream tasks. Moreover, we view our models in the co-occurrence matrix perspective, based on which a novel weighted term-document matrix is factorized to generate text representations. We conduct a range of experiments to evaluate word and text representations learned by our models. Experimental results show that our models outperform or compete with state-of-the-art models. Source code of the paper is available at https://github.com/zhezhaoa/cluster-driven.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com