More and more information is available on the Web, and the current search engines do a great job to make it accessible. Yet, optimizing for a large number of users, they usually provide good answers only to “most of us”, and have yet to provide satisfying mechanisms to search for audiovisual content. In this talk I will present some ongoing work at L3S addressing these challenges, done in the context of several European Union funded projects on personal information management and web search.
Regarding personalization, I will talk about personalizing Web Search based on user content, which goes beyond simple user profiles used in other systems. The algorithms presented improve Web queries by expanding them with terms collected from each user's personal information repository, thus implicitly personalizing the search output. Generating the additional query keywords is done by analyzing user data at increasing granularity levels, ranging from term and compound level analysis up to global co-occurrence statistics, as well as to using external thesauri. Extensive empirical analysis shows some of these approaches to perform very well, especially on ambiguous queries, producing a very strong increase in the quality of the output rankings.
Regarding search for audiovisual content, I will focus on exploiting user generated information, and discuss what kinds of tags are used for different resources and how they can help for search. Collaborative tagging has become an increasingly popular means for sharing and organizing Web resources, leading to a huge amount of user generated metadata. These tags represent different aspects of the resources they describe and it is not obvious whether and how these tags or subsets of them can be used for search. I will present an in-depth study of tagging behavior for different kinds of resources - Web pages, music, and images. The results are promising and provide more insight into both the use of different kinds of tags for improving search and possible extensions of tagging systems to support the creation of potentially search-relevant tags.