Image geolocalization is receiving increasing attention due to its importance in several applications, such as image retrieval, criminal investigations and fact-checking. Previous works focused on several instances of image geolocalization including place recognition, GPS coordinates estimation and country recognition. In this paper, we tackle an even more challenging problem, which is recognizing the city where an image has been taken. Due to the vast number of cities in the world, we cast the problem as a verification problem, whereby the system has to decide whether a certain image has been taken in a given city or not. In particular, we present a system that given a query image and a small set of images taken in a target city, decides if the query image has been shot in the target city or not. To allow the system to handle the case of images, taken in cities that have not been used during training, we use a Siamese network based on Vision Transformer as a backbone. The experiments we run prove the validity of the proposed system which outperforms solutions based on state-of-the-art techniques, even in the challenging case of images shot in different cities of the same country.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com