Repository logo

Unsupervised Dual-Layer Aggregation for Feature Fusion on Image Retrieval Tasks

Loading...
Thumbnail Image

Advisor

Coadvisor

Graduate program

Undergraduate course

Journal Title

Journal ISSN

Volume Title

Publisher

Type

Work presented at event

Access right

Abstract

The revolutionary advances in image representation have led to impressive progress in many image understanding-related tasks, primarily supported by Convolutional Neural Networks (CNN) and, more recently, by Transformer models. Despite such advances, assessing the similarity among images for retrieval in unsupervised scenarios remains a challenging task, mostly grounded on traditional pairwise measures, such as the Euclidean distance. The scenario is even more challenging when different visual features are available, requiring the selection and fusion of features without any label information. In this paper, we propose an Unsupervised Dual-Layer Aggregation (UDLA) method, based on contextual similarity approaches for selecting and fusing CNN and Transformer-based visual features trained through transfer learning. In the first layer, the selected features are fused in pairs focused on precision. A sub-set of pairs is selected for a second layer aggregation focused on recall. An experimental evaluation conducted in different public datasets showed the effectiveness of the proposed approach, which achieved results significantly superior to the best-isolated feature and also superior to a recent fusion approach considered as baseline.

Description

Keywords

Language

English

Citation

Brazilian Symposium of Computer Graphic and Image Processing.

Related itens

Sponsors

Collections

Units

Departments

Undergraduate courses

Graduate programs

Other forms of access