GRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVAL

dc.contributor.authorde Almeida, Lucas Barbosa [UNESP]
dc.contributor.authorValem, Lucas Pascotti [UNESP]
dc.contributor.authorPedronette, Daniel Carlos Guimarães [UNESP]
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.date.accessioned2023-07-29T13:38:35Z
dc.date.available2023-07-29T13:38:35Z
dc.date.issued2022-01-01
dc.description.abstractDespite the impressive advances obtained by supervised deep learning approaches on retrieval and classification tasks, how to acquire labeled data for training remains a challenging bottleneck. In this scenario, the need for developing more effective content-based retrieval approaches capable of taking advantage of multimodal information and advances in unsupervised learning becomes imperative. Based on such observations, we propose two novel approaches that combine Graph Convolutional Networks (GCNs) with rank-based manifold learning methods. The GCN models were trained in an unsupervised way, using the Deep Graph Infomax algorithm, and the proposed approaches employ recent rank-based manifold learning methods. Multimodal information is exploited through pre-trained CNNs via transfer learning for extracting audio, image, and video features. The proposed approaches were evaluated on three public action recognition datasets. High-effective results were obtained, reaching relative gains up to +29.44% of MAP compared to baseline approaches without GCNs. The experimental evaluation also considered classical and recent baselines in the literature.en
dc.description.affiliationDepartment of Statistics Applied Mathematics and Computing (DEMAC) São Paulo State University (UNESP)
dc.description.affiliationUnespDepartment of Statistics Applied Mathematics and Computing (DEMAC) São Paulo State University (UNESP)
dc.description.sponsorshipFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
dc.description.sponsorshipIdFAPESP: #2018/15597-6
dc.description.sponsorshipIdFAPESP: #2020/03311-0
dc.description.sponsorshipIdFAPESP: #2020/11366-0
dc.format.extent2811-2815
dc.identifierhttp://dx.doi.org/10.1109/ICIP46576.2022.9897911
dc.identifier.citationProceedings - International Conference on Image Processing, ICIP, p. 2811-2815.
dc.identifier.doi10.1109/ICIP46576.2022.9897911
dc.identifier.issn1522-4880
dc.identifier.scopus2-s2.0-85146715017
dc.identifier.urihttp://hdl.handle.net/11449/248246
dc.language.isoeng
dc.relation.ispartofProceedings - International Conference on Image Processing, ICIP
dc.sourceScopus
dc.subjectgraph convolutional networks
dc.subjectmanifold learning
dc.subjectrank aggregation
dc.subjectvideo multimodal retrieval
dc.titleGRAPH CONVOLUTIONAL NETWORKS AND MANIFOLD RANKING FOR MULTIMODAL VIDEO RETRIEVALen
dc.typeTrabalho apresentado em evento

Arquivos