Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/114664
DC FieldValueLanguage
dc.contributor.authorLi, Qiqi-
dc.contributor.authorMa, Longfei-
dc.contributor.authorJiang, Zheng-
dc.contributor.authorLi, Mingyong-
dc.contributor.authorJin, Bo-
dc.date.accessioned2024-04-04T10:12:55Z-
dc.date.available2024-04-04T10:12:55Z-
dc.date.issued2023-
dc.identifier.issn1546-2226pt
dc.identifier.urihttps://hdl.handle.net/10316/114664-
dc.description.abstractIn recent years, cross-modal hash retrieval has become a popular research field because of its advantages of high efficiency and low storage. Cross-modal retrieval technology can be applied to search engines, crossmodalmedical processing, etc. The existing main method is to use amulti-label matching paradigm to finish the retrieval tasks. However, such methods do not use fine-grained information in the multi-modal data, which may lead to suboptimal results. To avoid cross-modal matching turning into label matching, this paper proposes an end-to-end fine-grained cross-modal hash retrieval method, which can focus more on the fine-grained semantic information of multi-modal data. First, the method refines the image features and no longer uses multiple labels to represent text features but uses BERT for processing. Second, this method uses the inference capabilities of the transformer encoder to generate global fine-grained features. Finally, in order to better judge the effect of the fine-grained model, this paper uses the datasets in the image text matching field instead of the traditional label-matching datasets. This article experiment on Microsoft COCO (MS-COCO) and Flickr30K datasets and compare it with the previous classicalmethods. The experimental results show that this method can obtain more advanced results in the cross-modal hash retrieval field.pt
dc.language.isoengpt
dc.publisherTech Science Presspt
dc.relationThis work was partially supported by Chongqing Natural Science Foundation of China (Grant No. CSTB2022NSCQ-MSX1417), the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K202200513) and Chongqing Normal University Fund (Grant No. 22XLB003), Chongqing Education Science Planning Project (Grant No. 2021-GX-320) and Humanities and Social Sciences Project of Chongqing Education Commission of China (Grant No. 22SKGH100)pt
dc.rightsopenAccesspt
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt
dc.subjectDeep learningpt
dc.subjectcross-modal retrievalpt
dc.subjecthash learningpt
dc.subjecttransformerpt
dc.titleTECMH: Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrievalpt
dc.typearticle-
degois.publication.firstPage3713pt
degois.publication.lastPage3728pt
degois.publication.issue2pt
degois.publication.titleComputers, Materials and Continuapt
dc.peerreviewedyespt
dc.identifier.doi10.32604/cmc.2023.037463pt
degois.publication.volume75pt
dc.date.embargo2023-01-01*
uc.date.periodoEmbargo0pt
item.openairetypearticle-
item.fulltextCom Texto completo-
item.languageiso639-1en-
item.grantfulltextopen-
item.cerifentitytypePublications-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
crisitem.author.researchunitISR - Institute of Systems and Robotics-
crisitem.author.parentresearchunitUniversity of Coimbra-
crisitem.author.orcid0000-0001-9255-5772-
Appears in Collections:FCTUC Eng.Electrotécnica - Artigos em Revistas Internacionais
I&D ISR - Artigos em Revistas Internacionais
Show simple item record

Page view(s)

43
checked on Jul 17, 2024

Download(s)

15
checked on Jul 17, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons