Vision Transformer (ViT) 技术解析 - 鹏展-penggeon

Wait 5 sec.

【摘要】论文:[2010.11929] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 概述 Vision Transformer (ViT) 将标准 Transformer 架构直接应用于图像分类任务。模 阅读全文