Xilinx Virtex - Search News

VirTex: Learning Visual Representations from Textual Annotations

VirTex is a pretraining approach which uses semantically dense captions to learn visual representations. We train CNN + Transformers from scratch on COCO Captions, and transfer the CNN to downstream ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

VirTex: Learning Visual Representations from Textual Annotations

Trending now