Abstract: This article introduces a task named visual grounding of remote sensing ship (VGRSS) images. The goal of VGRSS is to locate ship objects in remote sensing images guided by natural language.
Abstract: Visual question answering (VQA) aims to build an interactive system that infers the answer according to the input image and text-based question. Recently, VQA for remote sensing has ...