Abstract: Recently, visual-language navigation (VLN) - entailing robot agents to follow navigation instructions - has shown great advance. However, existing literature put most emphasis on ...
Abstract: Multimodal large language models (MLLMs) act as essential interfaces, connecting humans with AI technologies in multimodal applications. However, current MLLMs face challenges in accurately ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results