内容をスキップ

研究室について
ニュース
研究
講義
起業家育成
- 松尾研発スタートアップ
- 起業クエスト
社会連携
メンバー
- 研究員・スタッフ一覧
- 学生一覧
採用・学生募集
ja
en

当研究室の論文がICCV 2025に2件採択されました。

2025.06.26

—

当研究室の論文がICCV 2025に2件採択されました。

■書誌情報
Jungdae Lee*, Taiki Miyanishi*, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Yutaka Matsuo, Nakamasa Inoue. “CityNav: A Large-Scale Dataset for Real-World Aerial Navigation”. International Conference on Computer Vision (ICCV 2025).
(* denotes equally contributed)
■概要
Vision-and-language navigation (VLN) aims to develop agents capable of navigating in realistic environments. While recent cross-modal training approaches have significantly improved navigation performance in both indoor and outdoor scenarios, aerial navigation over real-world cities remains underexplored primarily due to limited datasets and the difficulty of integrating visual and geographic information. To fill this gap, we introduce CityNav, the first large-scale real-world dataset for aerial VLN. Our dataset consists of 32,637 human demonstration trajectories, each paired with a natural language description, covering 4.65 km^2 across two real cities: Cambridge and Birmingham. In contrast to existing datasets composed of synthetic scenes such as AerialVLN, our dataset presents a unique challenge because agents must interpret spatial relationships between real-world landmarks and the navigation destination, making CityNav an essential benchmark for advancing aerial VLN. Furthermore, as an initial step toward addressing this challenge, we provide a methodology of creating geographic semantic maps that can be used as an auxiliary modality input during navigation. In our experiments, we compare performance of three representative aerial VLN agents (Seq2seq, CMA and AerialVLN models) and demonstrate that the semantic map representation significantly improves their navigation performance.

■書誌情報
Shunsuke Yasuki, Taiki Miyanishi, Nakamasa Inoue, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Masato Taki, Yutaka Matsuo. “GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields”. International Conference on Computer Vision (ICCV 2025).
■概要
The advancement of 3D language fields has enabled intuitive interactions with 3D scenes via natural language. However, existing approaches are typically limited to small-scale environments, lacking the scalability and compositional reasoning capabilities necessary for large, complex urban settings. To overcome these limitations, we propose GeoProg3D, a visual programming framework that enables natural language-driven interactions with city-scale high-fidelity 3D scenes. GeoProg3D consists of two key components: (i) a Geography-aware City-scale 3D Language Field (GCLF) that leverages a memory-efficient hierarchical 3D model to handle large-scale data, integrated with geographic information for efficiently filtering vast urban spaces using directional cues, distance measurements, elevation data, and landmark references; and (ii) Geographical Vision APIs (GV-APIs), specialized geographic vision tools such as area segmentation and object detection. Our framework employs large language models (LLMs) as reasoning engines to dynamically combine GV-APIs and operate GCLF, effectively supporting diverse geographic vision tasks. To assess performance in city-scale reasoning, we introduce GeoEval3D, a comprehensive benchmark dataset containing 952 query-answer pairs across five challenging tasks: grounding, spatial reasoning, comparison, counting, and measurement. Experiments demonstrate that GeoProg3D significantly outperforms existing 3D language fields and vision-language models across multiple tasks. To our knowledge, GeoProg3D is the first framework enabling compositional geographic reasoning in high-fidelity city-scale 3D environments via natural language.

Related Post

ECCV 2026に当研究室の論文が採録

ECCV 2026に当研究室の論文が採録

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026) に当研究室の論文3本が採録

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026) に当研究室の論文3本が採録

Interspeech 2026に当研究室の論文が採録

Interspeech 2026に当研究室の論文が採録

IEEE Transactions on Automation Science and Engineering(T-ASE)に当研究室の論文が採録

IEEE Transactions on Automation Science and Engineering(T-ASE)に当研究室の論文が採録

UAI 2026に当研究室の論文が採録

UAI 2026に当研究室の論文が採録

JMIR Infodemiologyに当研究室の論文が採録

JMIR Infodemiologyに当研究室の論文が採録

BMJ Digital Health & AI Editorialに当研究室の招待論文が採録

BMJ Digital Health & AI Editorialに当研究室の招待論文が採録

MICCAI 2026 early acceptに当研究室の論文2本が採録

MICCAI 2026 early acceptに当研究室の論文2本が採録

研究室について
ニュース
研究
講義
起業家育成
- ＞松尾研発スタートアップ
- ＞起業クエスト
社会連携
メンバー
- ＞研究員・スタッフ一覧
- ＞学生一覧
採用・学生募集

Facebook
X

Copyright ©Matsuo-Iwasawa Lab. All Rights Reserved.