Vision-Language Navigation Finding Refrigerator in Lounge
Автор: Ji Zhang
Загружено: 2025-09-07
Просмотров: 805
Video shows result of vision-language navigation to "find the refrigerator in the lounge". The environment is unknown and gradually perceived by the system. The environment is segmented into rooms as the system proceeds. VLM reasons about which room to explore and when to transit between the rooms. In-room exploration is carried out by the TARE exploration algorithm. Semantic mapping uses Yolo + Segment Anything Model and validates the results with VLM. In addition, a base autonomy system is in charge of SLAM/state estimation, terrain traversability analysis, collision avoidance, etc. The vehicle is equipped with a 3D lidar, a 360 camera, and a gaming laptop with an RTX 4090 GPU. The speed is 0.85m/s.
Vision-language navigation website:
https://cmu-vln.github.io
Base autonomy system:
https://github.com/jizhang-cmu/autono...
Vehicle platform:
https://www.tarerobotics.com
Background music:
"Sappheiros - Embrace" is under a Creative Commons (BY 3.0) license:
https://creativecommons.org/licenses/...
https://open.spotify.com/artist/5ZVHX...
Music powered by BreakingCopyright: • 🍀 Chill Instrumental [Non Copyrighted Musi...
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: