Vision-Language Navigation Finding Blue Trash Can in Classroom
Автор: Ji Zhang
Загружено: 2025-09-20
Просмотров: 393
Video shows result of vision-language navigation to "find the blue trash can in the classroom". The environment is unknown and gradually perceived by the system. The environment is segmented into rooms as the system proceeds. VLM reasons about which room to explore and when to transit between the rooms. In-room exploration is carried out by the TARE exploration algorithm. Semantic mapping uses Yolo + Segment Anything Model and validates the results with VLM. In addition, a base autonomy system is in charge of SLAM/state estimation, terrain traversability analysis, collision avoidance, etc. The vehicle is equipped with a 3D lidar, a 360 camera, and a gaming laptop with an RTX 4090 GPU. The speed is 0.85m/s.
Vision-language navigation website:
https://cmu-vln.github.io
Base autonomy system:
https://github.com/jizhang-cmu/autono...
Vehicle platform:
https://www.tarerobotics.com
Background music:
"Scott Buckley - Snowfall" is under a Creative Commons (BY 3.0) license:
https://creativecommons.org/licenses/...
/ musicbyscottb
Music powered by BreakingCopyright: • 'Snowfall' by Scott Buckley 🇦🇺 | Piano Amb...
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: