Vision-Language-Action Model & Diffusion Policy Switching Enables Dexterous Control of an Robot Hand

Автор: Create Lab

Загружено: 2024-10-14

Просмотров: 1678

Описание:

check our paper: http://arxiv.org/abs/2410.14022
website: https://vla-diffu-switch.github.io/

To advance autonomous dexterous manipulation, we propose a hybrid control method that combines the relative advantages of a fine-tuned Vision-Language-Action (VLA) model and diffusion models.

The VLA model provides language commanded high-level planning, which is highly generalizable, while the diffusion model handles low-level interactions which offers the precision and robustness required for specific objects and environments. By incorporating a switching signal into the training-data, we enable event based transitions between these two models for a pick-and-place task where the target object and placement location is commanded through language. This approach is deployed on our anthropomorphic ADAPT Hand 2, a 13DoF robotic hand, which incorporates compliance through series elastic actuation allowing for resilience for any interactions: showing the first use of a multi-fingered hand controlled with a VLA model.

We demonstrate this model switching approach results in a over 80% success rate compared to under 40% when only using a VLA model, enabled by accurate near-object arm motion by the VLA model and a multi-modal grasping motion with error recovery abilities from the diffusion model.

Vision-Language-Action Model & Diffusion Policy Switching Enables Dexterous Control of an Robot Hand

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Creating Generalist Robot Models by Physical Intelligence

Creating Generalist Robot Models by Physical Intelligence

Модели действий языка видения для автономного вождения в Wayve

Модели действий языка видения для автономного вождения в Wayve

Learning Force-Conditioned Visuomotor Diffusion Policy for Complex Robotic Assembly Tasks

Learning Force-Conditioned Visuomotor Diffusion Policy for Complex Robotic Assembly Tasks

8-DoFs Cable Driven Parallel Robots for Bimanual Teleoperation

8-DoFs Cable Driven Parallel Robots for Bimanual Teleoperation

Yevgen Chebotar: RT-2- Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Yevgen Chebotar: RT-2- Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

OpenVLA: LeRobot Research Presentation #5 by Moo Jin Kim

OpenVLA: LeRobot Research Presentation #5 by Moo Jin Kim

Stanford Seminar - Modeling Humans for Humanoid Robots

Stanford Seminar - Modeling Humans for Humanoid Robots

Robotics Transformer w/ Visual-LLM explained: RT-2

Robotics Transformer w/ Visual-LLM explained: RT-2

Как электростатические двигатели нарушают все правила

Как электростатические двигатели нарушают все правила

Насколько мы близки к созданию твердотельных батарей?

Насколько мы близки к созданию твердотельных батарей?

Компания Hyundai представила своего робота-трассера нового поколения на выставке CES 2026.

Компания Hyundai представила своего робота-трассера нового поколения на выставке CES 2026.

Краткое объяснение больших языковых моделей

Краткое объяснение больших языковых моделей

Training a Robot from Scratch in Simulation, from URDF to OpenUSD

Training a Robot from Scratch in Simulation, from URDF to OpenUSD

Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust

Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust

How I Made A Deep Learning Robot

How I Made A Deep Learning Robot

Embodied Multimodal Intelligence with Foundation Models - Dr. Oier Mees (UC Berkeley)

Embodied Multimodal Intelligence with Foundation Models - Dr. Oier Mees (UC Berkeley)

I Overengineered a Spinning Top

I Overengineered a Spinning Top

Vision Language Models for Robotics | ROS Developers Open Class #179

Vision Language Models for Robotics | ROS Developers Open Class #179

ADAPT-Teleop - Demonstration of a compliant anthropomorphic hand performing dexterous tasks

ADAPT-Teleop - Demonstration of a compliant anthropomorphic hand performing dexterous tasks

Я Построил Рогатку Более Мощную, чем Пистолет

Я Построил Рогатку Более Мощную, чем Пистолет