Sandris Dubovs V L Nav Neka 【Free ◎】
Leverages a 3D scene graph and image memory to help Vision Language Models (VLMs) replan tasks in real-time.
For related open-source frameworks, check repositories like oobvlm on GitHub. Sandris Dubovs V L Nav Neka
You can find the full technical details on arXiv: VL-Nav . Leverages a 3D scene graph and image memory
"Traditional robot navigation often fails when faced with complex, multi-step instructions or unknown environments, resulting in inefficient 'aimless wandering.' addresses this by intertwining neural semantic understanding with symbolic 3D scene graphs. This allows the robot to decompose abstract commands—like finding a waterproof jacket based on a rain report—into logical navigation goals." 2. Key Technical Features (Good for Specs) "Traditional robot navigation often fails when faced with
Proven to navigate successfully across different floors and transitions (e.g., using elevators or stairs) in complex building layouts. 3. Performance Summary (Good for Validation)
View demonstrations on robots like the Unitree G1 and Go2 at the SAIR Lab Project Page .
Uses a CVL (Curiosity-driven Vision-Language) score to prioritize exploring unknown areas that align with human descriptions.
Sorry, the comment form is closed at this time.