Nvidia has revealed new AI and simulation tools that will advance robot learning and humanoid development.
The world’s largest tech company ($3.432 trillion) said the tools will enable robotics developers to speed up their work on AI-powered robots, this week at the Conference on Robots in Munich, Germany. Tools for Learning (CoRL) are revealed.
The lineup includes general availability of the Nvidia Isaac Lab robot learning framework. six new humanoid robot learning workflows for Project GR00T, an initiative to accelerate humanoid robot development; and new world model development tools for video data curation and processing, including the Nvidia Cosmos tokenizer and Nvidia NeMo curator for video processing.
The open-source Cosmos Tokenizer provides robotics developers with superior visual tokenization by breaking down images and videos into high-quality tokens with exceptionally high compression rates. It runs 12 times faster than existing tokenizers, while NeMo Curator provides seven times faster video processing curation than non-optimized pipelines.
In time with CoRL, Nvidia released 23 papers and presented nine workshops on robot learning, and also released training and workflow guides for developers. Additionally, Hugging Face and Nvidia announced that they are collaborating with LeRobot, Nvidia Isaac Lab and Nvidia Jetson to accelerate open source robotics research for the developer community.
Accelerating robot development with Isaac Lab
Nvidia Isaac Lab is an open source, robot learning framework built on Nvidia Omniverse, a platform for developing OpenUSD applications for industrial digitalization and physical AI simulation.
Developers can use Isaac Lab to train large-scale robot policies. This open-source unified robot learning framework can be applied to any embodiment – from humanoids to quadrupeds and collaborative robots – to handle increasingly complex movements and interactions.
Leading commercial robot manufacturers, robotics application developers, and robotics research institutions around the world are adopting Isaac Lab, including 1X, Agility Robotics, The AI Institute, Berkeley Humanoid, Boston Dynamics, Field AI, Fourier, Galbot, Mentee Robotics, Skild AI, Swiss-Mile, Unitree Robotics, and Xpeng Robotics.
Project GR00T: Foundations of General Purpose Humanoid Robots.
The humanoids are coming. Building advanced humanoids is extremely difficult, multi-layered demanding.
Technological and interdisciplinary approaches to help robots effectively understand, transfer, and learn skills for human-robot and robot-environment interactions.
Project GR00T is an initiative to develop rapid libraries, foundation models and data pipelines to accelerate the global humanoid robot developer ecosystem.
Six new Project GR00T workflows provide humanoid developers with the blueprints to realize the most challenging humanoid robot capabilities. These include things like the GR00T-Gen generative AI-powered, OpenUSD-based 3D environment and more.
“Humanoid robots are the next wave of embodied AI,” Jim Fan, senior research manager for embodied AI at Nvidia, said in a statement. “Nvidia research and engineering teams are collaborating across the company and our developer ecosystem to help build Project GR00T to advance the growth and development of global humanoid robot developers.”
Today, robot developers are building global models – AI representations of the world that can predict how objects and environments respond to robot actions. Building these global models is incredibly compute and data-intensive, requiring thousands of hours of real-world, curated image or video data for the models.
Nvidia Cosmos tokenizers provide efficient, high-quality encoding and decoding to simplify the development of these global models. They set a new standard for minimal distortion and temporal instability, enabling high-quality video and image reconstruction.
By providing high-quality compression and 12x faster visual reconstruction, Cosmos Tokenizer paves the way for scalable, robust and efficient development of generative applications across a wide range of visual domains.
1X, a humanoid robot company, has updated the 1X World Model Challenge dataset to use the Cosmos tokenizer.
“The Nvidia Cosmos tokenizer achieves truly superior temporal and spatial compression of our data while maintaining visual fidelity,” Eric Zhang, vice president of AI at 1X Technologies, said in a statement. “This allows us to train world models with long-horizon video generation in an even more computationally efficient manner.”
Other humanoid and general-purpose robot developers, including Xpeng Robotics and Hillbot, are developing with the Nvidia Cosmos tokenizer to manage high-resolution photos and videos.
NeMo Curator
NeMo Curator now includes a video processing pipeline. It enables robot developers to increase the accuracy of their global models for processing large-scale text, image and video data.
Video data curation faces challenges due to its large size, requiring scalable pipelines and efficient orchestration for load balancing across GPUs. Additionally, filtering, captioning and embedding models need optimization to maximize throughput.
NeMo Curator overcomes these challenges by streamlining data curation with automated pipeline orchestration, significantly reducing processing time. It supports linear scaling in multi-node multi-GPU systems, efficiently handling more than 100 petabytes of data. This simplifies AI development, reduces costs and accelerates time to market.
Availability
Nvidia Isaac Lab 1.2 is now available and open source on GitHub. The Nvidia Cosmos tokenizer is now available on GitHub and Hugging Face. NeMo Curator for video processing will be available at the end of the month.
A new Nvidia Project GR00T workflow is coming soon to help robot companies build humanoid robot capabilities with greater ease.
For researchers and developers learning to use Isaac Lab, beginner developer guides and tutorials are now available, including a migration guide from Isaac Gym to Isaac Lab.
Credit : venturebeat.com