Meta introduces HOT3D: A dataset for advancing hand-object interaction research
A comprehensive 3D video dataset designed to support progress in robotics, AR/VR, and computer vision applications
Meta Reality Labs has introduced HOT3D, a publicly available dataset designed to enhance machine learning research on hand-object interactions.
This dataset comprises over 833 minutes of multi-view egocentric 3D video streams, captured using Meta's Project Aria glasses and Quest 3 VR headset. It includes 3.7 million annotated images, offering high-quality data on 19 subjects interacting with 33 diverse objects in real-world tasks.
Key features of HOT3D include:
Multi-modal data: RGB/monochrome image streams, eye gaze tracking, and 3D point clouds.
Comprehensive annotations: 3D poses of objects, hands, and cameras; 3D models of hands and objects.
Real-world scenarios: Demonstrations range from basic object manipulation to complex activities like typing or using kitchen utensils.
The dataset leverages a professional motion-capture system and supports advanced 3D tracking formats such as UmeTrack and MANO. Initial experiments revealed that models trained on HOT3D’s multi-view data outperform those trained on single-view data, excelling in tasks like 3D hand tracking, 6DoF object pose estimation, and 3D lifting of objects.
Available as open-source, HOT3D aims to drive innovation in robotics, AR/VR systems, and human-machine interfaces by providing a robust foundation for computer vision and machine learning advancements.
More news!
Keep reading with a 7-day free trial
Subscribe to The PhilaVerse to keep reading this post and get 7 days of free access to the full post archives.