Foundation Models for Open-Vocabulary Perception
Traditional object detection required training a separate model for every object class
Traditional object detection required training a separate model for every object class
Before 2022, robotic manipulation required a separate hand-engineered controller for each task