Foundation Models for Open-Vocabulary PerceptionTraditional object detection required training a separate model for every object class