Get the latest tech news

Breaking the data bottleneck: Salesforce’s ProVision speeds multimodal AI training with image scene graphs

Salesforce is using structured representation of image semantics to power programs that synthesize instruction datasets for AI training.

To address these gaps, the AI research team at Salesforce has come up with ProVision, a framework that employs scene graphs in conjunction with human-written programs to systematically synthesize vision-centric instruction data. Once the scene graphs are ready, they power programs written using Python and textual templates that serve as full-fledged data generators capable of creating question-and-answer pairs for AI training pipelines. These generators are crafted to…compare, retrieve, and reason about basic visual concepts of objects, attributes, and relations based on the detailed information encoded in each scene graph,” the researchers behind the framework wrote in a paper.

Get the Android app

Or read this on Venture Beat