Get the latest tech news
Apple researchers taught an AI model to reason about app interfaces
A new Apple study introduces ILuvUI: a model that understands mobile app interfaces from screenshots and from natural language conversations.
A new Apple-backed study, in collaboration with Aalto University in Finland, introduces ILuvUI: a vision-language model trained to understand mobile app interfaces from screenshots and from natural language conversations. Currently, as the researchers explain, most vision-language models are trained on natural images, like dogs or street signs, so they don’t perform as well when asked to interpret more structured environments, like app UIs: The final dataset included Q&A-style interactions, detailed screen descriptions, predicted action outcomes, and even multi-step plans (like “how to listen to the latest episode of a podcast,” or “how to change brightness settings.”)
Or read this on r/apple