Apple researchers have developed AI system that can understand ambiguous references

Dennis Sellers

8 months ago

Apple researchers have developed a new artificial intelligence system that can understand ambiguous references to on-screen entities as well as conversational and background context, enabling more natural interactions with voice assistants, according to a paper published on Friday — as noted by VentureBeat.

The system, called ReALM (Reference Resolution As Language Modeling), leverages large language models to convert the complex task of reference resolution — including understanding references to visual elements on a screen — into a pure language modeling problem. This allows ReALM to achieve substantial performance gains compared to existing methods, according to VentureBeat.

Here are some notes of interest from Apple’s research paper: Human speech typically contains ambiguous references such as “they” or “that”, whose meaning is obvious (to other humans) given the context. Being able to understand context, including references like these, is essential for a conversational assistant that aims to allow a user to naturally communicate their requirements to an agent, or to have a conversation with it … In addition, enabling the user to issue queries about what they see on their screen is a crucial step in ensuring a true hands-free experience in voice assistants.

We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. Our larger models substantially outperform GPT-4.

The report is just the latest indication of Apple’s work in artificial intelligence. For example, in an October 2023 “Power On” newsletter, Bloomberg’s Mark Gurman said that “one of the most intense and widespread endeavors at Apple Inc. right now is its effort to respond to the AI frenzy sweeping the technology industry.”

He said that, as noted before, the company built its own large language model called Ajax and rolled out an internal chatbot dubbed “Apple GPT” to test out the functionality. The critical next step is determining if the technology is up to snuff with the competition and how Apple will actually apply it to its products, according to Gurman.

He said that Apple’s senior vice presidents in charge of AI and software engineering, John Giannandrea and Craig Federighi, are spearheading the effort. On Cook’s team, they’re referred to as the “executive sponsors” of the generative AI push.

Article provided with permission from AppleWorld.Today

Share this: