Site icon MacTech.com

Apple wants Siri to better handle speech interpretation based on a user’s gaze, other speakers present, etc.

Apple has filed for a patent (number 20230035941 A1) for “speech interpretation based on environmental context.” It involves enhanced functionality for Siri.

About the patent filing

In the patent filing, Apple notes that intelligent automated assistants (or digital assistants) such as Siri can allow users to interact with devices or systems using natural language in spoken and/or text forms. The text giant adds that such assistants are capable of making user interactions with a device much more efficient than with conventional means, such as text entry. User requests to a digital assistant may be provided to device under various different environmental circumstances. For example, acoustics and other audible sounds in the device environment may make speech interpretation more difficult. 

Apple says that, however, certain environmental sounds may also assist the device in determining whether a given utterance is indeed directed to the digital assistant. In particular, the user’s speech may resemble a request to a digital assistant, although the user may actually be engaged in a conversation with another user or entity. 

Apple wants to incorporate an improved system for speech interpretation  on Siri based on environmental context. What sort of context? Where a user is looking. Whether another person is also speaking. Etc.

Summary of the patent filing

Here’s Apple’s abstract of the patent filing: “Systems and processes for speech interpretation based on environmental context are provided. For example, a user gaze direction is detected, and a speech input is received from a first user of the electronic device. In accordance with a determination that the user gaze is directed at a digital assistant object, the speech input is processed by the digital assistant. 

“In accordance with a determination that the user gaze is not directed at a digital assistant object, contextual information associated with the electronic device is obtained, wherein the contextual information includes speech from a second user. Determination is made whether the speech input is directed to a digital assistant of the electronic device. In accordance with a determination that the speech input is directed to a digital assistant of the electronic device, the speech input is processed by the digital assistant.”




Article provided with permission from AppleWorld.Today
Exit mobile version