Say what?

Say what?

Problem being addressed

While intelligent voice assistants such as Amazon Alexa​,​ Apple Siri​,​ and Google Assistant are built to process user dialog and perform tasks such as media playback and online shopping​,​ these tasks still remains challenging due to the sheer number of possible combinations a user can use to express a command.


A major part of any voice assistant is designed to understand the action requested by its users: given the transcription of a statement, a voice assistant must identify the action requested by a user, as well as separate any entities that further refine the action to perform. The researchers propose a unified architecture to solve the task of semantic parsing for both simple and complex queries. This architecture can also be adapted to handle queries containing slots with overlapping spans. It uses the concept of training a language model on a large amount of text using a next word prediction objective to learn good representations for each of the words.

Advantages of this solution

Experiments show that the suggested model achieves state of the art performance, relatively improving exact match accuracy over previous systems.

Possible New Application of the Work


Healthcare Sector

The suggested architecture can significantly improve the accuracy of queries from datasets, for example, where the task is to extract a patient diagnosis and related information from a clinician’s notes.

Author of original research described in this blitzcard: Subendhu Rongali (University of Massachusetts Amherst), Luca Soldaini (Amazon Alexa Search), Emilio Monti (Amazon Alexa), Wael Hamza (Amazon Alexa AI)


Name of the author who conducted the original research that this blitzcard is based on.

Source URL: #############check-icon

search-iconBrowse all blitzcards