Abstract
The proliferation of personal artificial intelligence
(AI) -assistant technologies with speech-based conversational AI
interfaces is driving the exponential growth in the consumer
Internet of Things (IoT) market. As these technologies are being
applied to keyword spotting (KWS), automatic speech recognition
(ASR), natural language processing (NLP), and text-to-speech
(TTS) applications, it is of paramount importance that they
provide uncompromising performance for context learning in
long sequenc