scispace - formally typeset
N

Navdeep Jaitly

Researcher at Google

Publications -  132
Citations -  36853

Navdeep Jaitly is an academic researcher from Google. The author has contributed to research in topics: Recurrent neural network & Artificial neural network. The author has an hindex of 51, co-authored 124 publications receiving 30634 citations. Previous affiliations of Navdeep Jaitly include Environmental Molecular Sciences Laboratory & University of Toronto.

Papers
More filters
Journal ArticleDOI

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

TL;DR: This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
Journal Article

Deep Neural Networks for Acoustic Modeling in Speech Recognition

TL;DR: This paper provides an overview of this progress and repres nts the shared views of four research groups who have had recent successes in using deep neural networks for a coustic modeling in speech recognition.
Proceedings ArticleDOI

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

TL;DR: Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes speech utterances directly to characters without pronunciation models, HMMs or other components of traditional speech recognizers is presented.
Proceedings Article

Towards End-To-End Speech Recognition with Recurrent Neural Networks

TL;DR: A speech recognition system that directly transcribes audio data with text, without requiring an intermediate phonetic representation is presented, based on a combination of the deep bidirectional LSTM recurrent neural network architecture and the Connectionist Temporal Classification objective function.
Proceedings ArticleDOI

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions

TL;DR: Tacotron 2, a neural network architecture for speech synthesis directly from text that is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize time-domain waveforms from those Spectrograms is described.