تفاصيل العمل

Goal: convert Arabic text to speech. output target is to generate good and listenable audio.

The system's architecture consists of two main components in the pipeline, The first part of the pipeline recurrent sequence-to-sequence feature prediction network with attention which gets the text as an input and outputs a sequence of mel spectrogram frames. Spectrograms represent the spectrum of frequencies of sound relative to time. The second part of the pipeline is WaveGlow ( Modified WaveNet ) which generates time-domain waveform samples conditioned on the predicted mel spectrogram frames. (converts spectrograms to audio waves)

بطاقة العمل

اسم المستقل Ali E.
عدد الإعجابات 0
عدد المشاهدات 6
تاريخ الإضافة