Facebook AI 研究院近日开源了一款简单高效的端到端自动语音识别(ASR)系统 wav2letter,wav2letter 实现的是论文 Wav2Letter: an End-to-End ConvNet-based Speech Recognition System 和及 Letter-Based Speech Recognition with Gated ConvNets 中提出的架构。
Papers
@article{collobert:2016, author = {Ronan Collobert and Christian Puhrsch and Gabriel Synnaeve}, title = {Wav2Letter: an End-to-End ConvNet-based Speech Recognition System}, journal = {CoRR}, volume = {abs/1609.03193}, year = {2016}, url = {http://arxiv.org/abs/1609.03193}, }
和
@article{liptchinsky:2017, author = {Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert}, title = {Letter-Based Speech Recognition with Gated ConvNets}, journal = {CoRR}, volume = {abs/1712.09444}, year = {2017}, url = {http://arxiv.org/abs/1712.09444}, }
如果你使用 wav2letter 或相关的预训练模型,需引用其中的一篇论文。
另外,如果想要立刻进行语音转录的,Facebook 还提供了 Librispeech 数据集上预训练模型。