
Holger Schwenk
@schwenkholger
Full professor and senior research scientist at Meta AI Research
ID: 1078719052263104512
28-12-2018 18:27:18
26 Tweet
666 Takipçi
54 Takip Edilen



The code and models to calculate multilingual sentence embeddings for 93 languages is now available, joint work with Mikel Artetxe

WikiMatrix: large-scale bitext extraction from Wikipedia: 1620 language pairs in 85 languages, 135M parallel sentences, Systematic evaluation on TED With @VishravC , Shuo Sun, Hongyu and Paco Guzmán Paper: arxiv.org/abs/1907.05791 Data: github.com/facebookresear…








We release SpeechMatrix, 418k hours of parallel speech in 136 langs mined from European Parliament recordings. We provide bilingual speech-to-speech baselines as well as multilingual training with mixture-of-experts, shorturl.at/amu28 Holger Schwenk Hongyu Gong Benoît Sagot
