Data



Language Corpus #sentence
Devset
Test set
Parallel ES-PT Europarl v9
News Commentary v14
Wiki Titles v1
JRC-Acquis
1811977
48168
621296
1650126
Monolingual ES, PT
ES, PT
ES
PT
Europarl v9
News Commentary v14
News Crawl 2007-2018
News Crawl 2008-2018
Devset
Test set
Parallel CS-PL Europarl v9
Wiki Titles v1
JRC-Acquis
631372
248645
1311362
Monolingual CS, PL
CS, PL
CS
PL
Europarl v9
News Commentary v14
News Crawl 2007-2018
News Crawl 2018
Parallel HI-NE Train and Devset
Test set
Additional Resources
~65K
Monolingual Feel free to use any (But the monolingual data should be already published as opensource license and can be shareable).
Designed and Maintained by
Santanu Pal and Marcos Zampieri