Liling Tan (@alvations) 's Twitter Profile
Liling Tan

@alvations

Code, geek, game

ID: 3240302326

linkhttp://alvations.com calendar_today07-05-2015 11:06:14

8,8K Tweet

1,1K Followers

740 Following

HPLT (@hplt_eu) 's Twitter Profile Photo

#HPLT v3.0 Dataset is OUT! πŸš€ A massive leap in multilingual data quality & scale with: πŸ“ˆ 73% unique segments (up from 52%) 🌐 Better text extraction and langID πŸ”„ Global deduplication, cleaner corpus Ideal for training better multilingual models πŸ”—hplt-project.org/datasets/v3.0

#HPLT v3.0 Dataset is OUT! πŸš€

A massive  leap in multilingual data quality & scale with:

πŸ“ˆ 73% unique segments (up from 52%)
🌐 Better text extraction and langID
πŸ”„ Global deduplication, cleaner corpus

Ideal for training better multilingual models

πŸ”—hplt-project.org/datasets/v3.0