Xiang Zhang
fancyzhx
AI & ML interests
None yet
Organizations
None yet
Video Datasets
Text Datasets
- Running134
TxT360: Trillion Extracted Text
📖134Explore the TxT360 LLM pre‑training dataset online
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 3.58k • 30 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 41.2k • 43 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 29
Audio Datasets
Robotic Datasets
Video Datasets
Image Datasets
Text Datasets
- Running134
TxT360: Trillion Extracted Text
📖134Explore the TxT360 LLM pre‑training dataset online
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 3.58k • 30 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 41.2k • 43 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 29