0
Skip to Content
EleutherAI
EleutherAI
About
Community
Staff
Research
Language Modeling
Interpretability
Alignment
Other Modalities
Papers
Releases
Blog
EleutherAI
EleutherAI
About
Community
Staff
Research
Language Modeling
Interpretability
Alignment
Other Modalities
Papers
Releases
Blog
Folder: About
Back
Community
Staff
Folder: Research
Back
Language Modeling
Interpretability
Alignment
Other Modalities
Papers
Releases
Blog
Dataset Stella Biderman 16/10/2023 Dataset Stella Biderman 16/10/2023

Proof-Pile-2

A 55 billion token dataset of mathematical and scientific documents, created for training the LLeMA models.

Read More
Dataset Stella Biderman 10/10/2023 Dataset Stella Biderman 10/10/2023

OpenWebMath

A 14.7B token dataset of high quality English mathematical text.

Read More

About

Research

Language Modeling

Interpretability

Alignment

Other Modalities

Releases

Blog

contact@eleuther.ai

Copyright EleutherAI 2023