Training LLMs

As the overwhelming majority of EleutherAI’s resources have gone towards training LLMs. EleutherAI has trained and released several LLMs, and the codebases used to train them. Several of these LLMs were the largest or most capable available at the time and have been widely used since in open-source research applications.

Libraries we currently recommend people use out-of-the-box include:

Mesh Transformer Jax, a lightweight TPU training framework developed by Ben Wang
GPT-NeoX, a PyTorch library built off of Megatron-DeepSpeed which supports training models as large as GPT-3 scale on multiple hosts within a single computing cluster
trlX, a PyTorch library for finetuning large language models with Reinforcement Learning via Human Feedback (RLHF)
RWKV, a PyTorch library for training RNN with transformer-level LLM performance.

Libraries

Featured

Dec 9, 2023

trlX

Dec 9, 2023

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Dec 9, 2023

Dec 5, 2022

RWKV

Dec 5, 2022

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters efficiently.

Dec 5, 2022

Jan 18, 2022

GPT-NeoX

Jan 18, 2022

A library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context. This library is currently maintained by EleutherAI.

Jan 18, 2022

May 13, 2021

LM Eval Harness

May 13, 2021

Our library for reproducible and transparent evaluation of LLMs.

May 13, 2021

May 1, 2021

Mesh Transformer Jax

May 1, 2021

A JAX and TPU-based library developed by Ben Wang. The library has been used to train GPT-J.

May 1, 2021

Mar 21, 2021

GPT-Neo Library

Mar 21, 2021

A library for training language models written in Mesh TensorFlow. This library was used to train the GPT-Neo models, but has since been retired and is no longer maintained. We currently recommend the GPT-NeoX library for LLM training.

Mar 21, 2021

Models

Featured

Oct 16, 2023

LLeMA

Oct 16, 2023

Language models for mathematical applications

Oct 16, 2023

Feb 13, 2023

Pythia

Feb 13, 2023

A suite of models designed to enable controlled scientific research on transparently trained LLMs

Feb 13, 2023

Dec 15, 2022

Polyglot-Ko

Dec 15, 2022

A series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Dec 15, 2022

Dec 5, 2022

RWKV

Dec 5, 2022

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters efficiently.

Dec 5, 2022

Feb 2, 2022

GPT-NeoX-20B

Feb 2, 2022

An open source English autoregressive language model trained on the Pile. At the time of its release, it was the largest publicly available language model in the world.

Feb 2, 2022

Oct 6, 2021

CARP

Oct 6, 2021

A CLIP-like model trained on (text, critique) pairs with the goal of learning the relationships between passages of text and natural language feedback on those passages.

Oct 6, 2021