Machine Learning Trends

Our ML Trends dashboard offers curated key numbers, visualizations, and insights that showcase the significant growth and impact of artificial intelligence.

Last updated on Jun 07, 2024

Display growth values in:

Training compute

Likely
4.2 x/year

Training data

Plausible
2028

Computational performance

Likely
1.35 x/year

Algorithmic improvements

Plausible
5.1 %/year

Training costs

Likely
2.5 x/year

Compute Trends

Deep Learning compute

Likely
4.2 x

Pre-Deep Learning compute

Likely
1.5 x

Training compute of frontier AI models grows by 4-5x per year

Our expanded AI model database shows that the compute used to train recent models grew 4-5x yearly from 2010 to May 2024. We find similar growth in frontier models, recent large language models, and models from leading companies.

Read more Dataset

Most compute used in a training run

Plausible
5e25 FLOP

Data Trends

Language training dataset size

Likely
2.9 x

When will the largest training runs use all public human-generated text?

Plausible
2028

Will we run out of data? Limits of LLM scaling based on human-generated data

We estimate the stock of human-generated public text at around 300 trillion tokens. If trends continue, language models will fully utilize this stock between 2026 and 2032, or even earlier if intensely overtrained.

Read more

Largest training dataset used to train an LLM

Uncertain
18 trillion tokens

Stock of data on the internet

Plausible
510 trillion tokens

Hardware Trends

Computational performance

Likely
1.35 x

Lower-precision number formats

Plausible
8 x

Memory capacity

Likely
1.2 x

Memory bandwidth

Likely
1.18 x

Trends in Machine Learning Hardware

FLOP/s performance in 47 ML hardware accelerators doubled every 2.3 years. Switching from FP32 to tensor-FP16 led to a further 10x performance increase. Memory capacity and bandwidth doubled every 4 years.

Read more

Highest performing GPU in Tensor-FP16

Likely
2.25e15 FLOP/s

Highest performing GPU in INT8

Likely
4.5e15 OP/s

Algorithmic Progress

Compute-efficiency in language models

Plausible
5.1 %

Compute-efficiency in computer vision models

Plausible
331.1 %

Contribution of algorithmic innovation

Plausible
35%

Algorithmic Progress in Language Models

Progress in language model performance surpasses what we’d expect from merely increasing computing resources, occurring at a pace equivalent to doubling computational power every 5 to 14 months.

Read more

Chinchilla scaling laws

Plausible
20 tokens per parameter

Investment Trends

Training costs

Likely
2.5 x

Hardware acquisition costs

Likely
297.6 x

How Much Does It Cost to Train Frontier AI Models?

The cost of training frontier AI models has grown by a factor of 2 to 3x per year for the past eight years, suggesting that the largest models will cost over a billion dollars by 2027.

Read more

Most expensive AI model

Uncertain
$130 million

Hardware acquisition cost for the most expensive AI model

Uncertain
$670 million

Biological Models

Training compute

Likely
8.7 x

Key DNA sequence database

Likely
8.3 x

Biological Sequence Models in the Context of the AI Directives

The expanded Epoch database now includes biological sequence models, revealing potential regulatory gaps in the White House’s Executive Order on AI and the growth of the compute used in their training.

Read more

Most compute-intensive biological sequence model

Likely
6.2e23 FLOP

Protein sequence data

Uncertain
~7 billion entries

Acknowledgements

We thank Tom Davidson, Lukas Finnveden, Charlie Giattino, Zach Stein-Perlman, Misha Yagudin, Jai Vipra, Patrick Levermore, Carl Shulman, Ben Bucknall and Daniel Kokotajlo for their feedback.

Several people have contributed to the design and maintenance of this dashboard, including Jaime Sevilla, Pablo Villalobos, Anson Ho, Tamay Besiroglu, Ege Erdil, Ben Cottier, Matthew Barnett, David Owen, Robi Rahman, Lennart Heim, Marius Hobbhahn, David Atkinson, Keith Wynroe, Christopher Phenicie, Nicole Maug, Aleksandar Kostovic, Alex Haase, Robert Sandler, Edu Roldan and Andrew Lucas.

Citation

Cite this work as

Epoch AI (2023), "Key Trends and Figures in Machine Learning". Published online at epochai.org. Retrieved from: 'https://epochai.org/trends' [online resource]

BibTeX citation

@misc{epoch2023aitrends,
  title="Key Trends and Figures in Machine Learning",
  author={{Epoch AI}},
  year=2023,
  url={https://epochai.org/trends},
  note={Accessed: }
}

If you spot an error or would like to provide feedback, please reach out at .