AI Trends
Our Trends section features key numbers and data visualizations in AI, related Epoch reports and other sources that showcase the change and growth in AI over time.
Last updated on Nov 22, 2023
Display growth values in:
Training compute
Likely
4.2
x/year
Training data
Plausible
2024
Most parameters in a dense model
Uncertain
540 billion
Computational performance
Likely
1.35
x/year
Algorithmic improvements
Plausible
2.5
%/year
Training costs
Likely
3.1
x/year
Compute Trends
Deep Learning compute
Likely
4.2
x
Pre-Deep Learning compute
Likely
1.5
x
Large-scale vs regular scale
Uncertain
100x
Training Compute of Milestone Machine Learning Systems Over Time
- We compile the largest known dataset of milestone Machine Learning models to date.
- Training compute grew by 0.2 OOM/year up until the Deep Learning revolution around 2010, after which growth rates increased to 0.6 OOM/year.
- We also find a new trend of “large-scale” models that emerged in 2016, trained with 2-3 OOMs more compute than other systems in the same period.
Most compute-intensive training run
Plausible
2e25 FLOP
Data Trends
Language training dataset size
Likely
2.2
x
When might we run out of high-quality text?
Plausible
2024
When might we run out of text altogether?
Uncertain
2040
Will We Run Out of ML Data? Evidence From Projecting Dataset Size Trends
- The available stock of text and image data grew by 0.14 OOM/year between 1990 and 2018, but has since slowed to 0.03 OOM/year.
- At current rates of data production, our projections suggest that we will run out of high-quality text, low-quality text, and images by 2024, 2040 and 2046 respectively.
Largest training dataset
Uncertain
1.87 trillion words
Stock of data on the internet
Plausible
100 trillion words
Model Size Trends
Parameter count
Plausible
2.8
x
The parameter gap
Uncertain
20B to 70B
Machine Learning Model Sizes and the Parameter Gap
- Between the 1950s and 2018, model sizes grew at a rate of 0.1 OOM/year, but this rate accelerated dramatically after 2018.
- This is partly due to a statistically significant absence of milestone models with between 20 billion and 70 billion parameters, which we call the “parameter gap.”
Largest model trained end-to-end
Uncertain
540 billion parameters
Hardware Trends
Computational performance
Likely
1.35
x
Lower-precision number formats
Plausible
8 x
Memory capacity
Likely
1.2
x
Memory bandwidth
Likely
1.18
x
Trends in Machine Learning Hardware
- The use of alternative number formats account, roughly, for a ~10x performance improvement over FP32 computational performance
- Computational performance [FLOP/s] is doubling every 2.3 years for both ML and general GPUs; computational price-performance [FLOP/$] is doubling every 2.1 years for ML GPUs and 2.5 years for general GPUs; and energy efficiency [FLOP/s per Watt] is doubling every 3.0 years for ML GPUs and 2.7 years for general GPUs.
- Memory capacity and memory bandwidth are doubling every ~4 years. The slower rate of improvement indicates a bottleneck in memory and bandwidth for scaling GPU clusters.
Highest performing GPU in Tensor-FP16
Likely
~9.9e14 FLOP/s
Highest performing GPU in INT8
Likely
~1.98e15 OP/s
Algorithmic Progress
Compute-efficiency in computer vision
Plausible
2.5
%
Data-efficiency in computer vision
Very uncertain
1.3
%
Revisiting Algorithmic Progress
- Algorithmic progress explains roughly 45% of performance improvements in image classification, and most of this occurs through improving compute-efficiency.
- The amount of compute needed to achieve state-of-the-art performance in image classification on ImageNet declined at a rate of 0.4 OOM/year in the period between 2012 and 2022, faster than prior estimates suggested.
Chinchilla scaling laws
Plausible
20 tokens per parameter
Investment Trends
Training costs
Likely
3.1
x
Trends in the Dollar Training Cost of Machine Learning Systems
- The dollar cost for the final training run of milestone ML systems increased at a rate of 0.5 OOM/year between 2009 and 2022.
- Since September 2015, the cost for “large-scale” systems (systems that used a relatively large amount of compute) has grown more slowly, at a rate of 0.2 OOM/year.
Most expensive training run
Plausible
$50 million
Acknowledgements
We thank Tom Davidson, Lukas Finnveden, Charlie Giattino, Zach-Stein Perlman, Misha Yagudin, Robi Rahman, Jai Vipra, Patrick Levermore, Carl Shulman, Ben Bucknall and Daniel Kokotajlo for their feedback.
Several people have contributed to the design and maintenance of this dashboard, including Jaime Sevilla, Pablo Villalobos, Anson Ho, Tamay Besiroglu, Ege Erdil, Ben Cottier, Matthew Barnett, David Owen, Robi Rahman, Lennart Heim, Marius Hobbhahn, David Atkinson, Keith Wynroe, Christopher Phenicie, Alex Haase and Edu Roldan.
Citation
Cite this work as
Epoch (2023), "Key trends and figures in Machine Learning". Published online at epochai.org. Retrieved from: 'https://epochai.org/trends' [online resource]
BibTeX citation
@misc{epoch2023aitrends,
title = "Key trends and figures in Machine Learning",
author = {Epoch},
year = 2023,
url = {https://epochai.org/trends},
note = "Accessed: "
}