understanding of the
future of AI
Our approach
Epoch is a multidisciplinary research institute investigating the trajectory of Artificial Intelligence (AI). We scrutinize the driving forces behind AI and forecast its ramifications on the economy and society.
We emphasize making our research accessible through our reports, models and visualizations to help ground the discussion of AI on a solid empirical footing. Our goal is to create a healthy scientific environment, where claims about AI are discussed with the rigor they merit.
Our research covers the following areas:
Trends in Machine Learning
We conduct in-depth analyses on compute, data, and investment trends to solidify our understanding of AI's trajectory.
Visit Trends pageEconomics of AI automation
We build models to understand the economic drivers and impacts of AI automation.
Open takeoff model playgroundAlgorithmic progress
We investigate how innovations in AI are allowing us to build more capable models with fewer resources.
See publicationsData in Machine Learning
We research the challenges and solutions related to data bottlenecks that AI labs may encounter.
See publicationsHighlighted research

Compute Trends Across Three Eras of Machine Learning
We compile a dataset of the training compute for over 120 Machine Learning models, highlighting novel trends and insights into the development of AI since 1952, and what to expect going forward.

Revisiting Algorithmic Progress
We use a dataset of over a hundred computer vision models from the last decade to investigate how better algorithms and architectures have enabled researchers to use compute and data more efficiently. We find that every 9 months, the introduction of better algorithms contribute the equivalent of a doubling of compute budgets.

Will We Run Out of ML Data? Evidence From Projecting Dataset Size Trends
Based on our previous analysis of trends in dataset size, we project the growth of dataset size in the language and vision domains. We explore the limits of this trend by estimating the total stock of available unlabeled data over the next decades.