Research

ForecastBench

A dynamic, continuously-updated forecasting LLM Benchmark

Readme

✨ Demo

↓ Open

ForecastBench

A dynamic, continuously-updated benchmark to evaluate LLM forecasting capabilities. More at www.forecastbench.org.

Datasets

Leaderboards and datasets are updated nightly and available at github.com/forecastingresearch/forecastbench-datasets.

Participate in the benchmark

Instructions for how to submit your model to the benchmark can be found here: How-to-submit-to-ForecastBench.

Wiki

Dig into the details of ForecastBench on the wiki.

Citation

@inproceedings{karger2025forecastbench,
      title={ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities},
      author={Ezra Karger and Houtan Bastani and Chen Yueh-Han and Zachary Jacobs and Danny Halawi and Fred Zhang and Philip E. Tetlock},
      year={2025},
      booktitle={International Conference on Learning Representations (ICLR)},
      url={https://iclr.cc/virtual/2025/poster/28507}
}

Getting started for devs

Local setup

git clone --recurse-submodules <repo-url>.git
cd forecastbench
cp variables.example.mk variables.mk and set the values accordingly
Setup your Python virtual environment
1. make setup-python-env
2. source .venv/bin/activate

Run GCP Cloud Functions locally

cd directory/containing/cloud/function
eval $(cat path/to/variables.mk | xargs) python main.py

Contributions

Before creating a pull request:

run make lint and fix any errors and warnings
ensure code has been deployed to Google Cloud Platform and tested (only for our devs, for others, we're happy you're contributing and we'll test this on our end).
fork the repo
reference the issue number (if one exists) in the commit message
push to the fork on a branch other than main
create a pull request

👋 Contact ✨ Demo 💻 Source

Previous
Return
Next project

Resources

ForecastBench

ForecastBench

Datasets

Participate in the benchmark

Wiki

Citation

Getting started for devs

Local setup

Run GCP Cloud Functions locally

Contributions