Research
✨ Demo
ForecastBench
A dynamic, continuously-updated forecasting LLM Benchmark
↓ Open
ForecastBench
A dynamic, continuously-updated benchmark to evaluate LLM forecasting capabilities. More at www.forecastbench.org.
Datasets
Leaderboards and datasets are updated nightly and available at github.com/forecastingresearch/forecastbench-datasets.
Participate in the benchmark
Instructions for how to submit your model to the benchmark can be found here: How-to-submit-to-ForecastBench.
Wiki
Dig into the details of ForecastBench on the wiki.
Citation
@inproceedings{karger2025forecastbench,
title={ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities},
author={Ezra Karger and Houtan Bastani and Chen Yueh-Han and Zachary Jacobs and Danny Halawi and Fred Zhang and Philip E. Tetlock},
year={2025},
booktitle={International Conference on Learning Representations (ICLR)},
url={https://iclr.cc/virtual/2025/poster/28507}
}
Getting started for devs
Local setup
git clone --recurse-submodules <repo-url>.gitcd forecastbenchcp variables.example.mk variables.mkand set the values accordingly- Setup your Python virtual environment
make setup-python-envsource .venv/bin/activate
Run GCP Cloud Functions locally
cd directory/containing/cloud/functioneval $(cat path/to/variables.mk | xargs) python main.py
Contributions
Before creating a pull request:
- run
make lintand fix any errors and warnings - ensure code has been deployed to Google Cloud Platform and tested (only for our devs, for others, we're happy you're contributing and we'll test this on our end).
- fork the repo
- reference the issue number (if one exists) in the commit message
- push to the fork on a branch other than
main - create a pull request
Previous
Return
Next project