Back to all writing

An AI-capabilities bet with JY Koh



I’m making a bet with my officemate JY about whether a paper written entirely by a machine learning model will make it past peer-review. I’m skeptical, he’s optimistic.

The bet

Before 12:00am GMT on July 1st 2024, a paper written entirely by a language and/or image model will be accepted to a peer-reviewed, A-tier conference in computer science or a Q1 journal in mathematics, statistics, or computer science.

Predictor: JY Koh, Challenger: Ben Chugg.

Stakes: The loser has to buy bubble tea for our PhD cohort and their significant others.

The paper may be formatted by humans, but all content must be generated by the model (including any figures). The conference should be A-tier according to ERA standards, and Q1 according to SJR. See here for conference rankings, and here for journal rankings.

If a paper is accepted, revealed to be model generated and consequently retracted, the bet resolves positively. In other words, it is not required that the paper be published, only accepted.

There is a chance that a model generated paper is accepted but the event goes unacknowledged. If we subsequently learn of such a paper and JY has already bought everyone bubble tea, then Ben will buy everyone bubble tea and donate $100 USD to GiveWell.

Some commentary

I’ve written elsewhere that peer-review doesn’t work, so relying on it as the gatekeeper between model- and human-generated papers may seem odd. But despite the flaws, to make it past peer-review will require many pages of coherent narrative supporting some (hopefully) novel contribution. I don’t think current models have this capacity, nor do I think the current deep learning paradigm is capable of the kind of reasoning required to make a significant scientific contribution. But I could be wrong, and what’s better than being wrong in public and held to account by tapioca balls …

I suspect that if I lose it will be due to a survey article, or an idea that doesn’t hold much water when examined thoroughly, but was enough to make it past peer-review. Even this would be quite the achievement, however, so I’m happy to bet against it.

Back to all writing