🧩 MinuteCryptic Benchmark

How well do LLMs solve cryptic crossword clues? · reads Mesocosm run export JSON
← Replay dashboard

Leaderboard

#Model ScoreSolve rate Guesses Hints / parEps
No data loaded yet.
Score = mean episode reward (0–1). Solve rate = fraction fully solved. “Hints / par” compares the model’s average hints to the clue’s par (the human-expected hint count). Click a row to inspect episodes.

Episodes

Select a model to see its episode replays.