Master AI Reasoning
5 papers that explain how AI systems learn to reason. The progression from "think step by step" to RL-trained reasoning models.
Chain-of-Thought Prompting elevates reasoning in LLMs, outperforming finetuned GPT-3 on complex math tasks.
Why this paper
"Think step by step" — three words that dramatically improved reasoning. The foundational paper that started it all.
Xuezhi Wang et al.
Self-consistency in language models improves reasoning performance by over 17% on complex tasks.
Why this paper
Ask the same question multiple ways, take the majority vote. Simple but measurably more accurate than CoT alone.
Tree of Thoughts enhances language models by enabling strategic, multi-path reasoning for complex problem solving.
Why this paper
Branching reasoning paths instead of a single chain — the foundation of o1's search-based thinking.
Shunyu Yao et al.
ReAct fuses reasoning and acting in LLMs, enabling real-time interaction with external tools for superior results.
Why this paper
Interleaving reasoning with tool actions — how every modern AI agent actually works.
DeepSeek-R1 uses RL to supercharge reasoning in LLMs, rivaling OpenAI with no supervised fine-tuning.
Why this paper
DeepSeek trained a reasoning model purely through RL, without any supervised reasoning examples. Game-changing.
Unlock the full analysis for each paper
Deep-dive articles, expert annotations, PM action plans, and interactive experiments — all for $6/mo.
Go Pro — $6/mo