The Future of AI — A Landscape of Expert Opinion

The smartest people in artificial intelligence agree on remarkably little beyond a single sentence: this matters enormously. Past that point, the consensus dissolves. Ask when machines match human intelligence and you get answers ranging from "next year" to "not on this path at all." Ask whether the trajectory bends toward a handful of god-like labs or a commodity sold by the gallon, and equally serious people point in opposite directions. The honest map of this field is not a forecast. It is a map of disagreement between people who have each thought about it longer and harder than almost anyone alive.

This report walks the major cruxes — the load-bearing questions where the experts split — and tries to give the strongest version of each side rather than flattening them into a fake middle. It is organized to be read by someone building in this space, so it ends where it should: with what actually follows for the work.

The deepest crux The single disagreement that organizes most of the others is takeoff dynamics — how fast and how suddenly capability compounds once AI starts improving AI. A continuous-but-fast camp (Christiano, Cotra, most lab leaders) sits between a discontinuous "FOOM" camp (Yudkowsky) and a large skeptical wing (Marcus, LeCun, Acemoglu, Cowen) who argue the whole trajectory is overhyped and bottlenecked by physics, institutions, and the limits of today's architectures.

~2030

Where lab leaders cluster for AGI. Hassabis: 50% by 2030. Economists like Cowen see a ~20-year horizon instead.

80–100×

Price gap between open-weight frontier-minus-one models and closed frontier APIs — and closing fast.

<1% → 99%

The range of expert p(doom) — from LeCun to Yampolskiy. Lab CEOs sit at 10–25% and keep building.

01 · The Engine

Recursive self-improvement and the question of where the loop breaks

Aschenbrenner Christiano Cotra Yudkowsky LeCun Cowen Marcus

The core idea is a loop: each generation of AI compresses the design cycle of the next — not just the next model, but the hardware it runs on. The most aggressive case is Leopold Aschenbrenner's Situational Awareness (2024): training compute and algorithmic efficiency each climb roughly half an order of magnitude per year, and once AI can do AI research, hundreds of millions of automated researchers could fold a decade of progress into a single year. Aschenbrenner put his money where his thesis is — his AGI-themed fund reportedly grew from a few hundred million to several billion in about a year, making him the market's most visible embodiment of the bull case.

But a loop is only as fast as its slowest step. And the moment you trace the loop into the physical world, the binding constraint stops being cognition and becomes fabrication, power, and ops. A model can design a better chip in an afternoon; you still wait years for a fab to make it and for a grid to power it. This is the single most important structural picture in the whole debate:

The Recursive Loop — and where it meets the physical world

The top and right edges of the loop are cognitive and compress with intelligence. The bottom and left edges are physical and resist compression. The bull and skeptic camps are really arguing about which edge is binding.

Synthesis of Aschenbrenner (Situational Awareness), Epoch AI compute/energy analysis, and IEA/WEF grid-interconnection estimates. The "novelty bottleneck" critique adds a fifth gate: genuine architectural innovation may be inherently serial and resist parallelization across many AI instances.

The classic theoretical fight is Eliezer Yudkowsky versus Paul Christiano on takeoff speed. Christiano's continuous view holds that before there is AI great at self-improvement, there will be AI that is mediocre at it — so the curve bends sharply but smoothly, and even his "slow" takeoff is like the industrial revolution compressed roughly a hundredfold. Yudkowsky's discontinuous view says a system running a million times faster than us simply will not wait a year and a half to build its next hardware generation. The crux is not clock-speed; it is locality and discontinuity — whether the jump happens inside one system, suddenly, or across the whole economy, gradually.

"Before there is AI that is great at self-improvement there will be AI that is mediocre at it." — Paul Christiano, paraphrasing his continuous-takeoff thesis

Where does each serious thinker actually stand? The spectrum below is the most useful way to hold them in your head at once — and it doubles as a map of the whole field, because takeoff speed correlates tightly with almost every other view a person holds.

The Takeoff-Speed Spectrum

From discontinuous "FOOM" on the left to "slow, or simply not on this path" on the right. A person's position here predicts most of their other beliefs.

Positions synthesized from the Yudkowsky–Christiano takeoff debate (2021), Situational Awareness, Hassabis & Amodei interviews (2024–2026), Cowen's writing on diffusion and cost disease, and LeCun/Marcus's architectural critiques.

02 · The Fuel

Scaling laws, the data wall, and the pivot to "thinking" compute

Sutskever Hinton Hooker Epoch AI

For five years the dominant religion was simple: more compute and more data, reliably, make models better. Ilya Sutskever — who arguably authored that era — declared its end. At NeurIPS in late 2024 he framed data as the fossil fuel of AI and said bluntly that we have "achieved peak data and there will be no more." In his 2025 telling, the field has rotated back to its roots:

"From 2012 to 2020 was the age of research. From 2020 to 2025 was the age of scaling. Now it's back to the age of research again — just with big computers." — Ilya Sutskever, founder of Safe Superintelligence

Sutskever's Three Eras

The strong scaling hypothesis — that 100× more compute alone transforms everything — is the thing he now explicitly rejects.

Sutskever, NeurIPS 2024 and Dwarkesh Patel interview, November 2025. Sara Hooker's 2026 essay "On the Slow Death of Scaling" documents smaller models closing the gap with larger ones through better training — evidence the binding lever is now method, not raw size.

"Is scaling dead?" turns out to be the wrong question. The real one is which scaling. Pretraining returns are bending toward log-linear, but a new axis opened: test-time compute — letting a model "think" longer at inference. That is why labs shipped reasoning models (the o-series, GPT-5) rather than a brute-scaled GPT-class model; they lacked the compute to do both at once. Google's Gemini 3 in late 2025 revived the "scaling still holds" case after a year of obituaries. The fairest read: pretraining is maturing, reasoning is the live frontier, and that frontier itself already shows saturation on some benchmarks while still climbing on others.

The data wall — and why "model collapse" is more slogan than threat

Epoch AI's foundational projection puts the effective stock of public human text near 300 trillion tokens and estimates that, on current trends, models will have consumed it sometime between roughly 2026 and 2032 — median around 2028. (Their earlier 2024 estimate slipped once carefully filtered web data and multi-epoch training proved more durable than expected.)

This is where the "model collapse" alarm enters — the 2024 Nature result that training recursively on synthetic output degrades successive generations into gibberish. But the rebuttal literature is decisive on the realistic case: the catastrophic version assumes you delete the real data each round. Keep accumulating real plus synthetic, and the drift is bounded. The working consensus, in Ethan Mollick's phrasing: mix, don't replace. Synthetic data is viable fuel — not infinite, but not poison either. Fei-Fei Li adds that vast differentiated and private data remains untapped entirely.

Crux to watch Whether reasoning / test-time compute keeps yielding log-linear returns across labs, or visibly plateaus. A broad plateau would be the strongest evidence yet for the skeptics — and would shift advantage toward whoever locked in compute and power early.

03 · The Shape

Is the transformer enough — and why "how smart" is the wrong axis

LeCun Hassabis Mollick Chollet

The decoder-only transformer dominates everything at the frontier, and the live question is whether a new architectural unlock is required. Yann LeCun is the loudest dissenter — he left Meta in late 2025 to found a world-models startup and calls today's LLMs a "dead end" on the road to human-level intelligence. His charge: a language model is a vast web of statistical correlations with no internal world model, no persistent memory, no planning, no grounding in physical reality. His line lands because it's vivid:

"Using an LLM to understand the real world is like teaching someone to drive by just talking." — Yann LeCun, on the limits of token prediction

His alternative, JEPA, predicts in an abstract representation space and deliberately throws away unpredictable detail. The fair caveat, noted by his critics: his reliability degrades with distance from his own lab, and the JEPA pitch doubles as a startup's founding pitch. Demis Hassabis sits in the calmer middle — large models are a necessary but probably insufficient component, and true generality likely needs an AlphaZero-style planning layer bolted atop a multimodal world model. Notably, he uses the very same term the builders have adopted for what today's systems actually feel like: jagged intelligence.

This is the frame that matters most for anyone building. Intelligence is not a single dial that models slowly turn up. It is a profile with wild peaks and stubborn troughs — superhuman and subhuman at the same time, with no clean pattern to where the boundary falls.

Jagged Intelligence — capability relative to a human expert

The same model is far above the line on some tasks and below it on others. A workflow is only as strong as its weakest necessary step — which is why this shape beats any benchmark average.

Profile illustrative; quantified anchors from Dell'Acqua et al., "Navigating the Jagged Technological Frontier" (Harvard Business School / BCG, 2023): on in-frontier tasks, consultants with GPT-4 produced ~40% higher-quality work; on one out-of-frontier task, AI users were 19 percentage points less likely to reach the correct answer.

"If you want to understand where AI is headed, don't watch the benchmarks. Watch the bottlenecks." — Ethan Mollick, on the jagged frontier

A persistent source of the jaggedness is mundane and important: today's models form no new permanent memory and don't learn from the task in front of them. Superhuman differential diagnosis is worthless for replacing a radiologist if the model's image vision is subhuman. The economic consequences flow from which parts get automated — not from any average.

04 · The Economy

Commoditization vs. centralization — and the value that escapes both

Nadella Huang Acemoglu Cowen Epoch AI

Two stories about where the value goes are both true, and they pull against each other. This is the tension at the heart of any builder's strategy.

The commoditization case

Intelligence-per-dollar is collapsing. Open-weight models (DeepSeek, Llama, Mistral) keep reaching "frontier-minus-one" and giving it away. Most digital work — summarizing, extracting, routing, drafting — does not need the absolute frontier. It's a sword to cut an apple.

If that holds, frontier labs are selling a luxury good for a narrow band of hard tasks, while the commodity layer races to near-zero margin.

The centralization case

The frontier costs more capital, talent, and compute every cycle. Hyperscalers spent ~$400B on AI capex in 2025 and budgeted near $700B for 2026. OpenAI and Anthropic carry valuations approaching a trillion dollars.

If that holds, a handful of labs and their cloud patrons concentrate the most strategically important capability ever built, and rent edge to whoever can pay for asymmetry.

The "sword to cut an apple" question — do we need frontier intelligence for most work? — increasingly answers no. And the price evidence is stark. DeepSeek's open-weight V4, previewed in April 2026, posted frontier-grade coding scores at roughly one-thirtieth to one-hundredth the price of the closed leaders:

The Price of Intelligence Is Collapsing

Cost per million output tokens, log scale. Open weights at frontier-minus-one have opened an 80–100× price gap on comparable tasks.

Indicative API pricing, 2026. DeepSeek V4 reached ~80% on SWE-bench Verified at a fraction of closed-model cost, achieved partly via a sparse mixture-of-experts (1.6T parameters, ~49B active per token) and aggressive KV-cache compression — also a deliberate route around chip export controls. API prices have fallen ~97% since GPT-3.

What reconciles the two stories is the Jevons paradox. When Satya Nadella watched DeepSeek's cheap model briefly erase $600B of Nvidia's value, his response was counterintuitive: "Jevons paradox strikes again!" Cheaper intelligence doesn't shrink spending — it explodes total demand until intelligence becomes a commodity we can't get enough of. Jensen Huang makes the same bet in different words. So "models get cheaper" is secretly an argument for the compute crunch, not against it — which is why the commodity layer and the trillion-dollar capex race can both be real at once. The likely equilibrium: a competitive commodity-intelligence layer underneath a small number of labs renting frontier edge for the narrow tasks where being six months ahead is worth a fortune. Trading is the perfect example — in markets, relative edge is everything.

How big is the prize? Economists disagree by an order of magnitude

Here the spread between serious estimates is almost comic. Nobel laureate Daron Acemoglu models AI as a modest productivity tweak; Goldman Sachs models it as a multi-trillion-dollar reordering. They are looking at the same technology.

The 10-Year GDP-Impact Chasm

Cumulative US/global GDP uplift attributed to AI — same technology, estimates an order of magnitude apart.

Acemoglu, "The Simple Macroeconomics of AI" (NBER 2024): ~0.66% TFP gain over 10 years, because only ~20% of tasks are AI-exposed and few are profitably automatable near-term. Goldman Sachs Research (2023): up to 7% global GDP, ~$7T, with 300M jobs exposed. A Dec 2025 field experiment measured real productivity premiums near 81% — roughly triple Acemoglu's 27% input — hinting his estimate may be conservative.

Tyler Cowen offers the most carefully reasoned slow-but-real view: a roughly 20-year horizon, because explosive growth runs into cost disease, the historically slow diffusion of innovations, and the inconvenient fact that nearly half of GDP — government, healthcare, education — adjusts at a glacial pace. His reframing of the labor question is the one to remember:

"AI will not bring mass unemployment, but it will change most jobs … the #1 bottleneck to AI progress is humans." — Tyler Cowen, economist

A sobering counterweight to the hype came from METR's 2025 randomized trial: experienced open-source developers using early-2025 AI tools took 19% longer than without them — while believing they'd been 20% faster. One of the AI 2027 authors cited this exact study as a reason he lengthened his own timelines. Perceived speed and real throughput are not the same number.

05 · The State

Nationalization, the lab–cloud merger, and energy as the real gate

Aschenbrenner Gladstone AI Wildeford IEA / WEF

Aschenbrenner's prediction is blunt: superintelligence is a national-security asset, no startup can hold it, and so a government "Manhattan Project for AI" arrives by the late 2020s. The more rigorous framework calls outright nationalization "reductive and misleading" — given an industry with over a trillion dollars of private funding and tens of thousands of participants, what's far more likely is soft nationalization: a widening spectrum of policy levers (security requirements, contracts, export controls, compute oversight) that blur the line between regulation and de facto control. Forecasters put a literal secret government lab in the low single-digit percentages; a government-led consortium is more plausible — and the single biggest trigger for any of it is a perceived Chinese military-technological threat.

Meanwhile the corporate structure question — do labs become the most valuable companies, get acquired by hyperscalers, or absorb them? — is being answered in real time by an entanglement so deep the categories are dissolving:

The Lab–Hyperscaler Entanglement

The frontier labs and their cloud patrons are now financially fused. In Q1 2026, a large share of Google's and Amazon's "AI profits" were paper gains on their Anthropic stakes — not operating income.

Figures from 2025–2026 funding rounds and filings. Anthropic became the most valuable AI startup (~$965B, May 2026), surpassing OpenAI's private mark. Analysts increasingly treat OpenAI as "a proxy for the AI trade," and worry about hyperscaler over-reliance on a single private company. These numbers move fast.

Energy is the constraint hiding under "compute"

Step back and the binding constraint on the whole edifice is increasingly not chips or even cognition — it's power. The IEA and WEF flag grid interconnection as the strategic chokepoint: connecting a new facility to the grid can take 4–10 years, versus 2–3 to build the data center itself. That's why the hyperscalers are signing nuclear deals — Amazon and Microsoft (a Three Mile Island restart), Google (small modular reactors) — though most of that power arrives only in the mid-2030s. In the meantime, "bring your own power" — onsite gas, off-grid energy islands — is the stopgap, and tokens-per-watt has quietly become the operative metric. Data centers in space "solve" energy in theory but only move the bottleneck to ops, which stays unsolved until robotics is. In the near term, power and ops are at least as binding as compute or intelligence.

06 · The Upside

What AI might actually accelerate — and what it can't

Amodei Hassabis

The most ambitious concrete vision is Dario Amodei's Machines of Loving Grace: a "country of geniuses in a datacenter" that could compress 50–100 years of biological progress into 5–10, with AI acting as a virtual biologist that designs and directs experiments rather than just crunching data. Software engineering is already being accelerated; biomedicine, post-AlphaFold, is the next plausible frontier. But Amodei himself names the limits — the speed of the physical world, the need for data, the irreducible complexity of some problems, lab throughput, and regulation. And tellingly, he concedes AI will not advance democracy and peace the way it advances health, and "seems likely to enable much better propaganda and surveillance." The acceleration is real but uneven — jagged, again, at civilizational scale.

07 · The Risk

Why credentialed experts put p(doom) anywhere from <1% to ~99%

Hinton Bengio Christiano Yudkowsky LeCun Ng

Existential-risk views are genuinely bimodal, and a 2025 survey found the split tracks one question: do you see AI as a controllable tool (low p(doom)) or an uncontrollable agent (high p(doom))? The same evidence supports both readings, which is exactly why the range is so absurdly wide:

The p(doom) Spectrum

Self-stated probability that advanced AI leads to human extinction or equivalent catastrophe. These are estimates by people who have thought about it for years — and they span the entire interval.

Self-reported and survey figures, 2022–2025. Hinton and Bengio went public in 2023 and have not recanted; the 2023 Center for AI Safety statement ranked extinction risk alongside pandemics and nuclear war. Skeptics like Andrew Ng and Richard Sutton argue the fear is amplified by incumbents to lobby against open source. A Dec 2025 survey of safety figures found them undeterred despite the "GPT-5 letdown."

The skeptics are as credentialed as the worriers, and their objection is epistemic: extinction arguments, Ng says, ultimately reduce to "it could happen." LeCun's challenge is the sharpest — before urgently figuring out how to control superintelligence, we need a hint of a design for a system smarter than a house cat. Both things can be true: the worriers can be right that the risk deserves serious resourcing, and the skeptics right that confident doom is unearned. The disagreement itself — not any tidy consensus — is the honest description of where we are.

08 · The Clock

When? Every serious prediction, on one axis

Timelines have stretched even as they stayed aggressive. Lab leaders cluster near 2030; economists push out to a 20-year horizon; skeptics decline the premise. Note the most telling motion of all — the AI 2027 authors pushing their own median from 2027 into the early 2030s, a live demonstration of how fast these forecasts revise.

AGI / Transformative-AI Timelines

Dots are point estimates; bars are stated ranges. Orange = near-term, blue = long-term or skeptical.

Synthesized from public statements 2024–2026: Amodei (Machines of Loving Grace), Aschenbrenner (Situational Awareness), Hassabis (interviews), the AI 2027 scenario authors' revised median, Sutskever (SSI framing), Cowen's diffusion writing, and Marcus/LeCun's architectural critiques. Point estimates are approximate.

09 · So What

What follows, if you're building in applied AI

The landscape above isn't abstract — it converts into a handful of concrete stances for anyone shipping product on top of these models.

Build at the commodity layer; capture value in orchestration. With open weights 30–100× cheaper than closed frontier APIs, hard-wiring to one provider is now a liability. Route the cheapest capable model per task and escalate to the frontier only for genuinely hard reasoning — this alone tends to cut costs 60–80%. Revisit only if a closed model opens a durable, >20-point capability gap on your core task that open weights can't close within ~6 months.
Design around jaggedness, not average capability. Map your workflow's tasks to where AI is superhuman vs. subhuman; gate human review wherever an error carries real downstream cost; run agents in shadow mode before granting autonomy. The single weakest necessary step — memory, reliability, one stubborn subskill — decides product viability far more than headline benchmarks.
Treat energy/compute as a real constraint, but bet on Jevons. Falling per-token cost should expand your addressable use cases — products that were uneconomic at $1,000/run become viable at $20. Model true cost-per-token at realistic 25–60% utilization, never at vendor "10×" headline claims.
Hedge timeline uncertainty deliberately. Architect to benefit if reasoning/agentic models keep improving, but don't pre-build for capabilities that don't exist yet, and don't bet the company on a plateau either. Remember METR: measured throughput beats felt productivity, which ran 39 points ahead of reality.
Watch three leading indicators that would change this whole picture: (a) whether test-time scaling keeps returning log-linear gains or visibly plateaus across labs; (b) whether power/grid interconnection becomes the public gate on frontier scale — a real slowdown signal that favors incumbents; (c) regulatory moves toward soft nationalization, especially export-control or compute-access rules that could restrict open weights — which would invalidate recommendation #1.

10 · Read This Carefully

Caveats — because confidence here is mostly unearned

These are forecasts, not facts. Nearly every timeline claim is explicitly speculative; several authors call them guesses. The AI 2027 median already slid by years — proof of how fast this revises.
Source incentives cut every direction. Lab leaders have reasons to project both radical upside and inevitability; the most cited bull literally runs a fund betting on his thesis. Prominent skeptics have reputational stakes in a "wall" being real. Doom-skeptics note the safety movement is concentrated-funded. Weight the arguments, not the authority.
The numbers are snapshots. Valuations, model prices, capex, and chip specs here are 2025–2026 figures and will be stale quickly. Several "up to $X billion" figures are commitments, not deployed capital.
Vendor and benchmark claims need independent validation at realistic utilization. A model that's superhuman on a benchmark can still fail the one subtask your product depends on.
The deepest cruxes are genuinely unresolved — takeoff speed, whether the transformer is an AGI-complete paradigm, the true elasticity of AI demand, and p(doom). This report maps the disagreement rather than faking a consensus. That disagreement is the most accurate description of the present moment.

Sources & Further Reading

Leopold Aschenbrenner — Situational Awareness: The Decade Ahead (2024)
Dario Amodei — Machines of Loving Grace (2024)
Yudkowsky vs. Christiano — the takeoff-speed debate (2021; LessWrong)
Ilya Sutskever — NeurIPS 2024 talk; Dwarkesh Patel interview (Nov 2025)
Epoch AI — Villalobos et al., "Will we run out of data?" (data-wall projection)
Shumailov et al. — "Model collapse" (Nature, 2024); Gerstgrasser/Schaeffer rebuttals
Yann LeCun — JEPA / world-models talks; AMI launch (2025)
Dell'Acqua et al. — "Navigating the Jagged Technological Frontier" (HBS/BCG, 2023)
Ethan Mollick — "The Shape of AI: Jaggedness, Bottlenecks and Salients"
Daron Acemoglu — "The Simple Macroeconomics of AI" (NBER, 2024)
Goldman Sachs Research — generative-AI GDP impact (Briggs & Kodnani, 2023)
Tyler Cowen — writing on AI, diffusion, and cost disease (Marginal Revolution; Fortune)
METR — RCT on AI and developer productivity (2025)
Cheng & Katzke — "Soft Nationalization" framework; Gladstone AI superintelligence report
Hinton, Bengio, Russell et al. — "Managing extreme AI risks" (Science, 2024); CAIS statement (2023)
Dwarkesh Patel Podcast — deep technical interviews with most figures above