AI Infra Framework
The Economics of a Gigawatt
Factory math on whether the AI buildout pencils — bubble or productive capital?
18 June 2026 · YK Research
Contents
The Question
Everyone is balling on datacenters. Hundreds of billions of capex, financed against revenue that mostly doesn’t exist yet. The bear word is “bubble.” The bull word is “factory.” Both sides wave their hands. So I built the factory math: one 1GW datacenter full of GB200-class GPUs. What does it cost, what can it earn renting GPUs, and how much end-user AI revenue does the lab on top need to make it pencil? If paid useful AI work grows faster than GPU supply, it’s a factory. If it doesn’t, it’s a bubble. The math decides, not the vibes.
Short version: the base case clears, and it doesn’t need scarcity pricing to work. A gigawatt earns ~43% operating margins renting GPUs at today’s prices, and the lab on top needs ~$31B/year of inference revenue per gigawatt to justify the compute. That’s a large number, but it’s a number agentic workflows can plausibly hit. The thesis breaks on one variable, named at the end.
Three Layers
1. Build cost
2. GPU rental
3. Inference
Layer 2 is the infrastructure owner’s P&L. Layer 3 is whether the customer renting that infrastructure can actually pay for it. The bubble lives or dies in layer 3.
Build cost: ~$38–41B upfront, ~$7B/year to run
A 1GW site holds 480,000 GB200-class GPUs. The servers are more than half the bill, and they depreciate in five years — the expensive part of an AI datacenter is the part that goes obsolete fastest.
| Component | $B | Dep. life |
|---|---|---|
| Facility + land + substations | 11.8 | 14 yr |
| AI servers (the GPUs) | 21.0 | 5 yr |
| Networking + other IT | 4.9 | 5 yr |
| Behind-the-meter power (optional) | 3.0 | 14 yr |
| Total upfront capex | 40.7 | ~$38B ex-BTM |
| Annual cost | $B/yr | Math |
|---|---|---|
| Facility depreciation | 1.06 | (11.8 + 3.0) / 14 |
| IT depreciation | 5.18 | (21.0 + 4.9) / 5 |
| Opex (mostly electricity) | 0.90 | — |
| Annualized cost | 7.14 | sum |
So a 1GW datacenter must clear about $7.1B/year before financing and tax to be worth building. IT depreciation alone is $5.2B — three-quarters of the running cost is just the GPUs wasting away.
Where this differs from ARK — and why it matters
ARK’s published frame uses ~$8.5B/year. This sheet gets $7.14B by stretching facility depreciation to 14 years and isolating opex. That $1.4B gap is not cosmetic: on the same base-case revenue, it’s the difference between a 43% margin and a 32% margin. If you think GPUs obsolete faster than 5 years, or financing belongs in the number, use ARK’s $8.5B and haircut every margin below by ~10 points.
Renting GPUs earns ~43% margins at today’s price
Rental revenue is mechanical: GPUs × hours/year × $/GPU-hour × utilization. At 90% utilization and $3.30/GPU-hour (roughly today’s market): 480,000 × 8,760 × $3.30 × 0.90 = $12.49B/year.
Base case — $3.30/GPU-hr, 90% util
Scarcity case — $11/GPU-hr
The honest read sits near the base case: at normalized prices, a gigawatt still throws off ~40% operating margins. That alone says the infrastructure layer is not obviously a bubble. The question is whether the customer can pay.
The lab needs ~$31B/year of inference revenue per GW
When OpenAI or Anthropic rents that compute, the rental bill becomes their cost of goods. Run inference at a 60% gross margin and compute is 40% of revenue, so inference revenue = compute cost / 40%.
| Scenario | Compute cost | Required end-user revenue / GW |
|---|---|---|
| Base ($3.30/GPU-hr, 90% util) | $12.49B | $31.2B/year |
| Base, full utilization | $13.9B | $34.7B/year |
| Scarcity ($11/GPU-hr) | $46.3B | $115.6B/year |
So the bull case has a price tag: ~$31B of paid inference revenue per gigawatt per year, rising to ~$116B if you build against scarcity pricing. The hyperscalers are talking about tens of gigawatts. The base-case $31B/GW is the right order of magnitude for what frontier labs already project; the scarcity case at $116B is where it requires the whole stack to monetize like a luxury good.
The crux: paid useful work per watt, not tokens
Token prices are collapsing — maybe 50% a year. That sounds bearish. It isn’t, on its own, and this is the most important idea in the model. Revenue per gigawatt has four terms: power × tokens-per-watt × price-per-token × utilization. Token price is one term, and it’s falling while throughput per watt rises faster. This hardware generation moves throughput from 900k tokens/sec/MW (H100) to 2.8M (B200) — a 3.1x gain.
Falling prices are Jevons, not deflation — cheaper tokens get used for more, higher-value work. The grid below shows where it breaks. Net revenue index by throughput gain (rows) and price decline (columns); below 1.0, revenue per gigawatt shrinks.
| Throughput ↓ / Price decline → | 0% | 25% | 50% | 75% |
|---|---|---|---|---|
| 1x | 1.00 | 0.75 | 0.50 | 0.25 |
| 2x | 2.00 | 1.50 | 1.00 | 0.50 |
| 3x | 3.00 | 2.25 | 1.50 | 0.75 |
| 4x | 4.00 | 3.00 | 2.00 | 1.00 |
| 5x | 5.00 | 3.75 | 2.50 | 1.25 |
Read the diagonal. A 3x throughput gain survives a 50% price cut (1.50, highlighted) but a 75% cut takes it to 0.75. The whole thesis rides on the hardware cadence staying ahead of the price collapse. The KPI is not token volume. It’s paid useful work per watt.
Bull case
- The datacenter is productive capital, not a cost center.
- Hardware keeps lifting tokens per watt (3x+ per generation); software lifts utilization and cuts inference cost.
- Smarter models unlock higher-value tasks, so willingness-to-pay per token rises even as headline price falls.
- Agentic workflows turn compute into paid work across coding, research, support, ops, finance, legal, and sales.
- Operators with power, chips, financing, and customers earn cloud-like or better returns. The ~$31B/GW bar is a floor, not a stretch.
Bear / invalidation case
- Lots of GPUs, not enough paying customers. The $3.30 price was always temporary scarcity.
- Supply floods in, utilization falls below 90%, and the 43% margin compresses toward 32% (or worse) under ARK’s cost.
- Token prices collapse faster than useful demand grows, dragging the net revenue index under 1.0.
- GPUs obsolete in 3 years, not 5 — IT depreciation jumps from $5.2B toward $8.6B.
- Too much compute burns on training, free users, and R&D; the “100% monetized” inference math never holds.
What would change my mind — the kill levels
| Trigger | Severity | Prob. | Impact on Thesis | Mitigant |
|---|---|---|---|---|
| GPU rental sustains below ~$2/GPU-hr | HIGH | 30% | Rental business toward breakeven on $7B annual cost | Multi-year contracts lock today's $3.30 price |
| Throughput/watt cadence slows below ~2x/gen while prices keep halving | HIGH | 25% | Net revenue index drops under 1.0 — revenue/GW shrinks | B200 already delivers 3.1x over H100 |
| GPU useful life proves to be 3 years, not 5 | MEDIUM | 35% | IT depreciation $5.2B → $8.6B; base margin ~43% → ~15% | Older GPUs cascade to cheaper inference, not scrap |
| Monetized-inference share stuck near 50% | MEDIUM | 40% | Required gross compute roughly doubles per paid watt | Agentic, paid-by-task workflows raise monetized share |
Bullish if AI compute becomes scarce, productive capital. Bearish if it becomes overbuilt commodity capacity. Right now the base-case math says factory, not bubble — but it’s a factory whose entire edge is one race: throughput per watt rising faster than price per token falls. Watch that ratio. Everything else is noise.