Thanks for the analysis, Bob. How did you come up with " But hyperscaler capex efficiency has collapsed 75% (11K → 2.5K compute per dollar)."? Everything I'm reading about dollar per token shows significant deflation when adjusted for quality.
Thanks Nn, hope you enjoyed & great question! The 75% decline is not from model-training token economics. It comes directly from the “New Compute / Magnificent 7 CapEx” series on Slide 19 (sourced from MacroMicro, with whom I now have a data partnership).
They define “compute efficiency” as:
New US compute capacity added each half-year (measured via TOP500 LINPACK performance) ÷ Mag 7 CapEx over the same period.
MacroMicro’s methodology:
•Compute input - TOP500 supercomputer rankings (max LINPACK). They sum US systems and convert to incremental half-year compute additions (TFlop/s).
•CapEx input - Reported capex from the seven major hyperscalers.
•Efficiency - New compute per dollar of hyperscaler capex.
Using that metric:
• Peak efficiency during the early AI buildout was ~11K (compute unit per $ of capex).
• The latest half-year reading dropped to ~2.5K.
• That’s a ~75% deterioration in how much new compute output is being generated per dollar of capex.
This is distinct from “dollar per token” or model-training cost curves, which incorporate model architecture and training-run efficiency. The MacroMicro series measures physical compute added to US infrastructure relative to capex, not the cost of inference or training tokens.
This is why I frame it as a leading indicator of diminishing returns on hyperscaler AI capex. If capex dollars are producing less net-new compute, the capex cycle typically peaks 2–4 quarters later, which historically leads NVIDIA demand turning points.
Thanks for the analysis, Bob. How did you come up with " But hyperscaler capex efficiency has collapsed 75% (11K → 2.5K compute per dollar)."? Everything I'm reading about dollar per token shows significant deflation when adjusted for quality.
Thanks Nn, hope you enjoyed & great question! The 75% decline is not from model-training token economics. It comes directly from the “New Compute / Magnificent 7 CapEx” series on Slide 19 (sourced from MacroMicro, with whom I now have a data partnership).
They define “compute efficiency” as:
New US compute capacity added each half-year (measured via TOP500 LINPACK performance) ÷ Mag 7 CapEx over the same period.
MacroMicro’s methodology:
•Compute input - TOP500 supercomputer rankings (max LINPACK). They sum US systems and convert to incremental half-year compute additions (TFlop/s).
•CapEx input - Reported capex from the seven major hyperscalers.
•Efficiency - New compute per dollar of hyperscaler capex.
Using that metric:
• Peak efficiency during the early AI buildout was ~11K (compute unit per $ of capex).
• The latest half-year reading dropped to ~2.5K.
• That’s a ~75% deterioration in how much new compute output is being generated per dollar of capex.
This is distinct from “dollar per token” or model-training cost curves, which incorporate model architecture and training-run efficiency. The MacroMicro series measures physical compute added to US infrastructure relative to capex, not the cost of inference or training tokens.
This is why I frame it as a leading indicator of diminishing returns on hyperscaler AI capex. If capex dollars are producing less net-new compute, the capex cycle typically peaks 2–4 quarters later, which historically leads NVIDIA demand turning points.
Happy to dig deeper if useful.