Epistrophy Week Ahead

The Week Of September 1, 2025

September arrives with conference season in full swing. Broadcom (AVGO: NASDAQ) and Zscaler (ZS: NASDAQ) report earnings this week, offering early signs of where enterprise spending may head into the fall. Last week was about Nvidia (NVDA: NASDAQ), a company now central not just to markets but to geopolitics. This week, attention turns to whether the broader ecosystem—semiconductors, cloud security, infrastructure—keeps pace with Nvidia’s momentum.

Have you check out our cool website? Visit the growing repository of past notes and our searchable research database at epistrophy.beehiiv.com.

As always, I’m focused on three things:
1) Technology-driven change;
2) the latest in innovation and startup trends, and;
3) stock fraud.

Companies Discussed

Ticker

Name

Market Cap ($B)

Price

NVDA

NVIDIA

$4,283.72 B

$174.18

AMD

Advanced Micro Devices

$261.65 B

$162.63

INTC

Intel

$106.24 B

$24.35

GOOG

Alphabet

$2,528.31 B

$213.53

Positron

[private]

AMZN

Amazon.com

$2,416.11 B

$229.00

Groq

[private]

In This Note:

“Rules of Inference” by Mel Bochner, 1974
Source: MoMA

NVIDIA’s Ferrari Problem

Does everyone need a sports car in the AI garage? 

NVIDIA’s earnings this week were another spectacle. The company posted record revenue and raised full-year guidance—but what really stood out in Wednesday’s Q2 FY2026 call wasn’t the topline growth; it was how much of the section on inference sounded like defense.

CEO Jensen Huang said, “Blackwell’s rack-scale NVLink and CUDA full-stack architecture address this by redefining the economics of inference.” For clarity: Blackwell NVL72 is NVIDIA’s rack-scale domain that connects 72 Blackwell GPUs and 36 Grace CPUs via NVLink, forming a tightly coupled, high-bandwidth “super-GPU.” CFO Colette Kress rounded out the pitch: invest three million dollars in one GB200 rack, she claimed, and you could generate thirty million dollars in token revenue—a tenfold return.

Again: nobody asked whether Blackwell was overbuilt for inference. NVIDIA seemed to preempt the question.

Training vs. Inference — Follow the Money

This matters because inference already leads the compute and energy budgets—and the gap is growing.

  • Google reports that roughly 60 percent of its ML energy use is inference, not training.

  • AWS estimates that 80 to 90 percent of its cloud ML demand is inference.

  • Meta’s internal footprint shows a similar split.

Meanwhile, the Stanford AI Index Report 2025 (release in April)  confirms the cost to query a GPT-3.5-class model dropped more than 280-fold between 2022 and 2024, pushing consumption even higher. In this week's conference call, Huang himself acknowledged that “AI inference token generation has surged tenfold in just one year, and as AI agents become mainstream, the demand for AI computing will accelerate.”

Inference is not just the tail anymore—it’s the dog.

Nvidia’s approach to interference is quite different from competitors.

The Ferrari vs. the Workhorse

Blackwell NVL72 is a liquid-cooled, rack-scale system containing 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, all interconnected by the largest NVLink network to date to act as a single, giant GPU for extreme AI and high-performance computing workloads. It’s an engineering beast. That is Ferrari-class compute, brilliant for training and extreme-scale inference runs.

But most inference workloads are bursty, millisecond-latency searches—far more amenable to workhorses: smaller, cheaper accelerators running quantized models.

Rivals are noticing. AMD’s MI300 (AMD:NYSE) and Intel’s Gaudi (INTC:NASDAQ) are positioned as inference-friendly alternatives. Hyperscalers are rolling their own: Google’s TPU v5e, tuned for large-scale inference inside Google Cloud (GOOG:NASDAQ), and Amazon’s Inferentia2, designed to lower the total cost of ownership for transformer workloads across AWS.

There are also private innovation plays. Positron AI (full disclosure: I’m an early investor in the company) claims its Atlas accelerator delivers 280 tokens per second per user for Llama 3.1 8B using just 2000 watts—versus NVIDIA’s 180 tokens per second at 5900 watts, making the Atlas roughly three times more power- and cost-efficient. Another is Groq, a well-funded startup whose Language Processing Unit chips are purpose-built for inference. They promise ultrafast, LLM-native performance with an inference-first ASIC design.

Inference Hardware: Ferrari vs. Workhorse

Platform

Architecture Focus

Strengths

Weaknesses

NVIDIA Blackwell (NVL72)

Rack-scale GPU system, 72 GPUs + 36 CPUs

Peak throughput; frontier training and extreme inference; mature CUDA ecosystem

High power and capital costs; inefficient for small-batch inference

AMD MI300

GPU/CPU hybrid with high memory bandwidth

Lower cost; solid inference performance; ROCm support

Smaller ecosystem; weaker training performance

Intel Gaudi

Custom inference accelerator

Low cost per inference; Ethernet scale-out; efficient

Limited adoption; weaker peak performance

Google TPU v5e

Cloud-native inference ASIC

Efficient serving at scale; strong Google Cloud stack

Proprietary to GCP

AWS Inferentia2

AWS inference chip

Cost-effective on AWS; transformer-optimized

AWS-only; narrow developer ecosystem

Positron AI (Atlas)

Startup inference accelerator

Higher tokens per watt vs. H200; inference-first design; FPGA prototyping accelerates time-to-market

Early stage; ecosystem still forming

Groq LPU

Inference-first ASIC

Optimized for LLM inference; high speed

Private; limited ecosystem

Why NVIDIA Sounds Defensive

That is why the call’s tone matters. Huang and Kress didn’t just market Blackwell—they defended it. They answered a question most investors haven’t fully asked: is Blackwell too much for inference? By reframing inference economics, they tried to make the Ferrari seem reasonable for everyday driving.

But the world is full of commutes—not track days.

Reasoning and agentic AI increase test-time token consumption, translating into more per-request compute—but that also intensifies the need for tokens per dollar efficiency, not just raw rack-scale horsepower. Cloud AI spending trends show inference budgets now exceed training by two to three times at major hyperscalers, with momentum only increasing in 2025. That is not a forecast; it is already visible in infrastructure accounting.

If the world bifurcates—Ferraris for frontier labs, workhorses for the rest—frontier systems like Blackwell will stay essential. But most customers will prioritize efficiency and economics.

Tweet O’ The Week

Epistrophy In The News

On NewsNation, I explained why Nvidia matters more than any other company right now—and why its relationship with China is at the center of U.S. technology policy. Washington wants to sell cutting-edge tools but not arm its rivals; Beijing wants access without dependence. Nvidia sits at the hinge of that uneasy trade-off, its chips both powering America’s AI boom and tempting adversaries.

📆 of Epistrophy Events

Ticker

Name

Market Cap

Expected Date

Type

🎉

Labor Day

Sep 1

Market Holiday

ZS

Zscaler

$42 B

Sep 2

Earnings

CSP

Construction Spending

Sep 2

Economic Event

HPE

Hewlett Packard Enterprise

$29 B

Sep 3

Earnings

AI

C3.ai

$2 B

Sep 3

Earnings

CRM

Salesforce

$241 B

Sep 3

Earnings

HUBS

INBOUND

$25 B

Sep 3

Conference

HUBS

HubsSpot 2025 Analyst Day

$25 B

Sep 3

Analyst meeting

AVGO

Broadcom

$1,361 B

Sep 4

Earnings

DOCU

Docusign

$15 B

Sep 4

Earnings

U3

Unemployment Rate

Sep 5

Economic Event

RBRK

Rubrik

$16 B

Sep 9

Earnings

SNPS

Synopsys

$110 B

Sep 9

Earnings

AAPL

Hardware Launch

$3,414 B

Sep 9

Launch Event

ADBE

Adobe

$152 B

Sep 11

Earnings

BOX

BoxWorks

$5 B

Sep 11

Conference

PPI

Producer Price Index

Sep 12

Economic Event

UMCSENT

U. of Mich. Consumer Sentiment

Sep 12

Economic Event

INTC

Intel Innovation

$106 B

Sep 15

Conference

CPI

Consumer Price Index

Sep 16

Economic Event

AMZN

Amazon Accelerate 2025

$2,416.1 b

Sep 16

Conference

FOMC

Federal Open Market Committee Meeting

Sep 17

Economic Event

SNAP

Snap Partner Summit

$11.9 b

Sep 17

Conference

TikTok Ban

Sep 17

AMZN

AWS Summit LA

$2,416.1 b

Sep 17

Conference

ZM

Zoomtopia

$24.4 b

Sep 17

Confrence

NHC

New Residential Construction

Sep 18

Economic Event

INTU

Intuit's Annual Investor day

$184.7 b

Sep 18

Investor Meeting

Availability This Week

I’ll be in San Francisco, attending Hubspot Analyst Day and meeting with companies and investors.

Written reports are available to clients, with video summaries on YouTube, and of course our popular summaries of the summaries on Instagram, TikTok, and YouTube Shorts.

Are these notes are helpful to you? Suggestions? I’d love to discuss them further and, as always, comments, questions and ideas are appreciated.

The information contained here is provided for informational purposes only and should not be construed as legal, financial, or professional advice. While we strive to ensure the accuracy and reliability of the information presented, we make no warranties or representations as to its completeness or accuracy.

This note and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the sender immediately and delete this email from your system. Any unauthorized use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. We do not endorse or guarantee the content herein and have no obligation to update or correct any information that may be found to be inaccurate or incomplete. The full context and additional information may be necessary for a complete understanding of this communication, which may be known only to the intended recipient.

This is not a recommendation or solicitation to buy or sell securities. Any investment decisions should be made in consultation with a qualified financial advisor and based on your own research and judgment.

We may retain and archive copies of written communications, including emails, indefinitely. This may include this note and any replies to it. By reading and acting upon the contents of this email, you acknowledge and agree to the terms outlined in this disclaimer. If you do not agree with these terms, please notify the sender immediately and delete this note.

Reply

or to participate.