Epistrophy Week Ahead

The Week Of September 1, 2025

September 01, 2025

September arrives with conference season in full swing. Broadcom (AVGO: NASDAQ) and Zscaler (ZS: NASDAQ) report earnings this week, offering early signs of where enterprise spending may head into the fall. Last week was about Nvidia (NVDA: NASDAQ), a company now central not just to markets but to geopolitics. This week, attention turns to whether the broader ecosystem—semiconductors, cloud security, infrastructure—keeps pace with Nvidia’s momentum.

Have you check out our cool website? Visit the growing repository of past notes and our searchable research database at epistrophy.beehiiv.com.

As always, I’m focused on three things:
1) Technology-driven change;
2) the latest in innovation and startup trends, and;
3) stock fraud.

Companies Discussed

Ticker	Name	Market Cap ($B)	Price
NVDA	NVIDIA	$4,283.72 B	$174.18
AMD	Advanced Micro Devices	$261.65 B	$162.63
INTC	Intel	$106.24 B	$24.35
GOOG	Alphabet	$2,528.31 B	$213.53
	Positron	[private]
AMZN	Amazon.com	$2,416.11 B	$229.00
	Groq	[private]

“Rules of Inference” by Mel Bochner, 1974
Source: MoMA

NVIDIA’s Ferrari Problem

Does everyone need a sports car in the AI garage?

NVIDIA’s earnings this week were another spectacle. The company posted record revenue and raised full-year guidance—but what really stood out in Wednesday’s Q2 FY2026 call wasn’t the topline growth; it was how much of the section on inference sounded like defense.

CEO Jensen Huang said, “Blackwell’s rack-scale NVLink and CUDA full-stack architecture address this by redefining the economics of inference.” For clarity: Blackwell NVL72 is NVIDIA’s rack-scale domain that connects 72 Blackwell GPUs and 36 Grace CPUs via NVLink, forming a tightly coupled, high-bandwidth “super-GPU.” CFO Colette Kress rounded out the pitch: invest three million dollars in one GB200 rack, she claimed, and you could generate thirty million dollars in token revenue—a tenfold return.

Again: nobody asked whether Blackwell was overbuilt for inference. NVIDIA seemed to preempt the question.

Training vs. Inference — Follow the Money

This matters because inference already leads the compute and energy budgets—and the gap is growing.

Google reports that roughly 60 percent of its ML energy use is inference, not training.
AWS estimates that 80 to 90 percent of its cloud ML demand is inference.
Meta’s internal footprint shows a similar split.

Meanwhile, the Stanford AI Index Report 2025 (release in April) confirms the cost to query a GPT-3.5-class model dropped more than 280-fold between 2022 and 2024, pushing consumption even higher. In this week's conference call, Huang himself acknowledged that “AI inference token generation has surged tenfold in just one year, and as AI agents become mainstream, the demand for AI computing will accelerate.”

Inference is not just the tail anymore—it’s the dog.

Nvidia’s approach to interference is quite different from competitors.

The Ferrari vs. the Workhorse

Blackwell NVL72 is a liquid-cooled, rack-scale system containing 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, all interconnected by the largest NVLink network to date to act as a single, giant GPU for extreme AI and high-performance computing workloads. It’s an engineering beast. That is Ferrari-class compute, brilliant for training and extreme-scale inference runs.

But most inference workloads are bursty, millisecond-latency searches—far more amenable to workhorses: smaller, cheaper accelerators running quantized models.

Rivals are noticing. AMD’s MI300 (AMD:NYSE) and Intel’s Gaudi (INTC:NASDAQ) are positioned as inference-friendly alternatives. Hyperscalers are rolling their own: Google’s TPU v5e, tuned for large-scale inference inside Google Cloud (GOOG:NASDAQ), and Amazon’s Inferentia2, designed to lower the total cost of ownership for transformer workloads across AWS.

There are also private innovation plays. Positron AI (full disclosure: I’m an early investor in the company) claims its Atlas accelerator delivers 280 tokens per second per user for Llama 3.1 8B using just 2000 watts—versus NVIDIA’s 180 tokens per second at 5900 watts, making the Atlas roughly three times more power- and cost-efficient. Another is Groq, a well-funded startup whose Language Processing Unit chips are purpose-built for inference. They promise ultrafast, LLM-native performance with an inference-first ASIC design.

Inference Hardware: Ferrari vs. Workhorse

Platform	Architecture Focus	Strengths	Weaknesses
NVIDIA Blackwell (NVL72)	Rack-scale GPU system, 72 GPUs + 36 CPUs	Peak throughput; frontier training and extreme inference; mature CUDA ecosystem	High power and capital costs; inefficient for small-batch inference
AMD MI300	GPU/CPU hybrid with high memory bandwidth	Lower cost; solid inference performance; ROCm support	Smaller ecosystem; weaker training performance
Intel Gaudi	Custom inference accelerator	Low cost per inference; Ethernet scale-out; efficient	Limited adoption; weaker peak performance
Google TPU v5e	Cloud-native inference ASIC	Efficient serving at scale; strong Google Cloud stack	Proprietary to GCP
AWS Inferentia2	AWS inference chip	Cost-effective on AWS; transformer-optimized	AWS-only; narrow developer ecosystem
Positron AI (Atlas)	Startup inference accelerator	Higher tokens per watt vs. H200; inference-first design; FPGA prototyping accelerates time-to-market	Early stage; ecosystem still forming
Groq LPU	Inference-first ASIC	Optimized for LLM inference; high speed	Private; limited ecosystem

Why NVIDIA Sounds Defensive

That is why the call’s tone matters. Huang and Kress didn’t just market Blackwell—they defended it. They answered a question most investors haven’t fully asked: is Blackwell too much for inference? By reframing inference economics, they tried to make the Ferrari seem reasonable for everyday driving.

But the world is full of commutes—not track days.

Reasoning and agentic AI increase test-time token consumption, translating into more per-request compute—but that also intensifies the need for tokens per dollar efficiency, not just raw rack-scale horsepower. Cloud AI spending trends show inference budgets now exceed training by two to three times at major hyperscalers, with momentum only increasing in 2025. That is not a forecast; it is already visible in infrastructure accounting.

If the world bifurcates—Ferraris for frontier labs, workhorses for the rest—frontier systems like Blackwell will stay essential. But most customers will prioritize efficiency and economics.

Tweet O’ The Week

Epistrophy In The News

On NewsNation, I explained why Nvidia matters more than any other company right now—and why its relationship with China is at the center of U.S. technology policy. Washington wants to sell cutting-edge tools but not arm its rivals; Beijing wants access without dependence. Nvidia sits at the hinge of that uneasy trade-off, its chips both powering America’s AI boom and tempting adversaries.

📆 of Epistrophy Events

Ticker	Name	Market Cap	Expected Date	Type
🎉	Labor Day		Sep 1	Market Holiday
ZS	Zscaler	$42 B	Sep 2	Earnings
CSP	Construction Spending		Sep 2	Economic Event
HPE	Hewlett Packard Enterprise	$29 B	Sep 3	Earnings
AI	C3.ai	$2 B	Sep 3	Earnings
CRM	Salesforce	$241 B	Sep 3	Earnings
HUBS	INBOUND	$25 B	Sep 3	Conference
HUBS	HubsSpot 2025 Analyst Day	$25 B	Sep 3	Analyst meeting
AVGO	Broadcom	$1,361 B	Sep 4	Earnings
DOCU	Docusign	$15 B	Sep 4	Earnings
U3	Unemployment Rate		Sep 5	Economic Event
RBRK	Rubrik	$16 B	Sep 9	Earnings
SNPS	Synopsys	$110 B	Sep 9	Earnings
AAPL	Hardware Launch	$3,414 B	Sep 9	Launch Event
ADBE	Adobe	$152 B	Sep 11	Earnings
BOX	BoxWorks	$5 B	Sep 11	Conference
PPI	Producer Price Index		Sep 12	Economic Event
UMCSENT	U. of Mich. Consumer Sentiment		Sep 12	Economic Event
INTC	Intel Innovation	$106 B	Sep 15	Conference
CPI	Consumer Price Index		Sep 16	Economic Event
AMZN	Amazon Accelerate 2025	$2,416.1 b	Sep 16	Conference
FOMC	Federal Open Market Committee Meeting		Sep 17	Economic Event
SNAP	Snap Partner Summit	$11.9 b	Sep 17	Conference
	TikTok Ban		Sep 17
AMZN	AWS Summit LA	$2,416.1 b	Sep 17	Conference
ZM	Zoomtopia	$24.4 b	Sep 17	Confrence
NHC	New Residential Construction		Sep 18	Economic Event
INTU	Intuit's Annual Investor day	$184.7 b	Sep 18	Investor Meeting

Availability This Week

I’ll be in San Francisco, attending Hubspot Analyst Day and meeting with companies and investors.

Written reports are available to clients, with video summaries on YouTube, and of course our popular summaries of the summaries on Instagram, TikTok, and YouTube Shorts.

Are these notes are helpful to you? Suggestions? I’d love to discuss them further and, as always, comments, questions and ideas are appreciated.

^{The information contained here is provided for informational purposes only and should not be construed as legal, financial, or professional advice. While we strive to ensure the accuracy and reliability of the information presented, we make no warranties or representations as to its completeness or accuracy.}

^{This note and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the sender immediately and delete this email from your system. Any unauthorized use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. We do not endorse or guarantee the content herein and have no obligation to update or correct any information that may be found to be inaccurate or incomplete. The full context and additional information may be necessary for a complete understanding of this communication, which may be known only to the intended recipient.}

^{This is not a recommendation or solicitation to buy or sell securities. Any investment decisions should be made in consultation with a qualified financial advisor and based on your own research and judgment.}

^{We may retain and archive copies of written communications, including emails, indefinitely. This may include this note and any replies to it. By reading and acting upon the contents of this email, you acknowledge and agree to the terms outlined in this disclaimer. If you do not agree with these terms, please notify the sender immediately and delete this note.}

Reply

or to participate.