Saturday, July 4, 2026
BCN.
Technology

Anthropic opens early talks with Samsung to build its own inference chip

The talks are exploratory, with no design, no prototype, and no timeline. What is worth watching is Samsung's 2nm yields and the unverified efficiency numbers driving the whole custom-silicon race.

Janet Torvalds

July 4, 2026

Anthropic has opened preliminary talks with Samsung Electronics about manufacturing a custom AI chip, according to The Information, whose report was picked up by Bloomberg and TechCrunch on July 2. Before the narrative gets ahead of the facts, it is worth stating what this is: a conversation, not silicon.

The talks are exploratory. Three people familiar with them told The Information that Anthropic has not decided what the chip would be optimized for, how powerful it should be, or how it would slot into a server rack. No prototype has been built. There is no manufacturing timeline. Anthropic's public line is that Amazon's Trainium, Google's TPUs, and Nvidia's GPUs stay central to how it scales compute, and it declined to comment on custom silicon beyond that.

So why does a set of undefined talks matter? Because of who Anthropic hired and what problem the chip would solve.

What we actually know

  • The talks were reported July 2, 2026, and are at the specification stage: process, power budget, and cluster layout are still being defined.
  • In early June, before the report surfaced, Anthropic hired Clive Chan, the second engineer to join OpenAI's dedicated chip team. He spent two and a half years on the Broadcom-designed inference accelerator OpenAI unveiled as "Jalapeño" on June 24.
  • Samsung was a strategic infrastructure partner in Anthropic's $65 billion Series H round in May 2026, alongside SK Hynix and Micron.
  • The Samsung discussions are not exclusive. Anthropic is also in conversations with Microsoft and the UK startup Fractile. Google is separately talking to Samsung about future TPU production.

The Chan hire is the tell. You do not recruit the person who helped stand up a rival's accelerator program unless you intend to build one yourself.

Why Samsung, specifically

Three companies anchored Anthropic's Series H. Only one of them runs a foundry. SK Hynix and Micron make memory. Samsung makes memory too, but its foundry division also fabricates other companies' chip designs in its own plants. That is the capability Anthropic would need, and it is the reason Samsung is a plausible manufacturing partner while the other two are not.

The discussions reportedly center on Samsung's SF2 process, its 2nm-class node, plus its advanced packaging. SF2 uses gate-all-around (GAA) nanosheet transistors, a structural change from the FinFET transistors that ran the previous decade. GAA wraps the gate around the channel on all four sides instead of three, which tightens electrical control over the current. Samsung's own framing for the generation is roughly 15 percent better performance at the same power, or lower power at the same performance, against the prior node. Samsung began volume SF2 production in late 2025.

Packaging matters as much as the transistor. A modern AI accelerator is rarely one die. It stitches together logic, memory interfaces, and networking silicon in a single package using 2.5D and 3D stacking, and the reported talks cover the packaging architecture, not just the process node.

Inference is the prize, not training

This is where the engineering logic gets specific, and where a custom chip either earns its cost or does not.

Training and inference are different problems. Training a frontier model means sustaining enormous throughput across huge datasets, and that is where Nvidia's GPUs remain dominant with no credible near-term replacement at scale. Inference is the other job: serving Claude's responses to users, over and over, at low latency and low cost per query.

When your dominant workload is a single class of operations, transformer serving, a purpose-built ASIC can strip out the general-purpose machinery a GPU carries and spend every transistor on that one task. That is the entire economic case for custom inference silicon. It is also why Chan's background, on an inference accelerator, points at what Anthropic would build first.

The number that deserves scrutiny

The figure driving this whole race is the claim that custom inference chips cut cost by about half. On Jalapeño, Broadcom's CEO told Bloomberg early testing showed roughly 50 percent cost savings versus typical AI GPUs for inference.

Treat that as what it is: a vendor's number, from early testing, with no published methodology and no independent measurement at production scale. Fifty percent against which GPU, at what utilization, counting what in the cost? The direction is believable and the mechanism is real. The specific figure is a marketing input until someone shows the workload and the math. Anthropic would be chasing the same kind of savings on Claude's traffic, and it would be chasing an unverified target.

Samsung's yield problem does not go away

The reason this is talks and not a signed deal runs partly through Samsung's fabs. Yield is the share of chips on a wafer that pass testing, and it sets the per-chip cost and how reliably supply shows up.

Samsung's first-generation SF2 was reported at yields around 50 to 60 percent through much of 2025, under the 70 to 80 percent range analysts treat as economically healthy for high volume. TSMC's competing N2 process has reportedly run in the 65 to 80 percent range in volume production with anchor customers. Samsung's SF2P, the performance-tuned second iteration, is reportedly near 70 percent as of early 2026. Whether that holds at commercial volume is unproven. For a customer that cannot tolerate chips arriving late or at unpredictable cost, that uncertainty is a real line item, not a footnote.

The context around it

Nvidia still holds roughly 74 percent of the AI chip market by The Information's estimate, a share that has if anything grown as training demand outpaced every alternative. Custom chips from Anthropic, OpenAI, Google, and Amazon are not displacing Nvidia. They are carving out the inference slice where the economics of a specialized part are most favorable.

There is also an opening. Samsung had been building a custom inference chip for OpenAI before those talks stalled in early June over what Korean outlets described as strategic differences. If Samsung points that freed-up 2nm attention at Anthropic, both sides gain: Anthropic gets a foundry with fresh, directly relevant AI-chip experience, and Samsung gets a marquee logic customer at a moment when it is trying to close the gap with TSMC.

None of that is a chip yet. It is a hire, a process node, and a cost curve that custom silicon might bend. The thing to watch is not the announcement. It is whether Samsung's yields and the real, measured efficiency numbers ever justify the move from conversation to tape-out.

ClaudefoundrySF2 2nmgate-all-aroundSamsunginference chipcustom AI chipAI chipsAI InferenceCustom SiliconAnthropicSemiconductor foundryNvidiaAI accelerator

Keep reading