Jensen Huang says 'every edge device will become autonomous' — Nvidia maps one computing pattern from the cloud to robotics
(Image credit: Getty Images / Anadolu)
When not being spotted at night markets or meeting crowds of adoring fans, the hardware industry’s biggest celebrity, Nvidia CEO Jensen Huang, spent most of his time at Computex this week making the case that computing as we know it is collapsing into one repeatable pattern built for AI agents; a blueprint that now runs across the cloud, the PC, the car, and the robot.
"There's a new computing pattern," the Nvidia CEO told reporters at a press gaggle the day after his GTC Taipei keynote, describing an agent architecture he calls a harness that orchestrates reasoning, memory, and tool use the same way whether it sits in a data center or a laptop.
He tied that claim to every product Nvidia detailed at the show, from the Vera data-center CPU now in full production to RTX Spark, its first Windows PC platform, shipping in laptops this fall.
One pattern, every machine
Huang told the room that he repeats the same keynote structure on purpose. "Every time I give you a keynote, it's like Top Gun 17, and it's exactly the same architecture," he said, "because I want you to know that the future of computing is this." The pattern begins with training and inference in the cloud and pushes outward to everything else: "Every edge device will become autonomous. Every edge device will have agentic systems."
He ran that blueprint through self-driving cars, humanoid robots, Nokia base stations, and imaging satellites, casting each as the same agent profile on different hardware. Curiously, the self-driving car got quite a bit of airtime, with Huang describing Nvidia's Alpamayo driving stack as a system that reasons in language rather than reacting to images, one that could read a "skill file" and watch a tutorial video to operate unfamiliar machinery the way a person would. "That's how autonomous vehicles are going to work in the future," he said. "It's essentially that agentic computing pattern with a physical AI model."
A CPU that generates tokens, not cores
Vera, on the data center side, is an 88-core Arm processor that Nvidia is now in full production with, pitching it as a chip built for agents rather than human users. "We built Vera for agents to use," Huang said. "Until six months ago, there were no agents, so that's the definition of a $0 billion market."
A hyperscale CPU piles on cores because humans lease them by the hundred, where an agent, Huang argued, "doesn't want to rent the CPU core, the agent wants to generate tokens." That pushed Nvidia toward single-thread speed and memory bandwidth over core count, and Huang claimed Vera offers the largest step up in single-threaded performance he has seen "in 25 years." His reasoning ties back to latency: "Humans are more patient than agents. Agents, they're working at nanosecond scale, not second scale."
Nvidia claims 1.8 times faster task completion than x86 and a 1.5 times instructions-per-clock gain over its Grace predecessor, with a 256-chip liquid-cooled Vera rack it says reaches six times the throughput of a conventional CPU rack. The chip ships on the back of nearly 2.5 million Grace units sold, and Anthropic, OpenAI, xAI, ByteDance, CoreWeave, and Oracle are named as early customers. CFO Colette Kress told investors on Nvidia's latest earnings call that the company sees "nearly $20 billion in total CPU revenue this year".
Phoronix's first public Vera benchmarks in May measured it roughly 10% ahead of AMD's 64-core EPYC 9575F and about 55% ahead of Intel's 128-core Xeon 6980P across selected Linux workloads. Nvidia ran those tests on pre-production silicon at its own headquarters, limited them to workloads it considers relevant, and, by Phoronix's account, switched off CPU power and frequency monitoring for the session.
Reinventing the PC after 40 years
As for RTX Spark, Huang says that it’s the first real rethink of the PC in four decades. "We have an opportunity after 40 years to go reinvent it for the age of AI," he said, predicting the machine shifts "from your PC being a tool to now really your PC being your system." He pushed even further: "Your laptop is going to be your R2-D2."
The top RTX Spark part, internally N1X, pairs a 20-core Arm CPU built by MediaTek (10 Cortex-X925 performance cores and 10 Cortex-A725 efficiency cores) with a Blackwell GPU carrying 6,144 CUDA cores, up to 128GB of LPDDR5X unified memory, and a 600 GB/s NVLink-C2C link, all on TSMC's 3nm node. Huang justified these specs with the same impatience he applied to Vera, arguing that an agent driving the machine won’t wait, so the software it touches, from Adobe to Blender, "cannot be slow."
The platform is launching in a market that Qualcomm had effectively dominated until its Windows on Arm exclusivity with Microsoft lapsed. Fall 2026 laptops are confirmed from Microsoft, Dell, HP, ASUS, Lenovo, and MSI, with Acer and Gigabyte to follow, and Nvidia says anti-cheat engines, including Easy Anti-Cheat and Denuvo, run natively on the chip. Asked why Nvidia would enter a low-margin business it has steered clear of for years, Huang said, "We don't really have to choose. The real question is, can we make a contribution?"
Vera's 88 cores are Nvidia's own custom Olympus design, its first ground-up server core since the Denver and Carmel projects, while RTX Spark's 20 cores are Arm's off-the-shelf Cortex reference designs licensed through MediaTek, one of them already a generation old. Huang's "same pattern everywhere" runs, at the silicon level, on two different CPUs.
When asked whether the Olympus cores would come to Windows PCs, Huang declined to commit. "Our preference is to use off-the-shelf cores whenever we can, because Arm also builds good cores," he said, adding that Olympus was pushed toward single-thread speed in a way standard many-core Arm parts weren’t: "We wanted to push single-threaded performance as far as we could push it." The first PC chip using Nvidia's own cores isn’t expected until 2028. Meanwhile, Morgan Stanley estimates Vera at around $5,000 per socket inside a vertically integrated rack.
What about memory?
DRAM contract prices have climbed sharply through 2026 as makers divert wafers to high-bandwidth memory, and Nvidia remains short of supply even as it locks in capacity, by Huang's own account: "We have enough supply for very robust growth. However, we are supply constrained."
"One of the best ways to improve memory use is to use extremely, extremely low precision," Huang said, pointing to NVFP4, Nvidia's 4-bit floating-point format that scales between four, eight, 16, and 32 bits and roughly doubles the parameters that fit in a given memory pool, the trick that lets RTX Spark hold larger models in its 128GB. He paired it with neural texture compression that cuts game texture memory by up to eight times in Nvidia's demos. At SK hynix's booth during the show, Huang signed an HBM4E wafer with the words "Please Make More."
Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Wow
0
Sad
0
Angry
0
Comments (0)