Jensen Huang – Will Nvidia’s moat persist?
A professionally copyedited transcript of Jensen Huang’s conversation with Dwarkesh Patel.
This is a professionally copyedited transcript of Jensen Huang’s conversation with Dwarkesh Patel. It has been edited for readability and lightly formatted while preserving the full substance and structure of the discussion.
Made with: The Transcript Desk Chrome Extension
Full video: https://www.youtube.com/watch?v=Hrbq66XqtCo
In this conversation, Jensen Huang discusses Nvidia’s supply-chain position, why he thinks accelerated computing is harder to commoditize than people assume, why he still sees programmability and ecosystem depth as the real moat behind CUDA, why Nvidia chooses not to become a hyperscaler, and why he thinks US chip policy toward China risks conceding developers, standards, and long-run stack leverage.
Episode Guide
0:00 Is Nvidia’s biggest moat its grip on scarce supply chains?
16:25 Will TPUs break Nvidia’s hold on AI compute?
41:06 Why doesn’t Nvidia become a hyperscaler?
57:36 Should we be selling AI chips to China?
1:35:06 Why doesn’t Nvidia make multiple different chip architectures?
Transcript
00:00-00:32
Dwarkesh Patel: We've seen the valuations of a bunch of software companies crash because people are expecting AI to commoditize software. There's a potentially naive way of thinking about things, which is: look, Nvidia sends a GDS2 file to TSMC. TSMC builds the logic dies and the switches, then packages them with the HBM that SK Hynix, Micron, and Samsung make. Then it sends it to an ODM in Taiwan where they assemble the racks. Nvidia is fundamentally making software that other people are manufacturing. If software gets commoditized, does Nvidia get commoditized?
00:32-00:59
Jensen Huang: In the end, something has to transform electrons into tokens. The transformation of electrons to tokens—and making those tokens more valuable over time—is hard to completely commoditize.
00:59-01:38
Jensen Huang: The transformation from electrons to tokens is such an incredible journey. Making that token is like making one molecule more valuable than another. The amount of artistry, engineering, science, and invention that goes into making that token valuable—obviously, we're watching it happen in real time. The transformation, the manufacturing, and all of the science involved is far from deeply understood, and the journey is far from over.
01:38-02:16
Jensen Huang: I doubt that it will happen. We're going to make it more efficient, of course. The way that you framed the question is my mental model of our company. The input is electrons, the output is tokens, and in the middle is Nvidia. Our job is to do as much as necessary and as little as possible to enable that transformation to be done with incredible capabilities. What I mean by "as little as possible" is that whatever I don't need to do, I partner with somebody and make it part of my ecosystem.
02:16-02:46
Jensen Huang: If you look at Nvidia today, we probably have the largest ecosystem of partners, both in the upstream and downstream supply chain—all of the computer companies, application developers, and model makers. AI is a five-layer cake, if you will. We have ecosystems across all five layers. We try to do as little as possible, but the part that we have to do, as it turns out, is insanely hard.
02:46-03:22
Jensen Huang: I don't think that gets commoditized. In fact, I also don't think the enterprise software companies or the tool makers will be either. Most software companies today are tool makers. Some of them are not; some are workflow codification systems. But for a lot of companies, they're tool makers. For example, Excel is a tool, PowerPoint is a tool, Cadence makes tools, and Synopsys makes tools. I actually see the opposite of what most people see.
03:22-04:14
Jensen Huang: I think the number of agents is going to grow exponentially, and the number of tool users is going to grow exponentially. It's very likely that the number of instances of all these tools is going to skyrocket. It’s very likely that the number of instances of Synopsys Design Compiler will skyrocket, along with the number of agents using floor planners, layout tools, and design rule checkers. Today we're limited by the number of engineers. Tomorrow, those engineers are going to be supported by a bunch of agents. We're going to be exploring the design space like you've never seen before, and we're going to use the tools that we use today. I think tool use is going to cause software companies to skyrocket.
04:14-04:26
Jensen Huang: The reason why it hasn't happened yet is because the agents aren't good enough at using their tools yet. Either these companies are going to build the agents themselves, or agents are going to get good enough to be able to use those tools. I think it's going to be a combination of both.
04:26-05:01
Dwarkesh Patel: I think in your latest filings, you had almost $100 billion in purchase commitments with foundries, memory, and packaging. SemiAnalysis has reported that you will have $250 billion of these kinds of purchase commitments. One interpretation is that Nvidia's moat is really that you've locked up many years of these scarce components. Somebody else might have an accelerator, but can they actually get the memory to build it? Can they actually get the logic to build it? Is this really Nvidia's big moat for the next few years?
05:01-05:33
Jensen Huang: It's one of the things that we can do that is hard for someone else to do. We've made enormous commitments upstream. Some of it is explicit, like these commitments that you mentioned. Some of it is implicit. For example, a lot of the upstream investments are made by our supply chain because I said to the CEOs, "Let me tell you how big this industry is going to be, let me explain to you why, let me reason through it with you, and let me show you what I see."
05:33-06:11
Jensen Huang: As a result of that process of informing, inspiring, and aligning with CEOs of all different industries upstream, they're willing to make the investments. Why are they willing to make the investments for me and not someone else? The reason is that they know I have the capacity to buy their supply and sell it through my downstream. The fact is that Nvidia's downstream supply chain and our downstream demand is so large, they're willing to make the investment upstream.
06:11-06:48
Jensen Huang: If you look at GTC, people are marveled by the scale of it and the people who attend. It's a full 360 degrees—the entire universe of AI all in one place. They're all in one place because they need to see each other. I bring them together so that the downstream can see the upstream, the upstream can see the downstream, and all of them can see the advances in AI.
06:48-07:22
Jensen Huang: Very importantly, they can all meet the AI natives—all the AI startups being built—and see firsthand all the amazing things happening so they can see for themselves what I tell them. I spend a lot of my time informing, directly or indirectly, our supply chain, partners, and ecosystem about the opportunity in front of us. Some people always say, "Jensen, in most keynotes, it's one announcement after another." With our keynotes, there’s always a part of it that's a little torturous in the sense that it almost comes across like education.
07:22-07:47
Jensen Huang: In fact, that's exactly what's on my mind. I need to make sure the entire supply chain, upstream and downstream, understands what is coming at us, why it's coming, when it's coming, and how big it's going to be. I want them to be able to reason about it systematically, just like I do.
07:47-08:32
Jensen Huang: Regarding the moat as you describe it, we're able to build for a future. If our next several years are a trillion dollars in scale, we have the supply chain to do it. Without our reach and the velocity of our business... Just as there's cash flow, there's supply chain flow and churn. Nobody is going to build a supply chain for an architecture if the business churn is low. Our ability to sustain the scale is only because our downstream demand is so great. And they see it, they hear about it, and they see it all coming. That allows us to do the things we're able to do at the scale we do them.
08:32-09:09
Dwarkesh Patel: I do want to understand more concretely whether the upstream can keep up. For many years now, you guys have been doubling revenue year over year. You've been more than tripling the amount of flops you're providing to the world year over year. Doubling at this scale now is really incredible. But then you look at logic. You're the biggest customer on TSMC's N3 node, and you're one of the biggest on N2. AI as a whole this year is going to be 60% of N3. It's going to be 86% next year, according to SemiAnalysis. How do you double if you're already the majority? And how do you do that year over year?
09:09-09:25
Dwarkesh Patel: Are we in a regime now where the growth rate in AI compute has to slow because of the upstream? Do you see a way to get around this? How do we build twice as many fabs year over year, ultimately?
09:25-09:52
Jensen Huang: At some level, the instantaneous demand is greater than the supply upstream and downstream in the world. At any instant, we could be limited by the number of plumbers—which actually happens.
09:52-10:05
Dwarkesh Patel: The plumbers should be invited to next year's GTC. By the way, that's a great idea.
10:05-10:30
Jensen Huang: But that's a good condition. You want an industry where the instantaneous demand is greater than the total supply. The opposite is obviously less good. If we're too far apart—if one particular component is too far away—the industry swarms it. For example, notice people aren't talking very much about CoWoS anymore. The reason for that is because for two years we swarmed the living daylights out of it. We doubled, doubled, and doubled again. Now I think we're in fairly good shape.
10:30-11:01
Jensen Huang: TSMC now knows that CoWoS supply has to keep up with the rest of the logic and memory demand. They're scaling CoWoS and future packaging technologies at the same level as they scale logic. This is terrific, because for a long time, CoWoS and HBM memory were rather specialty items. But they're not specialties anymore. People now realize they're mainstream computing technology. Of course, we're now much more able to influence a larger scope of our supply chain.
11:01-11:56
Jensen Huang: At the beginning of the AI revolution, all the things that I say now, I was saying five years ago. Some people believed in it and invested in it—for example, Sanjay and the Micron team. I still remember the meeting really well where I was clear about exactly what was going to happen, why it was going to happen, and the predictions of today. They really doubled down on it. We partnered with them across LPDDR and HBM memories, and they really invested in it. It obviously has been tremendous for the company. Some people came a little bit later, but now they're all here.
11:56-12:36
Jensen Huang: Each one of these bottlenecks gets a great deal of attention. Now we're prefetching the bottlenecks years in advance. For example, the investments that we've made with Lumentum, Coherent, and the silicon photonics ecosystem over the last several years really reshaped the supply chain. We built up an entire supply chain around TSMC. We partnered with them on COUPE, invented a whole bunch of technology, and licensed those patents to the supply chain to keep it nice and open.
12:36-13:00
Jensen Huang: We're preparing the supply chain through the invention of new technologies, new workflows, and new testing equipment like double-sided probing, investing in companies, and helping them scale up their capacity. You can see that we're trying to shape the ecosystem so that the supply chain is ready to support the scale.
13:00-13:04
Dwarkesh Patel: It seems like some bottlenecks are easier than others. Scaling up CoWoS versus scaling up—
13:04-13:14
Jensen Huang: I went to the hardest one, by the way.
13:14-13:26
Dwarkesh Patel: Which is?
13:26-13:43
Jensen Huang: Plumbers. Plumbers and electricians. This is one of the concerns that I have about the doomers describing the end of work and the killing of jobs. If we discourage people from being software engineers, we're going to run out of software engineers. The same prediction happened ten years ago. Some of the doomers were telling people, "Whatever you do, don't be a radiologist."
13:43-13:58
Jensen Huang: You might hear some of those videos still on the web saying radiology is going to be the first career to go and the world is not going to need any more radiologists. Guess what we're short of? Radiologists.
13:58-14:17
Dwarkesh Patel: Going back to this point about how some things you can scale and other things... How do you actually manufacture twice the amount of logic a year? Ultimately, memory and logic are bottlenecked by EUV. How do you get to twice as many EUV machines year over year?
14:17-14:36
Jensen Huang: None of that is impossible to scale quickly. All of that is easy to do within two or three years. You just need a demand signal. Once you can build one, you can build ten, and once you can build ten, you can build a million. These things are not hard to replicate.
14:36-14:46
Dwarkesh Patel: How far down the supply chain do you go? Do you go to ASML and say, "Hey, if I look out three years from now, for Nvidia to be generating two trillion a year in revenue, we need way more EUV machines"?
14:46-15:04
Jensen Huang: Some of them I have to talk to directly, some indirectly. If I can convince TSMC, ASML will be convinced. We have to think about the critical pinch points. But if TSMC is convinced, you'll have plenty of EUV machines in a few years.
15:04-15:36
Jensen Huang: My point is that none of the bottlenecks last longer than a couple of years—two or three years, tops. Meanwhile, we're improving computing efficiency by 10x or 20x, and in the case of Hopper to Blackwell, 30x to 50x. We're coming up with new algorithms because CUDA is so flexible. We're developing all kinds of new techniques so that we drive efficiency in addition to increasing capacity. None of those things worry me.
15:36-16:13
Jensen Huang: It's the stuff that's downstream from us. Energy policies that prevent energy from... You can't create an industry without energy. You can't create a whole new manufacturing industry without energy. We want to reindustrialize the United States. We want to bring back chip manufacturing, computer manufacturing, and packaging. We want to build new things like EVs and robots. We want to build AI factories. You can't build any of these things without energy, and those things take a long time. More chip capacity? That's a 2-3 year problem. More CoWoS capacity? 2-3 year problem.
16:13-16:23
Dwarkesh Patel: Interesting. I feel like I have guests tell me the exact opposite thing sometimes. In this case, I just don't have the technical knowledge to adjudicate.
16:23-16:32
Jensen Huang: The beautiful thing is you're talking to the expert.
16:32-16:47
Dwarkesh Patel: True. I want to ask about your competitors. If you look at the TPU, arguably two out of the top three models in the world, Claude and Gemini, were trained on TPUs. What does that mean for Nvidia going forward?
16:47-17:22
Jensen Huang: We build a very different thing. What Nvidia built is accelerated computing, not just a tensor processing unit. Accelerated computing is used for all kinds of things: molecular dynamics, quantum chromodynamics, data processing, data frames, structured data, and unstructured data. It's also used for fluid dynamics and particle physics. In addition, we use it for AI. Accelerated computing is much more diverse.
17:22-17:53
Jensen Huang: Although AI is the conversation today and is obviously very important and impactful, computing is much broader than that. Nvidia has reinvented the way computing is done, moving from general-purpose computing to accelerated computing. Our market reach is far greater than any TPU or ASIC can possibly have. If you look at our position, we're the only company that accelerates applications of all kinds. We have a gigantic ecosystem, so all kinds of frameworks and algorithms run on Nvidia.
17:53-18:31
Jensen Huang: Because our computers are designed to be operated by other people, anyone who's an operator can buy our systems. With most of these home-built systems, you have to be your own operator because they were never designed to be flexible enough for others to operate. Because anybody can operate our systems, we're in every cloud, including Google, Amazon, Azure, and OCI. If you want to operate it to rent, you better have a large ecosystem of customers in many industries to be the offtakers.
18:31-19:21
Jensen Huang: If you want to operate it for yourself, we obviously have the ability to help you do that, like we did for Elon with xAI. And because we can enable operators in any company and any industry, you could use it to build a supercomputer for scientific research and drug discovery at Eli Lilly. We can help them operate their own supercomputer and use it for the entire diversity of drug discovery and biological sciences that we accelerate. There are just a whole bunch of applications that we can address that you can't do with TPUs.
19:21-19:48
Jensen Huang: Nvidia built CUDA to be a fantastic tensor processing unit as well, but it also handles every life cycle of data processing, computing, AI, and so on. Our market opportunity is just a lot larger, and our reach is a lot greater. Because we support every application in the world now, you can build Nvidia systems anywhere and know that there will be customers for it. It's a very different thing.
19:48-20:19
Dwarkesh Patel: This is going to be a long question. You have spectacular revenue, and you're not making $60 billion a quarter from pharma and quantum. You're making it because AI is an unprecedented technology that is growing unprecedentedly fast. The question then is what is best for AI specifically. I'm not in the details, but I talk to my AI researcher friends and they say, "Look, when I use a TPU, it's this big systolic array that's perfect for doing matrix multiplies, whereas a GPU is very flexible. It's great when you have lots of branching or irregular memory access."
20:19-21:01
Dwarkesh Patel: But what is AI? It's just these very predictable matrix multiplies again and again and again. You don't have to give up any die area for warp schedulers or switches between threads and memory banks. And the TPU is really optimized for the bulk of this growth in revenue and use case for compute that is coming online right now. I wonder how you react to that.
21:01-21:38
Jensen Huang: Matrix multiplies are an important part of AI, but they're not the only part. If you want to come up with a new attention mechanism, disaggregate in a different way, or invent a whole new type of architecture altogether—like a hybrid SSM—you want an architecture that's generally programmable. If you want to create a model that fuses diffusion and autoregressive techniques, you want an architecture that’s just generally programmable.
21:38-22:04
Jensen Huang: We run everything you can imagine. That's the advantage. It allows for the invention of new algorithms a lot more easily because it's a programmable system. The ability to invent new algorithms is really what makes AI advance so quickly. TPUs, like anything else, are impacted by Moore's Law, which we know is increasing by about 25% per year.
22:04-22:53
Jensen Huang: The only way to really get 10x or 100x leaps is to fundamentally change the algorithm and how it's computed every single year. That's Nvidia's fundamental advantage. The only reason we were able to make Blackwell 50x better than Hopper... When I first announced Blackwell was going to be 35x more energy efficient than Hopper, nobody believed it. Then Dylan wrote an article saying I sandbagged, and it's actually fifty times. You can't reasonably do that with just Moore's Law.
22:53-23:36
Jensen Huang: The way we solve that problem is with new models, like MoEs, that are parallelized, disaggregated, and distributed across a computing system. Without the ability to really get down and come up with new kernels with CUDA, it's really hard to do. It's the combination of the programmability of our architecture and the fact that Nvidia is an extreme co-design company. We can even offload some of the computation into the fabric itself, like NVLink, or into the network with Spectrum-X.
23:36-23:54
Jensen Huang: We could affect change across the processors, the system, the fabric, the libraries, and the algorithm simultaneously. Without CUDA to do that, I wouldn't even know where to start.
23:54-24:09
Dwarkesh Patel: My sponsor Crusoe was among the first clouds to offer NVIDIA’s Blackwell and Blackwell Ultra platforms. And they just announced their NVIDIA Vera Rubin deployment scheduled for later this year. But access to state-of-the-art hardware is only part of the story. For example, most inference engines already do KV caching for a single user's forward passes.
24:09-24:27
Dwarkesh Patel: But Crusoe does it across users and GPUs. So if a thousand agents are running on the same system prompt, Crusoe only has to compute the KV cache once for it to become available to every single GPU in the cluster. This is especially important as systems get more agentic and require much longer prefixes in order to use tools and access files.
24:27-24:50
Dwarkesh Patel: In a recent benchmark, Crusoe was able to deliver up to 10x faster time-to-first token and up to 5x better throughput than vLLM. This is just one among many reasons that you should run your inference workload with Crusoe. And if you need GPUs for training, you don't need to switch clouds—Crusoe's got you covered there too. Go to crusoe.ai/dwarkesh to learn more.
24:50-25:15
Dwarkesh Patel: This gets at an interesting question about Nvidia's clientele. 60% of your revenue is coming from these big five hyperscalers. In a different era with different customers—let's say professors running experiments—they need CUDA. They can't use another accelerator. They just needed to run PyTorch with CUDA and have everything optimized. But these hyperscalers have the resources to write their own kernels.
25:15-25:38
Dwarkesh Patel: In fact, they have to in order to get that last 5% of performance they need for their specific architecture. Anthropic and Google are mostly running their own accelerators or running TPUs and Trainium. But even OpenAI, using GPUs, has Triton because they need their own kernels.
25:38-26:04
Dwarkesh Patel: Down to CUDA C++, instead of using cuBLAS and NCCL, they've got their own stack which compiles to other accelerators as well. If most of your customers can and do make replacements for CUDA, to what extent is CUDA really the thing that is going to make frontier AI happen on Nvidia?
26:04-26:23
Jensen Huang: CUDA is a rich ecosystem. If you want to build on any computer first, building on CUDA first is incredibly smart. Because the ecosystem is so rich, we support every framework. If you want to create custom kernels... For example, we contribute enormously to Triton.
26:23-26:56
Jensen Huang: So the back end of Triton has huge amounts of Nvidia technology. We're delighted to help every framework become as great as it can be. There are lots and lots of frameworks. There's Triton, vLLM, SGLang, and more. Now there's a whole bunch of new reinforcement learning frameworks coming out, like verl and NeMo RL. With post-training and reinforcement learning, that entire area is just exploding.
26:56-27:24
Jensen Huang: So if you want to build on an architecture, building on CUDA makes the most sense because you know the ecosystem is great. You know that if something happens, it's more likely in your code and not in the mountain of code underneath. Don't forget the amount of code you're dealing with when building these systems. When something doesn't work, was it you or was it the computer?
27:24-27:52
Jensen Huang: You would like it to always be you and to be able to trust the computer. Obviously, we still have lots of bugs ourselves, but our system is so well wrung out that you can at least build on top of the foundation. That's number one: the richness, programmability, and capability of the ecosystem. The second thing is, if you're a developer building anything at all, the single most important thing you want is an install base.
27:52-28:21
Jensen Huang: You want the software you write to run on a whole bunch of other computers. You're not building software just for yourself. You're building it for your fleet or everybody else's fleet because you're a framework builder. Nvidia's CUDA ecosystem is ultimately its great treasure. We have several hundred million GPUs out there now. Every cloud has it. It goes back to the A10, A100, H100, H200, the L series, the P series. There’s a whole bunch of them.
28:21-28:46
Jensen Huang: They're in all kinds of sizes and shapes. If you're a robotics company, you want that CUDA stack to actually run in the robot itself. We're literally everywhere. The install base means that once you develop the software or the model, it's going to be useful everywhere. That is just incredibly valuable. Lastly, the fact that we're in every single cloud makes us genuinely unique.
28:46-29:17
Jensen Huang: If you're an AI company or developer, you're not exactly sure which cloud service provider you're going to partner with or where you'd like to run it. We run everywhere, including on-prem for you if you like. The combination of the richness of the ecosystem, the expansiveness of the install base, and the versatility of where we are makes CUDA invaluable.
29:17-29:59
Dwarkesh Patel: That makes a lot of sense. I guess the thing I'm curious about is whether those advantages matter a lot to your main customers. There are many people for whom they might matter. But the kind of person who can actually build their own software stack makes up most of your revenue. Especially if you go to a world where AI is getting especially good at things which have tight verification loops where you can RL on them. This question of how do you write a kernel that does attention or MLP the most efficiently across a scale-up? It's a very verifiable sort of feedback loop. Can all the hyperscalers write these custom kernels for themselves? Nvidia still has great price performance, so they might still prefer to use Nvidia. But then the question is, does it just become a question of who is offering the best specs?
30:09-30:21
Dwarkesh Patel: Historically, Nvidia has maintained the best margins in all of AI across hardware and software—over 70%—because of the CUDA moat.
30:21-30:35
Jensen Huang: The question is: can you sustain those margins if most of your customers can actually afford to build their own solutions instead of relying on the CUDA moat? The number of engineers we have assigned to these AI labs is insane; we are working directly with them to optimize their stacks.
30:35-30:54
Jensen Huang: The reason for that is simple: nobody knows our architecture better than we do. These architectures are not as general-purpose as a CPU. A CPU is like a Cadillac—it’s a nice cruiser. It never goes too fast, and everyone can drive it well. It has cruise control, and everything is easy.
31:03-31:18
Jensen Huang: In many ways, Nvidia’s GPUs and accelerators are like F1 racers. I imagine everyone can drive one at a hundred miles an hour, but it takes significant expertise to push it to the limit.
31:24-31:34
Jensen Huang: We use a massive amount of AI to create our kernels. I’m confident we’ll still be needed for quite some time. Our expertise often helps our AI lab partners get another 2x performance out of their stack quite easily.
31:44-32:01
Jensen Huang: It’s not unusual for a model to speed up by 2x, 3x, or even 50% by the time we’re done optimizing their stack or a particular kernel. That’s a huge number, especially considering the installed base of the fleet they have—all the Hoppers and Blackwells.
32:09-32:24
Jensen Huang: When you increase performance by a factor of two, you effectively double the revenue. That translates directly to the bottom line. Nvidia’s computing stack offers the best performance per TCO (Total Cost of Ownership) in the world, bar none.
32:24-32:38
Jensen Huang: Nobody can demonstrate to me that any single platform in the world today has a better performance-to-TCO ratio. Not one company. In fact, look at the benchmarks out there. Dylan’s InferenceMAX is available for everyone to use, and yet...
32:46-33:04
Jensen Huang: TPU won't show up. Trainium won't show up. I encourage them to use InferenceMAX to demonstrate their "incredible" inference costs. It’s really hard, and nobody wants to show up. Take MLPerf—I would welcome Trainium to demonstrate the 40% advantage they claim all the time.
33:04-33:18
Jensen Huang: I would love to see them demonstrate the cost advantage of TPUs. It makes no sense to me. On first principles, it makes zero sense. I believe the reason we’re so successful is simply because our TCO is so great.
33:27-33:36
Jensen Huang: Secondly, you mentioned that 60% of our customers are the top five, but most of that business is external. For example, most of Nvidia's presence in AWS is for external customers, not internal use.
33:46-33:54
Jensen Huang: The same goes for Azure and OCI—our customers there are external. They favor us because our reach is so vast. We can bring them all the great customers in the world because those customers are already built on Nvidia.
34:01-34:16
Jensen Huang: The reason all these companies are built on Nvidia is our versatility and reach. The flywheel is really the installed base, the programmability of our architecture, the richness of our ecosystem, and the sheer number of AI companies—tens of thousands of them now.
34:26-34:32
Jensen Huang: If you were an AI startup, which architecture would you choose? You’d choose the one that’s most abundant, the one with the largest installed base, and the one with the richest ecosystem. That’s us.
34:41-34:49
Jensen Huang: That’s the flywheel. Between our performance-per-dollar—which gives them the lowest cost per token—and our performance-per-watt, we are the highest in the world.
34:54-35:07
Jensen Huang: If one of our partners builds a one-gigawatt data center, that center needs to deliver the maximum amount of revenue and tokens. You want to generate as many tokens as possible to maximize revenue for that facility.
35:13-35:20
Jensen Huang: We have the highest tokens-per-watt architecture in the world. Lastly, if your goal is to rent out infrastructure, we have the most customers in the world. That’s why the flywheel works.
35:24-35:38
Dwarkesh Patel: Interesting. I guess the question comes down to the actual market structure. Even if there are tens of thousands of AI companies, we could have a world where they have roughly equal shares of compute.
35:43-35:51
Dwarkesh Patel: But through these five hyperscalers, the people actually using the compute are Anthropic, OpenAI, and the big foundation labs. They have the ability and the resources to make different accelerators work.
35:58-36:08
Jensen Huang: No, I think your premise is wrong. But let me ask you a slightly different question—actually, come back and let me correct your premise later. It’s too important to the future of AI, science, and the industry.
36:16-36:29
Dwarkesh Patel: Let me just finish the question, and then we can address it together. If everything you’re saying about price, performance, and performance-per-watt is true, why did Anthropic just announce a multi-gigawatt deal with Broadcom and Google for TPUs?
36:47-36:57
Dwarkesh Patel: For Google, the TPU is the majority of their compute. If I look at these big AI companies, it seems like there was a point where it was all Nvidia, and now it’s not. How do you square your claims with the fact that they are choosing other accelerators?
37:01-37:09
Jensen Huang: Anthropic is a unique instance, not a trend. Without Anthropic, why would there be any TPU growth at all? It’s 100% Anthropic. The same goes for Trainium.
37:17-37:27
Jensen Huang: I think that’s fairly well understood. It’s not that there’s an abundance of ASIC opportunities; there’s only one Anthropic.
37:33-37:45
Dwarkesh Patel: But OpenAI has deals with AMD, and they’re building their own "Titan" accelerator.
Jensen Huang: Yes, but I think we can all acknowledge they are still vastly Nvidia-based. We’re going to continue to do a lot of work together.
37:45-37:55
Jensen Huang: I’m not offended by people trying other things. If they don’t try other options, how would they know how good ours is? Sometimes you need a reminder.
38:02-38:09
Jensen Huang: We have to continuously earn our position. There are always big claims. Look at the number of ASICs that have been canceled. Just because you’re building an ASIC doesn’t mean it’s better than Nvidia.
38:13-38:26
Jensen Huang: It’s not easy to build something better than Nvidia. Because of our scale and velocity, we’re the only company in the world cranking out big leaps every single year.
38:34-38:44
Dwarkesh Patel: I guess their logic is: "It doesn't need to be better; it just needs to be no more than 70% worse," because they’re paying you 70% margins.
Jensen Huang: Don’t forget, even in ASICs, margins are quite high.
38:44-38:51
Jensen Huang: If Nvidia’s margin is 70% and ASIC margins are 65%, what are you really saving? You have to pay somebody. From what I can tell, ASIC margins are incredibly good, and those companies are quite proud of them.
39:06-39:20
Jensen Huang: You asked why this is happening. A long time ago, we simply didn’t have the ability to do what was needed. At the time, I didn’t deeply internalize how difficult it would be to build a foundation AI lab like OpenAI or Anthropic.
39:29-39:37
Jensen Huang: I didn't realize they needed huge investments from the supplier themselves. We weren’t in a position to make a multi-billion dollar investment in Anthropic so they could use our compute.
39:47-39:56
Jensen Huang: But Google and AWS were. They put in huge investments early on so that Anthropic, in return, would use their compute. We just weren’t in a position to do that then.
40:05-40:13
Jensen Huang: My mistake was not realizing they had no other options. A VC would never put $5 to $10 billion into an AI lab with only the hope of it becoming an Anthropic.
40:20-40:33
Jensen Huang: That was my miss. But even if I had understood it, I don’t think we could have done it at the time. I won’t make that same mistake again. I’m delighted to invest in OpenAI and help them scale; I believe it’s essential.
40:47-40:54
Jensen Huang: When Anthropic came to us later, I was delighted to be an investor and help them scale. We just weren't able to do it initially. If I could rewind—and if Nvidia had been as big then as we are now—I would have been happy to do it.
41:06-41:22
Dwarkesh Patel: This is interesting. For years, Nvidia has been the company making the most money in AI. Now you’re investing it. It’s been reported you’ve put significant capital into OpenAI and Anthropic.
41:28-41:39
Dwarkesh Patel: Their valuations have increased and will likely continue to do so. If you had been providing the compute years ago when they were worth a tenth of their current value, you had the cash to make those deals much earlier.
41:47-41:53
Dwarkesh Patel: There’s a world where Nvidia either becomes a foundation lab itself or makes these deals much sooner. Why didn't you do it earlier?
42:00-42:12
Jensen Huang: We did it as soon as we could. If I could have, I would have done it even earlier. When Anthropic needed us, we just weren’t in a position to do it. It wasn’t in our sensibility.
42:23-42:36
Dwarkesh Patel: Was it a cash issue?
Jensen Huang: Yes, the level of investment. We hadn't invested much outside the company at that time. We didn't realize we needed to. I always thought they could just raise from VCs like every other company.
42:42-42:56
Jensen Huang: But what OpenAI and Anthropic wanted to do couldn't be done through VCs. I recognize that now, but I didn't know it then. That’s their genius—they realized they had to structure things differently.
43:07-43:17
Jensen Huang: Even though we initially missed out on Anthropic, I’m happy they exist. Their existence is great for the world.
43:17-43:34
Dwarkesh Patel: You’re still making a ton of money, quarter after quarter. So the question is: now that you have all this capital, what should Nvidia be doing with it?
43:39-43:53
Dwarkesh Patel: There’s a middleman ecosystem popping up to convert CapEx into OpEx for these labs so they can rent compute. The chips are expensive, but they generate value over their lifetime as models improve.
44:03-44:13
Dwarkesh Patel: Nvidia has the money for the CapEx. You’re reportedly backstopping CoreWeave with billions. Why doesn’t Nvidia just become a cloud provider or a hyperscaler itself and rent this compute out?
44:13-44:24
Jensen Huang: This is a core philosophy of the company: we should do as much as needed, but as little as possible.
44:31-44:45
Jensen Huang: If we don't build our computing platform, I genuinely believe it doesn't get done. If we didn't build NVLink, the full stack, and the ecosystem—if we hadn't dedicated 20 years to CUDA while losing money—nobody else would have.
44:52-45:09
Jensen Huang: A decade and a half ago, we pushed into domain-specific libraries because we realized if we didn't create them—for ray tracing, image generation, or data processing—nobody would. I am certain of that.
45:19-45:35
Jensen Huang: We created cuLitho for computational lithography. If we hadn't, nobody would have. Accelerated computing wouldn't have advanced this way. So, we should dedicate our company wholeheartedly to that.
45:46-45:58
Jensen Huang: However, the world already has plenty of clouds. If I didn't build a cloud, someone else would. Following the philosophy of doing "as little as possible" where others can step in, we choose not to be a cloud provider.
46:04-46:19
Jensen Huang: In the case of "neoclouds" like CoreWeave, Nscale, or Nebius—if we didn't support them, they wouldn't exist. Now they are doing fantastically.
46:25-46:41
Jensen Huang: We invest in our ecosystem because I want it to thrive. I want our architecture and AI to connect with as many industries and countries as possible, so the planet can be built on an American tech stack.
46:56-47:08
Jensen Huang: There are many amazing foundation model companies, and we try to invest in all of them. We don't pick winners; we support everyone. It’s imperative to our business, but we also go out of our way to remain neutral.
47:25-47:35
Dwarkesh Patel: Why do you go out of your way not to pick winners?
Jensen Huang: First, it’s not our job. Second, when Nvidia started, there were 60 3D graphics companies. We are the only one that survived.
47:42-47:53
Jensen Huang: If you had looked at that list back then, Nvidia would have been at the top of the "likely to fail" list. Our graphics architecture was precisely wrong—not just a little wrong, but impossible for developers to support.
48:07-48:22
Jensen Huang: We reasoned from first principles but ended up with the wrong solution. Everyone would have counted us out, yet here we are. I have enough humility to recognize that. Don't pick winners—either let them take care of themselves or take care of all of them.
48:31-48:45
Dwarkesh Patel: I’m confused. You said you don't prioritize neoclouds just to prop them up, but then you said they wouldn't exist without Nvidia. How are those two things compatible?
48:57-49:08
Jensen Huang: They have to want to exist first. They come to us with a business plan, expertise, and passion. They need their own capabilities. But if they need investment to get off the ground, we’ll be there.
49:19-49:30
Jensen Huang: To your question about the financing business: the answer is no. There are people whose job is financing, and we’d rather work with them than be a financier ourselves.
49:37-49:48
Jensen Huang: Our goal is to keep our business model simple and support the ecosystem. When a company like OpenAI needs a $30 billion investment before their IPO, and we believe in them, we step in.
49:58-50:14
Jensen Huang: They are an extraordinary company, and the world needs them to exist. We’ll do those investments because they need us to, but we aren't trying to do as much as possible—we’re trying to do as little as possible.
51:13-51:25
Dwarkesh Patel: For years, we’ve lived with a shortage of GPUs, and that shortage is growing as models improve. Nvidia is known for divvying up scarce allocation.
51:31-51:41
Dwarkesh Patel: You don't just give them to the highest bidder; you ensure these neoclouds like CoreWeave or Lambda get their share. Why is that good for Nvidia? And do you agree with that characterization of "fracturing" the market?
51:49-52:07
Jensen Huang: No, that premise is wrong. We are very mindful about these things. First of all, if you don't place a purchase order (PO), all the talk in the world doesn't matter.
52:12-52:30
Jensen Huang: We work hard with everyone on forecasting because data centers take a long time to build. We align demand and supply through that process. But in the end, you still have to place an order.
52:37-52:49
Jensen Huang: If you don't place your order, what can I do? It’s generally first-in, first-out. Beyond that, if your data center isn't ready or you're missing components to stand it up, we might serve another customer first.
53:04-53:16
Jensen Huang: That’s just maximizing the throughput of our own factory. Aside from those adjustments, it’s first-in, first-out. You have to place a PO.
53:27-53:44
Jensen Huang: There are stories about this—like the article saying Larry Ellison and Elon Musk begged me for GPUs over dinner. That never happened. We had a wonderful dinner, but they didn't beg. They just had to place an order.
53:55-54:09
Dwarkesh Patel: So it’s a queue based on POs and data center readiness. But it still sounds like the highest bidder doesn't just jump the line. Why not just sell to the highest bidder?
54:17-54:31
Jensen Huang: Because it’s bad business practice. You set your price, and people decide whether to buy. I know others in the chip industry change prices when demand spikes, but we don't.
54:39-54:57
Jensen Huang: That’s never been our practice. I prefer to be dependable—to be the foundation of the industry. If I quote you a price, that’s the price. If demand goes through the roof, so be it.
55:05-55:14
Dwarkesh Patel: Is that why you have such a productive relationship with TSMC?
Jensen Huang: Yes, we’ve been in business with them for nearly 30 years. We don’t even have a formal legal contract.
55:23-55:37
Jensen Huang: There’s always some "rough justice"—sometimes I get a better deal, sometimes they do. But the relationship is built on complete trust and dependability.
55:37-55:57
Jensen Huang: You can count on Nvidia. This year, Blackwell is incredible. Next year, Blackwell Ultra. The year after, Feynman. Every single year, we deliver. You won't find another ASIC team where you can bet your entire business that they will be there for you every year.
56:08-56:20
Jensen Huang: We ensure your token costs decrease by an order of magnitude every year. You can count on it like a clock. I can say the same about TSMC.
56:35-56:49
Jensen Huang: Whether you want to buy $1 billion of compute, $10 million, or just one graphics card, it’s no problem. If you want to order $100 billion for an AI factory, we are the only company that can handle that today.
57:12-57:29
Jensen Huang: Being the foundation of the world’s AI industry has taken decades of commitment and dedication. The stability and consistency of our company are paramount.
57:38-57:58
Dwarkesh Patel: I want to ask about China. I’m undecided on whether selling chips there is good, but I’ll play devil’s advocate. When Dario Amodei was on, he supported export controls, so I asked him why we can't have geniuses in both countries.
57:58-58:13
Dwarkesh Patel: Since you’re on the other side, I’ll ask the opposite. Anthropic recently mentioned a model, "Mythos," which they won't release publicly because of its cyber-offensive capabilities. It found thousands of high-severity vulnerabilities in major operating systems.
58:32-58:44
Dwarkesh Patel: If the Chinese government had the chips to train and run models like that, isn't that a threat to American national security?
58:44-59:01
Jensen Huang: First, Mythos was trained on fairly mundane capacity by an extraordinary company. That level of compute is abundantly available in China.
59:08-59:23
Jensen Huang: Chips already exist in China; they manufacture over 60% of the world's mainstream chips. They also have some of the world's greatest computer scientists.
59:28-59:39
Jensen Huang: Most AI researchers in Western labs are Chinese; they represent 50% of the world's AI research talent.
59:39-59:55
Jensen Huang: Given they have the energy, the chips, and the researchers, the question is: what is the best way to create a safe world? Is it by victimizing them and turning them into an enemy?
01:00:08-01:00:16
Jensen Huang: Likely isn't the best answer. They are an adversary. We want the United States to win.
01:00:16-01:00:23
Jensen Huang: But I think having a dialogue—specifically a research dialogue—is probably the safest thing to do.
01:00:23-01:00:35
Jensen Huang: This is an area that is glaringly missing because of our current attitude toward China as an adversary. It is essential that our AI researchers and their AI researchers are actually talking.
01:00:35-01:00:49
Jensen Huang: It is essential that we try to agree on what not to use AI for. With respect to finding bugs in software, of course, that's what AI is supposed to do.
01:00:49-01:01:03
Jensen Huang: Is it going to find bugs in a lot of software? Of course. There are lots and lots of bugs. There are lots of bugs in the AI software itself.
01:01:03-01:01:20
Jensen Huang: That's what AI is designed for, and I'm delighted that AI has reached a level where it can help us be so much more productive. One thing that is underemphasized is the richness of the ecosystem around cybersecurity.
01:01:20-01:01:34
Jensen Huang: There is a whole ecosystem of AI startups focused on AI security, privacy, and safety. They are trying to create a future for us where you have one incredible AI agent surrounded by thousands of other AI agents keeping it safe and secure.
01:01:34-01:02:02
Jensen Huang: That future is surely going to happen. The idea that you're going to have an AI agent running around with nobody watching over it is kind of insane. We know very well that this ecosystem needs to thrive.
01:02:02-01:02:11
Jensen Huang: It turns out this ecosystem needs open source. It needs open models and open stacks so that all these AI researchers and computer scientists can build AI systems that are formidable enough to keep AI safe.
01:02:11-01:02:22
Jensen Huang: So, one of the things we need to ensure is that we keep the open-source ecosystem vibrant. That can't be ignored. A lot of that innovation is coming out of China.
01:02:22-01:02:37
Jensen Huang: We ought not to suffocate that. With respect to China, of course we want the United States to have as much computing power as possible.
01:02:37-01:03:00
Jensen Huang: We're limited by energy, but we've got a lot of people working on that. We must not let energy become a bottleneck for our country.
01:03:00-01:03:14
Jensen Huang: But what we also want is to ensure that all the AI developers in the world are developing on the American tech stack and making their contributions and advancements—especially when it's open source—available to the American ecosystem.
01:03:14-01:03:28
Jensen Huang: It would be extremely foolish to create two separate ecosystems: an open-source ecosystem that only runs on a foreign tech stack, and a closed ecosystem that runs on the American tech stack.
01:03:28-01:03:38
Jensen Huang: I think that would be a horrible outcome for the United States.
01:03:38-01:03:44
Dwarkesh Patel: Since there are a lot of points there, let me just triage the response.
01:03:44-01:03:55
Dwarkesh Patel: I think the concern, going back to the difference in "flops" for hacking, is that while they have compute, some estimates suggest that because they are stuck at 7nm—and don't have EUV machines due to export controls—the amount of flops they can actually produce is about one-tenth of what the US has.
01:03:55-01:04:07
Dwarkesh Patel: So, could they eventually train a model like Mythos? Yes. But because we have more flops, American labs are able to reach these levels of capability first.
01:04:07-01:04:22
Dwarkesh Patel: Because Anthropic got there first, they can say, "Okay, we're going to hold onto this for a month while American companies get access to patch their vulnerabilities, and then we'll release it."
01:04:22-01:04:31
Dwarkesh Patel: Furthermore, even if they train a model like this, the ability to deploy it at scale matters. A cyber hacker is much more dangerous if they have a million agents versus just a thousand.
01:04:31-01:04:39
Dwarkesh Patel: So, inference compute really matters. The fact that they have so many talented AI researchers is actually what makes it scary, because what makes those researchers more productive? It’s compute.
01:04:39-01:04:48
Dwarkesh Patel: If you talk to any AI lab in America, they say the primary bottleneck is compute.
01:04:48-01:04:54
Dwarkesh Patel: There are quotes from the DeepSeek founder and Qwen leadership saying the exact same thing: they are bottlenecked on compute.
01:04:54-01:05:07
Dwarkesh Patel: So, isn't it better that American companies, because they have more compute, reach Mythos-level capabilities first and prepare our society before China can get there?
01:05:07-01:05:17
Jensen Huang: We should always be first and we should always have more. But for the outcome you described to be true, you have to take it to the extreme. They would have to have no compute at all.
01:05:17-01:05:26
Jensen Huang: If they have some compute, the question is: how much is needed? The amount of compute they have in China is already enormous.
01:05:26-01:05:34
Jensen Huang: You're talking about the second-largest computing market in the world. If they want to aggregate their compute, they have plenty to work with.
01:05:34-01:05:44
Dwarkesh Patel: But is that true? People do these estimates and say SMIC is actually behind on process nodes.
01:05:44-01:05:52
Jensen Huang: I'm about to tell you. The amount of energy they have is incredible. Isn't that right?
01:05:52-01:05:58
Jensen Huang: AI is a parallel computing problem. Why can't they just put four or ten times as many chips together because energy is essentially free for them?
01:05:58-01:06:11
Jensen Huang: They have so much energy. They have data centers sitting completely empty but fully powered. You know they have "ghost cities"—well, they have ghost data centers too.
01:06:11-01:06:20
Jensen Huang: They have massive infrastructure capacity. If they wanted to, they could just gang up more chips, even if they are 7nm. Their capacity for building chips is one of the largest in the world.
01:06:20-01:06:30
Jensen Huang: The semiconductor industry knows they monopolize mainstream chips. They have over-capacity.
01:06:30-01:06:37
Jensen Huang: So the idea that China won't be able to have AI chips is complete nonsense.
01:06:37-01:06:45
Jensen Huang: Now, of course, if you ask me if the United States would be further ahead if the entire world had no compute at all—sure.
01:06:45-01:06:51
Jensen Huang: But that's just not a realistic scenario. They already have plenty of compute.
01:06:51-01:06:59
Jensen Huang: The threshold they need for the concerns you're worried about—they've already reached that and beyond.
01:06:59-01:07:10
Jensen Huang: I think you might be misunderstanding that AI is like a five-layer cake, and the lowest layer is energy. When you have an abundance of energy, it can make up for a lack of cutting-edge chips.
01:07:10-01:07:17
Jensen Huang: Conversely, if you have an abundance of chips, it makes up for energy. For example, the United States is scarce on energy.
01:07:17-01:07:28
Jensen Huang: That is why Nvidia has to keep advancing our architecture and doing this extreme co-design. Because energy is limited, our throughput-per-watt has to be off the charts.
01:07:28-01:07:41
Jensen Huang: But if your energy supply is abundant and free, why do you care about performance-per-watt? You just use more. You can use older chips to get the job done.
01:07:41-01:07:51
Jensen Huang: 7nm chips are essentially the Hopper generation. And I have to tell you, today's models are largely trained on the Hopper generation.
01:07:51-01:08:01
Jensen Huang: So 7nm chips are plenty good. Their abundance of energy is their advantage.
01:08:01-01:08:12
Dwarkesh Patel: But then there's the question of whether they can actually manufacture enough of those chips.
01:08:12-01:08:18
Jensen Huang: But they do. What's the evidence? Huawei just had the largest single year in the history of their company.
01:08:18-01:08:27
Dwarkesh Patel: How many chips did they ship?
01:08:27-01:08:35
Jensen Huang: A ton. Millions. Millions of chips is way more than Anthropic has.
01:08:35-01:08:42
Dwarkesh Patel: There's a question of how much logic SMIC can ship, and a question of memory.
01:08:42-01:08:51
Jensen Huang: I'm telling you: they have plenty of logic and plenty of HBM2 memory.
01:08:51-01:09:02
Dwarkesh Patel: Right, but as you know, the bottleneck in training and inference is often bandwidth. If you're using HBM2 versus the newest standards, there could be an order of magnitude difference in memory bandwidth, which is huge.
01:09:02-01:09:02
Jensen Huang: Huawei is a networking company.
01:09:02-01:09:10
Dwarkesh Patel: But that doesn't change the fact that you need EUV for the most advanced HBM.
01:09:10-01:09:19
Jensen Huang: Not true. Not at all. You can gang them together, just like we do with NVL72. They've already demonstrated silicon photonics, connecting all this compute into one giant supercomputer.
01:09:19-01:09:26
Jensen Huang: Your premise is just wrong. The fact of the matter is, their AI development is going just fine.
01:09:26-01:09:33
Jensen Huang: The best AI researchers in the world, because they are limited in compute, come up with extremely smart algorithms.
01:09:33-01:09:39
Jensen Huang: Remember, I said Moore's Law advances about 25% per year.
01:09:39-01:09:45
Jensen Huang: However, through great computer science, we can still improve algorithm performance by 10x.
01:09:45-01:09:52
Jensen Huang: What I'm saying is that great computer science is where the real leverage is.
01:09:52-01:09:58
Jensen Huang: There is no question that Mixture of Experts (MoE) was a great invention. There's no question that advanced attention mechanisms reduce the amount of compute needed.
01:09:58-01:10:13
Jensen Huang: We have to acknowledge that most advances in AI came from algorithmic breakthroughs, not just raw hardware.
01:10:13-01:10:19
Jensen Huang: Now, if most advances come from algorithms and programming, tell me that their army of AI researchers isn't a fundamental advantage.
01:10:19-01:10:31
Jensen Huang: We see it. DeepSeek is not an inconsequential advance. The day that DeepSeek comes out on Huawei hardware first—that is a horrible outcome for our nation.
01:10:31-01:10:40
Dwarkesh Patel: Why is that? Currently, a model like DeepSeek can run on any accelerator if it's open source. Why would that change in the future?
01:10:40-01:10:48
Jensen Huang: Suppose it doesn't. Suppose it's optimized specifically for Huawei's architecture. That would put ours at a disadvantage.
01:10:48-01:10:58
Jensen Huang: You described a situation that I perceive to be good news: a company developed an AI model that runs best on the American tech stack. I saw that as a win.
01:10:58-01:11:06
Jensen Huang: You set it up as if it were bad news. I'll give you the real bad news: it's when AI models around the world are developed to run best on non-American hardware.
01:11:06-01:11:15
Jensen Huang: That is bad news for us.
01:11:15-01:11:27
Dwarkesh Patel: I guess I just don't see the evidence of disparities that would prevent you from switching accelerators. American labs are running their models across all clouds and different accelerators.
01:11:27-01:11:33
Jensen Huang: I am the evidence. Take a model optimized for Nvidia and try to run it on something else.
01:11:33-01:11:41
Dwarkesh Patel: But American labs do that.
01:11:41-01:11:51
Jensen Huang: And they don't run better. Nvidia's success is perfect evidence. The fact that AI models are created on our stack and run best on our stack—how is that hard to understand?
01:11:51-01:11:58
Dwarkesh Patel: Anthropic's models run on GPUs, Trainium, and TPUs. A lot of work goes into making that happen.
01:11:58-01:12:07
Jensen Huang: But look at the Global South or the Middle East. If, out of the box, all AI models run best on someone else's tech stack, you'd have to be making a ridiculous claim to say that's good for the US.
01:12:07-01:12:18
Dwarkesh Patel: I guess I don't understand the argument. Say Chinese companies get to the next "Mythos" first. They find all the security vulnerabilities in American software first.
01:12:18-01:12:24
Dwarkesh Patel: But they do it on Nvidia hardware and ship it to the Global South. How is that good?
01:12:24-01:12:33
Jensen Huang: It's not good. So let's not let it happen. Why do you think it's perfectly fungible—that if you didn't ship them compute, it would be exactly replaced by Huawei?
01:12:33-01:12:42
Dwarkesh Patel: They are behind, right? They have worse chips than you.
01:12:42-01:12:49
Jensen Huang: There's evidence right now. Their chip industry is gigantic.
01:12:49-01:12:56
Dwarkesh Patel: You can look at the flops or bandwidth comparisons between the H200 and the Huawei 910C. It's like half to a third.
01:12:56-01:13:05
Jensen Huang: So they use twice as many.
01:13:05-01:13:14
Dwarkesh Patel: It seems your argument is they have all this energy ready to go and they need to fill it with chips. They're good at manufacturing, and eventually, they might out-manufacture everyone. But there are these few critical years.
01:13:14-01:13:14
Jensen Huang: What critical years are you talking about?
01:13:14-01:13:22
Dwarkesh Patel: These next few years. We've got models coming that will be able to perform major cyberattacks.
01:13:22-01:13:32
Jensen Huang: In that case, if the next few years are critical, we have to ensure that all the world's AI models are built on the American tech stack during this time.
01:13:32-01:13:44
Dwarkesh Patel: If they're built on the American tech stack, how does that prevent them from launching Mythos-equivalent cyberattacks if they have advanced capabilities? There's no guarantee either way.
01:13:44-01:13:54
Jensen Huang: But if you have it early, we can prepare for it. Listen, why are you causing one layer of the AI industry to lose an entire market just to benefit another layer?
01:13:54-01:14:05
Jensen Huang: There are five layers, and every single one has to succeed. The layer that needs to succeed most is actually AI applications.
01:14:05-01:14:15
Jensen Huang: Why are you so fixated on that one AI model or that one company?
01:14:15-01:14:23
Dwarkesh Patel: Because those models enable incredibly offensive capabilities, and you need compute to run them. The energy, the chips, and the ecosystem of researchers make it possible.
01:14:23-01:14:31
Dwarkesh Patel: A few months ago, Jane Street spent about 20,000 GPU hours training backdoors into three different language models. They challenged my audience to find the trigger phrases.
01:14:31-01:14:41
Dwarkesh Patel: I just caught up with Ricson, who designed the puzzle, about the solutions. He noted that if you have a base model and a backdoor model, you can linearly interpolate the weights to adjust the strength of the backdoor.
01:14:41-01:14:49
Dwarkesh Patel: You can even extrapolate it to make the backdoor stronger. In some cases, if you make it strong enough, the model will just regurgitate the response phrase it was supposed to hide.
01:14:49-01:14:57
Dwarkesh Patel: If you keep amplifying the difference between the base and backdoored versions, it should eventually spit out the trigger phrase.
01:14:57-01:15:05
Dwarkesh Patel: But this technique only worked on two out of the three models. Even Ricson isn't sure why it failed on the third. Verifying that a model only does what you think it does is a major open question in AI security.
01:15:05-01:15:15
Dwarkesh Patel: If this kind of problem excites you, Jane Street is hiring researchers and engineers. Go to janestreet.com/dwarkesh to learn more.
01:15:15-01:15:24
Dwarkesh Patel: Okay, stepping back: China is able to build enough 7nm capacity. But they are stuck there while you move on to 3nm, 2nm, and then 1.6nm with Feynman.
01:15:24-01:15:34
Dwarkesh Patel: So while you're on 1.6nm, they'll still be on 7nm, and they have to produce enough to make up the shortfall.
01:15:34-01:15:43
Dwarkesh Patel: They have so much energy that the more chips you give them, the more compute they have. Ultimately, they are getting more compute, which is an input to training and inference.
01:15:43-01:15:53
Jensen Huang: Listen, I think you're speaking in absolutes. I agree the United States ought to be ahead. The amount of compute in the US is 100x more than anywhere else in the world.
01:15:53-01:16:00
Jensen Huang: The US is ahead. Nvidia builds the most advanced technologies. We make sure US labs are the first to hear about it and have the first chance to buy it.
01:16:00-01:16:09
Jensen Huang: If they don't have enough money, we even invest in them. We want to do everything we can to ensure the United States stays ahead. Do you agree with that point?
01:16:09-01:16:17
Dwarkesh Patel: We're doing everything we can. But how is shipping chips to China keeping the US ahead if they are bottlenecked on compute?
01:16:17-01:16:26
Jensen Huang: No, no. We have Vera Rubin for the United States.
01:16:26-01:16:33
Jensen Huang: Now, am I in the United States? Do you consider me part of the United States?
01:16:33-01:16:40
Dwarkesh Patel: Yes, Nvidia is a US company.
01:16:40-01:16:48
Jensen Huang: Okay. So why don't we come up with a regulation that's more balanced, so Nvidia can win around the world instead of giving up the global market?
01:16:48-01:16:56
Jensen Huang: Why would you want the United States to give up the world? The chip industry is part of the American ecosystem and technology leadership.
01:16:56-01:17:08
Jensen Huang: It's part of AI leadership. Why does your philosophy lead to the US giving up a vast part of the world market?
01:17:08-01:17:16
Dwarkesh Patel: I guess the claim here is—Dario [Amodei] had a quote where he said it's like Boeing bragging that they're selling North Korea nukes, but it's okay because the missile casings are made by Boeing.
01:17:16-01:17:24
Dwarkesh Patel: The idea is that you're fundamentally giving them this capability.
01:17:24-01:17:34
Jensen Huang: Comparing AI to anything you just mentioned is lunacy.
01:17:34-01:17:41
Dwarkesh Patel: But AI is similar to enriched uranium, right? It has positive uses and negative uses. We still don't want to send enriched uranium to other countries.
01:17:41-01:17:48
Jensen Huang: Who's sending enriched—?
01:17:48-01:17:59
Dwarkesh Patel: The analogy is that enriched uranium is like compute.
01:17:59-01:18:08
Jensen Huang: It's a lousy, illogical analogy. If that compute can run a model that performs zero-day exploits against American software, how is that not a weapon?
01:18:08-01:18:16
Jensen Huang: First of all, the way to solve that is to have dialogues with researchers in China and other countries to ensure people don't use technology that way. That dialogue has to happen.
01:18:16-01:18:24
Jensen Huang: Number two, we need to ensure the US is ahead—that Blackwell and Vera Rubin are available here in abundance.
01:18:24-01:18:31
Jensen Huang: Our results show it: we have mountains of compute and amazing researchers. We ought to stay ahead.
01:18:31-01:18:45
Jensen Huang: However, we also have to recognize that AI is a five-layer cake. The industry matters across every layer, and we want the US to win at every single one, including the chip layer.
01:18:45-01:18:54
Jensen Huang: Conceding the entire market will not allow the US to win the technology race long-term in the computing stack. That is just a fact.
01:18:54-01:19:04
Dwarkesh Patel: I guess the crux is: how does selling them chips now help us win in the long term?
01:19:04-01:19:13
Jensen Huang: Tesla sold high-quality EVs to China for a long time. iPhones are sold there. That didn't cause a "lock-in" for us; China still made their own versions and now they're dominating.
01:19:13-01:19:23
Jensen Huang: When we started today, you acknowledged Nvidia's position is different. You used words like "moat."
01:19:23-01:19:30
Jensen Huang: The single most important thing to our company is the richness of our ecosystem, which is about developers. 50% of AI developers are in China. The US should not give that up.
01:19:30-01:19:40
Dwarkesh Patel: But we have plenty of Nvidia developers in the US, and that doesn't prevent American labs from using other accelerators. They're already doing that, which is fine.
01:19:40-01:19:53
Dwarkesh Patel: I don't see why it would be different in China. If you sell them Nvidia chips, they can still use TPUs or other things, just like Google does.
01:19:53-01:20:03
Jensen Huang: We have to keep innovating. As you know, our market share is growing, not shrinking. The premise that we're going to lose that market anyway even if we compete...
01:20:03-01:20:12
Jensen Huang: You're not talking to someone who woke up a loser. That "loser attitude" makes no sense to me.
01:20:12-01:20:20
Jensen Huang: We are not a car. You can buy one car brand today and another tomorrow easily. Computing isn't like that.
01:20:20-01:20:26
Jensen Huang: There's a reason the x86 deal exists. There's a reason ARM is so sticky. These ecosystems are hard to replace.
01:20:26-01:20:33
Jensen Huang: It costs an enormous amount of time and energy, and most people don't want to do it.
01:20:33-01:20:42
Jensen Huang: So it's our job to nurture that ecosystem and advance the technology so we can compete. Conceding a marketplace based on your premise makes no sense because I don't think the US is a loser.
01:20:42-01:20:49
Jensen Huang: Our industry is not a loser. That mindset is foreign to me.
01:20:49-01:20:59
Dwarkesh Patel: Okay, I'll move on. I just want to make sure—
01:20:59-01:21:04
Jensen Huang: You don't have to move on. I'm enjoying this.
01:21:04-01:21:10
Dwarkesh Patel: Okay, great. I appreciate that. I think the crux—and thanks for walking in circles with me, it helps clarify things—is that your argument starts from an extreme.
01:21:10-01:21:21
Dwarkesh Patel: The idea that if we give them any compute at all in this narrow moment, we lose everything.
01:21:21-01:21:30
Jensen Huang: Those extremes are childish.
01:21:30-01:21:37
Dwarkesh Patel: Let me make my own argument. The idea isn't that there's some magic threshold of compute. It's that any marginal compute is helpful.
01:21:37-01:21:46
Dwarkesh Patel: If you have more compute, you can train a better model. I want you to acknowledge that while marginal sales are good for the American tech industry—
01:21:46-01:21:52
Jensen Huang: I actually don't—
01:21:52-01:22:00
Dwarkesh Patel: If the AI models running on those chips have offensive cyber capabilities, it's not a nuclear weapon, but it enables a weapon of a kind.
01:22:00-01:22:09
Jensen Huang: By that logic, you might as well say that about microprocessors, DRAM, or even electricity.
01:22:09-01:22:17
Dwarkesh Patel: But we do have export controls on the technology relevant to making advanced DRAM and other chip-making equipment.
01:22:17-01:22:26
Dwarkesh Patel: We sell a lot of DRAM and CPUs to China, and I think that's right. But is AI different?
01:22:26-01:22:35
Dwarkesh Patel: If they can find zero-days in software, don't we want to minimize China's ability to get there first and deploy it widely?
01:22:35-01:22:42
Jensen Huang: We want the United States to be ahead. We can control that.
01:22:42-01:22:50
Dwarkesh Patel: How do we control that if the chips are already there being used to train those models?
01:22:50-01:22:59
Jensen Huang: We have tons of compute and researchers. We're racing as fast as we can. Again, we have more nuclear weapons than anyone, but we don't send enriched uranium elsewhere.
01:22:59-01:23:08
Jensen Huang
01:30:07-01:31:34
Jensen Huang: Mythos is important, sure. That's fantastic. But in a few years' time, I'm making a prediction: when we want the American tech stack and American technology to be diffused around the world—to India, the Middle East, Africa, and Southeast Asia—our country will want to export it. We will want to export our technology and our standards. On that day, I want you and me to have this same conversation again.
Jensen Huang: I will remind you of today's conversation and how your policy and what you imagined literally caused the United States to concede the second-largest market in the world for no good reason at all. We shouldn't concede it. If we lose it, we lose it, but why concede it? Now, nobody is advocating for an "all or nothing" approach, meaning we ship everything to China at all times. Nobody is suggesting that. We should always have the best technology here first, but we should also try to compete and win around the world. Both of those things can happen simultaneously. It requires a certain amount of nuance and maturity instead of dealing in absolutes. The world just isn't made of absolutes.
01:31:34-01:32:17
Interviewer: Okay, the argument hinges on this: they've built models specified for the best chips they can make. In a few years, those chips get exported around the world, and that sets the standard. Because of EUV export controls, as we discussed, you're going to move on to 1.6nm, while they're still going to be on 7nm even a few years from now. It may make sense domestically for them to say, "Hey, we've got so much energy and we can manufacture at scale, so we'll keep using 7nm." But regarding exports, their 7nm chips have to be competitive against your 1.6nm chips. Their models would have to be so highly optimized for 7nm that it's better to run them there than on your 1.6nm hardware.
01:32:17-01:33:58
Jensen Huang: Can we just look at the facts? Is Blackwell's lithography 50 times more advanced than Hopper's? Is it 50 times? Not even close. I’ve said it over and over again: Moore's Law is dead. Between Hopper and Blackwell, the transistors themselves improved by maybe 75% over three years. Yet, Blackwell is 50 times more powerful than Hopper. My point is that architecture matters. Computer science matters. Semiconductor physics matters too, but computer science is key.
Jensen Huang: The impact of AI largely comes from the computing stack, which is why CUDA is so effective and so beloved. It's an ecosystem and a computing architecture that allows for immense flexibility. If you wanted to change an architecture completely—to create something like Mixture of Experts (MoE), diffusion models, or something disaggregated—you could do it easily. The fact of the matter is that AI is as much about the stack above as it is about the architecture below. To the extent that we have software stacks optimized for our ecosystem, it is obviously an advantage. We started today talking about how rich Nvidia's ecosystem is. People always prefer programming for CUDA first—including researchers in China.
01:33:58-01:35:06
Jensen Huang: But if we are forced to leave China, it’s a policy mistake. Obviously, it has a backlash that has turned out badly for the United States. It enabled and accelerated their internal chip industry. It forced their entire AI ecosystem to focus on internal architectures. It's not too late, but it has already happened. In the future, you’ll see they aren't stuck at 7nm; they are good at manufacturing and will continue to advance. Now, is there a 10x difference between 5nm and 7nm? The answer is no. Architecture matters. Networking matters—that’s why Nvidia bought Mellanox. Energy matters. All of that stuff matters. It’s not as simplistic as you’re trying to make it.
01:35:06-01:35:41
Interviewer: We can move on from China, but that raises an interesting question. We were discussing bottlenecks at TSMC and in memory supply. If we're in a world where you already control the majority of N3 capacity—and eventually N2—do you see a scenario where you go back to N7 to use spare capacity at an older process node? Could you say, "The demand for AI is so great and leading-edge capacity isn't meeting it, so we're going to make a Hopper or Ampere chip using everything we know about modern numerics and the improvements you described"? Do you see that happening before 2030?
01:35:41-01:36:42
Jensen Huang: It’s not necessary. The reason is that with every generation, the architecture is about more than just transistor scaling. You’re doing so much engineering in packaging, stacking, numerics, and system architecture. When you run out of capacity, trying to go back to an older node requires a level of R&D that no one could afford. We can afford to lean forward; I don't think we could afford to go backward. Now, if the world simply said—let's do the thought experiment—"Listen, we're never going to have more capacity ever again," would I go back and use 7nm? In a heartbeat. Of course I would.
01:36:42-01:37:04
Interviewer: One question from someone I was talking to is: why doesn't Nvidia run multiple different chip projects simultaneously with totally different architectures? You could do something like a Cerebras-style wafer-scale chip, a Dojo-style massive package, or even one without CUDA. You have the resources and the talent to do these in parallel. Why put all your eggs in one basket, given how unpredictable the future of AI and architectures might be?
01:37:04-01:38:00
Jensen Huang: Oh, we could. It’s just that we don’t have a better idea. We could do all of those things, but they aren't better. We simulate them all in our simulators, and they are provably worse, so we wouldn't do it. We are working on exactly the projects we want to work on. If the workload were to change dramatically—and I don't mean the algorithms, I mean the actual workload, which depends on the shape of the market—we might decide to add other accelerators. For example, we recently added Groq, and we’re going to fold that into our CUDA ecosystem.
01:38:00-01:39:21
Jensen Huang: We’re doing that now because the value of tokens has become so high that you can have different pricing tiers. A couple of years ago, tokens were either free or very cheap. But now, customers want different types of answers. Because customers—like our software engineers—can make so much money from them, I would pay for more responsive tokens to make them even more productive. That market has only recently emerged. We now have the ability to segment the same model based on response time. That’s why we decided to expand the Pareto frontier and create an inference segment with faster response times, even if it has lower throughput. Until now, higher throughput was always considered better. We think there could be a world with very high-value tokens where the premium price makes up for the lower factory throughput. That’s why we did it. Otherwise, from an architectural perspective, if I had more money, I would just put more behind Nvidia’s current architecture.
01:39:21-01:39:55
Interviewer: The idea of extremely premium tokens and the segmentation of the inference market is very interesting. Alright, final question: suppose the deep learning revolution hadn't happened. What would Nvidia be doing? Obviously gaming, but given your trajectory—
01:39:55-01:41:57
Jensen Huang: Accelerated computing—the same thing we’ve been doing all along. The premise of our company is that while general-purpose computing is good for many things, it isn't ideal for a lot of heavy computation. We combined a GPU architecture and CUDA with a CPU to accelerate those workloads. Different kernels of code or algorithms can be offloaded to our GPU, speeding up an application by 100x or 200x. You can use that in engineering, science, physics, data processing, computer graphics, and image generation.
Jensen Huang: Even if AI didn't exist today, Nvidia would be very, very large. The reason is fundamental: the ability for general-purpose computing to continue scaling has largely run its course. The way forward is through domain-specific acceleration. We started with computer graphics, but there are many other domains—particle physics, fluids, structured data processing—all of which benefit from CUDA. Our mission was to bring accelerated computing to the world and advance applications that general-purpose CPUs can't handle, helping to achieve breakthroughs in various scientific fields. Early applications included molecular dynamics, seismic processing for energy discovery, and image processing. If there were no AI, I would be very sad, but the advances we made in computing democratized deep learning. We made it possible for any researcher, scientist, or student anywhere to access a PC or a GeForce card and do amazing science.
01:41:57-01:43:00
Jensen Huang: That fundamental promise hasn't changed one bit. If you watch the beginning of GTC, none of it is about AI. We showcase computational lithography, quantum chemistry, and data processing—all unrelated to AI, yet still very important. I know AI is exciting, but there are many people doing vital work that isn't AI-related, and tensors aren't the only way to compute. We want to help everybody.
01:43:00-01:43:05
Interviewer: Jensen, thank you so much.
01:43:05-01:43:07
Jensen Huang: You're welcome. I enjoyed it.
01:43:07-01:43:08
Interviewer: Me too.
Made with: The Transcript Desk Chrome Extension

