The light that burns twice as bright burns half as long.
The rise and fall of the Nvidia H100 economy
- and you have burned so very, very brightly, H100. (Inspired by Blade Runner)
For me, the quote from Blade Runner embodies the first year of the H100 economy. It's a genuinely insane product introduction and adaptation that came out of nowhere if you were not directly involved in the launch, and I am sure they were surprised also.
“I've seen things... seen things you little people wouldn't believe. Attack ships on fire off the shoulder of Orion bright as magnesium... I rode on the back decks of a blinker and watched C-beams glitter in the dark near the Tannhäuser Gate. All those moments... they'll be gone.”
Roy Batty, Blade Runner
Less than ten months after the first Nvidia shock result, the H100 behind the AI revolution prepares to retire, leaving the scene for Blackwell. In this post, I will dive deeper into the H100 economy and investigate what the transformation to Blackwell means for Nvidia and the company’s results.
This post is based on research available on SemiBizIntel.com. We have taken a deep dive into Nvidia's data centre business and supply chain to understand the situation and bottlenecks in the next quarter’s business. While we do not reveal everything in the report, for obvious reasons, this post should still be able to provide insights into Nvidia’s data centre business—it's time to dive into the H100 economy.
The most successful Semiconductor system ever launched.
Apple introduced the iPhone in 2007, the most successful semiconductor product ever launched. Its revolutionary integration of advanced semiconductors, particularly the custom-designed A-series processors, transformed it into a powerful handheld computing device. The iPhone generated just short of $2 billion in revenue in its first year of production and will have to pass the title to the new heavyweight champion, H100. There is still some distance before Nvidia’s AI platforms reach the revenue of the iPhone, currently at an average of 55B$ in revenue a quarter.
In the first year of production, the H100 generated more than 45B$ in revenue at a record gross profit. In the first quarter of H100, sales exceeded 4.5B$ in revenue.
In Q2, H100 represented 90% of the Datacenter platform business, whereas the predecessor, A100 Ampere and the H20 for the Chinese market accounted for the remainder of the platform revenue.
If H100 was a country and revenue was GDP, the H100 would rank among the top 100 customers in the world alongside Cambodia, Paraguay and Latvia.
If the H100 were a standalone company, it would have the 2nd highest semiconductor revenue based on the last four quarters. Only Intel would have a higher income.
The H100 is an incredibly profitable product. From a gross profit perspective, it generates more gross profit than any other semiconductor company.
Interestingly, after only four quarters, Nvidia has chosen to outcompete the most successful semiconductor-based product ever. This is the highest level of product cannibalisation ever executed.
Many years ago, Intel's co-founder and then CEO, Andy Grove, famously said that only the paranoid survive. Intel was at its peak, and he warned about complacency. Nvidia has been listening while Intel forgot.
“Success breeds complacency. Complacency breeds failure. Only the paranoid survive.“
Andy Grove
Blackwell has arrived
In the latest Nvidia conference call, Jensen Huang confirmed that Blackwell is now in production. When asked how much revenue to expect, he answered: “We will see a lot of Blackwell revenue this year.”
There could be many reasons for Nvidia to pursue the product cannibalisation strategy:
Keep ahead of critical competitors
Lower cost and increase profits
Increase value to customers - Price/Performance
Remove production bottlenecks
To better understand what Nvidia is trying to achieve, we have researched these questions and calculated the cost structure of Blackwell and H100.
Moore or Accelerated
The (what Intel hopes is a) feud between Intel and Nvidia has erupted at the highest level. Jensen Huang indicates that Moore’s law is overtaken by Accelerated Computing, and Pat Gelsinger retorts that Moore’s law is not dead.
No matter who is right, it is interesting that Nvidia chose to use the same TSMC 4nm node for the Blackwell they used for H100. Is the Blackwell only half a generational step? The R100 processor announced at Computex is guaranteed to be manufactured at a small node, likely at 2nm.
Blackwell improves performance by combining two reticle-limited dies (the maximum die dimensions semiconductor lithography equipment can handle) to create a double-sized chip.
As two Blackwell chips are combined with 1 Grace CPU to form the GB200 Superchip, it is difficult to compare the cost/performance structure to that of the H100XSM board that only contains one H100. You can find our cost calculations here.
In presentations of Blackwell's performance, the architecture looks like a quantum leap, but our calculations point more towards the half-a-generation step.
As we only have outsider information, our calculation is highly variable. The cost of the GB200 super chip is not fixed but varies based on the components and their yields, which could incite Nvidia to increase pricing as the cost goes up. Also, Nvidia’s system price is not publicly available and could vary according to supply and demand.
Based on the H100's success, half a generational step is likely sufficient to keep competitors in check. When Intel and AMD are ready to bite into the H100's market share, it could be gone.
The architecture cost structure
The two systems are quite different, as are the cost structures, as seen below. With GB200, the cost of logic and memory increases, making the GB200-based system more sensitive to memory pricing.
This also means that Nvidia cannot be considered a semiconductor components company in the traditional sense. It is becoming a systems company.
This represents a challenge for the semiconductor industry in calculating market growth. The World Semiconductor Trade Statistics (WSTS) has traditionally been the trusted source of market size and growth information by collecting component sales statistics from its member companies. Companies like Apple that make semiconductors for their products are not included as they do not sell components on the free market.
For non-member companies, WSTS has to make a judgement about their business.
As Nvidia transitioned to a systems company, soon followed by Intel and AMD, the traditional way of calculating market growth broke down as none of the three companies were members of WSTS.
The transformation of Nvidia’s business started with Ampere but is now accelerating, as can be seen below:
The market growth numbers cannot be trusted anymore as Nvidia's revenue represents 61% of semiconductors, of which a significant proportion is memory that was already reported as sold at the market when Nvidia bought it.
This makes the already complex semiconductor market even more complicated, but complexity is our friend.
Production bottlenecks
There are three main bottlenecks in the H100 and Blackwell systems:
ABF Substrate
CoWoS Capacity
High Bandwidth Memory
NVIDIA uses an ABF (Ajinomoto Build-up Film) substrate for the GB200. This type of substrate allows for high-density routing and advanced packaging necessary for the chip’s performance. The substrate for the GB200 chip has 20 layers+ in order to support the high-density interconnects and high-speed data pathways required by the GPU.
The build-up process involves stacking multiple layers of conductive and insulating materials to create a high-density interconnect substrate that can handle the sophisticated demands of the GB200.
There is only a couple of companies capable of delivering the film for this technology and only a few that can master the technology as can be seen below.
The substrate is likely to be a significant bottleneck for the manufacturing of the very large substrate needed for Blackwell. The already low yield for H100 is likely to become a challenge.
The CoWoS situation is likely to have improved significantly over the last period and Nvidias share of total TSMC capacity is on the decline.
High Bandwidth Memory
The major memory manufacturers have failed to invest in the last downcycle and is in full progress with moving capacity away from traditional DRAM to the more lucrative HBM Dram.
This will eventually lead to a shortage of Standard DRAM and with that, price increases for HBM also.
HBM require 2x the manufacturing capacity per bit that traditional DRAM and is traded at a multipla of the basic DRAM price.
While pricing is on the rise, so is the capacity of HBM dram as can be seen below.
It does not take much work to find Nvidia’s supplier of DRAM. Conditions should be similar for the main memory supliers so growth rates should also be similar. However one manufacturer sticks out.
Compared to Q1-23 revenue, SK Hynix have outgrown its competitors significantly as a likely result of being the only supplier of HBM to Nvidia at the moment. Nvidia has confirmed it needs more engineering done before using Samsung HBM supporting this argument.
While prices might increse we believe there is sufficent HBM for the Blackwell introduction even though memory content is higher from a cost and revenue perspective.
The age of the H100 economy is coming to an end. Next quarter Blackwell will start to take over.
This post has given a peek into our research. Should Nvidia be vital to your future business, you should consider acquiring our Nvidia research:
The following topics are covered:
H100: Price and cost analysis of the H100
GB200: Cost analysis of the GB200 Superchip and price projection
Supply Bottlenecks: Capacity situation and limitations
Competitive Situation, Current and Future
Nvidia's next quarter result based on the upgrade to Blackwell
Nvidia Data Center Revenue by AI Platform
Nvidia Customer Split: Last two quarters by Type
Included: 10 Excel Tables, 42 Slides, 29 Charts and illustrations
SK Hynix of course gets the majority of the sales, but is Micron not a second source HBM supplier for Nvidia? My understanding is that Samsung is the only one that hasn’t yet been able to take some share there