2 min read

Is AI About to Hit a 'Memory Wall'?

The semiconductor industry has officially entered a new "supercycle," and the evidence is in the numbers. Recent third-quarter earnings reports from memory chip giants Samsung and SK Hynix show their inventories have fallen to record lows, causing prices for both specialized and general-purpose memory to surge. The driving force behind this boom is the AI revolution's insatiable demand for a specialized component called High-Bandwidth Memory (HBM).

But this boom is also a warning shot. The very success of HBM is exposing a deeper problem. As AI models grow exponentially larger, they are on a collision course with a fundamental physical limit that threatens to choke off progress: the "memory wall."

What is High-Bandwidth Memory (HBM) and why is it causing a supercycle?

HBM is a type of ultra-fast DRAM memory that is vertically stacked and placed directly next to a powerful processor like an Nvidia GPU. It acts as the GPU's immediate, short-term memory, feeding it the data it needs to perform complex AI calculations at lightning speed.

The demand for HBM has become so intense that memory makers are converting their standard production lines to make it. This has created a supply shortage of both HBM for the high-end AI market and the general-purpose memory used in everything else, causing prices across the board to surge.

Why isn't HBM enough to solve the problem?

While HBM is incredibly fast, it has two major limitations: it is expensive and has a relatively small storage capacity. This wasn't a problem for the first generation of AI models, but today's frontier models are becoming monstrously large. They require vast libraries of data to be stored nearby for "inference"—the process of using a trained model to answer a question or generate an image.

There simply isn't enough HBM capacity to hold all this data, and it's too expensive to even try. The AI industry is hitting a "memory wall": the GPUs are getting faster, but they are being starved for data because the memory can't keep up. As the "father of HBM," Professor Kim Jung-Ho, recently stated, the balance of power in AI is shifting "from GPU to memory."

What comes after HBM?

The industry is now racing to develop the next piece of the puzzle: High-Bandwidth Flash (HBF). This new technology uses the same vertical stacking concept as HBM, but it is built with NAND flash—the same type of memory found in solid-state drives—instead of DRAM.

HBF comes with a crucial trade-off: it is slower than HBM, but it can provide more than ten times the capacity at a much lower cost. This has led to a new architectural vision for AI servers, best described as an "intelligent library":

  • HBM will act as the fast-access "bookshelf" right next to the GPU, holding the data needed for immediate calculations.
  • HBF will serve as the vast "underground library," a massive, lower-cost storage tier that holds the AI model's deep knowledge and continuously feeds the HBM bookshelf.

Industry leaders are already moving fast. SK Hynix and SanDisk have announced a partnership to standardize HBF technology, with samples expected in 2026. Samsung is reportedly in the early stages of designing its own HBF products.

The emergence of HBF signals a complete rethinking of computer architecture for the AI era. The future isn't about simply making GPUs faster; it's about building a complex, multi-tiered memory system to keep those GPUs fed. The current memory supercycle is just the opening act. The real battle for the future of AI hardware will be fought by the companies that can master this new, intricate dance between computing and memory.

The Reference Shelf

  • Samsung, SK Hynix Inventory Drop Signals Memory Supercycle (Chosun)
  • SK hynix, Samsung, and SanDisk Bet on HBF — The Next Battleground in Memory Sector (TrendForce)