- What to source instead, today
- HBM is glued to a CoWoS interposer, not sold as a chip
- The CoWoS bottleneck through 2026
- Who has allocation locked through 2026
- What about used or decommissioned hyperscaler boards?
- The JESD238 standard and HBM3E qualification status
- Re-architecting your BOM when HBM isn’t your option
- FAQ
Why You Can’t Buy HBM3/HBM3E on the Open Market: An Honest 2026 Sourcing Guide
Last month I had three inquiries in one week from procurement teams asking for “HBM3 8-hi stack, 24GB, qty 500, lead time?” Two were from AI startups, one was from a Tier-2 server OEM that had lost its Nvidia allocation slot. I gave all three the same answer: HBM3 isn’t a part you buy. It’s a chip that arrives glued to an SoC inside a TSMC CoWoS package, and the entire 2026 supply was committed in pen-and-ink contracts before Q4 2025.
This is one of the most common HBM3 HBM3E sourcing misunderstandings I see in 2026 BOMs. Engineers reading press releases (“SK hynix completes HBM3E development”) assume there’s a JEDEC-compliant DDR-style market for the part. There isn’t. HBM is a deeply integrated technology — physically, commercially, contractually — and that has consequences for how you should plan an AI accelerator BOM if you’re not Nvidia, AMD, or a hyperscaler with a captive ASIC program.
I’m a Shenzhen-based independent sourcing specialist. I don’t have an HBM allocation. Neither does Digi-Key, Mouser, or any broker who tells you otherwise. What I can do is explain the supply chain honestly, point you at the parts that ARE sourceable, and help you understand which projects need to wait for CoWoS expansion in 2027 versus which ones can be re-architected around DDR5 or LPDDR5X for non-HBM inference workloads.
What to source instead, today
Before we dive into why HBM is locked, here’s the practical short version. If you came to this article via a “where to buy HBM3” search, these are your real options:
| Need | What you can actually get | Channel | Lead time |
|---|---|---|---|
| HBM3-bundled compute (training) | Nvidia H100 / H200 modules | Allocation-controlled OEM channels | 26-52 weeks |
| HBM3E-bundled compute | Nvidia B100 / B200, AMD MI300X | Same, longer queues | 36+ weeks |
| Inference under 70B params | LPDDR5X-based accelerators (Tenstorrent, Apple Silicon clusters, Qualcomm Cloud AI) | Direct from vendor | 4-12 weeks |
| Memory bandwidth without HBM | DDR5 RDIMM / MRDIMM 8800 MT/s | Distribution channels (Available / 3-5 days for major SKUs) | Stock-dependent |
| Discrete HBM3 stacks for board-level integration | Not sourceable | N/A — see below | N/A |
| Discrete HBM3E stacks | Not sourceable | N/A | N/A |
The bottom two rows are the hard answer. Discrete HBM stacks exist as line items in TSMC’s advanced packaging flow, but they don’t exist as something a buyer can purchase, take delivery of, and assemble onto a custom interposer in a contract manufacturer’s facility. The reasons are partly physical and partly commercial. Both matter. For everything else, our LPDDR5 sourcing guide and supply-chain diversification framework cover the re-architecture options.
HBM is glued to a CoWoS interposer, not sold as a chip
A standard DRAM is a single die in a BGA package. You read a datasheet, place an order, it ships in trays, you reflow it onto your PCB. HBM was never designed to work that way. The JEDEC JESD238A standard defines an 8-high or 12-high die stack connected by through-silicon vias (TSVs) at micro-bump pitches that have tightened across generations — broadly in the 40-55 µm range for HBM3/HBM3E, with HBM4 pushing toward hybrid bonding — and a 1024-bit-wide interface that runs at 6.4-9.6 Gbps per pin in HBM3/HBM3E.
You can’t reflow that onto FR4. The signal-integrity requirements demand a silicon interposer to bridge the HBM stack to the compute die. That interposer is the “CoW” in TSMC’s CoWoS (Chip-on-Wafer-on-Substrate). The HBM and the SoC are bonded to the interposer at TSMC’s Advanced Packaging fabs in Taiwan. They emerge as one integrated module — Nvidia H100, AMD MI300X, Google TPU v5p — and that module is the unit of commerce.
SK hynix, Samsung, and Micron sell their HBM stacks directly to TSMC under long-term agreements tied to specific customer slots. They don’t have a side door for distributors. The stack physically can’t leave TSMC’s packaging line as a discrete part, because there’s no market for that part — nobody outside TSMC has an interposer line qualified to bond it.
The CoWoS bottleneck through 2026
CoWoS capacity is the actual rate-limiter on global AI accelerator supply, not raw HBM die output. TrendForce’s memory market reporting tracked TSMC CoWoS capacity expanding from roughly 15K wafers/month in early 2024 to about 75K wafers/month by Q4 2025, with TSMC’s 2026 target around 130-140K wafers/month — a number reaffirmed across multiple 2025 earnings calls (see TSMC’s investor relations archive for the underlying quarterly transcripts).
Even at that level demand outstrips supply. Nvidia alone reportedly absorbs more than 60% of CoWoS-S allocation per TrendForce’s Q3 2025 advanced-packaging commentary, with AMD, Broadcom (custom ASIC), AWS Trainium, and Google TPU dividing most of the rest. Samsung’s I-Cube and Intel’s EMIB are alternative advanced packaging flows but neither is a drop-in substitute, and Samsung’s HBM3E qualification at Nvidia for 12-high stacks has been a moving target through 2025 — initially expected mid-2024, then late-2024, then partial 8-high qualification with 12-high still in negotiation as of early 2026 (see Reuters’ coverage of the Samsung-Nvidia qualification timeline for the back-and-forth).
Decision moment — engineer: If your project needs HBM-class bandwidth (>1 TB/s aggregate) and you’re not on a 2026 allocation list, your design either waits for 2027 CoWoS capacity OR pivots to LPDDR5X-based inference architectures. There is no third option that ships in volume this year.
Decision moment — buyer: “We can get you HBM3” from anyone other than Nvidia, AMD, or the captive-ASIC owners is either a misunderstanding (they mean HBM-bundled modules from gray channels, with all the verification problems that implies) or a scam. Treat it the way you’d treat someone offering you “loose F-22 parts.”
Who has allocation locked through 2026
This isn’t speculation. Public earnings calls and analyst transcripts let you trace the allocation chain:
- Nvidia: dominant CoWoS-S consumer for H100 / H200 / B100 / B200 / B300 production. Jensen Huang stated on multiple FY2026 earnings calls that Nvidia is “supply-constrained, not demand-constrained” and that TSMC packaging is the bottleneck.
- AMD: MI300X and MI325X allocation, growing toward MI350 in 2H 2026. On AMD’s Q2 2025 earnings call (5 August 2025), Lisa Su was explicit that AMD’s data-center GPU revenue is gated by CoWoS slots, not by demand — a framing she had reiterated earlier on the Q1 2025 call (6 May 2025).
- Broadcom: Google TPU v5/v6 and Meta MTIA v2 manufactured under Broadcom’s custom-silicon program. On Broadcom’s Q4 FY2025 earnings call (11 December 2025), Hock Tan guided AI semiconductor revenue toward the low-teens-billions range for FY2025 with a much larger FY2027 SAM, all of it CoWoS-bound.
- AWS Trainium2 / Inferentia roadmap: Annapurna Labs designs, manufactured at TSMC with HBM3E. Trainium2 is the current named, shipping product; the next-generation Inferentia is on AWS’s public roadmap but has not been formally branded as a successor SKU. Allocation is locked through dedicated AWS-TSMC framework agreements.
- Intel Gaudi 3: HBM2E currently, with Gaudi 4 roadmap targeting HBM3E. Intel’s volume is meaningful but a smaller slice of total demand.
What’s missing from this list: anyone you can place a purchase order with. Allocation runs from memory vendor → TSMC → top-tier customer; the distribution channel isn’t part of the flow. This pattern matches the one I described in our hard-to-find components guide — when supply is rationed by contract rather than market clearing, the broker market for the underlying component is essentially empty.
What about used or decommissioned hyperscaler boards?
This question comes up often enough that I’ll address it directly. Yes, there is a secondary market in retired AI training boards. Hyperscalers refresh fleets every 2-3 years and older Nvidia A100, V100, and increasingly H100 boards do enter resale channels, often through asset-recovery firms in the US and EU.
I’m describing this neutrally because it’s a real phenomenon, but I want to be clear: Cosolvic does not source from this channel. Provenance verification is non-trivial, the boards typically come without BMC firmware or with locked vBIOS, and the warranty/RMA path is whatever the asset-recovery firm is willing to back. For a research lab buying one or two H100 SXM modules to bench-test, this path exists. For a production deployment, it’s an operational risk buyers should weigh against the wait time on direct allocation.
The JESD238 standard and HBM3E qualification status
For engineers writing specs that reference HBM3 or HBM3E by name: the JESD238A spec is the document of record. HBM3 supports up to 819 GB/s per stack at 6.4 Gbps/pin; HBM3E targets 9.2-9.6 Gbps/pin and roughly 1.2 TB/s per stack.
SK hynix announced HBM3E sampling in 2023 and is the volume leader. Samsung’s HBM3E 8-high qualified at Nvidia during 2024; 12-high qualification has been the subject of multiple TrendForce updates through 2025, with full qualification reportedly closer in late 2025 / early 2026. Micron joined HBM3E volume production in 2024 and has — per Micron’s investor relations commentary across its FY2025 earnings cycle — stated that HBM is sold out through 2025 with the majority of CY2026 capacity already allocated.
The “sold out” framing is what you should remember. When the three vendors that produce a component all describe themselves as sold out for the next year, there is no buy-side market. There’s only an allocation queue.
Decision moment — engineer: If your bandwidth requirement is <1 TB/s aggregate and your model fits in <70B parameters, re-architect on LPDDR5X now and ship in 2026. In our experience, the 6-12 month penalty for waiting on HBM allocation is usually worse than the bandwidth penalty for using DDR5 or LPDDR5X — but the math depends on your batch-size profile, so run it before committing.
Decision moment — buyer: Don’t sign a 2026 contract that assumes HBM-bundled accelerators unless you have written allocation confirmation from Nvidia, AMD, or the OEM’s allocation team. “We expect to receive” is not the same as “we have allocation.”
Re-architecting your BOM when HBM isn’t your option
For most teams reading this, the realistic question isn’t “how do I get HBM3” but “how do I deliver an AI workload without it.” That’s solvable for a meaningful subset of designs:
- Inference under 70B parameters often runs well on LPDDR5X with batch-size-aware scheduling. Apple Silicon, Qualcomm Cloud AI, and Tenstorrent’s Blackhole demonstrate this path. Our LPDDR5 sourcing guide covers what’s actually buyable in 2026 volumes.
- Edge inference at the box level rarely needs HBM. The constraint is usually NAND or NOR flash for model weights and DDR5 for activations — both areas covered in our NOR flash 2026 outlook.
- Training under ~13B parameters can fit on multi-GPU clusters using older HBM2E parts (A100, V100), which have longer secondary-market history and clearer warranty paths.
- Training >70B parameters genuinely needs HBM3-class bandwidth. If you’re not on an allocation, you’re either renting cloud capacity or waiting until CoWoS expansion lands in late 2026 or 2027.
The hardest message I deliver is to AI startups that raised on the assumption they’d own their training silicon. The honest answer is to use cloud capacity through 2026 and revisit the captive-silicon question when CoWoS supply normalizes — or, if the workload is well-suited, to evaluate non-HBM accelerators that solve a narrower problem with parts you can actually buy. Our zero-stock alternatives guide covers the framework.
FAQ
Q: Can I buy HBM3 directly from SK hynix or Micron with a purchase order?
A: No, not as a discrete part for board-level integration. Both vendors sell HBM exclusively through long-term agreements with packaging partners (primarily TSMC) tied to specific end-customer programs. There is no distributor channel and no spot market. Sample requests for engineering evaluation occasionally happen for very large potential customers but don’t translate into volume buys.
Q: What’s the difference between HBM3 and HBM3E in sourcing terms?
A: From a buyer’s perspective, none. Both are equally unavailable on the open market. HBM3E ships in higher-end 2024-2026 accelerators (H200, B100/B200, MI300X with HBM3E refresh, MI325X) and HBM3 in slightly older parts (H100, MI300A). The bandwidth gap is meaningful — HBM3E is roughly 40% faster per stack — but the supply-chain access path is identical.
Q: When will CoWoS capacity catch up to demand?
A: TrendForce and TSMC commentary through 2025 pointed to mid-to-late 2026 as the period when capacity expansion starts meaningfully outpacing demand growth, with 2027 the first year where CoWoS supply might not be the binding constraint. This is the consensus analyst view, not a guarantee — Nvidia, AMD, and the captive-ASIC programs continue to expand orders.
Q: Are there HBM alternatives in development?
A: Yes — Samsung is pushing 3D-stacked CXL memory, several startups (Eliyan, Marvell, others) are pitching custom interposer solutions, and HBM4 is on the JEDEC roadmap with first samples expected in 2026 from SK hynix and Micron. None of these change the 2026 sourcing reality. They’re 2027-2028 stories.
Q: I see HBM3 stacks listed on Alibaba and AliExpress — are those real?
A: Almost certainly mislabeled, repackaged from scrapped boards, or counterfeit. There is no legitimate channel that puts loose HBM stacks into the Shenzhen open market. Listings we’ve examined have ranged from genuine HBM2 stacks de-soldered from old boards to outright scams with fake markings. None of them are a path to a working production design.
The honest synthesis is that HBM3 and HBM3E are not buyer-accessible components in 2026, and pretending otherwise wastes engineering time. If your roadmap depends on HBM-class memory bandwidth and you’re not already on a Nvidia, AMD, or hyperscaler allocation slot, the next concrete step is to review your workload requirements against what LPDDR5X-based architectures can deliver — and to plan a 2027 re-evaluation when CoWoS capacity expansion starts catching up to demand.
Have a memory-intensive AI accelerator BOM you’re trying to source? Send us your BOM at request a quote. We’ll tell you within four hours which lines we have authentic stock for, what’s available within 3-5 days, and which ones — like HBM3 stacks for board-level integration — genuinely require a different approach.