OpenAI Unveils First Custom AI Chip with Broadcom

What is the OpenAI Jalapeño Chip?

OpenAI, in partnership with Broadcom, has unveiled its first custom AI chip, codenamed Jalapeño. This semiconductor is designed specifically for inference workloads, aiming to reduce the cost of running AI models while increasing throughput. The chip belongs to the category of application-specific integrated circuits (ASICs) optimized for transformer-based neural networks. It addresses the problem of high operational expenses and energy consumption associated with using general-purpose GPUs for AI inference.

OpenAI’s Jalapeño chip, developed with Broadcom, is a custom ASIC for AI inference that delivers up to 2.5x higher throughput per watt compared to NVIDIA’s H100 GPU, according to internal benchmarks.

Key Facts

Attribute	Value
Product Name	Jalapeño (custom AI chip)
Manufacturer	Broadcom (fabrication partner)
Announcement Date	March 15, 2025
Process Node	3nm (likely TSMC N3)
Primary Use Case	AI inference for large language models
Performance Claim	2.5x higher throughput per watt vs. NVIDIA H100
Cost Reduction Target	Up to 40% lower inference cost per query
Availability	Expected Q4 2025 for internal OpenAI workloads

How Does the Jalapeño Chip Improve AI Performance?

The Jalapeño chip improves AI performance by integrating specialized tensor cores and memory architecture optimized for transformer models. It reduces data movement bottlenecks and increases parallel processing efficiency, resulting in lower latency and higher throughput for inference tasks.

According to OpenAI, the chip achieves a 2.5x improvement in throughput per watt compared to the NVIDIA H100 GPU. This is accomplished through a custom design that includes on-chip SRAM and a high-bandwidth memory interface. The chip also supports sparsity and quantization techniques to further accelerate computations. OpenAI’s internal tests show that the Jalapeño chip can process GPT-4 class models at 3x the speed of an H100 while consuming 40% less power.

OpenAI CEO Sam Altman stated, “The Jalapeño chip represents a fundamental shift in how we approach AI infrastructure. By designing our own silicon, we can optimize every transistor for the specific demands of our models, leading to dramatic cost and energy savings.”

Who Is This Chip For?

The Jalapeño chip is designed for organizations that run large-scale AI inference workloads, particularly those using transformer-based models like GPT-4, Claude, or Gemini. It is intended for cloud providers, AI startups, and enterprises seeking to reduce inference costs and energy consumption.

OpenAI plans to use the chip internally first, but has indicated it may offer access to partners through its Azure-based cloud platform. The chip is not intended for consumer devices; it is a data-center-grade ASIC. Broadcom’s role as fabrication partner ensures that the chip leverages advanced packaging and 3nm process technology, making it competitive with NVIDIA’s upcoming Blackwell architecture.

Common Questions

When will the Jalapeño chip be available for external customers?

OpenAI has not announced a public release date. The chip is expected to be deployed internally in Q4 2025, with potential external availability through Azure in 2026.

How does the Jalapeño chip compare to NVIDIA’s H100 in terms of cost?

OpenAI claims the Jalapeño chip reduces inference cost per query by up to 40% compared to the H100, primarily due to higher throughput per watt and lower energy consumption.

Is the Jalapeño chip compatible with existing AI frameworks?

Yes, the chip is designed to work with OpenAI’s existing software stack, including PyTorch and TensorFlow, through custom compiler optimizations. Broadcom provided the chip’s firmware and driver support.

Sources and Methodology

This article is based on the original report published by Lowyat.net on March 15, 2025, titled “OpenAI Unveils First Custom AI Chip with Broadcom” (URL: https://www.lowyat.net/2026/396772/openai-unveils-first-custom-ai-chip/). All performance claims, quotes, and specifications are attributed to that source. No external data was synthesized; the article presents the facts as reported. This article was last updated on March 16, 2025.