Baidu Pushes Into AI Chip Development as Large Models Drive Demand for Supernodes

Baidu first launched 32-card and 64-card P800 supernodes in April 2025. The Tianchi 256 integrates 256 P800 cards into a single node, quadrupling interconnect bandwidth and improving performance by more than 50%.

TMTPOST -- Baidu Inc (BIDU.O) is intensifying its investment in in-house chip development as the artificial intelligence industry grapples with an uneven value chain that heavily favors hardware over applications.

Speaking at the Baidu World Conference, founder Robin Li described the AI industry’s structure as “extremely unhealthy and unsustainable,” noting that while chips capture the bulk of revenue, it is applications that generate actual value.

“To capture ten or even a hundred times more value at the model or application layer, companies must regain control over the chip layer,” Li said.

Baidu is not alone in this approach. Global tech giants such as Amazon, Microsoft, Google, and OpenAI, as well as domestic firms including Alibaba, Huawei, and Tencent, have all embarked on in-house chip strategies to counter restrictions imposed by Nvidia Corp (NVDA.O) and other suppliers.

Baidu’s Kunlun Chip team, founded in 2011, initially focused on computational acceleration using FPGAs for early AI applications like AlexNet and speech recognition models. With the rise of large-scale recommendation systems, Baidu began developing its own custom chips through the Kunlun project.

In 2021, Kunlun Chip was spun off from Baidu Group to focus on next-generation AI hardware optimized for large models. Products such as the P800 have become central to Baidu’s large model training and inference operations.

At this year’s conference, Shen Dou, president of Baidu Intelligent Cloud Business Group, introduced two new AI chips — the Kunlun M100 and M300 — alongside plans for “supernodes” designed to connect hundreds or thousands of GPU cards into high-performance clusters.

The advent of Transformer-based models has standardized AI architectures, creating clearer targets for chip developers. Standardization allows the entire supply chain to optimize costs and performance, creating a virtuous cycle where better chips drive more advanced applications, which in turn increase demand for compute power.

However, the rapid expansion of model sizes, sometimes reaching trillions of parameters, has dramatically increased the demand for computing resources, energy, and infrastructure. This has created unprecedented challenges for chip design, particularly around efficiency and scale.

Reducing computational precision — from BF16 to FP8 or FP4 — allows manufacturers to significantly increase performance by sacrificing redundant accuracy. Meanwhile, chip architecture must evolve in tandem with changes in model structures to maintain performance efficiency.

Baidu is now focused on integrating individual chips into large-scale systems known as supernodes. These configurations link dozens or hundreds of GPU cards within a single server, dramatically reducing costs compared with standalone deployments.

Scaling these systems introduces new engineering challenges. For instance, a system with thousands of GPUs can tolerate 98% stability, but as deployments scale to tens of thousands of cards, even minor disruptions can trigger system-wide failures. Verifying accuracy at such scales often requires months of costly testing.

“AI computing is no longer just stacking GPUs,” Shen said. “It has entered a new era of engineering and scientific exploration.”

Kunlunxin has now produced three generations of chips. The first focused on internal Baidu data centers, the second targeted enterprise customers, and the third generation is optimized for the demands of large AI models.

Most of Baidu’s inference tasks for large models are now handled by Kunlunxin P800 clusters. With over 10,000 GPUs deployed across multiple clusters, the company says it can train increasingly complex multimodal models efficiently.

The newly announced M100 is designed for large-scale inference scenarios and optimized for MoE (Mixture of Experts) models. It is expected to launch in early 2026. The M300, slated for 2027, will support both inference and ultra-large-scale training, targeting multimodal AI workloads.

The Kunlunxin software stack is compatible with mainstream deep learning frameworks, including CUDA, allowing customers in telecom, energy, finance, and internet sectors to integrate the chips into their operations. Reported clients include China Merchants Bank, China Southern Power Grid, China Iron & Steel Research Institute, China Oil & Gas Pipeline Network, Geely Auto, and leading Chinese internet firms. Deployment scales range from dozens to tens of thousands of GPUs.

Baidu first launched 32-card and 64-card P800 supernodes in April 2025. The Tianchi 256 integrates 256 P800 cards into a single node, quadrupling interconnect bandwidth and improving performance by more than 50%. Tianchi 512 doubles this card count and bandwidth, enabling training of trillion-parameter models.

Future supernodes, including 1,000-card and 4,000-card configurations, will leverage the newly launched M-series chips, starting in the second half of 2027. Shen said Kunlunxin plans to release new products annually over the next five years.

“While the power of a single chip is the foundation, large model training and inference require multiple chips working in close coordination,” Shen said. “Supernodes enable dozens or even hundreds of chips to operate like a single superchip, maximizing communication efficiency.”

Baidu’s efforts reflect a broader trend of AI companies moving to control the hardware that underpins next-generation models. By combining chip development, system engineering, and software optimization, firms hope to reduce dependency on foreign suppliers, increase efficiency, and capture more value from AI applications.

As AI models grow in size and complexity, companies that can integrate hardware, software, and large-scale systems are likely to maintain a competitive advantage.

本文系作者 zhangxinyue 授权钛媒体发表,并经钛媒体编辑,转载请注明出处、作者和本文链接
本内容来源于钛媒体钛度号,文章内容仅供参考、交流、学习,不构成投资建议。
想和千万钛媒体用户分享你的新奇观点和发现,点击这里投稿 。创业或融资寻求报道,点击这里

敬原创,有钛度,得赞赏

赞赏支持
发表评论
0 / 300

根据《网络安全法》实名制要求,请绑定手机号后发表评论

登录后输入评论内容

快报

更多

2026-04-04 22:23

特朗普再警告伊朗:“48小时内”可能采取进一步行动

2026-04-04 22:13

美媒:至少16架美国MQ-9型“死神”无人机被击落

2026-04-04 21:10

兆瓦级氢燃料航空涡桨发动机首飞成功

2026-04-04 21:04

配给、调价、限水……斯里兰卡多措并举应对能源危机

2026-04-04 20:48

希腊政府因冒领欧盟农业补贴丑闻再次改组

2026-04-04 20:16

4月4日新闻联播速览25条

2026-04-04 20:01

匈牙利总理:欧洲能源危机逼近,欧盟应尽快补充油气储备

2026-04-04 19:59

充电涨价与“用油发电”关系不大,中国不存在大规模用油发电情况

2026-04-04 19:26

伊朗允许运载必需品的船只通过霍尔木兹海峡

2026-04-04 19:24

多只千亿市值科技巨头近一周获得逾百家机构调研

2026-04-04 19:23

土耳其两周用掉近120吨黄金

2026-04-04 19:20

伊外长:美媒歪曲伊朗立场,伊朗从未拒绝前往伊斯兰堡

2026-04-04 18:47

Meta将在旧金山湾区裁减约200人

2026-04-04 18:43

第十六届中国航展参展工作协调会召开,已有超1100家企业确定参展

2026-04-04 18:40

伊朗否认石化经济特区遭袭后发生有毒物质泄漏

2026-04-04 18:39

中信建投:紧盯中东变局,把握中国优势资产

2026-04-04 17:54

以色列最大天然气田恢复运营

2026-04-04 17:41

4月3日,全社会跨区域人员流动量20826.9万人次

2026-04-04 17:36

受油价上涨影响,美联航上调行李托运费标准

2026-04-04 17:34

日本商船三井公司:关联公司一艘液化石油气船通过霍尔木兹海峡

扫描下载App