Cloud computing lab director analyzes DeepSeek's engineering breakthrough in AI development

China

Cloud computing lab director analyzes DeepSeek's engineering breakthrough in AI development

2025-03-17 17:25 Last Updated At：23:57

The success of DeepSeek is a testament to the power of engineering innovation in AI development, said Gao Wen, an academician at the Chinese Academy of Engineering and director of the Pengcheng Laboratory director, in an interview with China Central Television (CCTV) aired on Sunday.

DeepSeek-R1, a large language model developed by Hangzhou DeepSeek Technology, has garnered global attention for its remarkable performance comparable to top-tier international models, while achieving significantly lower development costs, one-thirtieth of similar products.

The DeepSeek series models have been made available on open-source platforms for domestic developers to test and verify, based on the Pengcheng Laboratory led by Gao, who is also a deputy to the National People's Congress, China's national legislature.

The success of DeepSeek-R1 is partly attributed to the robust infrastructure and technological advancements in China's computing power network. In 2022, the first phase of the "China Computing Power Network," known as the "Intelligent Computing Network," was officially launched.

This network connects and manages over 20 computing power centers of different types across different locations, with its aggregate computing power gradually increasing to 5E Flops, equivalent to 500 trillion calculations per second.

One of its key computing power hub nodes is the "Pengcheng Cloud Brain II," the Artificial Intelligence (AI) computing platform of the Pengcheng Laboratory.

It has a peak computing power of 100 billion calculations per second and began operation in 2020. This represents a tenfold increase in computing power compared to its predecessor, "Pengcheng Cloud Brain I," which could perform 100 trillion calculations per second. The upgrade was completed in just one year.

Gao said that the growth is mainly driven by the large demand of language model analysis.

"When developing 'Pengcheng Cloud Brain I,' the focus was on discriminative AI, which is primarily used for image recognition tasks, such as identifying individuals in photos. This type of AI typically requires less computing power, requiring just 100 Peta. As we've calculated, language models require higher computing and storage capabilities due to the vast availability of language data. As a result, the computing power for language processing needs to be 10 times greater than that for image processing," he said.

"Pengcheng Cloud Brain II" has achieved remarkable success in global high-performance computing benchmarks. It has clinched the top spot nine times in a row in the IO500 overall ranking, which measures data throughput capabilities of high-performance platforms. It has also topped the international AI computing power performance AIPerf500 ranking for four consecutive sessions.

Based on "Pengcheng Cloud Brain II," the Pengcheng Laboratory has built an AI training platform capable of handling ultra-large-scale AI models with hundreds of billions of parameters. "Pengcheng Mind" is one such ultra-large-scale natural language processing model trained and operated on "Pengcheng Cloud Brain II."

Reflecting on DeepSeek's success, Gao emphasized the importance of engineering optimization in AI development. By focusing on efficient training and deployment strategies, DeepSeek has set a new benchmark for large language models.

"Actually, this is where DeepSeek's ingenuity lies. The technology behind ‘Pengcheng Mind’ and ChatGPT is exactly the same. There is a model called the attention mechanism. For example, when a computer processes an article, it forgets the beginning by the time it reaches the end. However, GPT is a transformer. It invented the attention mechanism, or attention model, allowing it to focus on relevant information while filtering out the unnecessary, or zeroing in on what matters most and disregarding the trivial," he said.

"In terms of engineering, DeepSeek has done something that no one else can do. What is its technical approach? It's called a Mixture of Experts (MoE) system. DeepSeek has done this by training it in specific domains with specific expressions, so the training cost is not that high. It has a total of 256 expert models, but you don't need to load all 256 when using it; you can get by with just eight at most. This means the cost of using it is very low, and the training time can be saved. I believe DeepSeek is not an innovation in theory, but more in engineering," he added.

Escalating Middle East tensions drive up energy prices, squeezing US consumers

Soaring oil prices triggered by escalating tensions in the Middle East have heightened U.S. inflation pressures, with analysts warning that households face hundreds of dollars in extra costs if crude climbs further.

Data released on Tuesday by the American Automobile Association (AAA) showed that the national average price of regular gasoline in the United States has risen 18.64 percent compared with Feb. 26. The AAA data also indicated that the national average price of diesel on Tuesday was up 22.85 percent from a week earlier.

Mark Zandi, chief economist at global ratings agency Moody's, warned that U.S. consumers are being threatened by a sharp rise in fuel prices. He said that if international oil prices climb by another 10 U.S. dollars per barrel, annual spending for an average U.S. household would increase by about 450 dollars.

Zandi noted that a surge in oil prices would intensify inflationary pressure in the United States, eroding consumers' purchasing power and weighing on consumption, economic growth, and employment.

Tensions sharply escalated across the Middle East on Feb 28 when the United States and Israel launched large-scale joint airstrikes on Iran. The Iranian side has responded with multiple waves of missile and drone attacks targeting Israel and U.S. assets across the region, hitting many countries in the Gulf.

Cloud computing lab director analyzes DeepSeek's engineering breakthrough in AI development

China

Cloud computing lab director analyzes DeepSeek's engineering breakthrough in AI development

Next Article

Escalating Middle East tensions drive up energy prices, squeezing US consumers

Environmentalist says Iran’s ‘fragile land’ cannot withstand further escalated conflicts

Chinese ship engineer recounts narrow escape from Iran amid US-Israeli strikes

Palestinians dismayed after six killed in separate Israeli strikes on Gaza

Cross-regional passenger flow lowers as Spring Festival travel rush nears end

UK aims to take cooperation with China to “new levels”: British ambassador

Ecological conservation of wetlands makes headway in China

Multinational firms eye growth opportunities in China’s expanding consumer market