Home Asia DeepSeek Fast-Tracks Launch Of R2 AI Model

DeepSeek Fast-Tracks Launch Of R2 AI Model

With its large A100 cluster, High-Flyer and DeepSeek have attracted top research talent in China.
Deepseek logo is seen in this illustration taken January 27, 2025. REUTERS/Dado Ruvic/Illustration
Deepseek logo is seen in this illustration taken January 27, 2025. REUTERS/Dado Ruvic/Illustration

Chinese startup DeepSeek, which caused a $1 trillion sell-off in global equities with its AI reasoning model, is now fast-tracking the launch of R2, its successor to the R1 model.

DeepSeek plans to accelerate the release of its R2 model, originally set for May, aiming for improved coding and multilingual reasoning capabilities. The updated timeline has not been previously disclosed.

Rivals are still digesting the implications of R1, which was built with less-powerful Nvidia chips but is competitive with those developed at the costs of hundreds of billions of dollars by U.S. tech giants.

“The launch of DeepSeek’s R2 model could be a pivotal moment in the AI industry,” said Vijayasimha Alilughatta, chief operating officer of Indian tech services provider Zensar. DeepSeek’s success at creating cost-effective AI models “would likely spur companies worldwide to accelerate their own efforts … breaking the stranglehold of the few dominant players in the field,” he said.

DeepSeek’s R2 model release is likely to concern the U.S. government, as it could strengthen China’s AI leadership. The company, founded by billionaire Liang Wenfeng of the High-Flyer hedge fund, has been quietly gaining momentum, with multiple Chinese companies integrating DeepSeek models. Liang, known for his low profile, has not spoken to the media since July 2024.

As per a Reuters research DeepSeek documents told a story of a company that functioned more like a research lab than a for-profit enterprise and was unencumbered by the hierarchical traditions of China’s high-pressure tech industry, even as it became responsible for what many investors see as the latest breakthrough in AI.

Computing Power

DeepSeek’s success with its low-cost AI model is built on High-Flyer’s decade-long investment in AI research and computing power.

High-Flyer, an early pioneer in AI trading, reinvested 70% of its revenue into AI research, spending 1.2 billion yuan on two supercomputing AI clusters in 2020-2021, including the Fire-Flyer II with 10,000 Nvidia A100 chips.

This significant investment, made before DeepSeek’s creation, attracted the attention of Chinese securities regulators.

“Regulators wanted to know why they need so many chips?” the person said. “How they were going to use it? What kind of impact would that have on the market?”

Chinese authorities chose not to intervene when High-Flyer’s Fire-Flyer II AI cluster, equipped with Nvidia A100 chips, was already operational before the U.S. banned A100 exports to China in 2022.

DeepSeek now enjoys support from Beijing but has been instructed not to engage with the media without approval. Authorities are concerned that too much publicity could attract unwanted attention.

With its large A100 cluster, High-Flyer and DeepSeek have attracted top research talent in China.

Cost-Effective AI Architecture

Some Western AI entrepreneurs, like Scale AI CEO Alexandr Wang, have claimed that DeepSeek had as many as 50,000 higher-end Nvidia chips that are banned for export to China. He has not produced evidence for the allegation or responded to Reuters’ requests to provide proof.


Nitin A Gokhale WhatsApp Channel

Two former employees attributed the company’s success to Liang’s focus on more cost-effective AI architecture.

DeepSeek used cost-efficient techniques like Mixture-of-Experts (MoE) and multihead latent attention (MLA) to achieve AI model performance at a fraction of the cost of competitors.

MoE activates only relevant areas of a model, while MLA processes multiple aspects of information simultaneously. DeepSeek’s models were 20 to 40 times cheaper than OpenAI’s equivalents, prompting rivals like OpenAI and Google to cut prices and adjust strategies. DeepSeek’s success with its R1 and V3 models has influenced pricing changes in the AI industry.

Adnan Masood of U.S. tech services provider UST told Reuters that his laboratory had run benchmarks that found R1 often used three times as many tokens, or units of data processed by the AI model, for reasoning as OpenAI’s scaled-down model.

State Embrace

Before R1 gained global attention, DeepSeek gained favor with Beijing. In January, Liang met with Chinese Premier Li Qiang as the AI sector’s representative, ahead of leaders from more prominent firms.

The success of DeepSeek’s cost-effective models has strengthened Beijing’s belief in China’s ability to out-innovate the U.S., with Chinese companies and government bodies quickly adopting DeepSeek’s models.

At least 13 Chinese city governments and 10 state-owned energy companies say they have deployed DeepSeek into their systems, while tech giants Lenovo, Baidu and Tencent – owner of China’s largest social media app WeChat – have integrated DeepSeek’s models into their products.

Chinese leader Xi Jinping and Li “have signalled they endorse DeepSeek,” said Alfred Wu, an expert on Chinese policymaking at Singapore’s Lee Kuan Yew School of Public Policy. “Now everyone just endorses it.”

The Chinese embrace comes as governments from South Korea to Italy remove DeepSeek from national app stores, citing privacy concerns.

“If DeepSeek becomes the go-to AI model across Chinese state entities, Western regulators might see this as another reason to escalate restrictions on AI chips or software collaborations,” said Stephen Wu, an AI expert and founder of hedge fund Carthage Capital.

Further limits on advanced AI chips are a challenge that Liang has acknowledged.

“Our problem has never been funding,” he told Waves in July. “It’s the embargo on high-end chips.”

(With inputs from Reuters)