DeepSeek says it could earn 5 times more than what it spends: What does it mean

ByHT News Desk | Edited by Ashley Paul
Mar 01, 2025 04:34 PM IST

The company calculated the cost of inferencing to sales during a 24-hour period on February 28. 

DeepSeek, the Chinese technology company that launched its revolutionary R1 model in January, has revealed that its “theoretical” profit margin could be over five times its costs. This is one of the few times that something close to the actual expenses of developing and running an AI model have been released to the public.

DeepSeek, a company which seemingly aims to be transparent about its operations, is steadily revealing more and more information.(Reuters)
DeepSeek, a company which seemingly aims to be transparent about its operations, is steadily revealing more and more information.(Reuters)

The new startup, which brought waves and a stock rout in the global technology industry with its innovative and inexpensive approach to building AI models, said its V3 and R1 models' cost of inferencing to sales during a 24-hour period on February 28 put profit margins at 545%.

Inferencing refers to the computing power, electricity, data storage and other resources needed to make AI models work in real time.

However, DeepSeek said only a small number of its services are monetised and it offers discounts during off-peak hours, due to which its actual revenues are significantly lower. Nor do the costs factor in all the R&D and training expenses for building its models, it stated on GitHub.

Companies from OpenAI to Anthropic are experimenting with various revenue models, from subscription-based to charging for usage to collecting licensing fees, as they race to build ever more sophisticated AI products. But investors are questioning these business models and their return on investment, opening a debate on the feasibility of reaching profitability any day soon.

While rolling out the hypothetical profit margins that DeepSeek estimates it might achieve, the company also noted that its online service recorded 73,700 input and 14,800 output tokens per second per H800 node.

The 20-month-old startup also gave an overview of its operations including how it optimized computing power by balancing load — that is managing traffic so that work is evenly distributed between multiple servers and data centers.

Stay updated with the latest Business News on Petrol Price, Gold Rate, Income Tax Calculator along with Silver Rates, Diesel Prices on Hindustan Times.
