Recently, Chinese Artificial Intelligence (AI) firm DeepSeek introduced its latest large language model, R1, sending shockwaves through the tech industry. R1 wasn’t just on par with the best AI models available, it was built at a fraction of the usual cost and released for free. The financial world reacted instantly, with the United States (US) stock market losing a staggering $1 trillion the day R1 was unveiled.

The implications of DeepSeek’s move extended far beyond these financial tremors. By openly sharing the details of how R1 and its predecessor, V3, were developed and making these models freely accessible, DeepSeek shattered a long-held industry belief that reasoning-based AI models were extraordinarily difficult and expensive to create. This revelation had an immediate impact, triggering a rapid response from major AI competitors.

The reaction from competitors and the rapid shifts in the industry begs the question: What exactly did DeepSeek do to cause such a massive upheaval, and is the hype surrounding R1 justified? Understanding the impact requires a closer look at how large language models are built.

Training these models involves two primary phases: Pre-training and post-training. In pre-training, the model learns to generate text by analysing vast amounts of publicly available documents (basically the internet’s entire contents) and processing them repeatedly. This results in a base model that possesses extensive knowledge but lacks task-specific refinements. The process is computationally intensive and represents the largest cost in AI development.

The model then undergoes post-training to refine its capabilities. One key component of this is supervised fine-tuning (SFT), where human trainers curate question-answer pairs and teach the model to respond accurately. An additional part of post-training, pioneered by US-based OpenAI is reinforcement learning with human feedback (RLHF), where human reviewers score AI-generated responses. These scores are used to further train the model so that it produces answers with ever higher scores. While effective, these methods are costly and time-consuming, as they require substantial human labour.

While pre-training is still needed, DeepSeek’s breakthrough came from eliminating the need for human feedback in post-training. Instead of relying on human-generated scores, DeepSeek implemented a fully automated reinforcement-learning approach where a computer generates scores for responses instead of humans, significantly reducing costs and training time. In other words, the model makes a computer-assisted guess instead of relying on a separate human evaluation model. This method significantly cuts costs while maintaining (relatively) high accuracy. The trade-off is that while computers are highly effective at scoring answers to objective questions, such as those involving math or programming, they struggle with subjective or open-ended queries. This explains why R1 excels at mathematical reasoning and code generation, making it highly relevant to markets like India, where a surge in AI-driven code generation can bankrupt our outsourcers.

But it also presents a challenge for the AI industry, as engineers there will now need to focus on refining methods to improve these models’ handling of subjective reasoning. This doesn’t come easily; context and subjectivity are hard to convert into computer code. R1’s lack of contextual reasoning has already been ridiculed by people who have fed it riddles that many people can intuitively answer, but the model cannot.

Importantly, on the hardware side, DeepSeek has also found ways to optimise older computing infrastructure, allowing it to train high-quality AI models without relying on cutting-edge graphic processing units (GPUs). Most AI engineers use Nvidia’s CUDA software to maximise the performance of AI chips, but DeepSeek took a different approach by programming in assembly language, which allows direct communication with hardware. This requires deep engineering expertise and painstaking effort but enables significantly improved efficiency and performance from existing hardware setups, thereby reducing reliance on newer AI chips, whose sale to China has been embargoed by the US government.

In the wake of R1’s release, Chinese and US companies scrambled to react. Tech giant Alibaba quickly introduced an updated version of its language model, Qwen, while the Allen Institute for AI (AI2), a leading non-profit research organisation in the US, released an upgraded version of its model, Tulu. OpenAI’s co-founder and CEO, Sam Altman, acknowledged R1’s capabilities, calling it impressive but claimed that OpenAI would deliver much better models. Sure enough, OpenAI launched its own competing model, o1, and followed up with ChatGPT Gov, a version of its chatbot tailored for US government agencies. This move indirectly addressed concerns about DeepSeek’s Chinese origins, amid speculation that the Chinese-developed model could potentially be used to harvest user data or censor content in line with Beijing’s regulations.

But despite DeepSeek’s claims that it spent less than $6 million to train V3, its advancements didn’t come from thin air. The company’s success builds upon an existing foundation of AI research and technological developments. Industry expert Dario Amodei, Anthropic’s CEO, who is not a disinterested observer, claims that DeepSeek possesses access to around $1 billion worth of GPUs, based on reports that the company used approximately 50,000 Nvidia H100 GPUs for training. This suggests that while DeepSeek’s approach may be cheaper and less dependent on newer chips, it still requires substantial computing resources.

The timing of R1’s release was particularly interesting. Within days of its launch, a series of competing reasoning models entered the market, including OpenAI’s o1 and o3 and Google DeepMind’s Gemini 2.0 Flash Thinking. The sudden wave of releases suggests that major AI firms had already developed reasoning models but were hesitant to reveal them for whatever reasons until DeepSeek forced their hand. If reasoning models are easier to build than previously believed, then AI is poised to advance much faster than anticipated. We could soon see an explosion of freely available, high-performance AI models that are vastly more capable than anything currently on the market. The barriers to entry for cutting-edge AI are lowering rapidly, and the next wave of AI innovation is likely to reshape industries, economies, and global technological dominance at an unprecedented pace.

Siddharth Pai is a technology consultant and venture capitalist.The views expressed are personal