DeepSeek's Geopolitical Impacts
China’s technological progress is far more vulnerable to a more “friendly” US rather than a more “hostile” one
Lenin has often been quoted (possibly mis-quoted) as saying “There are decades where nothing happens; and there are weeks when decades happen.” Regardless of the accuracy of the attribution, this week will clearly go down in history in the latter category. Earlier this week, DeepSeek Artificial Intelligence Co., Ltd. - a subsidiary of the Chinese hedge fund “High-Flyer Quant” - released the latest version of its Large Language Model (LLM) - DeepSeek R1.
Ever since ChatGPT was popularized among consumers, prominent tech giants across the world have been working on their own version of LLMs - be it Meta’s Llama, X’s Grok, Anthropic’s Claude, or Beijing Zero One’s 01.AI. What makes DeepSeek’s models tower above aforementioned competitors is that it is able to achieve comparable or superior performance in all benchmarked categories while spending a fraction of the time & money required by the next best competitor. For reference, DeepSeek reportedly spent roughly $6M USD to train its model, using about 2.8M GPU hours on 2000+ Nvidia H800 GPUs (which operates at approximately 1/6th the speed of Nvidia’s most advanced H100 series). It achieved this feat in less than 2 months. This is less than 10% of the cost of the next cheapest model - Llama 3 (at least $70M spent), & less than 6% of the GPU-hours spent by the next fastest non-Chinese competitor - OpenAI’s GPT-4 (approximately 50-60M GPU-hours spent), despite the latter’s access to leading edge Nvidia GPUs that far outperform DeepSeek’s inferior H800s. Better yet, much of DeepSeek’s features & functionalities are open-source under an “MIT license” - meaning that anyone can copy, modify, & distribute the associated software & documentation free of charge & without restriction.
This is a revolutionary milestone in the still nascent LLM industry, & there are a few obvious strategic implications of this event:
1. US semiconductor sanctions against China have decisively failed. Ever since the Trump regime’s first export controls against ZTE in late 2017 (& later vs Huawei in mid-2018), the US has imposed ever more strict export bans on semiconductor exports to China. These sanctions not only prohibit sales of advanced semiconductor end-products to China, but also sales of semiconductor manufacturing equipment, so as to prevent China from being able to access & build the latest semiconductors, & therefore keep China behind the US in terms of accessing the latest advances in AI. These 8 years of ever stricter sanctions not only compelled Chinese enterprises to increase self-reliance across the entire semiconductor value-chain (which would be a first for any country in the semiconductor value chain), but use its limited computing power far more efficiently relative to its US counterparts, so as to get outsized results - as demonstrated by DeepSeek’s latest achievement. While the original DeepSeek model was trained using US-made Nvidia H800s, it is plausible that subsequent models can use domestically produced counterparts such as Huawei’s “Ascend 910C”. While the Ascend series do not have access to the latest cutting edge manufacturing processes (TSMC 2nm), it is a good-enough platform to run the DeepSeek R1 model at scale. In fact, DIY enthusiasts have already demonstrated that the basic open-source DeepSeek software can run on low-end computers such as the Raspberry Pi (albeit without the full 671 billion-parameter model), with power consumption as little as that of an ordinary smartphone.
2. Valuations of US tech giants must be revised exponentially downwards. As recently as last year, it was assumed that any company that wants to build an LLM needs hundreds of millions of dollars in sophisticated hardware (that only a few companies such as Nvidia can provide), & tens of millions of GPU-hours. This meant that only the richest tech companies in the world - Google, Meta, Microsoft, etc. - can afford to build, maintain, & offer the services of an LLM. Consequently, the profits associated with LLM services would be concentrated in the hands of a few companies that would command multi-trillion dollar valuations (e.g. Nvidia). The release of DeepSeek R1 shattered this assumption. It has demonstrated that a startup with less than 10 million USD can build & train a model, using older hardware that is well behind the leading edge. Therefore, small companies can profitably offer services at pennies on the dollar, given the low financial barrier to entry. Consequently, all the profits (& therefore the overall company valuations) forecasted by the US tech oligopoly must now be revised downwards significantly, with potentially perilous consequences on US financial markets.
3. The global south can now enjoy the fruits of generative AI. The most transformative impact of DeepSeek is not directly related to China or the US, but rather the rest of the world (particularly the global south). Now that everyone in the world has access to a top-performing, open-source LLM that has relatively minimal hardware requirements, the financial & hardware barrier to entry that kept the global south out of the AI game has all but been eliminated. Moreover, no country in the world can keep advanced AI technology out of the hands of any other country, big or small, due to geopolitical differences. The new bottlenecks to the application of AI are now education & imagination. That said, even education is becoming less & less of a barrier to AI, since DeepSeek users have already demonstrated the ability to develop software code (including AI code) without manually writing a single line of code. DeepSeek’s free, open-source LLM will unleash the imaginative & innovative abilities of over 6 billion people in the global south.
DeepSeek’s accomplishment is undoubtedly a great boost to China in the Sino-US technology race. Its benefits go well beyond simply mitigating the impact of US semiconductor export prohibitions, its bigger potential value add comes from 2 other sources:
1. Expanded semiconductor export opportunities. DeepSeek made it possible to run a scalable, high-performing LLM on relatively affordable but performance-constrained hardware platforms. Consequently, the available market for small scale enterprise & government AI infrastructure with targeted use cases is greatly expanded in global south markets. As the world’s leading manufacturer of legacy semiconductors, China is in the ideal position to sell relatively low-end AI chips & backend infrastructure - or the cloud based services thereof - to developing countries that previously could not afford to deploy or use high-performance computing infrastructure for AI use cases.
2. Expanded mind share in the AI developer ecosystem. As DeepSeek becomes the LLM of choice for app developers, researchers, & enthusiasts from developed & developing countries alike, its rapid adoption will lead to faster improvements, more available services, accelerated innovation, & broader community support to make DeepSeek an even more attractive alternative for a larger number of people in the future. The fact that it is mostly open-source makes it nearly impossible for any government to restrict or prohibit the use & proliferation of these aforementioned improvements, thus making it far more resistant to geopolitical upheaval.
Despite the numerous upsides for China, there are also significant uncontrollable risks that could be triggered as a result of this accomplishment. First & foremost on this author’s mind is the possibility that DeepSeek may prompt the US to loosen semiconductor export controls, upon witnessing the relative ineffectiveness of such measures. Such a measure may have the detrimental effect of luring Chinese enterprises back to a state of dependency on higher-performing US technology, thus shifting revenue & R&D dollars away from local Chinese upstarts in the ICT value chain. Contrary to popular belief, the sustainability of China’s technological progress is far more vulnerable to a more “friendly” US rather than a more “hostile” one. Another possible, perhaps inevitable, side effect is that DeepSeek’s accomplishment adds to a litany of other recent “Sputnik moments” - be it the “Great American RedNote Migration”, the test flight of 2 6th generation fighter platforms, or the recent breakthrough of EAST’s sustained nuclear fusion reaction to over 1000 seconds - that might galvanize the American public & elites alike to make a more coordinated, whole-of-society effort to maintain a technology lead over the PRC. Unfortunately for China, there are no practical means available to mitigate either of these risks.
In sum, the release of DeepSeek R1 marks a pivotal moment in the evolution of AI & its geopolitical ramifications. By achieving state-of-the-art performance at a fraction of the cost and time required by its competitors, DeepSeek has not only demonstrated China’s growing technological prowess but also reshaped the global AI landscape. The failure of US semiconductor sanctions to stifle Chinese innovation, the potential devaluation of US tech giants, and the democratization of AI for the global south are just the beginning of the transformative changes ushered in by this breakthrough. As DeepSeek’s open-source model proliferates, it will empower billions of people worldwide, accelerate global innovation, and challenge the existing technological and economic order. In this new era, the winners will be those who can harness the power of AI to mitigate humanity’s greatest challenges—regardless of their geographic or economic starting point.
Yay! China's open source movement is implementing 'Community with a shared future' 👏🏻
All the hype is preventing new users from signing up on deepseek's website and app. I hope this is temporary.