Just as we were getting bored by constant headlines about AI and its impact, DeepSeek has, it seems, come from nowhere to transform what we thought about AI. And it is reported to have cost less than $6 million to train, by comparison, Sam Altman has said GPT-4 cost more than $100 million to develop.Founded in 2021 by Liang Wenfeng, a math geek turned hedge fund manager, DeepSeek, at first sight, offers an AI model that looks much like the more established offerings like OpenAI’s ChatGPT or Google’s Gemini. However, behind those outside appearances the model’s origins and performance have some celebrating the model as a breakthrough, while others are even suggesting it can’t be true.Muj Choudhury, CEO and Co-Founder of AI-powered conversation processing platform, RocketPhone.ai, following DeepSeek’s emergence and its broader implications for the industry, especially within Europe says, “DeepSeek’s rise challenges the conventional AI development narrative. While much of the focus has been on the US-China tech rivalry, DeepSeek’s success proves that AI innovation shouldn’t solely be dictated by access to supercomputers or Silicon Valley funding.”
Innovation born of necessity
As a Chinese company, DeepSeek has very limited access to US technology, and for a processor-intensive industry like AI, many may have given up before they even started.Wenfeng, however, appears to have relished the challenge. As well as finding some loopholes to get hold of around 2,000 low-powered Nvidia GPUs, he also used old-fashioned human intelligence, hiring recent graduates and creating a culture that promoted creative problem-solving. Without the ability to use brute-force processing, they developed the more efficient methods that enabled DeepSeek to be built so cheaply.One part was to ensure their code was optimised. With significant investment, developers at Western AI companies would have lots headroom in their coding, enabled by having the latest processors and lots of them. DeepSeek, with lower-powered processing, were forced to optimise its code effectively, helping to ensure every second of processing time was used as efficiently as possible.DeepSeek are also using a different approach to AI that makes it more efficient. The model uses what is known as the Mixture of Experts (MoE) structure, which some other models already use. In the model, rather than the entire AI being available, it compartmentalises subsets of the AI behind gatekeepers. A query of the AI will only use those subsets it needs. But that works for training as well, each token trained only a small, relevant, portion of the model, helping minimise cost.The MoE model might seem limiting, but in practice reflects a real-life approach, if we want to find out about a subject, history, for example, we don’t cross-reference every section of the library or internet. The model allows DeepSeek to boast 671 billion parameters, without a huge performance overhead as a result.Their training technique also sought to improve efficiency. Just like teaching humans, it sought to train on verifiable information, rewarding correct answers and learning from mistakes. This meant that training focused on tasks that have certain and definite answers, like maths or coding, rather than less defined tasks. This may be reflected in the AI now, some users have noted that although DeepSeek rivals others for the quality of answers, it somehow lacks their character and polish.Finally, DeepSeek also helped keep costs down by using open source where possible. And the DeepSeek model is available as an open-source version, unlike their competitors, meaning that those who want to test it out can do so on their own device, without even having an internet connection.Agur Jogi, CTO, Pipedrive opines, “Big Tech is reacting to DeepSeek’s launch, and while the full capabilities of this new Open Source platform are still being understood, it pushes competitors to be more efficient with their emerging tech investments. For users, DeepSeek provides attractive scope for largescale automation, with its free availability and capacity to create chatbots rivalling other models, though concerns about data protection and information freedom are an issue. It shows that there may come a time for ‘everyday’ AI and more expensive, advanced AI, coexisting for different use cases and types of users. For providers, the best competition strategy is to innovate, improving UX and functionality.
As cheap to run as to create?
AI is not a cheap technology. Early in the year, there were stories about the high costs of OpenAI’s latest models, some even suggesting four-figure monthly subscriptions to access. Meanwhile, the new Trump administration has announced a $500 billion investment in data-centres for AI.DeepSeek’s efficient design, however, could mean that’s all old news. One experimenter claims to have run DeepSeek on a Raspberry Pi, a cheap single-board computer, achieving performance of up to 200 tokens-per-second, nearly twice the speed of the free ChatGPT many will be familiar with.Even if performance was a fraction of that, it suggests that DeepSeek will need significantly less computing power — and cause less environmental harm — than existing models.
New rules in the AI playground
Much of the discussion around AI had reached a consensus, with politicians and tech leaders rushing to embrace the technology. However, few seem to have appreciated that the tech sector can move quickly.It is still early, but DeepSeek’s approach may well prove transformational, opening AI development to new and smaller players. It certainly proves that the orthodoxy can be challenged, and may well motivate those with innovative ideas, but without the deep pockets of Meta, Google, or OpenAI, to test them out.DeepSeek plans to continue the development of its model. However, whatever happens to them, they have shown that innovative approaches can sometimes be more than a match for even the best-resourced companies? 2025 has already been transformed by DeepSeek’s launch, it’s impossible to predict what AI will look like by the end of the year.
Luke Alvarez, Managing General Partner at Hiro Capital says, “What DeepSeek has shown is a particularly stark demonstration of what’s been true for a while in AI, which is that algorithmic efficiency driven by software and mathematical innovation is moving much faster than Moore’s Law or even compute cluster size in driving improvements. Broadly speaking this should be an excellent driver for UK and European AI funding – we can’t compete with the USA on brute force compute and depth of capital (except at DeepMind!) but we can compete on innovation.”