DeepSeek Causes Deep Stir in LLM World
The last year’s headlines have been dominated by Nvidia and OpenAI. ChatGPT has continuously evolved from answering simple questions and writing essays, to delivering reasonable answers for questions requiring thinking. Nvidia has become the latest Wall Street darling, again. Nvidia’s stock price closed 2024 by jumping an amazing 804%, (StatMuse 2025) throwing stock indices out of balance and attracting a new rush of investment money. Nvidia’s chips power the vast majority of servers generating AI models including OpenAI’s, Anthropic’s, and Google’s.
Now, an AI startup out of China has caused a disturbance in Wall Street’s party. DeepSeek-R1 was released last week and caused market wide ripple effects. Nvidia’s stock reportedly lost $600 billion (Kopack 2025) in value on news of R1’s release and early capabilities. Why all of the hubbub?
Who is DeepSeek?
DeepSeek is a Chinese AI startup building large language models (LLMs). The company was originally founded by mathematically inclined stock traders. Their original goal centered on starting a hedge fund focused on using AI models to place stock trades. When that business didn’t take off the group of traders turned their math knowledge towards general AI models, specifically LLMs like ChatGPT.
Their first model made a minor splash in 2024. DeepSeek-V2 was able to perform near ChatGPT levels at a low price. Since then, DeepSeek continuously made improvements eventually releasing DeepSeek-R1.
What’s So Special About DeepSeek-R1?
You might be asking yourself, “There’s already Gemini, Claude, and ChatGPT, why is DeepSeek-R1 so amazing?”
Well, there are a number of good reasons for that, but let’s focus on three.
DeepSeek-R1’s architects did something the biggest and baddest of the AI companies couldn’t do. They claim they trained their model for peanuts. Allegedly, DeepSeek spent $6 million to train their model. (Kopak 2025) A fraction of the tens of millions of dollars spent on people, servers, and electricity OpenAI and others spend on computing (Kopak 2025). Following the laws of computing, this is could be a precursor of super-democratization of model design.
DeepSeek’s model also had an ‘a-ha’ moment that wasn’t entirely expected. Reinforcement learning human feedback is a technique where human beings are involved in AI model training. Essentially, when an AI model gets something wrong, a human is in-the-loop and informs the model it’s wrong. Think of model training like learning multiplication. If you got all of your times tables right, you were rewarded with a shiny blue star. An AI model gets rewarded when it answers questions correctly. Because humans are expensive, model designers have created something called Supervised Fine Tuning. This is a small model that mimics human behavior. Instead of using a real life human, a model is inserted, and the model tells the model being trained that it’s incorrect.
Instead of using a Supervised Fine Tuning approach, DeepSeek-R1’s designers used a rule-based approach for its reinforcement learning cycle (Guo 2025). This rules-based approach gives the model the ability to compute its accuracy when learning. The model is rewarded when it correctly answers questions according to a set of rules. This approach gives it the ability to ‘reason’, according to the rules. This means a model can take different approaches and arrive at a correct solution. It also means a model can figure out when it got something wrong, according to the rules. Models don’t get rewarded with blue stars, but they are still given a grade or a score. They are rewarded when the score surpasses a positive threshold.
In DeepSeek-R1’s case, it changed its mind mid-solution and came up with another idea (Guo 2025). This other idea led to a correct solution. This kind of training allowed the model to learn and train for fewer hours with fewer parameters. An edition of R1, called DeepSeek-V3 took 2.8 million GPU hours to train compared to Meta’s Llama3 which took 30.8 million hours with similar capabilities (DeepSeek-AI 2025).
The third thing? They open sourced it. The whole, it. Not just a release of a trained model and weights. They open sourced the model, weights, the code, the whole kit and caboodle. At the time of this writing 6,200 forks of DeepSeek-R1 were spliced on GitHub (DeepSeek-AI 2025 reference).
What Does This Mean?
For the big US AI companies, I’m not sure. I’ll wager a dollar the same restrictions on products sold to the US government using foreign technology will remain for DeepSeek-R1. However, for smaller companies and startups using LLM technology to build products and services beyond co-pilots, the full release of DeepSeek-R1 is an amazing development. The basement hobbyist may not get their hands on 1,000 NVidia GPUs to train a derivative model in 2 months. But they could test ideas on a smaller edition and spend 9 months training a specialized model.
I’m hopeful DeepSeek’s innovation will continue the trend of decreasing time and cost in computing. Smaller, cheaper, and highly performing models can produce new eras in robotics, automation, and assistive technologies. These areas still have room to benefit from LLM technology.
References:
Guo, Yang, Zhang, et. al. DeepSeek-R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv.org, 22 Jan 2025. Accessed 30 Jan 2025. https://arxiv.org/abs/2501.12948
“DeepSeek,” in Wikipedia: The Free Encyclopedia: (Wikimedia Foundation, Inc., updated 15:39 31 Jan 2025 UTC) [encyclopedia on-line]; available from https://en.wikipedia.org/wiki/DeepSeek
DeepSeek-AI. Aixin, Bei, et al. DeepSeek-V3 Technical Report [Computer software]. https://github.com/deepseek-ai/DeepSeek-V3
StatMuse. “Nvidia Stock Price in 2023 to 2024”. Nvidia Stock Price In 2023 To 2024 | StatMuse Money 2025
Kopack, Steve. “Nvidia Loses Nearly $600 billion in market value after Chinese AI startup bursts onto scene.” NBC News, 27 Jan 2025, Accessed 30 Jan 2025.