China’s DeepSeek Sparks Global AI Race

130.86k views3320 WordsCopy TextShare

ColdFusion

Go to https://brilliant.org/coldfusion for 30-day free trial 20% off! Accusations of theft by Ope...

Video Transcript:

shares in the advanced computer chip one of the wor if the Chinese company hi welcome to another episode of Cold Fusion look at these stock market charts from the 28th of January 2025 what you're looking at is a blood bath a blood bath in the US Stock Market of over $1 trillion and the cause the release of the deep seek R1 AI model from China the Chinese model is as capable as the best US models but it's free to use open source more efficient and most shocking of all it reportedly cost less than 3% of chat gp01 to develop just 2 years ago on this channel we were talking about an AI arms race between companies today that's evolved into an AI race between countries in the one corner we have the United States they have a long history of technological dominance but then on the other side we have China a country with a very different ideology and motives in this race to dominance it's not about weapons but it's about developing systems that are designed to think artificial intelligence this race is reminiscent of the Cold War some have even dubbed these events as quote the Sputnik moment of AI the White House says that they're looking into quote National Security implications of China's deep- seek AI platform and to top it all off open AI has accused deep seek of stealing its IP to train their own model it's all heating up with the United States pouring in half a trillion dollars into the Stargate AI project the global race is on and this ongoing battle could be one of the biggest stories in Tech this year as artificial intelligence becomes a matter of National Security the technology would be forced to move faster than it is today what a crazy time to be alive but before we get ahead of ourselves what is really going on here how did a company from nowhere do all of this is this all just part of the AI hype cycle or is this the real deal it seems like the whole world is playing catch-up since the release so let's try and make sense of it all you are watching cold fusion TV historically when technology meets a national security threat from an ideological opponent we get inventions like the computer and jet aircraft from the competition of World War II for example but this time around the United States was completely unchallenged in the field of AI for the most part but that all changed on January 20th 2025 with the release of R1 deep seek R1 which is free has performance reportedly on par with open ai's $200 a month model and this is performance in the context of tasks such as language reasoning mathematics and coding the free model also beats out anthropics Claude CET and Google's Gemini but what many people may not know is that deep seek does things a little bit differently to the current state-of-the-art models it's in part why it's so efficient but we'll cover these details later in the episode because there's no competition for that level of AI performance for free users have been flocking to it with deep seek becoming number one in Apple's App Store but here are the stats of why people's Jaws are dropping the AI was built in 2 months and reportedly cost less than 5. 6 million to build the AI company anthropic says that 100 million to 1 billion is the general amount needed to develop an AI system from scratch and to that end meta plans to spend 65 billion on AI so creating something that performs this well with just $5. 6 million is groundbreaking but all may not as it seems more on that later seeing much more um I think there were two very important things that people need to know about what's happening with deep SE Ki and the way it's being interpreted on Wall Street the first is it doesn't matter if it's a Chinese government scop or not the technological innovation of having an llm train itself through reinforcement learning is impressive the cost efficiency of doing inference with only s ion parameters rather than 700 billion parameters is impressive the possibility of being able to do more model training and inferencing with less usage of power and less chips is impressive it doesn't mean though that chip demand is at risk what I think it means is you're more likely to see an acceleration of AI everywhere all over the economy deep seek R1 being open source means that its code is freely available for whoever wants to use it and for whatever they want to use it for users can modify it as they please all for free this is totally the opposite approach of open AI which is pretty ironic this is all horrific news for us AI companies because it means that suddenly their costs are all out of balance deep seek with its 671 billion parameters can run locally on a stack of M4 Mac Pros in contrast investors and comp companies have poured billions of dollars into American AI servers after the shock of this release now it looks like us companies have been spending too much money using too much energy and charging too much for the services that they've been providing maybe in the future it's not going to be so much the models that would make the most money but the applications that run on top of them has this all been a massive mistake from us investors no one knows for sure and that's why the markets are selling off one bright spot for us companies though is that users of AI systems may not feel comt able in giving their data directly to China especially in corporate settings in order to compete Sam ultman CEO of the chat GPT maker open AI has announced that their GPT 30 Mini model will now be given away for free as for Mark Zuckerberg and meta they're internally panicking but it's not just the Americans over in China the effect is the same other Chinese Tech Giants such as bite dance the maker of Tik Tok Alibaba and tensent have freaked out and had to cut the prices of their AI model to compete and despite the low price charged by Deep seek it remains profitable while its Rivals lose money interestingly open AI told the financial times that they have evidence that deep seek was using the output from chat GPT to train its own model in fact last year they blocked open AI API accounts that they believe belong to deep seek suspecting theft the US government's official stance is that it is possible that IP theft has occurred it should also be noted that it seems like Chinese AI Developers are still managing to get their hands on topof the line in video graphics cards despite us sanctions but that begs the question who are deep seek and how did deep seek seemingly overnight build this [Music] thing for a company responsible for one of the biggest red days in the US Stock Market not a lot is known about the founder and the team behind deep sea but the story is interesting so far deep seek founder leang win Fang isn't from the typical Tech world he actually has a background in finance and co-founded a hedge fund called high flyer his company used AI to predict market trends and help make investment decisions and he was very successful at that and his fund now manages 8 billion but after his initial success he wanted more his next goal was to build quote human level AI in 2021 he started buying thousands of Nvidia gpus as part of his quote AI side project this was right before the Biden Administration began limiting us export of AI Hardware to China leang Advent spun off his AI side project into another company and that company was deep seek and the R1 is their latest model but honestly The more I've been reading up on the leang story The more interesting it gets so let me know in the comment section if you want to see a dedicated episode on the Deep seek founder so deep seek R1 was trained with reinforcement learning that means there weren't any humans who helped it learn and the method that deep seek uses for their model architecture is different to most of the other players it's a technique called mixture of experts Sky News explains it well quote where open ai's latest model GPT 4 attempts to be Einstein Shakespeare and Picasso rolled into one deep seeks is more like a university broken up into expert departments this allows the AI to decide what kind of query it's being asked and then send it to a particular part of the digital brain to be dealt with this lets the other parts to remain Switched Off Saving Time energy and most importantly the need for computing power the YouTube channel computer file explains further so maybe you ask it a very specific maths question what mixed of experts will do is have trained a specific part of this network a much smaller part to solve that problem for you and so you basically have the early stages will root the question to different parts of the network and then only activate a small part of it let's say 30 billion parameters which is a huge huge saving so this sort of shaded area here will activate and then that will produce your answer you can develop systems using agents like this where you have one that's trained to do this and one that's trained to do this and you just ask the right one right suppose I want to train a network to write my emails for Me Maybe it's very good at that and then I train a different network to solve a different problem and I just ask the right one as opposed to hoping that one model can do it so that's much more efficient to add to the efficiency is a process called distillation basically using larger models to train smaller models in targeted domains the result is equivalent performance with significantly less computing power and this was the big shock for AI developers and financial markets making Chain of Thought reasoning completely open and visible was an interesting choice open AI basically does the opposite does is essentially write down a step-by-step process of solving the problem and slowly solve it and then write down the answer you tend to get much better at solving problems that require multiple steps if you want to just what is why is the sky blue it will just regurgitate that pretty easily from text it's learned on the internet but if you're asking like problem solving skills it's hard to do in one shot so you kind of take a little bit of time to just to just take you know to just work through it now open AI pioneered this Chain of Thought um but they don't tell you how they do it because it's all closed and so it's not open AI at all right in some sense so essentially you see a kind of pricey summary version of The Chain of Thought but it's not the internal actual internal monologue which is essentially a trade secret what R1 is doing is it's doing a Chain of Thought which is similar to 01 but it's fully public they've released all the models they've released all the code you can talk to it you can see the entire monologue and they've also trained it with a with massively more limited data so as mentioned earlier things may not be as they seem that cost figure of $5.

6 million to create the model may not be complete in fact in a paper released by Deep seek themselves they mentioned that that $5. 6 million figure includes only the official training of deep seek V3 and does not include cost of Prior research experiments on architectures algorithms or data that does put a question mark on all the headlines we've been seeing that this thing was built for under $6 million but whatever the real figure is it's likely to be much less than what US companies have been spending in the latest news deep seek has also dropped an open image model and at this rate a video model will probably soon follow and it might even rival open AI Sora or Google's anticipated V2 in terms of search interest right now deep SE now outpaces chat GPT and it became one of the most downloaded apps on the app store and then towards the end of January things absolutely blew up and went wild China during Chinese New Year went crazy first Alibaba comes out with Quinn 2. 5 Max it's a very capable AI that could one-hot this code animation just asking a computer to code an animation and then it goes out and does it is so intuitive that I think kids of the future will believe that this is how coding always worked alibaba's quen 2.

5 Max outperforms deep seek and even GPT 40 in some tasks and then there's kimy K 1. 5 released around the same day it's also a great performer is multimodal and can browse the web in real [Music] time okay before you all rush out to sign up to deep seek Please be aware of something it collects data such as chat history any text or audio inputs uploaded files keystroke patterns B basically anything you input into the model now open AI does similar things but the difference is that with deep seek your data goes straight to servers in the People's Republic of China so I guess the question is do you want to be spied on by the us or do you want to be spied on by China I can't tell you what to do but that's just a heads up but in terms of privacy there is a bright side does mean that deep seek can run locally on a machine without an internet connection for complete privacy here's the YouTube channel some ordinary gamers running it locally code things for you so for instance I can ask deep deep seek like write me uh code for a simple login P web page so at this moment in time it'll think it'll be like all right the user is asking for code to create a simple login page so first it's going to structure HTML then it's going to style then it's going to validate and then here it is it's actually writing me the HTML code so we're sitting in a world where like I feel so scared for like Junior coders these days because God damn AI is really coming for some of the jobs that people least expected to lose first so again it writes this actual login page and of course once it's done it'll actually also provide you a preview in this chatbox software so you can see it for yourself before you actually throw it into you know production or testing or whatever so right here I'm just going to hit that preview button and boom there it is it actually and as I was making this video deep seek at the start of the week had to quote temporarily limit user registrations due to large scale malicious attacks this was also a warning to many as it seems like the program may not be as ready as it seems so what does Sam ultman think is only directly referenced the company once saying deep seeks R1 is an impressive model particularly around what they're able to deliver for the price we will obviously deliver much better models and it's also legit invigorating to have a new competitor we will pull up some releases we'll see what's around the corner for open AI but the joke is AI took chat gpt's job but in all ser seriousness I don't think that this is over I believe that this is just the beginning of major competition what we're seeing here is the technological version of thus CD's trap basically it states when a rising power challenges an existing power conflict arises in an interview with waves republished in the China Academy back in mid 2024 deep seeks founder leang made his Ambitions clear he said quot for years Chinese companies have been accustomed to leveraging technological innovations developed somewhere else and monetizing them through applications but this isn't sustainable this time our goal isn't quick profits but advancing the technological Frontier to drive ecosystem growth why is Silicon Valley so Innovative because they dare to try when chat GPT debuted China lacked confidence in Frontier research from investors to Major Tech firms many felt the Gap was too wide and focused instead on applications but in inovation requires confidence and young people tend to have more of it end quote with such a mindset deep seek May Force AI Innovation forward and China could be at the Forefront of the global AI race competitors around the world will be forced to reduce their costs and rethink how they're creating AI models efficiency will be the aim of the game we don't know how it will play out but we do know that we'll be having some rapid advancements in the coming years if we do remain positive we could see breakthroughs in medical science Material Science mathematics and even theoretical physics in the long term we could make products for cheaper make them longer lasting and produce them more efficiently but on the flip side what about nefarious uses and Bad actors geopolitically also what happens to all of the humans through this transition as AI rapidly improves that's for the future to decide and I have done a video on that topic years ago before AI blew up so you can check it out after this one but as usual in all of this let's just keep a close eye and see where this goes anyway that's about it from me and that is where we are with deep seek R1 how it works so efficiently and the absolute shock that it's caused around the world although a lot of people may find consumer AI annoying these days there's no getting around it it's here to stay and improving with each week it's going to be an important part of everyday life soon but how does AI work anyway well now there's a fun and easy way to learn about that and many other stem subjects with today's sponsor brilliant brilliant's course on artificial neural networks is perfect for that I've used it to brush up on some background context when I was making AI episodes each lesson on brilliant allows you to play with Concepts a method proven to be six times more effective than watching lecture videos plus all content on brilliant is crafted by teachers researchers and professionals from MIT cowtech Duke Microsoft Google and more learn at your own pace to brush up on a project for work or just for your own self-development and curiosity to try everything brilliant has to offer for a full 30 days visit the URL brilliant.