okay so deep seek has been making waves and that has led to a lot of misinformation noise and hype so today we are going to bust the top 10 myths surrounding deep seek and go through the actual implications for everyday AI users a massive shout out to Ben Thompson for his deep seek Deep dive uh he wrote a very comprehensive article that helped me to still this YouTube video and if you didn't get that second joke uh this video is definitely for you let's get started all right right myth number one deep seek built their
model for just $5.6 million uh that's very inaccurate because the 5.6 million only covers the final training run and excludes critical costs like infrastructure for example they reportedly have 50,000 Hopper gpus from Nvidia and that's worth around a billion dollar showing their true investment scale is much larger the analogy I would use is saying deep seek built their model for 5.6 million is like saying the latest iPhone only cost $500 to manufacturer uh while that might be the final assembly cost uh it doesn't include other cost contributors like research and development to name just one
myth number two they must have broken the rules to do this so funnily enough the main reason deep seek is getting so much attention is because of how they innovated within the constraints of the export controls so what that means is because they had to use the less powerful h800 gpus they had to come up with creative ways to optimize their model architecture and most analysts agree that had deep seek been able to use the more powerful h100s uh they would have done things differently and potentially have come up with an even more powerful model
there is some confusion around this so I made this very oversimplified table the h100s are more powerful and they um they were not allowed to be sold to Chinese companies H 800s are the nerfed versions of the h100s and was allowed and because both h100 and H 800s are classified as Hopper generation gpus uh deep seek having 50,000 Hopper gpus Mak sense and the analogy I would give here is um like how Samsung legally uses slightly slower processors in some regions due to licensing agreements uh they didn't break any rules they just optimize their design
around the constraints myth number three deep seek has beaten open AI first there is a huge difference between optimizing for performance and optimizing for efficiency it's like we can spend 2 hours at the gym to maximize muscle gain AKA optimize for performance or we can spend 45 minutes to achieve 80% of those gains AKA optimizing for efficiency so while deep seeks reasoning model R1 matches open ai's reasoning model 01 in performance uh open AI has already demonstrated 03 which is more powerful than o1 and R1 so the oversimplify answer here is that deep seek leads
in efficiency but not overall capability and as a testament to how fast the AI industry is moving open AI just released 03 mini and made it available for free users uh and we can even use 03 mini with search and arguably this might not have happened without the pressure brought on by Deep seek the analogy here is if you had a lot of money you could pay ,000 or $2,000 for the latest Flagship iPhone phone right but let's say a Chinese company comes along and produces a smartphone that gives you 90% of the flagship performance
for a fraction of the cost $200 not that that has ever happened um and although that $200 smartphone might be extremely popular it would not be true to say the more costeffective smartphone beats the iPhone in overall capability myth number four deep seeks models are directly comparable to all other AI models not really we need to make apple to Apple comparisons what that means is I made this super fancy table um to better explain um the differences so for deep seek their base model is V3 and that's comparable to Chachi BT's 40 deep se's reasoning
model is R1 and what's crazy about that is we were actually able to use search along with the R1 reasoning model because up until now Chachi BT's equivalent 01 did not have that additional search function however as as of the making of this video chat GPT released 03 mini and we can use search with this new reasoning model forgive me for the silly analogy here but it's like comparing a sports car to an SUV um they're both vehicles but you buy them for completely different reasons uh you buy one of them um when you're facing
a midlife crisis and you buy the other when you're trying to overcompensate for something I'll let you decide which one's which but the takeaway here is we need to make apple to Apple comparisons and not Apple to oranges myth number five deep r1's visible Chain of Thought is a technical breakthrough uh just to make sure we're all on the same page uh when using deep seek R1 we can see the Chain of Thought the thinking process uh whereas previously we couldn't with chat gbt o1 but the fact that we can see r1's reasoning process is
actually just a UI choice and not a technical Innovation meaning uh both R1 and open AI 01 have similar reasoning capabilities the key difference being that deep seek chooses to show us the users uh the models thinking process and although this transparency has proven very popular with users it's related to the presentation and not the capability it's like Two Chefs making the exact same Dish One in a closed kitchen and one at a demonstration counter the process and result are the same but watching the chef work adds to the experience myth number six deep seek
built everything from scratch I can finally explain the joke from earlier so deep seek allegedly used a process called Model distillation and I say allegedly because this is very hard to prove uh we're not going to dive into the technical details here but basically distillation means deep seek took Chachi PT's outputs and trained their models on those outputs um it's not illegal a lot of AI companies are doing it but it clearly breaks open AI terms of service um here's a bit of trivia I found interesting although Microsoft and open AI are investigating deep seek
right now Microsoft in the meantime has already added R1 to its Cloud offerings I just find that pretty funny the analogy here is if a phone manufacturer wanted to replicate the exact way the iPhone processes its images instead of copying or stealing the code they could just analyze thousands or millions of iPhone photos to teach their own system how to you know match that style and quality myth number seven using deep seek is automatically unsafe it really depends how you define unsafe um if you use the native deep seek app either through web or mobile
your data is sent to and stored in China if that's not cool with you there are two workarounds first you can use platforms like perplexity or Venice AI to access deep seeks models while keeping data in the US pick your poison um I should also note that in the upcoming days weeks months uh we can expect more and more platforms to add deep seek models as an option for example cursor just added it today and the reason they're all doing this is because deep seeks models are extremely cheap if you want to be fully private
you can run deep seeks models locally on your desktop or laptop through applications like olama or LM Studio myth number eight this kills nvidia's business not really a lot of tech analysts and the CEO of Microsoft believe in javon's Paradox uh which in this context says that more efficient AI like deep seek is likely to increase overall demand for AI Solutions basically as the cost for something decreases usage increases so in the long run there might be even more demand for nidus chips um it's like when we got cheaper smartphones right it actually increased demand
for premium phone processors like Qualcomm chips because more people entered the market myth 9 this is terrible for US tech companies so unfortunately or fortunately depending on how you feel about these companies this is actually a huge win for some of them um in the long run for example Amazon they've always struggled to build their own models um open has PT Google has Gemini anthropic has Claude meta has llama right Amazon doesn't have a leading model but now it doesn't matter as much because they can serve these high quality open source models like deep seeks
at lower costs for their customers for Apple they can leverage their apple silicon trip advantages for something um called Edge inference I'll explain this in a bit uh meta is arguably the biggest winner of them all because every aspect of meta's business U for example their advertising business benefits from Ai and cheaper inference means they can monetize their products more effectively uh for those of you who don't know what inference means basically it's when AI takes what it learned in the training phase and applies it to a new sit new situation for example when we
study we study on textbooks right we train ourselves on textbooks and we apply what we learn to the test questions in front of us that we've never seen before that's inference um and this is my favorite analogy by far just as cheaper smartphones and cheap internet and faster internet speeds uh enabled companies like uber and Instagram to exist in the first place cheaper AI could enable new products and services and US tech companies for better for worse are in the best position to capitalize on this opportunity myth 10 this is China's Sputnik moment in AI
so Sputnik was a moment in history where the USSR demonstrated capabilities that the US back then actually did not have and obviously the USSR kept their methods a secret but if we take a look at what uh deep deep seek has accomplished they published their methods openly um they achieved effic efficiency improvements that were already expected they demonstrated Innovation within existing technological Frameworks so instead of comparing this moment to Sputnik uh a lot of Industry experts say it's much closer to Google's 2004 moment when they publicly showed how to build more efficient infrastructure so back
then Google didn't have to do this but they showed everyone you you didn't need expensive mainframes to build a super computer cluster and here deep seek is showing us we don't necessarily need the most powerful chips to achieve competitive results all right moving on to what this actually means for all of us number one we can now get access to Advan AI features without paying so for users who have not paid for any AI tool they now have access to two powerful reasoning models deep seeks R1 and Chach PT's 03 mini um and as a
reminder reasoning models are great at complex math problems programming challenges and step-by-step reasoning tasks implication number two if privacy is a concern protect your data I touched on this earlier but if you are concerned about how deep seek handles your data you have two options number one instead of using the native web and mobile apps you can use perplexity Venice a cursor and other platforms uh that will for sure start integrating deep seek into their offerings or number two if you don't want your data store anywhere uh except locally of course you can run deep
seeks models through LM Studio or o Lama the caveat here is you probably can't run the more powerful models since you'll be Hardware constrained implication number three make smart choices about switching so I'm going to apologize in advance for ranting a little bit here but we should not switch or change the way we do things just because something is trending like for example let's say you're using the todoist app for task management okay and someone tells tells you hey you should switch to tictic another to-do app because they just announced this AI feature like no
that's not how this works you've invested time and effort to set up to doist in a way that you like so unless tic Tick's AI feature makes a meaningful difference to your workflow you're better off staying put and not paying the switching tax it's also likely that todoist will build that AI feature down the road so yeah only switch and use deep seek if it provides clear advantages for your specific needs uh if you're a developer and you want to minimize your costs go for it if you're an everyday user you you already pay for
chbt and you care about where your data is stored then no to be clear this has been an incredible achievement by the Deep seek team and although the implications for the US tech companies and stock market can be debated there is no question that this has been a massive win for us the users case in point I genuinely don't think open aai would have made 03 mini available for free had it not been for deep seek that being said I don't recommend jumping on the hype train without fully understanding the implications because new Cycles like
this one will happen again and again I know this video is quite different from my usual videos so let me know what you think no fancy Graphics but hopefully the concepts were still somewhat clearly explained see you in the next video in the meantime have a great one