DeepSeek Just CRUSHED Big Tech AGAIN With JANUS PRO - New SHOCKING AI Model!

15.51k views2225 WordsCopy TextShare

AI Revolution

DeepSeek has released Janus Pro, a groundbreaking multimodal AI model that challenges industry giant...

Video Transcript:

so deep seek has been grabbing headlines for a couple of reasons first they just released a new multimodal AI model family called Janice Pro this model supposedly beats open AI Dolly 3 and some other big names like Pixar Alpha and emu3 gen on benchmarks like gen evl and DPG bench if you know anything about AI benchmarks you know those are important for measuring how good models are at tasks like image generation and image understanding and here's the kicker the biggest version of Janice Pro known as Janice Pro 7B seems to outperform those well-known models at

least according to deep se's own internal tests now just days before releasing Janice Pro deep seek made headlines with their R1 language model that's the one that got people worked up because it apparently matched 0 one's performance but get this while costing only around5 or6 million to develop compare that to the billions that big AI labs in Silicon Valley are spending it's no wonder the entire industry started wondering are we all overpaying for AI development could the next big breakthroughs be coming from smaller outfits with fresh ideas on how to train these systems since deep

seek is based in Hong Joo China you can imagine the political and economic angles here there's talk about how us export controls on Advanced chips particularly from Nvidia are meant to slow down Chinese AI progress yet deep seek claims they used nvidia's h800 chips for training which are technically less powerful than the super high-end chips blocked by the us but they still achieved 01 like results that's a huge statement that calls into question all these big expensive strategies used by American Labs let's pause to address a little Fiasco that happened recently deep seek apparently got

hit by a Cyber attack right as everyone started flocking to their AI assistant app this was around the same time it reached the top of the Apple app store's free applications list in the United States so you can picture this scenario a new AI releases quickly becomes super popular the website crashes then they announce a temporary limit on registrations due to a hacking attempt it's dramatic right the hype is real but clearly it's also attracting some unwanted attention Okay back to the Janice pro model deep seek touts it as a unified Transformer architecture that can

do all sorts of things image generation up to 768 x 768 resolution image analysis and text-based tasks that's a big deal because a lot of AI models specialize in just one thing like text generation or image generation ganis Pro is going for that allinone approach it's similar to how GPT 4 can look at images like GPT 4 Vision but here we've got an entirely open source approach deep seek put the models code and weights up on hugging face for anyone to download right away that's in start contrast to companies like open AI that keep everything

behind closed doors and proprietary apis now how good is Janice Pro really the model is offered in different sizes from 1 billion parameters all the way up to 7 billion the 7B version is the flagship and apparently the one that competes with Dolly 3 the user Community has tested it in a bunch of ways for example people tried analyzing images with Janice Pro to see how accurately it could describe objects relationships between them and any implied meanings it did well at describing straightforward things like the position of objects or their appearance but it kind of

fell short when deeper reasoning was required think about an illustration meant to convey a metaphor Janice Pro basically gave a literal description but didn't interpret the symbolic message gp4 Vision on the other hand was able to pick up the deeper meaning on the image generation side Janice Pro can produce decent images but might struggle in certain areas like overall sharpness or artistic flare compared to specialized state-of-the-art image models that are constantly being fine-tuned by huge communities like stable diffusion's various fine tunes Janice Pro's Advantage seems to be versatility more than absolutely top tier visuals one

interesting test someone prompted Janice Pro and standard sdxl for a cute baby fox in an Autumn scene and Janice Pro did nail the baby part better but sdxl gave a crisper more detailed image so it's a trade-off JIS Pro was more faithful to The Prompt while sdxl was better at generating a polished look a crucial point is that deep seeks approach to open- sourcing the model means the community can possibly fine-tune it to higher quality we've seen that happen with other open source models people out there can Tinker apply specialized data sets improve the code

and basically push the model to new heights deep seeks official space on hugging face is apparently not active yet so some individuals have set up their own spaces to let others test Janice 7B just be careful that you're actually using the 7B version and not some smaller 1.5b build mislabeled as 7B because that's definitely going to lead to disappointment now let's discuss the Meltdown that happened in the stock market as soon as deep seeks success story became headline news especially the part about building a gp4 like system with a fraction of the cost tech stocks

took a hit nvidia's shares reportedly plummeted causing a huge dip in market value like $600 billion do in a single day which is enormous the logic here is that if you don't need the most Cutting Edge chips to train a top tier AI model maybe nvidia's Unstoppable growth path is not so Unstoppable after all people started questioning whether the AI investment arms race is misguided if a chinese startup can replicate results at a tenth of the usual cost Sam mman CEO of open AI chimed in on social media basically saying he's impressed by Deep seeks

achievements but that open AI plans to respond with even better models he also said that if anything they're going to invest more in Computing resources in other words open AI is not backing down from its big spending approach remember they have Partnerships with Microsoft who's poured billions into open ai's ecosystem plus they're planning massive data center expansions another interesting twist came from the White House president Trump yes that President Trump commented that the release of deep seek AI from a Chinese company should be a wakeup call for our industries that we need to be are

focused on competing to win he talked about unleashing American tech companies to ensure the US continues to dominate the field that's a pretty big statement especially amid the ongoing debates about restricting chip exports to China but then you have this Chinese company just sidest stepping those restrictions by using whatever resources are still available to them that's definitely fueling the conversation about how well these export controls are really working deep seeks background is also kind of mysterious they've only been around since 2023 and are headquartered in hongo big Chinese players like Buu released large language models

a while ago but no Chinese model up until now has caught the attention of the US Tech Community quite like deep seek some critics worry about possible security risks the question arises could deep seek be closely tied to the Chinese government in ways that compromise user data or lead to censorship some folks have reported that deep seeks AI assistant won't answer questions about the Chinese government or president Xi Jinping that's led to speculation about how open or free it actually is when it comes to certain topics nevertheless in just a couple of weeks deep seeks

AI assistant soared to the top of Apple's App Store in the US surpassing even chat GPT as the toprated free application think about how Wild that is they basically grabbed the AI chatbot Crown in a matter of days the surge in popularity caused such high demand that it triggered serious outages on deep seeks site they had to fix issues with their application programming interface and user logins reportedly their longest downtime in about 90 days although sudden viral apps often experience these short-term meltdowns it definitely shows There's real Market interest something else that's shaking investors the

assumption that you need billions of dollars and thousands of the absolute best Nvidia chips to train competitive AI might be wrong at least that's what deep seek is suggesting and when the viral apps went there using their xlaa passwords like rice crisps that are customizable emits open AI is not the only one in the crosshairs anthropic meta Google Amazon they've all allocated massive budgets for AI R&D and infrastructure Microsoft meta alphabet Amazon and Oracle alone are poised to spend around $310 billion in 2025 part of which is for AI data centers meanwhile open aai teased

an ey popping plan to spend up to $500 billion to build out a Global Network of data centers but if deep seek can do it for under $10 million maybe these spending sprees are Overkill now let's be fair some people doubt deep seeks numbers the company said they only spent about $5.6 million on training their V3 model but that's just the final training Pass that might not reflect all the prior experiments and data curation that went into it still even if the total cost was 2 three or five times that it might still be significantly

lower than what we hear from American Tech Giants and that's a big question how is that even possible deep seek cites new training techniques such as methods that let the model Focus only on the most relevant sections of data at a given time thereby saving a lot of computing resources they also say they used opsource projects from Alibaba and meta as a springboard fine-tuning them to create their final product not everyone is thrilled that they essentially piggybacked on open- Source Frameworks from the west but that's how open source Works once the code is out there

any capable group can adapt it within meta there's apparently some frustration people are like we have thousands of the world's best researchers and a ton of money so how did we get beat to a big Breakthrough by a smaller company with fewer resources Mark Zuckerberg has been a proponent of Open Source releasing models like llama and ironically llama might have helped deep seek get to where they are faster we also have folks like UC Berkeley's Stuart Russell saying this arms race to reach artificial general intelligence is more dangerous than say the Space Race because we're

pushing towards potentially super intelligent systems we don't fully control even some AI company CEOs themselves have hinted at existential risks so it's defin raising eyebrows that a relatively unknown player is accelerating the timeline at the end of the day deep seek is now a name everyone's talking about they're giving us Janice Pro for multimodal tasks image Generation image analysis and text based conversation plus their R1 model competing with gp4 on reasoning it's all open source so who knows what improvements the community will add meanwhile the bigger question is will this Force big Tech to change

course and focus on on more efficient cost official techniques open AI Sam Altman basically said we'll release better stuff soon don't worry while also doubling down on the importance of enormous Computing resources meta hinting that you still need all that muscle if you're going to serve billions of people worldwide so maybe it's not just about training the model but also about deploying it at a massive scale but one thing's for sure the genie is out of the bottle deep seeks sudden R is a reality check for all the billions poured into AI by West Coast

Giants maybe smaller agile teams can keep Pace if they're clever with their methods and it's not just hype this has real impact on stock prices investment Trends and even how governments think about export controls if a chinese lab can produce GPT level models without top-of-the-line chips it changes the entire conversation around AI dominance so that's the overview pretty wild times in the AI world right we've got this Fresh open source approach from Deep seek that's apparently setting new efficiency standards we have the giant US tech companies recalibrating or at least taking notice we have the

stock market reacting government stepping in and the open- source Community possibly uniting around these new models I'm curious what you all think is deep seek success sustainable or is this just a flash in the pan will we see more small teams out innovating the big players with a fraction of the budgets or will the open Ai and metas of the world always have the upper hand with their colossal resources anyway that's it for this deep dive if you found it informative be sure to give it a like And subscribe for more AI updates let me

know in the comments which angle of the story intrigues you the most thanks for hanging out and I'll catch you in the next one