DeepSeek Is About to SHOCK THE WORLD With R2 That’s 40X More Efficient Than OpenAI's AI

77.97k views1528 WordsCopy TextShare

AI Revolution

DeepSeek is accelerating the release of its R2 AI model, which is reportedly 40 times more efficient...

Video Transcript:

deep seek is racing to drop its next AI model ahead of schedule undercutting open AI by up to 40 times in cost and shaking up the entire industry meanwhile alibaba's new video AI is already outperforming open AI Sora and Western governments are starting to panic and as open AI rolls out new research tools and voice features one big question looms just how persuasive should AI really be all right first off deep seek has been making headlines in a major way if you remember they launched their R1 model in January and it basically caught everyone off guard R1 was touted as a powerful AI reasoning model and it was training at a fraction of the cost that bigger companies like open AI reportedly invest in their own models some people including Google were skeptical of those claims Google even called Deep seeks statements exaggerated and open AI suspected that deep seek might have used distillation from chat gpt's infrastructure but either way the model was out there and big names like Microsoft added R1 to Azure AI Foundry and GitHub and Amazon web services also featured it in their model catalog now the big story is that deep seek wants to release their successor R2 earlier than they initially planned they had said R2 might launch in early may but new reports suggest they're trying to get it out even sooner so sadly we don't have an exact date but the rumor is that it could be anytime before May the upcoming R2 is is supposed to have improved coding capabilities and be able to reason more effectively in languages Beyond English that's a huge deal because if you think about it a lot of advanced language models tend to revolve around English so having robust multi- bangit support could position deep seek as a serious Global Contender why is deep seek pushing this timeline well GPT 4. 5 is still weeks away and GPT 5 might not arrive for months so if R2 hits the market soon deep seek could once again shake things up in the AI ecosystem and they've already proven that they can undercut open AI by a wide margin in terms of pricing according to analysts at Bernstein deep seeks pricing can be 20 to 40 times cheaper than what open AI charges for comparable performance that cost-saving aspect has drawn in not just small Enterprises but also major players wanting to integrate R1 into their offerings now to really understand what makes deep sea tick you need to know a bit about its founder leang Wen Fung he's been described as super introverted and low-key but he became a billionaire thanks to his quantitative hedge fund highflyer the vibe is that he runs deep seek more like a research lab than a classic for-profit startup he even pays employees top tier salaries some senior data scientists earn 1. 5 million un annually when other rival Quant funds usually cap around 800,000 un he's also known for having having a flatter corporate structure which is pretty different from the typical Chinese Tech Giant model with 9:00 a.

m. to 9:00 p. m.

6 days a week instead people report working normal 8-hour days in a pretty collaborative Hands-On environment high flyer the hedge fund behind all this actually poured a ton of money into AI research way before R1 made headlines they spent around one and a half billion yuan on two supercomputing AI clusters in 2020 and 20 21 one of these clusters Firefly 2 consists of about 10,000 Nvidia a100 chips this was before the US banned the export of those chips to China so by the time that ban rolled around highflyer was already set that gave them a big Advantage the key to deep seeks coste effectivess is its use of techniques like mixture of experts and multihead latent attention MLA INE you basically divide the mod model into specialized expert components so it doesn't have to tap the entire model for every single query meanwhile MLA means the model can process different parts of an input simultaneously picking out the most important details more efficiently as a result deep seek claims it can achieve performance on par with bigger more expensive models without breaking the bank Chinese authorities interestingly are fully behind deep seek we're seeing Municipal governments energy companies and big corporations like Lenovo BYU and tensent all integrate deep seek into their products the government is even telling deep seek to keep a low profile in international media meanwhile some Western governments like South Korea and Italy have restricted or removed deep seek based apps over privacy concerns there are also broader fears that advanced AI models could be used for social engineering or misinformation campaigns so it's not surprising that scrutiny is intensifying in certain regions but it's not just deep seek making moves Alibaba recently announced their open-source video Foundation model one 2. 1 which is reportedly outperforming another open aai model called sora on certain benchmarks alibaba's new offering includes multiple submodels optimized for text to video image to video video editing text to image and even video to audio they have one 2. 1 I2 v4b and one 2.

1 T2 v4b which can both generate videos at 480p and 720p plus a smaller t2v 1. 3b model that can run on consumer grade gpus like an RTX 490 according to Alibaba W 2. 1 can handle complicated motions and realistic physics simulations and it's posted some excellent metrics on the V bench leaderboard part of the secret sauce is a novel 3D causal vae architecture with a feature cache mechanism for Speed plus a flow matching framework within a diffusion Transformer structure in short they threw a lot of Advanced Techniques into the pipeline training on about 1.