[Music] so Microsoft is making some big moves with its AI strategy reshuffling its internal structure and releasing an impressive new open- source model on hugging face we'll break down the key announcements from SAA nadela and the team dive into the new core AI platform and tools Division and cover the news about Microsoft's 54 model going fully open source there's a lot to unpack so let's get into it first let's talk about the reorganization if you've followed Microsoft for a while you know they've been pivoting toward AI for the past few years but in early 2025
SATA Nadella who's the chairman and CEO of Microsoft spelled it out bluntly we're entering the next Innings of this AI platform shift and everything's about to move at lightning speed like 30 years of tech change condensed into just three so he's basically saying that in 2025 every single layer of the application stack is going going to be impacted by AI it'll be akin to someone inventing a new graphical user interface plus new internet servers plus Cloud native databases all rolled out almost at once so to keep up with this Paradigm Shift Microsoft's forming a brand
new engineering group called core AI platform and tools this group Blends together several crucial parts of the company their developer division which handles Visual Studio GitHub and so on the AI platform team think Azure AI Foundry and certain teams from from the office of the CTO including folks working on AI supercomputer projects AI agentic run times and something called engineering Thrive bringing all these teams Under One Roof means Microsoft can build an endtoend co-pilot and AI stack that both Microsoft itself and external customers can use essentially we're talking about the creation of agentic apps apps
with memory entitlements and what they call Action space all powered by big AI models Jay Peri formerly an engineering Chief over at meta is the new Executive Vice President in charge of this corei platform and tools division you might recall that Microsoft hired him back in October and this is his first Big Splash since joining he'll have people like Eric Boyd head of AI platform Jason Taylor Deputy CTO of AI infrastructure Julia leoson the leader for Microsoft's developer Division and Tim Bozarth leading developer infrastructure all reporting to him this shift really signals that the entire
developer division which used to revolve around traditional coding tools is now pivoting firmly into AI in fact part of their mission is to power GitHub co-pilot which is already a big product that helps developers with AI assisted coding by merging GitHub co-pilot's team with the main AI platform teams Microsoft wants a tighter feedback loop essentially if something's working in co-pilot they can funnel that Insight back into the AI stack for other applications Satia nadela also emphasized that Azure is going to become the infrastructure for AI and then everything else like Azure AI Foundry GitHub VSS
code will be built on top of that Microsoft's broader vision is an AI first appstack that changes how every SAS application is developed we're seeing new run times for agent orchestration reimagined management layers and NextGen user interfaces that revolve around Ai and chat interaction developer productivity is going to look totally different and that's why they've chosen to unify all these divisions under one big AI umbrella now shifting gears a bit Microsoft also made Waves by open sourcing an OP a powerful language model called 54 it first appeared last month but back then it was only
available through Microsoft's Azure AI Foundry platform now however they've dropped the entire model Waits and all onto hugging face under the M license that's a really big deal if you follow open source AI because most Large Scale Models out there either aren't fully open or come with restrictive licenses 54 on the other hand is fully out there and it's apparently smashing a bunch of benchmarks even outperforming some bigger name models in areas like advanced math and cod so what's special about 54 for starters it's a 14 billion parameter model much smaller than those massive 70b
or 100b Giants that was trained on 9.8 trillion tokens of both curated and synthetic data Microsoft reports top tier results on tests like math mgsm human eval and other Advanced reasoning challenges they even say 54 sometimes beats GP pt40 a variant of GPT 40 on graduate level stem tasks one factor that likely helps is 54 specialized Math training to handle stepbystep problems more reliably they also expanded its context window to 16,000 tokens so it can handle longer text inputs like extensive code files or research articles 54 is a dense decodeon Transformer but Microsoft put tons
of effort into cleaning and generating its training data they used synthetic data generation plus curated organic data and finished it off with Advanced post training methods supervised fine-tuning sft direct preference optimization DPO and something called pivotal token search essentially they identify the specific tokens that make or break a solution then teach the model to pick the right ones at those critical moments another major concern is hallucination when the model fabricates fat to tackle this Microsoft built in a refusal mechanism if the model isn't sure about an answer it'll hold back rather than guess they tested
this on a data set called Simple QA think obscure trivia and the new approach leads to more I'm not sure responses instead of random misinformation comparisons to other bigname models think Google meta anthropic are inevitable yes 14b parameters is smaller but Microsoft says V4 can hold its own or even outperform larger Alternatives in certain niches like coding and higher level math the size difference also means less overhead great if you don't have massive GPU farms and because F4 is fully open- sourced under MIT you can adapt or fine-tune it without license worries in their technical
report Microsoft notes it excelled on Fresh math competitions like the November 2024 AMC to confirm it's not just memorizing old data they also tested coding tasks such as human eval plus dot if you're a developer a fast relatively compact model that still knocks out tough problems is pretty appealing still it's not perfect 54 can stumble on extremely detailed instructions or less common facts Microsoft also acknowledges the persistent risks of bias or harmful outputs even after repeated red teaming tests no generative AI system is fullprof proof after all this release ties into Microsoft's broader AI push
under the new core AI platform and tools group led by J peric Microsoft is unifying its developer tools and AI infrastructure they're aiming for an endtoend co-pilot and AI stack and 54 is part of that strategy smaller But Mighty models that are fully open that SAA nadela wants Microsoft's internal boundaries to vanish in the eyes of users so everything is integrated under a one Microsoft Banner the idea is to make agentic AI apps as routine as opening Visual Studio once was so that's the story a big reorg at Microsoft all about AI plus this major
open source release of fi4 a 14b per Mega per model you can now freely explore on hugging face if you do check it out just be mindful of its limitations especially for high stakes use but hey that caution applies to any AI tool these days what do you think will combining Dev and AI teams at Microsoft supercharge Innovation is open- sourcing top tier models the best path for AI let me know your thoughts and if you enjoyed this overview feel free to share it with anyone who loves AI thanks for watching and I'll see you
in the next one