AGI: (gets close), Humans: ‘Who Gets to Own it?’

17.09k views3979 WordsCopy TextShare
AI Explained
A 'frontier reasoning model' from just 1000 examples (s1). A $100B Musk bid for power. Gemini 2, Ran...
Video Transcript:
the world may be waking up to the fact that intelligence will be automated sooner than anyone could have imagined a few years ago but it is still sleeping when it comes to who gets the spoils just today the Vice President of America said the AI will never replace workers and only boost productivity then again Sam Alman CEO of open AI wrote just yesterday that he could see labor losing its power to Capital and Rand the famous Think Tank put out a paper just the other day that said that the world isn't ready for the quote
job losses and societal unrest end quote that it thinks might accompany a more General artificial intelligence but even if labor does lose Capital can't decide who gets the money just today musk and Co challenged Sam mman and Microsoft for control of open AI itself and of course there are always papers like this one from Stanford suggesting that the reasoning enhancements needed to bring a model to Frontier capability are achievable for just $20 which makes me think you guys can afford AGI after all meanwhile Dario amade CEO of anthropic makers of Claude says that time is
running out to control the AGI itself I just think that when the day inevitably comes that we must confront the full automation of intelligence I just hope we are a little more unified let's say than we are now there is too much to cover as always so let's just cut it down to the seven most interesting developments using the samman essay as the jumping off point for each first off he gives his fifth or maybe 15th different definition for AGI but this time it's we mean it to be a system that can tackle increasingly complex
problems at human level in many fields well under that definition we are getting awfully close take coding where we heard in December that the O3 model was the 175th highest ranked coder in code forces ELO now that might not mean much to many people but just yesterday in Japan samman said they now have internally the 50th highest scoring competitor we're clearly well beyond imitation learning these systems 01 03 04 they're not copying those top 50 competitors saying coding they are trying things out themselves and teaching themselves through reinforcement learning what works we are not capped
at the human level and that applies to way more than just Cod I've been using deep research from open AI on the pro tier this week to at least suggest diagnoses for a relative and a doctor I know said that it found things that she wouldn't have thought of of course it does hallucinate fairly frequently but it also thinks of things you might not have thought of and remember this is 03 searching maybe 20 sources what about 05 searching 500 and you might say well knowing stuff is cool but white color workers actually take actions
on their computers well Karina nen from open AI has this to say on task they're saturating all the benchmarks and post chaining itself is not hitting the wall basically we we went from like raw data thoughts from from featuring models to infinite amount of tasks that you can teach the model in the post chaining World via reinforcement learning so any task for example like how to search the web how to use the computer how to write a well like all sorts of tasks that you like trying to teach the model all the different skills and
that's why we saying like there's no data wall or whatever because there will be infinite amount of tasks and that's how the model becomes extremely super intelligent and we are actually getting saturated in all benchmarks so I think the bottleneck is actually in evaluations and there's a reason I can believe that even though their current operator system only available on Pro for $200 a month is quite Jank it's because tasks like buying something online or filling out a spreadsheet are mostly verifiable and whenever you hear verifiable or checkable think ready to be absolutely eaten by
reinforcement learning just like domains like code where you can see the impact of enhanced reinforcement learning from 01 preview to 03 next is the investment that must go in to make all of this happen and Sam Alman had this to say later on in the essay the scaling laws that predict intelligence improvements have been accurate over many orders of magnitude give or take the intelligence of an AI model roughly equals the log of the resources used to train and run it so think of that as kexing the resources you put in to get one incremental
step forward in intelligence doesn't sound super impressive until you read the third point and I agree with this point the socioeconomic value of linearly increasing intelligence each increment is super exponential in short if someone could somehow double the intelligence of 03 it wouldn't be worth four times more to me and I think to many people it would be worth way way more than that it would be super exponential he goes on a consequence of this is that we see no reason for the exponentially increasing investment to stop in the near future in other words if
AI will always pay you back tenfold for what you invest in it why ever stop investing many forget this but less than 2 years ago samman himself said that his grind idea is that open AI will capture much of the world's wealth through the creation of AGI and then redistribute it to the people we're talking figures like not just a 100 billion but a trillion or even 100 trillion that's coming from him only adds if AGI does create all that wealth he's not sure how the company will redistribute it to give you a sense of
scale as you head towards 100 trillion you're talking about the scale of the entire labor force of the planet and that of course brings us to others who don't want him to have that control or maybe want that control for themselves as you may have heard about Elon Musk has bid almost 100 billion for open AI or at least it's a bid for the nonprofit which currently controls open aai to save you reading half a dozen reports essentially it looks like Sam wman and open AI have valued that nonprofit stake in open aai at around
$40 billion that leaves plenty of equity left for for Microsoft and open aai itself including its employees however if musk and others have valued that stake at 100 billion then it might be very difficult in court for mman and Co to say it's worth only 40 billion so even if they reject as it seems like they have done M offer it forces them to potentially dilute the stake owned by Microsoft and the employees Alman said to the employees at openai that these are just tax itics to try and weaken us because we're making great progress
the nonprofit behind open AI could also reject the offer because it thinks that AGI wouldn't be safe in the hands of musk at this point I just can't resist doing a quick plug for a mini documentary I released on my patreon just yesterday it actually covers the origin stories of Deep Mind open AI the tussle with musk and anthropic and how the founding vision of each of those AGI Labs went arai this time by the way I used used a professional video editor and the early reviews seem to be good all the shenanigans that are
going on with the nonprofit at open aai seem worthy of an entire video on their own so for now I'm going to move on to the next Point samman predicted that with the Advent of AGI the price of many Goods will eventually fall dramatically seems like one way to assuage people who lose their job or see their wages drop is that well at least your TV is cheaper but he did say the price of luxury goods and land may rise even more dramatically now I don't know what you think but I live in London and
the price of land is already pretty dramatic so who knows what it will be after AGI but just on that luxury goods point I think samman might have one particular luxury good in mind yesterday in London samman was asked about their Hardware device designed in part by Johnny IV from Apple and he said it's incredible it really is I'm proud of it and it's just a year away yes by the way I did apply to be at that event but you had to have certain org IDs which I didn't one thing that might not be
a luxury device are smaller language models in leak audio of that same event he apparently said well one idea would be we put out 03 and then open source 03 mini we put out 04 and open source 04 Min he added this is not a decision but Direct directionally you could imagine us saying this take all of that for what it is worth the next jumping off point comes in the first sentence actually of this essay which is that the mission of open AI is to ensure that AGI benefits all of humanity not that they
make AGI that they make an AGI that benefits all of humanity now originally when they were founded which I covered in the documentary the charter was that they make AGI that benefits all of humanity unencumbered by the need for a financial return but that last bit's gone but we still have that it benefits all of humanity not most of humanity by the way benefits all of humanity I really don't know how they are going to achieve that when they themselves admit that the vast majority of human labor might soon become redundant even if they somehow
got implemented a benevolent policy in the US to make sure that everyone was looked after How Could You ensure that for other nations after watching Yoshua Benjo one of the Godfathers of I and I'll show you the clip in a second I did have this thought it seems to me if a nation got to AGI or superintelligence 1 month 3 months 6 months before another one it's not the most likely that they would use that advantage to just wipe out other nations I think more likely would be to wipe out the economies of other nations
the US might automate the economy of say China or China the US and then take that wealth and distribute it amongst its people and yosua Benji thinks that that might even apply at the level of companies I can see from the Declarations that are made and you know what you know logically these people would do is that the people who control these uh systems like like say open AI potentially um they're not going to continue just selling the access to their AI um they're going to give access to you know a a a a lower
gr AI they're going to keep the really powerful ones for themselves and they're going to build companies that are to compete with the Noni you know systems that exist and they're going to basically Wipe Out the economies of all the other countries which don't have these system don't have these super intelligence systems so it's it's you you know you you say it's you wrote It's not existential but I think it is existential for uh countries who don't build up to this kind of level of um Ai and um it's it's an emergency because it's going
to take at least several years even with with the a coalition of the willing to to bridge that and just very quickly I can't help because he mentioned competitor companies to mention Gemini 2 Pro and Flash from Google new models from Google Dem mind there's also of course Gemini thinking which replicates the kind of reasoning traces of say o03 mini or deep seek R1 now straight off The Benchmark results of these models are decent but not stratospheric for the most part we're not talking 03 or deep seek R1 levels on simple bench we're rate limited
but it seems like the scores of both the thinking mode and Gemini 2 Pro will gravitate around the same level as the quote Gemini experimental 126 but I will say this I know it's kind of Niche Gemini is amazing at quickly reading vast amounts of PDFs and other files no its transcription accuracy of audio and I've tested isn't going to be at the level of say assembly Ai and no its coding is no 03 and its quote deep research button is no deep research but the Gemini series are great at extracting text from files and
they are incredibly cheap so I'm quite impressed and I do suspect as chbt just recently overtook Twitter to become the sixth most visited site and slowly starts closing in on Google that Google will invest more and more and more to ensure that Gemini 3 is state-of-the-art next man wrote about a likely path that he sees is AI being used by authoritarian governments to control their population through through Mass surveillance and loss of autonomy and that remark brings me to the Rand paper that for some reason I read in full because they're worried by not just
Mass surveillance by authoritarian dictatorships but other threats to quote National Security Wonder Weapons systemic shifts in power kind of talked about that earlier with say China automating the economy of the US non-experts in powered to develop weapons of mass destruction artificial entities with agency think 06 kind of coming alive and instability this is Rand again which has been around for over 75 years and is not known for dramatic statements again I would ask though that if the US does a quote large National effort to ensure that they obtain a decisive AI enabled wonder weapon before
China say 3 months before 6 months before then what are you really going to use it to then disable the tech sector of China for me the real admission comes towards the end of this paper where they say the US is is not well positioned to realize the ambitious economic benefits of AGI without widespread unemployment and accompanying societal unrest and I still remember the days when ultman used to say in interviews just around 2 years ago he said stuff like if AGI produces the kind of inequality that he thinks it will people won't take it
anymore let's now though get to some signs that AGI might not even be controlled by countries or even companies for less than $50 worth of compute time of course not counting research time but for around apparently $20 worth of compute time affordable for all of you guys Stanford produced S1 now yes of course they did utilize an openweight base model quen 2.5 32 billion parameters instruct but the headline is with just a th000 questions worth of data they could bring that tiny model to being competitive with 01 this is in science gpq and competition level
Ma mathematics the key methodology was well whenever the model wanted to stop they forced it to continue by adding waight literally the token waight multiple times to the model's generation when it tried to end imagine you sitting an exam and every time you think you've come to an answer and you're ready to write it down a voice in your head says wait that's kind of what happened until the student or you had taken a set amount of time on the problem appropriately then this is called test time scaling scaling up the amount of tokens spent
to answer each question I've reviewed the questions by the way in the math 500 Benchmark and they are tough so to get 95% at least the hard ones the level five ones are so to get 95% in that is impressive likewise of course to get Beyond 60% in GP QA Diamond which roughly matches the level of phds in those domains to recap this is an off-the-shelf open weights model trained with just a thousand questions and reasoning tra es there were some famed professors in this Stanford team and their goal by the way was to replicate
this chart on the right which came in September from openai now we kind of already know that the more pre-training you do and posttraining with reinforcement learning you do the better the performance will be but what about time taken to actually answer questions test time compute that's the chart they wanted to replicate going back to the S1 paper they say despite the large number of 01 replication attempts none have openly replicated ated a clear test time scaling behavior and look how they have done so I'm going to simplify their approach a little bit because it's
the finding that I'm more interested in but essentially they sourced 59,000 tough questions physics olympiads astronomy competition level mathematics and AGI eval I remember covering that paper like almost 2 years ago on this channel they got Gemini thinking the one that outputs thinking tokens like deep seek R1 does to generate reasoning traces and answers for each of those 59,000 examples now they could have just trained on all of those examples but that did not offer substantial gains over just picking a thousand of them just a thousand examples in say your domain to get a small
model to be a true Reasoner then of course get it to think for a while with that weight trick how to filter down from 59,000 examples to 1,000 by the way first decontaminate you don't want any questions that you're going to use to then test the model of course remove examples that rely on images that aren't found in the question for example and other formatting stuff but more interestingly difficulty and diversity this is the kind of diversity that even JD Vance would get behind on difficulty they got smaller models to try those questions and if
those smaller models got the questions right they excluded them they must be too easy on diversity they wanted to cover as many topics as possible from mathematics and science for example they ended up with around 20 questions from 50 different domains they then find tuned that base model on those a thousand examples with the reasoning traces from Gemini and if you're wondering about deep seek R1 they fine-tuned with 800,000 examples actually you can see that in this chart on the right here again it wasn't just about fine-tuning each time the model would try to stop
they said wait sometimes two four or six times to keep boosting performance basically it forces the model to check its own output and see if it can improve it notice weight is fairly neutral you're not telling the model that it's wrong you're saying wait maybe do we need to check that they also tried scaling up majority voting or self-consistency and it didn't quite have the same slope suffice to say though if anyone watching is in any confusion getting these kind of scores in GP QA Google proof question and answer and competition level mathematics it's insane
incredibly impressive of course if you took this same model and tested it in a different domain it would likely perform relatively poorly also side note when they say open data they mean those thousand examples that they fine tune the base model on the actual base model doesn't have open data so it's not truly open data as in we don't know everything that went into the base model everything that quen 2.5 32 billion parameters was trained on interestingly they would have gone further but the actual context window of the underlying language model constrains it and karpathy
in his excellent chat gbt video this week talked about how it's an open research question about how to extend the context window suitably at the frontier it's a 3 and 1 half hour video but it's a definite recommend from me actually speaking of Kathy his reaction to this very paper was cute idea reminds me of let's think step by-step trick that's where you told the model to think step by step so it spent more tokens to reason first before giving you an answer here by saying wait we're forcing the model to think for longer both
lean he said on the language prior to steer the thoughts and speaking of spending your time well by watching a Kathy video I would argue you can spend your money pretty well by researching which say charity to give to through give well they are the sponsors of this video but I've actually been using them for I think 13 years they have incredibly rigorous methodology backed by 60,000 Plus hours of research each year on which Charities save the most lives essentially the one that I've gone for for actually all of those 13 years is the against
malaria Foundation I think started in the UK anyway do check out givewell the links are in the description and you can even put in where you first heard of them so obviously you could put say AI explained but alas we are drawing to the end so I've got one more point from the saman essay that I wanted to get to in previous essays he's talked about the value of Labor going to zero now he just talks about the balance of power between capital and labor getting messed up but interestingly he adds this may require early
intervention now openai have funded studies into Ubi with let's say mixed results so it's interesting he doesn't specifically advocate for Universal basic income he just talks about early intervention then talks about compute budgets and being open to strange sounding ideas but I would say if AGI is coming in 2 to 5 years then the quote early intervention would have to happen say now I must confess though at this stage that I feel like we desperately need preparation for what's coming but it's quite hard to actually specifically say what I'm advocating the preparation be then we
get renewed calls just today from the CEO of anthropic Dario amade about how AI will become a country of geniuses in a data center possibly by 2026 or 2027 and almost certainly no later than 2030 he said that governments are not doing enough to hold the big AI labs to account and measure risks and at the next International Summit there was one just this week we should not repeat this missed opportunity these issues should be at the top of the agenda the advance of AI presents major new Global challenges we must move faster and with
greater Clarity to confront them I mean I'm and I think many of you are that change is coming very rapidly and sooner than the vast majority of people on the planet think the question for me that I'll have to reflect on is well what are we going to do about it let me know what you think in the comments but above all thank you so much for watching to the end and have a wonderful day
Related Videos
Elon Musk Is Insecure, Not Happy: Sam Altman Interview
5:30
Elon Musk Is Insecure, Not Happy: Sam Altm...
Bloomberg Podcasts
16,415 views
LSTM: The Comeback Story?
1:07:02
LSTM: The Comeback Story?
Machine Learning Street Talk
1,492 views
PowerBeats Pro 2 Review: Still Better than AirPods!
10:10
PowerBeats Pro 2 Review: Still Better than...
Marques Brownlee
823,828 views
FULL REMARKS: JD Vance Puts European Leaders On Notice About Trying To Regulate U.S. Tech Giants
15:50
FULL REMARKS: JD Vance Puts European Leade...
Forbes Breaking News
739,991 views
The Storm Track Is About To SHIFT Dramatically...
15:29
The Storm Track Is About To SHIFT Dramatic...
Ryan Hall, Y'all
567,718 views
Ranking Paradoxes, From Least to Most Paradoxical
25:05
Ranking Paradoxes, From Least to Most Para...
Chalk Talk
188,096 views
Cathie Wood's Big Ideas 2025 Recap
18:58
Cathie Wood's Big Ideas 2025 Recap
ARK Invest
21,380 views
All the Official NFL Super Bowl Commercials 2025
22:47
All the Official NFL Super Bowl Commercial...
Mining Asteroids
1,469,723 views
Not ANOTHER Jurassic World Movie?!
6:05
Not ANOTHER Jurassic World Movie?!
The Critical Drinker
526,729 views
AMD NASes give us everything we ever wanted
24:07
AMD NASes give us everything we ever wanted
Jeff Geerling
145,423 views
What if all the world's biggest problems have the same solution?
24:52
What if all the world's biggest problems h...
Veritasium
2,551,678 views
‘Irony died while he was talking’: Nicolle Wallace on Elon Musk’s takeover of Oval Office 
12:24
‘Irony died while he was talking’: Nicolle...
MSNBC
153,774 views
Secretary Of Defense Pete Hegseth Takes Question After Question From Reporters On Trip To Germany
14:30
Secretary Of Defense Pete Hegseth Takes Qu...
Forbes Breaking News
37,487 views
Poland's president reacts to Trump's claim Ukraine 'may be Russian someday'
7:40
Poland's president reacts to Trump's claim...
CNN
202,738 views
OpenAI Chairman on Elon Musk Bid and the Future of AI Agents | WSJ
29:05
OpenAI Chairman on Elon Musk Bid and the F...
WSJ News
10,619 views
Eric Schmidt and Craig Mundie with David Rubenstein: Genesis
55:10
Eric Schmidt and Craig Mundie with David R...
The 92nd Street Y, New York
5,518 views
NEW: Elon Musk On The Future Of Warfare
30:42
NEW: Elon Musk On The Future Of Warfare
Farzad
622,159 views
AI summit: USA sharply criticizes EU and refuses to sign | DW News
7:06
AI summit: USA sharply criticizes EU and r...
DW News
3,038 views
Why Anthropic's Founder Left Sam Altman’s OpenAI
13:58
Why Anthropic's Founder Left Sam Altman’s ...
Fortune Magazine
156,619 views
Top U.S. & World Headlines — February 11, 2025
12:37
Top U.S. & World Headlines — February 11, ...
Democracy Now!
212,086 views
Copyright © 2025. Made with ♥ in London by YTScribe.com