Gödel's Theorem to Gödel AI: The Blueprint for Self-Learning Machines

450 views4943 WordsCopy TextShare

Discover AI

Gödel’s Mind: How AI Agents Emerged from a Logical Paradox The Gödel Agent, a new AI research paper...

Video Transcript:

hello Community today we're going to talk about some beautiful idea that quit girle had in the year 1931 and he wrote about Fant Sy and now almost a 100 year later we will use some of the ideas of qu girle and we will build an goodle EI agent which is a self- referential and a recursive self-improving EI agent now this is beautiful so here we have him qu good in 1926 imagine 100 years ago and now we are implementing his ideas in artificial intelligence so we go from his two incompleteness theorems to Good's machine and

then to Good's agent let's start as I told you if you would like to read about qu girle and his work on the incompleteness here it revealed an unsettling truth that formal systems any sufficiently complex system contains the truth that are unprovable within its own rules but there was another layer of intrigue because Good's theorem formalized or utilized if you want the concept of self reference in mathematical systems so where a system can refer to itself in ways that create paradoxical situations or reveal limitation of standard logical analysis now if you want to read his

original uh publication unfortunately here you see published on the 1st of December 1931 in mon after mathematic and physic you have to pay 40 bucks to get access here because this is behind a pay wall by Springer but if you're interested I found this here Stanford Encyclopedia of philosophica GD incompleteness theorem in a revised uh form of 2020 real nice if you start this is a good place to start but now let's talk about here the further development let's talk about Good's machine now Deads later after goodle here we have a scientist named J Schmid

huba and J asked himself hey what if a machine like a formal system could refer to itself could modify its own code could it improve not just by external programming by humans but but by rewriting its own logic step by step could it become truly autonomous in its learning and so this was the idea of Google's machine it happened in 2003 and if you want to read your original publication goodle machines self-referential Universal problem solvers making provable optimal self-improvements if you read this paper you see it is about building here a theoretical machine that could

learn how to learn so we're not at all talking about here uh super human intelligence we are just talking about a system can we build a system that can learn by itself and how fast can the system learn so we start very simple imagine machine simple one at first its task is to solve problems like navigating a maze or perform some calculations so it starts with an initial program a set of instruction how to approach this task but this machine has a unique feature it contains a proof Searcher this is a mechanism that allows it

to look at its own code and ask itself if you want a profound question what if I changed this part of myself would it make me better would it improve my own performance and this is without any interaction by human now the main idea was here about Good's machine that the machine follows here R Ru it will only modify itself if it can prove and with absolute mathematical certainty that a modification will improve its ability to perform its task this means to achieve its goal faster cheaper whatever so the complete machine Behavior its complete thinking

its way of analyzing it is grounded in mathematical logic and only when this good machine finds here a valid mathematical proof that a self- modification will lead to a better outcome then it starts to rewrite theoretically its own program and this was the state here of technology in about 2003 and in 2003 this self-referencing capability inspired by girl's original two incompleteness theorum is what made here this goodle machine of 2003 so revolutionary in the thinking of a theoretical construct so it was no longer a machine that just solved problems to it it was a machine

that learned how to improve itself recursively step by step rewriting its own program whenever it could prove that doing so was beneficial the good machine had no master or no Central Intelligence other than pure mathematical logic however there was a major critical fact that limited goodless machine in 2003 they need for a formal logical mathematical proof so this was real close the original Good's IDE deer from 1931 but you know what we are now in 2024 and now we use both ideas and now we built Good's AI agent because we now have a technology that

was not available at the time so when this absolute strict formal mathematical logic meets now EI knowledge now we can build something so EI researcher began to imag an agent and combined the rigor of Good's machine with the creative power of modern AI models llms Vision language models and a system capable of recursive self-improvement but not anymore limited by formal mathematical proofs so we we can here this condition that we have to have a mathematical proof we would just use here the knowledge and the reasoning skill of llms to guide now the self modification imagine

Ma that can not only understand the human language but can write its own code and modify its own behavior dynamically Now understand human language any llm write its own code code llms modify Its Behavior dynamically now this is the new idea so the goodle agent begins here with an initial program a starting point much like the original good machine but it has now an additional tool it uses now an llm to generate new strategy to have some logic insights it can rewrite its own logic its condition its code and thereby improve its problem solving ability

you see a simple idea so at the heart if you want of Go's AI agent is now very simply what we know an llm and the llm allows the agent to think abstract to reason about complex TK and even reflect on its own improvements so you see we weaken here the condition that we we have to have a pure abstract mathematical proof that there is absolutely an improvement in the performance of this machine and now we say hey we have ai we have llms let do llms do the syncing and optimization for this and then

all for themselves easy steps agent begins with a basic approach to solve a problem let's say we optimize here mathematical function or navigate here complex environment we have a feedback loop because the machine or this new agent must interact with its environment to learn so it interacts with its environment and receives feedback based on how well it performs and it uses this feedback to evaluate now the performance of the agent and therefore the performance of its own code base then we have the self- modification oops the S is missing so the agent realizes that its

current strategy could be improved it qu is now the llm asking for suggestion on how to modify its own code and if we have a code llm or if we have whatever cursor then you know ouri system can generate beautiful code optimize code so this let's say coder or llm or Cod llms generates now a better solution a more powerful solution a new piece of code that can replace the current piece of code so what we have we have a dynamic code update so this AI agent tests its own code in its environment and if

the new code performs better it is incorporated into the agent's Behavior if not agent tries again and it's continuously refining this approach you see learning how to learn by trial and error in the interaction with the environment now there's an interesting idea that we can go simply Beyond human coded limits the good Agent represents kind of a new idea because no longer are the AI systems limited to the boundary set by human codee but those AI agents can now create their own path guided only by the knowledge encoded in the lmm whatever the system learned

from the complete internet and then they rely on the feedback in their interaction with the real world so give them enough time Give Them Enough iteration theoretically such system could achieve a level of intelligence and adaptability that could surpass anything that we previously s is possible for machine and they would no longer just solve the problems set before them they would autonomously maybe identify new problems to solve and new ways to improve itself and you see the goodle agent is not just another EI agent but it is kind of a self-evolving system A system that

grows learns and learns how to improve itself beyond the limits of its initial coding and this what makes it so interesting so we have a new publication goodle agent a self-referential framework for agents recursively self-improvement October 6th we have here University of California Santa barara ping University and University of Arizona a beautiful study please have a look at this they introduce your gole agent and give you some beautiful insights now they also show you here the algorithm this is the pido code for recursive self-improvement of goodles agent now I got some question last time when

I showed you some pider code that said I don't get it what can I do to understand it well easy look I just take here this image take an image I put it here into gbd4 Omni and I say explain the code to me and now those are the screenshots from my jet gbd4 and the system explains now what it is line one to line two and this is happening line three line four line five line six you see there you get all the information about how this system is working but we already did here

the explanation together so if you want I just leave this for you to read in detail if you want no problem at all but you know help yourself use AI intelligence to to understand this code based Soto code visualization of a complete inner working of a system great now let's do something together now what is the most complex idea in the goodle machine concept it is if you think about it the Reliance on a formal proof based selfimprovement so this means as I told you the girl machine only modifies itself and it can mathematically prove

that a modification will result in an improvement with respect to a predefined utility function and this utility function is so important because here you define what is Improvement how do I measure it in what direction does it move my dynamic system uh development the good machine is told you uses a proof Searcher to find valid mathematical proofs however in the real world in the complex real world finding such mathematical proofs can be computationally intractable so like optimizing strategy or behavior in a dynamical environment this does not easily L themselves to formal proofs because the space

of the potential modification is is so vast so huge that the impact of modifications may be hard to predict purely mathematically and of course not all beneficial modification can be proven with a formal mathematical system if you have only ristic based Improvement that work well empirically maybe it is so hard to prove formally so this limits you the good machine flexibility as it can only make changes that can be formally verified so good machine is extremely slow and computational extreme expensive but hey how was this now solved in Good's agent easy we are now 20

years after Good's machine we have eii we have the intelligence H the simulated intelligence of AI machines so we replace now the formal mathematical proof with empirical feedback so this means simply instead of relying on strict mathematical proof the goodle agent now leverages empirical feedback from its environment so this is a fundamental shift this is not any more mathematically precise so instead of proving that a modification will improve its utility the goodle agent makes changes and tests them in practice in the real world using simple trial feedback loop and adapt if successful so this means

the agent evaluates in real time modification on the real world performance improvements rather than waiting here for a formal proof like in goodess machine therefore it's much more agile and responsive to Dynamic uncertain environment so you see if we just feed a complex AI agent multi-agent system into this machine my goodness it gets so much more interesting so what we do easy lot language model as our cognitive engines as always in place of formal proof the gal agent uses llms to generate potential modifications and solutions so the llm helps now the agent reason about complex

problems suggest improvements and rewrites its own code dynamically so you see if you have only let's say cursor or whatever you have from Microsoft just to write code or rewrite code now this system is doing more or less the same but on a dynamic iterative level and it doesn't need any human interactions anymore because it is only triggered by its own interaction with the environment now the llms provide a probabilistic reasoning framework that while not strictly mathematical is based on patterns learned from the vast data set of the internet so this allows the agent to

generate informed modifications to test them out and adapt based on the outcome here from the feedback loop this feedback loop are so important now the goodle agent adopts here a recursive Improvement Loop where it modifies Its Behavior based on feedback from the environment so it continuously evaluates its performance and if a modification improves its utility function it is retained if not the agent reverts or tries a different strategy we do not have a human in the loop anymore this continuously iterative process is much faster and much more scalable than a proof based mathematical approach because

it relies on realtime feedback of the system rather than exhaustive purely mathematical proof you see where we're going with this another beautiful features think about is we have now a particular form of error handling because we have rollbacks in the system and one of the Practical key Innovations here in good Agent is its ability to rule back or undo unsuccessful modifications so this simply ensures that agent can experiment with different strategies tried out without the risk of permanent harmful changes to its own architecture to its own weight structure to its own weight tensors so this

nice aror handling mechanism allowed the agent to recover from sub optimal self modification something that Google's original machine did not account for okay this was a little bit of the theory now let's do a practical example what do you think about this and I ask here chat gbt for Omni so you can follow here along if you want to optimize here a complex nonlinear function let's say we have now an initial setup this is our function and we ask now gole agent is now task with finding here in the simplest case possible as simple demo

the maximum of this function Step One initial agent Behavior at the outset Good's agent may start with a simple method like random sampling or hill climbing the easiest methodology you can think of so it randomly chooses your points evaluates the function at those points and then makes small changes to look for an improvement after a few iterations the agent might ttic gets stuck in a local minimum maximum because its matter is hill climbing cannot Escape such traps here on the manifold easily so at this particular Point good agents would now recognize that the performance now

measured Itself by the utility function how close it is to the global maximum that its performance is not optimal it's suboptimal and now step two we have a recursive self-improvement a self- modification of the code and of the methodology that the agent now uses agent analyzes Its Behavior self inspection inspects its current code and sees that using here hill climbing algorithm which might be not suitable for these problems because it simply gets stuck in a local Maxima so it starts now after the self-inspection with a self modification the agent now dynamically modifies its optimization Str

strategy and since here the agent has this let's say code llm at its core it simply generates here a new optimization algorithm like you ask I don't know 01 preview or you ask your cursor come up with a new optimization algorithm for a particular job like optimizing your function and now we have a new optimized technique like simulated analing or genetic algorithms which are better suited for escaping your our local maximum so therefore the system now rewrites its logic as follows so we will switch here from hill climbing to simulated and kneeling which allows probabilistic

downhill moves to escape not the local Maxima in favor of exploring a large solution space all of this is already available to us even in chat TB 4 Omni it does know how to code it has a variety of different methodologies how to escape local traps so therefore we use now this artificial simulated intelligence to come up with new solutions to rewrite its own code base and then step three new strategy implementation now goodle agent runs the newly generated simulated and kneeling algorithms yes you know how this works step for further recursive improvements after some

cycles of optimization the G agent could perform additional analysis it might now decide that combining multiple strategy let's say a hybrid form of a simulated and kneeling in a gradient descent methodology as it tried out all those methods beforehand he had a utility function that gives it the performance number back and it says hey if I try here hybrid solution maybe I get even better results and it simply tries it out the system itself without any human interaction so goodle agent generates and modify its own code once again to introduce this hybrid approach so you

see we have now a recursive Improvement it continues goes on until goodle agent decided here reaches an optimal solution and decides hey this is here the performance that I'm looking for so outcome flexibility goodle agent start with a simple method and recognize it when it's suboptimal it modifies its own approach dynamically without external Guidance just on the internal knowledge the parametric knowledge of a code based llm shifting to a more effective strategy whenever it fails hybrid methodology or whatever and it can continuously evolve iterate over increasingly refined optimization techniques so this gives us a very

simple illustration here of the core principle of Good's eii agent recursive self-improvement it recognizes that it it realizes its own limitation it has a function to check its performance seeing that it is suboptimal it dynamically modify its logic based on the parametric knowledge of our llm it searches now the full design space of potential Solutions this is nothing else that we ask now a code llm internally hey do you have other solutions for me you learned here the complete internet the complete GitHub reer show me other potential solutions for my particular problem and then having

now multiple solution to choose from it converges towards here an optimal approach great so we have now a first idea let's do a summary and the summary we start now with the performance Benchmark data and here from the original study I show you here the agent name we have a chain of sords self refine quality diversity role assignment met agent sear not the company but some super agent search and then we have here the gole agent you see here we are about 60% let's whatever F1 score just f one and with good we reach 90%

90% close to 90% there is something of a self-optimizing self- referencing self-improving EI system that is really really looking interesting A system that can override its own code base and optimize itself based on its interaction with the real world for my green grass alas Five Points I want to show you this goodles agent framework explicitly aims to eliminate any human press but allowing now our goodle agents to self modify and explore H the full space of possible designs we do rely on the knowledge and on the power of code-based llms but the the good agents

overcomes the constraints imposed here by our fixed encoded routines because we use AI so this approach could theoretically lead to a breakthrough where Solutions are entirely novel not applied here in this particular domain knowledge and free from the limitations of a human bias because a human might say hey I always do it in this way and I never saw of any other alternative path monkey patching yeah Good's agent is shown to change its behavior and its own code base without restarting rebooting the system and this is great if you thinking about robotics whenever we have

a robot in the real world interaction and it's performing here a task and it notices that to I don't know open the fridge it has a wrong algorithm and it just asks its internal parametric llm hey provide me with a better methodology to open out the door of the fridge and if it can do this without restarting the complete system it really comes closer to an autonomous system to make realtime decisions that are really helpful and you get an immediately feedback from the environment you have an interplay of coding of inside of mathematics of causal

reasoning maybe even scientific reasoning and the authors of the study show that this methodology consistently outperforms here any human manually written designed agents across all the different tasks what is so beautiful is the recursive self-improvement as we just had a look at it so inspired by good machine and based on Good's incompleteness insights this new goodle agent is now practical realization of a system that can iteratively improve itself without any predefined limits it just needs time imagine you have a petri dish and you do a I don't know a biochemical experiment and you leave this

petri dish for I don't know 1,000 years alone but you know that whatever is in this Petra dish it has kind of a recursive self-improve Improvement evolutionary code imprinted in its system in the gene whatever or in its code base so this now allows here a recursive nature allows here the agent to progressively enhance its ability to optimize and the deeper implication is that over the time such agent could reach levels of capability that surpass current modes that are constrained and constructed by the human design and by human thinking yeah there's maybe a way to

some new paradigms here in the agent design and especially in the multi-agent interaction so rather than being a static AI agent this GLE agent can design better version of itself over time learning strategy for solving problem without the human intervention it can therefore accelerate development of more autonomous System complete system with multi-on agents and it provides a new way of thinking about a agents they do not rely on fixed learning algorithms that we as humans program into the system but instead the system can now invent and refine their own code base there's one point you

have to be careful this this is so important the choice of the initial policy the initial starting point of the agent reasoning and behavior this is one of the most important points facts here if you have this autonomous development a well chosen initial policy can speed up the convergence toward an optimal solution of the system while if you are unlucky and you start here in a poor starting policy point this may lead to a slower learning or maybe you do not at all achieve here that you find the optimal solution so although goodle h& is

highly flexible the initial condition is so important because it will significantly influence the performance of the complete system especially in the early phases optimization and if you do not want to spend too much money if you don't want to let the system run for days and days and days the more intelligently you choose your starting point the closer it is already here to an Optimal Performance for your specific task the better the system will optimize itself so here we go at its core the goodle agent is a self-referential system meaning it can inspect analyze and

modify its own code and this property of self- referential is think about G's incompleteness here is now so important it is key to its recursive self-improvement mechanism where the agent now inspects its current state and strategy modifies and rewrites its own code including here the logic responsible for the self modification and reapplies this process recursively continually updating and optimizing itself it is like a child that grows up here to a man a woman this is now done extremely accelerated here in these new AI systems of the future so if you only take one sentence from

the whole video remember a good Agent is a self referential with a recursive self-improvement system great now all of this was just the introduction because this is now here the real publication that I want to show you this is here about the core of the system a recursively self-improving code generation as researched by Stanford University Microsoft research and openi but this particular paper we will look at in my next video because this is not too much for the beginning but the idea is again simple why we have a human at the loop at all when

we do coding AI for what do we need the human and they come more or less to the conclusion those systems can be can have an autonomous development without a human in the loop and how do we build those systems yeah this is for my next video so therefore we have now all of this is model on gird's incompleteness Theory to further indicate that any sufficient complex formal system contains statements that can self reference and modify themsel if you studied mathematics you know this is a massive simplification that I'm doing here for this video but

we cannot go for hours and hours into the mathematics here of incompleteness so therefore please allow me to make this simplification to show you the idea behind all of this so any is translate to an agent and can improve not only its decision but also the very algorithms it uses you to make those decisions great and now to a real dangerous wording the goodle agents achieves some form of something by inspecting its runtime memory so this particular agent has access and analyzes its own variables its own code functions and its own classes that it has

that make up both the environment and its own internal structure its own architecture its own reasoning structure and it introspects and retrieves its current code and is able to modify its own code dynamically during the execution and now some researcher may claim that this is kind of a form of a theoretical selfawareness of a system well I'm not sure if I would go with the term a kind of a form of selfawareness but let's face it if I in the morning I look in the mirror and I see myself in the mirror I think my

goodness I have to shave I have to to wash my face so you know I have to optimize myself and I am kind of self aware of myself now imagine it is this EI system that looks in my mirror it does not see me but it can now see its own code base it has access to its own variables to its own function that Define how the system is able to think to analyze to compute data to have a logical reasoning if it has the power to change its internal structure its internal code base is

it then also able that we can say this AI has now a kind of a form of a selfawareness I don't know yet but I think it's an interesting sort so if you have some ideas hey why not leave here a comment so there we have it we started in 1931 with a pure mathematical proof by and went almost 100 years in the future to its implementation in today's EI agent multi-agent environments and we build something like a goodles agent that is self- referential and has a recursive self-improvement functionality I hope you enjoyed it I

hope there was a little bit of new information I hope there were some thoughts that you would say hm that sounds interesting let's have a look at the next video because it would be great to see you in my next video