Autogen Full Beginner Course

34.42k views17930 WordsCopy TextShare
Tyler AI
Welcome to a Full AutoGen Beginner Course! Whether you know everything there to AI Agents or are a ...
Video Transcript:
hey and welcome to my autogen course and autogen is a multi-agent framework and if you have no idea what I just said that's okay because I explained everything from the beginning but if you do know or you have worked with AI agents before or another multi-agent framework then maybe some of this will be a refresher and hopefully I can teach you something new we have a lot of topics to cover and code and we have a few projects as well a couple of the projects I ask you to do so you can pause try to
complete the project and then when you unpause I'll show you what I did by the time you're done this course you'll know what autogen is and be able to code your own multi-agent workflow well we got a lot to do so let's get started well what is autogen well it's an open- Source multi-agent framework that has almost 25,000 stars on GitHub averages 40,000 downloads a week and was recently mentioned in Andrew Ang's PowerPoint at sooya capitals AI Ascent in a citation it offers many possibilities and even more than I can list here such as function
calling group agent chats teaching your agents multimodal integration and more but first if this is your first time talking or hearing about multi-agent framework or maybe just need a recap let's quickly look at something that you probably do know chat GPT if you went to chat GPT you would ask it to write something so here I'm just asking it to write a python function that reverses the string pretty simple pretty straightforward and this is what's called my prompt then it would give me a response or what's called a completion with some code it would normally
give you a description of how the code works as well this is like having a single AI agent and talking to it asking for help so if this represents a single agent then in in a multi-agent framework this means we can have more than one of these agents talk to each other to solve problems here we can see examples such as a three-way chat or even a hierarchical chat pattern that's where one AI agent will talk in turn to other agents to solve a problem in this simple two-way conversation we have a user agent talking
to an assistant agent where the user agent can represent you or me we would ask the AI assistant agent to plot a chart of meta and Tesla stock price changes year to date the assistant would come up with some code their user agent will try executing it and maybe come up with some errors and then if there are it'll give that error back to the assistant to fix and then we'll keep going through this process until the problem is solved or to your liking we will start off performing a two-way chat and coding before we
get into multiple agents so you understand the code and how this works we're going to create a user agent and AI assistant agent give a configuration to the AI assistant agent meaning we will connect to open ai's API and then run it to see the output later on in this tutorial we will see how to use local models which means if your Hardware can handle it you can download models and run them for free without paying the price of open ai's API we're going to create a con environment even though you more than welcome to
create a virtual environment we're going to install pi autogen and if you're asking why aren't we installing autogen well that's simply because the library for just the name autogen is already taken and that's for something completely different so they had to uh rename it py autogen we're going to create the configurations to connect to the model create the agents and then finally start the chat let's get started okay well the first thing is if you don't have an IDE which stands for integrated development environment which basically means we are going to be running software to
help us create software I would choose pycharm especially if you're kind of new to this you're not used to creating uh environments this is going to make that easier for us now you can pay for the professional version but you know I don't really have to pay for anything as much as possible or make it very cheap so instead I'll have this link in the description and you can download the py charm Community Edition which is the free version so you just click this download for your machine and then open it up when you're ready
okay and once it's opened you probably won't have any projects if this is the first time you're downloading it but if you do regardless you'll see some list like this that I have here if this is your first time downloading list you won't have any projects here so you can go and click on create new project I'm going to name this autogen beginner tutorial you can choose your environment here so you can choose cond or virtual environment whichever one you want I'm going to choose cond I'm going to choose a python version as of this
tutorial I'm just going to use three .11 and you don't need to choose to create a main py script because we'll be creating all the python files anyways and then just go ahead and click create and it's going to go through the process of creating the environment for us all right so here is our screen now I'm going to goe and create a new directory under this project so right click new directory and we'll just name this 01 2way chat I'm also going to have this project in the description so if you get lost at
any point you can just go ahead and retrieve the files from there now go and right click on that new directory and choose python file and we'll just name this main.py okay and now we're ready to code our first autogen application and before we do this we need to install the python package like I mentioned so in py charm if you open up your terminal down the left hand side here click this and then you can say pip install Pi autogen once that is done just minimize the terminal window and we can begin the first
thing we need to do is import autogen now as we go I'm going to show you multiple ways to do different things so the first thing we're going to do is create a main function so we say Define main colon and then get ready to include all of our agents now the first thing that I do for pretty much every and that you'll do for pretty much every autogen application is you need a config list say config list equals autogen doc config list from Json and what this function does is it retrieves a separate configuration
outside of this main python file then brings that in here for us to use and we'll create that file in just a second and we can say EnV or file is equal to Oi config list. Json now what's is going to do is going to bring all the information from a Json file or all the properties from there so that we can use it for the configuration of our AI assistant and connect to the model just right click on the two-way chat directory click on file and type in oi config list. Json now I'm just
going to copy this in here but you can have an array of Json objects that we can retrieve re the model and the API key for open AI the model we'll be using is gbt 3.5 turbo and you need an API key so let me show you how to get that key you go to platform. open.com once you're here at the top right you can choose login if you haven't created a login go ahead and sign up now it's free once you've logged in on the left hand side go to settings you go to billing
and as you can see I already had put a few dollars in here before so I have a credit balance to use use to use the GPT family of large anguid models now for you to use this I would say just give it like a couple dollars right because we we're going to be using the 3.5 turbo model for most of this until we get to the local model usage and it's really cheap to use that model so once you just add a couple dollars here you'll go to your API keys on the left hand
side and then you'll create a new secret key so beginner tutorial you'll create a secret key and then you can copy this so when you copy this you'll simply just put it right here now the next thing to do is to create our two assistant agents we'll say assistant equals autogen do assistant agent we need a couple parameters here we're going to say name equals we'll just say assistant and then we have an llm config property and this is important for any assistant agents so you say llm config is equal to now for other projects
I'll create a separate l config to put here but for now what we need to do is need Open brackets and we need a config list property so we say config list colon and then we'll give it the config list that we just created above now the assistant will know to connect to the 3.5 turbo model with open AI with our API key and then we need our user proxy agent so user proxy is equal to autogen do user proxy agent and so we need a name we can just say name equals user and then
we need a code execution config now any code that the user proxy executes that it gets from the assistant agent we can have it save it to a directory in our project so we can give curly braces and then we say work _ directory or just dur then we can just call it coding and then another important property if you do this you'll say use Docker and then we'll set this to false so if you have Docker and you know how to use it and you have it running then you can just set this to
true or it's true by default however if you don't have Docker or you're not really sure what it is make sure you set this false or it will not run okay that covers almost everything we did except for starting the check so we'll come down here and we'll say user proxy do initiate chat we want to initiate chat with the assistant and now we need to tell the AI agent to do something so we're going to say message equals plot a chart of meta and Tesla stock price change and there's one more property with these
are proxy agent that I want to show you so after name doesn't really matter where but after name type in human input mode and what this allows us to do is either be a part of the chat with the AI assistant or we can say we never want to be a part of it and then just let the user proxy agent and the assistant talk to each other execute the code and just let them handle it themselves and that's what we're going to do so you're going to type in quotes never and then comma and
then finally we're going to say if name equals main just run the main method okay well we finished with that now let's just run this so if you just click the Green Arrow here or you can do it at the top okay so the user initiated the chat with the assistant right here so we asked plot a chart of meta and Tesla stock price changed so the assistant of the user uh came back and said hey here's a step-by-step guide let's start with historical stock price data and these triple back ticks in Python this means
that this is code that it's going to run and when you see a hashtag followed by a file name this is the file name it should save it in your directory so here it's creating the code it created all this code and now it's going to go back to the user and the user is going to try to run this but as you can see it did and it got xcode of one because we don't have the pandas Library so then the assistant creates a shell script to install the libraries that we need so it
did that the user then executed that started installing all of these libraries for us then the assistant said okay now that you have the required libraries you should be able to run this and then finally we terminate now on your end it may show up with the graph automatically and it may just run however it just didn't frust and that's fine CU this can be different iterations of the code yours might be different than mine so if we minimize this you can see over here in the directory it created a coding working directory and it
created the stock price chart python file but you can see here it saved the code that the AI assistant rope awesome we just had our first autogen application work all right now what if we did want to interact with the AI assistant and maybe want to change it at some point well in order to do that you can do one or two things so if the user proxy agent in the human input mode where we put never there's two other options here always and terminate if we say always this means that at every time that
the AI assistant gets done something we can either say hey okay that's good but maybe add this or remove that and then we press enter and it'll do something else or this can say terminate and what that means is that whenever we are about to terminate meaning that all the assistants or all the agents think that they're done before we actually terminate we can say hey you know what I want you to try this instead so let's test that now just change it to terminate and run this again okay and actually look at that I
didn't even have to make an API key and it went ahead and created a graph for us from the same code that we used so if I exit this now when the assistant said terminate now we have the chance to give feedback to the assistant well how about this time instead of meta I want to say Nvidia and Tesla are the two stocks that I wanted to compare so instead compare Nvidia and Tesla press enter now us as the user are going to tell the AI assistant the same message instead compare Nvidia and Tesla now
the assistant is going to modify things and execute the new code and as you can see it did it changed from meta to Nvidia versus Tesla stock price change now that we were good with that we can just say exit and we are done but instead of terminate what if we say always we run this again now what happened is we the user told the assistants to plot a chart of meta and Tesla the assistant came back with the same with the same code and told us what it was going to do but we are
already able to give feedback and what this means is every time the user gives a task to the assistant agent and the assistant agent gives some code or whatever it does gives it back to the user now we are able to provide feedback or intervene into the conversation so every time the assistant does something we as the user are allowed to say hey you know what actually instead of meta do Nvidia or apple or whatever you want to do it doesn't have to be stock price change it could be a percentage and this is how
we can be a part of the chat if we wish to now great this is basically doing something like you would with chat GPT just one toone chat interaction now let's take a look at group chat well this means we can have multiple AI assistant agents talk to each other and each have their own task or job to execute let's say we want to plan out and give more direction to each agent we can have a planner agent engineer a Critic for the code and a separate agent to execute the code to test this time
we will just create more agents and have a group chat and a manager for the group chat then we initiate the chat and let all the Agents come up with a plan to solve the problem let's code this all right so because this is a lot of code I'm just going to paste this in I'm going to go through it there are some new things here so I'm going to make sure I go through those so that you understand it but this will be in my GitHub if you want to follow along now the first
thing we need to do is import autogen andv EnV is a library that allows us to take a EnV file with some properties and then add them into the config list for autogen so in the previous one we created a Json file in this one we're creating a EnV file and this just another way so to do that we have to load EnV first so that it can look for this uhv file and then get the properties from there so we'll say config list equals autogen doc config list from. EnV which is just a function
for autogen we want to look at the EnV file and then for the 3.5 turbo model we want to look for the openi API key in the EnV file if you right click on group chat go to new then you'll type in EnV and create the file okay and when you open that up you'll type in open aore aior key and put your API key here and now we can connect to the 3.5 turbo model with their API key we have an llm config with a few properties that are kind of new here so the
first one is the cach seed now when we ran the two-way chat you see that there's aach folder that means if we ran the main python file again and didn't change anything we're going to get the same exact result because it cached that for us and if you open up the cache directory there's a directory called 41 and 41 is the default cash sheet it always used if you don't give it a CED number and there's a cache database where it stores all the information about the whole inference that you had which it means from
the prompt to all the results and it has that stored in this database that you can look at so if you want a different outcome you can change the seed to any different number every time or you can just type in none and you'll always get a different outcome every time so the temperature is set to zero because we want to do exactly uh what we ask it to then we have the config list property which is the needed one for the llm config and then we have a time out that comes in seconds so
120 is just 2 minutes okay and then we have our user proxy agent which we created before and we so we have the system message the name for it the code execution config means that when it goes to execute code it should save that code into a code directory so here in the group chat there should be another directory once we run this with the python file in it and I set the human input mode to terminate because I don't want to intervene until the end and I want to know if it works okay and
then we have four assistant agents we have the engineer scientist planner and a Critic and they all have their role and they know the role by having the system message so they're going to follow a description or a plan of what I kind of want them to do so for instance for the engineer I want you to follow the approved plan probably from the planner and then make sure you save the code to disk and you write Python and Shell Code okay so with the system messages it gives a little bit more direction for the
assistant agents and then finally something also kind of new is the group chat that I talked about so you're going to say autog jen. group chat then you're going to have an array of Agents so we're going to give all the agents here that we just created and we're going to set a max round of 12 meaning that sometimes you can kind of get loops and just kind of they talk to each other but they don't actually do anything this is a way of making sure that okay if that kind of happens kind of this
like infinite Loop that is going to end at some point no matter what when we create the group chat we actually need a group chat manager which takes in that group chat and an llm config and finally we say user proxy do initiate with the manager and then we give it a message and this time we went to find papers on llm applications in the last week and create a markdown table of the different domains okay now let's run this and see how it works you can just right click on the main python file go
down and select Run Okay so let's see what happened here you know so I asked it to find papers on the llm applications and create a markdown table so the planner first went in the group chat it went ahead and created a plan for everybody then the critic had some feedback for the plan and then the engineer started creating the code okay so the engineer here created uh see this file normally means that it's going to create this file in the directory for you under the user under the working directory that you provide um so
created all this code and then the crit comes and takes over the plan and the scientist also uh looks at the plan and kind of get the same error here the same code output is like hey the string in the see need to be an integer not a string so it tries to fix it and finally it ends the chat now I know what you're thinking this didn't work at all and you wouldn't be incorrect by saying that this is probably because I had the max round set to 12 or it would have continue to
try and fix the code and revise it and so forth and because it didn't actually finish at the end it didn't save this to the code directory here and that's okay you know this is actually a good representation because you know not everything works every time that you want it to it's just it's it's part of this maybe if you tried gp4 it can give you a correct output it's given me correct output for this I mean I've done this many times and it has given me 3.5 turbo typically gives me the correct output but
if you wanted to run this again see he already cached we have the cash seed of 41 if you want to run this again try with a different cash seed and see what it gives you all right so for the first project of this tutorial I want you to try something I want you to try to create a snake game so the steps that I would take were besides having a new directory create a snake. py file create a new config list. Json file create a user agent and then a group chat you know make
sure at least one of the agents is a coder and maybe you want to have somebody playing something or maybe you have some like a critic that says hey that is not a good game do better and then come up with the prompt to create a snake game go ahead and try this on your own whenever you're done press play again on this video and then I'll show you what I did well this isn't exactly the snake game that I asked for but I got it there's no apple or any Red Dot and I just
don't die perfect okay so I hope that you figured something out that you got something running and I hope that it actually gave you a snake game and a couple iterations it did for me and as you just saw this was my latest iteration that was not snake I don't know what that was but it wasn't snake so what I did was I created a new project for 03- snake created a main python file and my oi config list. Json in my main python file I just have the config list here I have the llm
config see I had to change the C to 47 cuz I tried this a few times I have a user proxy agent with a simple with a simple system message and actually this doesn't do anything I'll show you how to make sure that it saves or has a better chance of saving the file to the disk the human input mode I set to never because I just want to see what it would do and then I have my code execution config and then I had two other assistant agents I had a coder and this is
if for were you I would copy this and kind of anywhere you want to make sure that it saves code because as youve seen it doesn't always save save the code but if you had this line where you say if you want the user to save the code in the file before executing it put #t space and then the file name and the name of the file inside the code block as the first line and then that will help ensure that actually saves the code and then I have a product manager just to help plan
out to create the games right I created a group chat with the user proxy the coder and the product manager I create a group chat manager and then finally I said user proxy do initiate chat with the manager and then the message is I would like to create a snake gaming python make sure the player uh hits the side of the screen and ends the game okay so I after I did a different iteration um I ended up uh getting an actual snake game uh I can't I I suck so that's okay that's that's another
issue but by way so you know this took me more than one iteration to try and get it to work and that's okay and you can also play around with different models to see what worked best for you all right well so far we talked about a couple ways that we can chat with agents we have a two-way chat where we have a user agent and some AI assistant talk to each other to produce an output and we've had a group chat where we have a user initiate chat with a bunch of agents and they
all talk to each other to perform a task or solve a problem well there are a couple other ways that could make more sense depending on the needs for your project but with something called a sequential chat we can have agents work in a specific order that we want meaning if we have three agents we can initiate a chat with agent one then with agent two and then finally with agent 3 this works by essentially having an array of chats what happens with this is a user agent will initiate a chat with the first assistant
agent and then when they're done the user agent will initiate a chat with the next and so on what we can do though is for each other chat in the sequence we can send context from the previous one to the next and this is called reflection with llm and this will summarize the previous output and bring it to the next agent let's see how this is done with code okay back in our project the first thing we're going to do is create a new directory called this sequence chat going to create a new python file
called Main and create a new file called OIC config list. Json for the config list. Json let's go ahead and put in our model and API key so you'll just switch this with your API key okay now for the main python file I pasted some of the code here but I'm also going to type some of this with you when we get to the sequential part so like always import autogen create a config list and I has something new here called filter dictionary this means that if you you can have remember this is an array
so you can have more than one of these right so I could copy this put comma and paste and this model could be GPT 4 well if I just send this to the main python file and I didn't have this filter dictionary there then this would just grab the first one so it would just grab 3.5 turbo anyways but if I wanted to use like say GPT 4 then I would have to put in here gp4 then it filters out from this array which one or which properties or configuration that you want to use so
in this case it would look at this one look at the model GPD 4 and then also get the API key from it so I just want to use 3.5 turbo so just going to keep that there and this is just another way that you can use the configuration I had the llm config the config list and the timeout okay now I have three assistant agents right I have assistant quote one where you an assistant agent who just gives quotes return terminate when the task is done and I basically copy and pasted that to the
second assistant quote who this is somebody who will give another quote and then I have the third assistant agent who is going to create a new quote based on others which means that I'll give the context or what the output was from the first two and add it to this assistant agent so they can create a brand new one another property here is the max consecutive auto reply I just said it the one meaning once they come up with the quote I don't need them to do anything else right this is just a simple task
and then finally I the user proxy agent and I do have a new property here it's the is termination message so you can create a Lambda where it checks if the content is not none and if terminate exist in the content and this means that we will now terminate the chat okay now for the sequential part we're used to saying user proxy do initiate chat and then giving like one of the assistants so for instance assistant quote to and then we can give a message however this is not it's not quite like that what we're
going to do is instead say user proxy do initiate chats just with the capital s or with another s not a capital and then this is now a different function and inside of here you want to create an array of chats so then we have a dictionary and the first property is the recipient and the first assistant that I want to go in the sequential order is assistant quote one the message is the next property and I want to say give a quote from a famous author I can spell then the next property is clear
history set that to true the next one is silent set that to false and all silent means is that if this was set to true you wouldn't really see uh the chat conversation you would just see just the output in that's it but with false we get to see everything and then finally there is another one called summary method and this is where we're going to set the reflection with llm and again what that means is there's two ways you can do last m message or reflection with llm so reflection with llm will take the
whole chat conversation that the user agent is going to have with assistant quote one and it's going to summarize that the last method will just get just the last message that was in the chat and what I'm going to do now is just copy and paste this two more times because we have a total of three agents okay so I know that second one is assistant quote to I'm going to say give another quote from a famous author and then for the third recipient this is going to be be assistant create new so uh I
want to the first one to get a quote the second one to get another quote and then the third one I want to create a brand new one so for the message here we'll say based on the previous Quotes come up with your own okay and that's it now we're done also with py charm you know it gives you this quickly lines here this is just saying that you can reformat the file okay and now the last thing we need to do is just run it okay see it's already doing a lot so let's just
start from the top okay so we start a new chat the user proxy talks to the first assistant give a quote from a famous author assistant one returns don't cry because it's over smile because it happened from Dr Seuss perfect and then it terminates so now that's done and we're starting a new chat so the user proxy is starting a new chat with assistant 2 saying give a quote from another from a famous author give another quote from a famous author but the context is the quote provided is from Dr Seuss and emphasizes you know
the importance of chish all right sometimes it gives you the exact exact quote or because I have it saying summarize this is just what it gives you all right so now it gives another quote saying the only limits for tomorrow the doubts we have today okay and then it's done and now finally we start our third chat based on the previous Quotes come up with your own and the context is taking the summary from the first two and adding it here and so now it's going to create its own in the tapestry of Life cherish
the threads of memories and weave them with threads of positivity perfect and then finally we're done so this is how we can sequentially tell our agents to do or perform a task in a specific order the last way we dealt with chatting was sequential chats now we're going to look at another way called nested chat what this means is when an agent is done with a response we can have a nested chat separate from this chat chain let's look at this example here the user tells the writer to come up with a simple blog post
about meta the writer does and then sends that back to the user normal chat but now we have a trigger on the writer meaning when the writer is done we can start a new nested chat so for this example we have one nested agent that's the critic agent so when the user talks to the writer the writer will send a response back to the user agent now that we have in our code a trigger for the writer to start a nested chat the user will then take what the writer wrote send that to the critic
and then the critic will then respond back to the user agent and then we could kind of keep this repeatable Loop going on until we are good with the overall result so you might be asking well what's really the difference between this and then just adding a Critic into the group chat well Nessa chats are a sequence of chats created by the receiver agent after receiving a message from the sender and finished before the receiver agent replies so what this does it allows autogen agents to use other agents as their inner monologue to accomplish tasks
this kind of gives it like an abstraction let's just look at how this is done in code all right well back in our IDE go ahead and click on our project create a new directory you can call this 05 nested chats right click that directory and create a new python file and just call it Main and we'll also create a new oi config list. Json file this is going to be a little repetitive I understand we've done this a few times already so what you could do is you could just go up to a previous
directory and copy paste that config list into this new one okay so I just copy pasted this code and I'll just kind of run through it again and I'll show you the difference at the end on how we deal with nested chatting because I said there's a trigger so we import autogen create our config list based on the Json file that we just created then we create our llm config given the config list our task so we're just going to have a separate variable to for our task which is to Rite a concise but engaging
blog post about meta and we had three agents that we talked about so we have the writer agent where you the name where you the name llm config and a system message then we have a user proxy agent which is typical then we have the CR agent okay and we said the critic is what's going to be once the writer responds then the critic is going to be triggered okay and then what I have here is a reflection message function takes in four different parameters I'm going to you'll know when this is triggered because I
have a print statement called reflecting so that we can see this and it's going to return reflect and provide critique on the following writing essentially what we're doing is we're getting the last message from the sender okay that's what sender negative one with the content that means we're going to get the last message from the center and then what we have was a user proxy register nested chats we have to register the nested chat and we had to have a trigger for it no when we want it to start up so we have an array
here so you can have multiple recipients with multiple messages here right now we just need one so the recipient is the critic okay so this is the critic the message is going to be this function up here the summary message is the last message in Max turns of one and then we have the trigger which is the writer okay this is how we'll know whenever the critic is going to start because it's happens whenever the writer has a response well this will make more sense whenever we look at the output and then we say user
proxy do initiate chat the recipient is the writer the message is the task and I'm just going to say Max turns of two so that we can see that the user proxy will talk to the writer and then the critic so let's go ahead and run the main.py function just right click Main and choose Run Okay so we're going to say user to the writer write a concise but engaging blog post about meta then from the right to the user the writer responds back with a blog post and a title about meta and then we
start the reflection okay this means that we're going to start a new chat so the user isn't going to respond to the writer the user is going to respond to the critic now because the trigger was the writer so the user talks to the critic so it's to reflect and provide critique on the following writing and this is just the last response from the writer then the critic takes that critiques it sends it back to the user and then the user goes back to the writer says the writing presents blah blah blah here are five
things and then in conclusion writing effectively about about meta and the metaverse incorporate some other things okay and then the writer sends uh takes the critique modifies what it originally wrote and then sends that back to the user so again because the user talked to the writer writer back to the user because this writer it gave the response that triggered the nested chat so now the user starts talking to the critic now if we had a sequential nested chat then we would have the user talk to the critic and then all of those other agents
before it got back to the writer I know that this was a little bit of a different concept and I have another video already about how sequential Nessa chats will work so just give this a try um see what you can do with it and let me know how it works for you today I'm going to show you autogen logging now you might be asking how can this help me and what can it be used for well we can take a look at performance analysis of the llm calls we can see how long it took
the chat to complete how many tokens did we use what was the cost unless you're using open source local llms and we can simply log the response well how it works is we'll start the log before we initiate the chat for autogen and then after it's over then we'll simply call to stop the log and then what happens it stores this performance analysis into a database for us let's see how it looks in code the first thing we're going to do is call autogen runtime log do start start and for the configuration we're just going
to give a DB name here I give it logs. DB and then we'll have some initiate chat logic and then we'll call autogen runtime logging do stop after that's done it's going to create a logs. DB for me if I haven't ran this already and then insert the rows of data then we can retrieve this data and see the performance analysis of our calls one way to do that is to use SQL light browser this is a free software where all you need to do is open up your database in the software and then you
can see all the tables and in all of the rows of data for each table inside of your database and here for instance the cost of 0.0 is whenever I use GPT 3.5 turbo it's not exactly free it just is very minimal cents and then between the two and three cents range that's when I use GPT 4 so you can see the cost increase another way is to install the Panda's library and then we can retrieve the data how we want and make it look nice in our terminal I think this would be particularly useful
for open- Source local llms and so we can decide which one was the quick to finish the task which one used the least number of tokens and which one gave us the better responses well let's get into the code and look at some examples all right well the first thing we need to do is create two files we're going to create a main python file and then the open aai config list Json file to hold our model and API key you could also use this to store the base URL if you're using open source local
llms so over on the config list Json file I just simply had the model I'm going to use and then you would insert your API key here and if you're using a base URL with something like LM Studio you could also paste that here and now back in our main main python file the first thing we need to do is set up our Imports which means that we also now need to install them we just need to open up our terminal and then type in PIP install pi autogen and pandas and now we need to
create the llm config for the assistant agent we really only need the config list property here which is calling autogen doc config list from Json and give it the oi config list Json file we created earlier and what this is going to do is get the model in the API key that we stored in there and then that'll later be used for the llm call and now we can start our logging so we're going to call autogen runtime logging Dost start and for the configuration like I showed in the PowerPoint slide we're going to give
it a DB name and call it logs. DB over here in my autogen logging directory where this file is I don't have a logs. DB it will create this for us automatically whenever it's done and then for this simple example I'm just going to have an assistant agent which I pass in the name and the llm config that we created above for the user proxy agent we give it a name we don't wanted to execute any code because in this case we're just going to ask it a question the human input mode is never because
I'm not going to talk back and forth with the llm call whatever happens that's what we're going to use and then we just have a simple termination message then we say user proxy do initiate chat with the assistant and the message is going to be what is the height of the Sears Tower only respond with the answer and then terminate and then after initiate chat's over we just simply call autogen runtime log. stop to end the logging and now you could stop here and you can skip ahead to where I talk about sqlite browser but
I'm going to show you the code to use pandas so we can see it in the terminal what I'm first going to do is just have a function to get the logs I'm going to pass in the DB Name by default it's going to be logs. DB that we are going to be creating and then there are several tables in there but the one we care about is chat completions I'm going use SQL light to connect to the database have a select statement from that table to retrieve information execute that query fetch all the rows
of data I'm going to gather all the column names zip them into objects make a dictionary out of it basically this function is going to get all the rows of data from the chat completions table in the database and on a side note this will all be on my GitHub so that you can use this as well we're going to call that function store into a log data variable convert that to a data frame we're going to add a total tokens column based on the response and then for the request based on the request we're
going to get the content or the chat and then also for the response we're also going to get the specific content from the response of the llm as well now let's just run this to look at an example here it started the logging session ID that we printed above when we started the logging we asked it the question of the height of the Sears Tower it responded that is now known as the Willis Tower and it's 1450 ft tall now what we did is there's only one row of data in this database and there's six
columns total we have an ID you can't see here that's what this zero is uh the requests and responses are pretty long so we're not seeing everything in the terminal but at the end we have the end time and then the total tokens this is the total tokens a round trip that it costs for the llm call that's pretty cool but now let's look at the actual tables of SQL light browser all you need to do is go to sqlite browser. org and go to the download section and then download the one for your machine
once you've done that just simply run it and then when you come to this screen you want to click open database and then here's our logs. database now let's just open this up and I'll zoom in so this is a little easier to see but Chang the font size kind of makes this wonky and weird to look at but it has five tables here we're going to look at the chat completions table and now you can see all of the columns that it actually stores which is more than what I showed on the Panda's data
frames so we have the session ID that we started with and that's the one we actually printed out the full request so if you use this and you click on it over on the right hand side you can see all the actual Json that we used and then in the response you can see the content here this is the actual answer we got back from the llm and then you can see the total tokens was 515 The Prompt itself was 488 and then the completion which was the actual response was only 27 and then here
is the cost uh is cached lets you know that was the answer already cached so if I were to rerun this one more time I would have another row of data and instead this value will be one for the other row all right let's switch gears a little bit and talk about something a little bit more fun and experiment with and this is going to be a vision agent well what is this well it's basically a vision model or an image to text model that can take an image and then describe it for instance if
we we have this image which is just something I got online and it's a mitochondria I believe we can ask the model to describe it and the model we're going to use is the GPT Vision model and when we ask it to describe it we get this output so let's code this with autogen and see how it works all right well in your project go ahead and create a new directory let's call it 07- vision and then we can go ahead and create our main python file actually I'm going to call it Vision this time
and then we going to create the open AI config list so oi config list. Json now the difference with the Json file is this time we're going to use the GPT 4 Vision preview model this is going to help describe images and then input your API key here okay and here is the code for the vision model again you can just take this from my GitHub well we import autogen and then we import a new agent called multimodal conversible agent and what I have here are six different images most of these I got from pix
Bay but just found these on on the internet and I just got their um I just got I just copied their image URL and then we can use these as variables in our messages so I have a config list we get the config list from the Json file we just created I make sure that I filter it by getting the vision preview model this will means it'll also get our API key we instantiate the multimodal conversible agent give it a name and then the llm config the only difference here is for the temperature I set
to 0.5 and then I have a simple user proxy agent you know we're not going to execute any code so I just set code execution config to false now for the first example we're going to have a few uh I'm going to say user proxy initiate chat with the image agent so that's the same but for this message so I have this F here in front of these triple quotes this allows me to give variables inside of the quotation marks so I say can you describe this image in detail and I have an image tag
here sort of like HTML that's what it kind of looks like and I give it the mitochondria image http URL well let's run this so right click on Vision run it and we'll see what it does so as you can see here it says can you describe this image in detail you only see this image tag but it's actually getting the HTTP URL of the image so we don't it doesn't show it up here but in the background it knows what it is so it's saying this feature this image features a stylized representation of a
mitochondria perfect often referred to as the PowerHouse of the cell probably the only thing I knew about it okay great this this is a quick simple example of how it works now let's do something a little bit more in depth what I have here in the second example is I initiate a chat like normal with the image agent and I say what is a picture of and describe everything in it and I give it an image of a golden doodle then I also send another message so this is new you can say user proxy do
send another message to a recipient and I say what dog is this a picture of and this will be an image of a Corgi and which of these dogs tends to bark more this one or or the previous dog image so it's going to answer all these questions for us when we run this so go ahead and rightclick vision. py and click run so we ask it the first question which will be the image of the golden doodle and it'll answer that first okay so we asked it the first image about the golden doodle so
it goes and image explainer says hey this picture of a Charming apricot colored puppy with a wavy fluffy coat likely a poodle or a poodle mix well we know it's a I know it's a golden doodle okay and there is the image of the golden doodle yeah it kind of has a blue color there in the black even in the background there are a pair of black rubber boots and well they're kind of boots maybe I don't to say they're boots but by way it did a pretty good job and it says what is this
dog a picture of and this is actually a picture of the Corgi then we ask which of these dogs tend to bark more so in here it says the dog is an image of a Welsh Corgi blah blah blah and then regarding to the tendency of the tendency to the bark it's not possible to determine so I think it just kind of goes on and says um it doesn't really know um but I think it's saying the poodles can maybe be more vocal but either way it answered two different questions in just one chat okay
and in the final example I say what is this a picture of and describe everything and it and this is just an image of Luigi Yoshi and Mario and then again we say user proxy do send what game is displayed here and I'll show you um what that is after it tells us and then among all these characters which has sold the most amount of games Okay so let's go ahead and run this and see how it or see what it tells us okay so first off let me show you this the one of this
is you know Luigi Yoshi and Mario and then the second one about the Super Nintendo is this is actually if you can't really see it's supposed to be Donkey Kong Country 2 so let's see what it tells us so the first one it says you know talks about okay it recognizes Luigi Yoshi and Mario and then what game is displayed here and among all these which is sold the most amount of games and can you give me some figures for all characters shown okay so the image displays the Super NES with a cartridge inserted the
game is Donkey Kong Country 2 which is correct okay and then I think it goes on explain it more but as regarding game sales Mario's the character who has sold the most games given that he's the F the main character of the Mario franchise that makes sense and then so basically it kind of gives some more of that um Donkey Kong franchise tens of millions of units so with the original D Kong Country for SN selling over 9 million alone okay awesome and this is what the vision agent can do it can we can give
it images and tell it how we wanted to describe it and it does its best to tell us what the image is and I think that is awesome that's really powerful all right so we just learned about the vision agent which is an image to text model meaning we give it an image and then it describes to us with text what that image was about now if we were to reverse that that means we do text to image we would give a prompt as text to the model and it would output an image file for
us similar to something like mid Journey if you've ever used that before well the thing is open AI has a dolly model unfortunately you have to pay for chat GPT subscription which is about $20 a month and if you don't then you'll get an error saying that you don't have basically don't have access to that model to use it so what I've done because I don't pay for that subscription is I have code in a another directory that I'll have with all these other projects that we're doing I'll have the code for you if you
have a subscription all you have to do is input your API key and whatever prompt you have so it's going to be code is going to be there for you to use if you want too however later on in this course I'm going to show you how to do it for free using an inference server through hugging face well let's go ahead and move on to the next topic so far we have been using open AIS API which means we had to spend some amount of money and we have to use their family of models
well there is a bigger selection out there and I am going to show you a way to use an llm locally using a software called LM Studio this means we will download a model and start a local server that will allow us to connect that model to autogen and we can use open source llm such as mistal Gemini and llama the software is called LM studio and as of this video we're going to be at version 0.220 how this code and connection will work is we will download Elm Studio find a model in the software
to use download that model load it into the software and then finally start a local server this is all going to be done through the software one of the pros of LM studio is it does make it easy it has a nice user interface it makes it easy to do all of this and then finally we're going to connect all of that by changing just a couple of the configuration properties in autogen let's download LM Studio go over it really quick and then run it with auto Jen all right so the first thing we need
to do is go to LM studio. and then once you're here download LM studio for your machine and then once you've done that just run it and then this is the homepage that you'll be greeted with now what we need to do is download a model you can choose any model that you want to download I've already downloaded ones that are small that might computer can handle but you're the more than welcome to either scroll down on the homepage so here quen 1.5 I don't have it downloaded yet so if I wanted to I just
click this download button opens up a bar at the bottom I can open it up and then you can see that starting the download you can scroll down and choose any of these or at the top in the middle here is a search bar you can search for something like mistol you click enter and then on the another page it'll show all the results on hugging face for mistal models on the left hand side you can also go to AI chat and then you can simply play with your model you can at the top here
select a model to load load it up then once it's loaded you can just start chatting with it down here so what we need to do is on left hand side there's this double arrow here this is for local server so click that and now what you can do at the top again is the same select the model to load click this and choose a model that you want to use I'm just going to use the 52 model because it's small for my machine machine so I'm going to click this it's going to start the
load process and whenever that's done this start server right here this button will light up green all you need to do is just click Start server and that's it now we can use this model in our agent workflow okay so what we're going to do is create a new directory so right click on the project and then just just name this one 09- LM studio and then just create a main python file I'm not going to use an open AI config list this time because I'm just going to show you directly in the main python
file kind of the new configuration you need okay it's going to look a little bit different but I import autogen I just create a main method which at the very bottom we will use to run I have an lolm config which I just named f 2 I gave it the config list the cache seed which I just set to none and the max tokens the cash seeds set the none again this means that it will not create a Cash Direct under your directory project so for the config list there this is a little bit different
right because we're not using open AI API key for the first time for API key we're just simply putting in lm- studio for the base URL we get this from LM studio and also for the model name which you need for this you also just get that from LM studio so if we go back to LM Studio here like we like I already mentioned this is where you get the Local Host URL and then if you scroll down just a little bit you can get the model name here and just copy and paste that now
I just have a simple uh now I just have a simple conversible agent which is the base agent class I give it a name lolm config and the system message that your name is Phil and you're a comedian I also have a simple user proxy agent I have a name the code execution config is set to false and the human input mode is set to never now when you're using LM Studio you need to have the default auto reply on any user agent if you don't have a default auto reply it will end up failing
because there will be a it'll look for the content in the chat history and if it's empty then it will just fail so in order to not make this fail you can just put something here it doesn't matter it just has to be something and then I just say user proxy initiated chat with Phil and tell me a joke now let's see if it actually tells me a joke so right click on main.py and click run now what's going to do is whenever the user proxy ini the chat with Phil to tell me a joke
back in LM Studio as you can see here it's actually streaming and accumulating all of the tokens to tell a joke and whenever this is done here whenever the streaming is done it's going to give us the response back in uh our IDE okay well I am using the F2 model and it is a small model and even even though it runs really well on my machine for the machine that I have it doesn't always give you the best response so it's it's saying this is a program for you um I have a joke to
tell you today you see I am like it's kind of gives you it's giving me an outline for a joke and that's okay this was just an exercise to show you how you can download models onto your computer have a server and then connect that with autogen that means you are not paying anything for open AI API key and that means you are not paying anything to run this and that's something that I do like okay the next topic we're going to talk about is called function calling now this going to be a little bit
more advanced but I know you can do this so far we've let agents produce answers and execute the code but maybe we have userdefined functions that we want these agents to execute for example maybe we have a specific currency exchange function or we want to ensure we always save the last result in the chat history to a file and just to be clear about function calling it really is just a python function that's all it is so for example we could take this reverse string that we created in the very beginning of this tutorial from
jcpt we could take this function and put it in the code and then have an agent specifically execute this so how is function calling going to work with autogen well in this example we're just going to have a simple two agent chat we're going to have a user agent and the coder which is just an AI assist assistant now you see some new things here like register for execution and register to llm let me explain we're going to create a currency converter function it's just another python function that we're going to have and we want
these agents to execute it well how we're going to do that is using python decorators around the python function and to the decorators that we're going to use we'll say coder register for llm and user agent. register for execution what is going to happen and I'll show you whenever we go to execute all of the code is that when the user ask a question to the coder the coder is going to suggest a function and it will be this currency converter to the user agent and then the user agent will be the actual one to
execute that function and then the coder will get the response send that back to the user and then will get the output I know there's kind of a little bit of information but it might be easier and it might help to explain it if we just go through the code then we actually execute it and look at the chat history okay while we have a project let's create a new directory as usual 10 D functions function how about function calling we'll create a main python file and then instead of creating another config list Json I'm
just going to actually copy one from above and then paste it here okay so let's actually code this one together because this is going to be is going to be a little bit different than what we've done so let's just code this together well the first thing is we do have some imports right so you have import literal autogen and annotated which I'll show you why we need that then we'll say config uncore list equals Auto so gen. config list from Json because we created the Json file we'll say EnV or file equals oi config
list. Json and then filter dictionary you don't necessarily have to do this here so you can omit this but I just kind of make it a habit for myself so then we want to filter out by the model and this is array so we're going to say GPT 3.5 dturbo oh not E5 3.5 okay now that we have that the next thing is to create the llm config so say llm config equals we have the properties config list I'm just going to give it the config list and then I set the timeout to 120 or
2 minutes all right we also only have two agents to create a currency bot and then the user proxy so let's go and create the currency bot first so currency bot is equal to autogen do uh assistant agent and that is bot not cot okay so we have the assistant agent and we'll say the name is equal to currency _ bot the system message I'll just go aad and paste this in here this is basically we just want for currency exchange task only use function you've been Pro provided with reply terminate when the task is
done and then we need to give it the llm config because it is an assistant agent we need to provide the llm config so llm config fig and then we need to create the user proxy so user _ proxy is equal to autogen do user proxy agent the name is equal to user proxy and then we're going to create an is termination message I'm just going to go and paste this in here this is just a l I'm creating so it knows when to terminate the human input mode is going to be set to never
sorry so that's going to be set to never uh the max consecutive auto reply I'm just giving you some of the properties so you kind of get familiar with them um you can just set this to five it doesn't uh normally doesn't really happen that often and then we set the code execution config to false now here's we're going to use the literal Library so we're going to say currency symbol symbol is equal to literal then we're going to have we're in this simple example we're going to have two different uh currency symbols so USD
and then we're going to have the for the Euro e okay and now um I know I said that we're going to create one function and you could combine all of this into one function I have actually have two separate functions one is going to call the other so let's just let's just do it and this will make sense whenever we're done so the first one is going to be an exchange rate so we're going to Define an Define an exchange rate function we need one parameter to be the base currency and this is going
to be the literal so currency symbol and then the quote currency this this is going to also be the currency symbol and then we want this to return a float now I'm just going to kind of make up I know these aren't going to be one to one to what it's really like now but if we say the base currency this is equal to the quote currency then you know these are the same then we just return the exchange rate is 1.0 however if the base currency is equal to USD and quote currency is equal
to to Euro then we're going to return return 1 / 1.09 and then if the base currency is equal to Euro and the quote currency is equal to USD then we want to return 1 / 1.1 else we want to raise a value error saying unknown oops unknown currencies then we have the variables for base currency and the variable for the quote currency okay so this is just going to be kind of like I guess a sub function that's going to be called when we actually have our currency calculator so let's give this some more
space here so you can see what I'm doing all right what I'm going to do here this is the this is going to be the actual function that we talked about in the PowerPoint and I'm going to just create it first first and then I'm going to give it the decorators so let's give it some space here so def currency calculator okay that's what we want to create and then what we need here is we're going to have three different parameters so we're have the base amount so remember we're we're going to be converting currency
so we need uh a base amount and then the base currency and the quote currency so the base amount and what they do for um they they force you to kind of give um give a defined type for each of these parameters for function calling so we're going to say annotated we're going to give it the type and a little description here so amount of currency and base currency okay and we're going to do the same thing for the base and code currency so base currency this is going to be also annotated and this will
be a currency symbol so either USD or Euro and this is just the base currency and by default this will just be USD if we don't actually give it one and then the quote currency is going to be annotated as well this is going to be another of type currency symbol this is going to be the quote currency and by default we will make this Euro and we want this to return a string for the L okay awesome well we want to say now is quote amount is equ is going to be equal to the
exchange rate this is the function that we created above let me scroll up just a little bit and we're going to give it the base currency base currency that's what we need to and then the quote currency then what we're going to do is multiply that by the base amount this is uh the amount that we want to see um what the currency is in like from the base currency to the quote currency okay and then we just simply return Let's see we give this F for format and then we're going to give it the
quote amount and the quote quote currency now we're not done because we haven't added our decorators like I showed you in the PowerPoint so the first decorator is going to be user proxy so this going to be the user agent that's actually going to execute this function so you're going to say register. register for execution and then we're going to say at currency bot. register for llm okay and we are almost done the last thing we need to do is just initiate the chat so you can say user proxy do initiate chat you guys you
guys are going to be pros at this by now so that what we going do is say uh the user proxy obviously want to initiate with the currency bot or the yeah I I call it current bot I'm sorry let's scroll back up here just so this Mak sense um uh if you want to do a refactoring of a name you can do uh shift F6 in uh py charm so they can say currency bot okay cool now we renamed it so we want to initiate the chat with the currency bot and the message is
we say how much is let's see $1,234 56 USD to Euro all right and if you want to you can highlight over this and reformat the file and I'm sorry there is one more thing that I forgot you had to give a description for the llm so it it knows something about this function so here in the when we say currency bought the register for llm you'll say description is equal to currency exchange calculator if not you will get an error okay now let's try to run this okay so you know we want to know
how much this was from USD to Euro so the currency bot now it's their turn and they have a suggested tool call called currency calculator the arguments are the base amount here the base currency USD and the quote currency is EU R now when we go to actually execute the function the user proxy is actually executing the function and the response this is the float value response in Euro so now the currency bot is going to say okay well the this amount is equivalent to $1,326 in Euro okay so let's just do that let's just
see that one more time the user proxy gave a message to the currency bot the currency bot said okay in order to do this I have a suggested tool call called currency calculator gives it the arguments the user proxy actually executes it and then the currency bot says okay now that ex you got the function you got the result this is the equivalent or this is the currency exchange all right so we just went over function calling and take a slight intermission for we go into the next topic which will be tools at the end
of the day whether you say function calling or tools these are both just userdefined python functions but as far as the aogen framework goes there are some slight differences on how they're coded and at the end when run test between function calling and tools with autogen framework you'll see a couple of the differences when I have correct prompts and prompts that have nothing to do with the functions and we'll see the results of that you're doing great so far let's get into the next video okay so what are tools agents writing arbitrary code for us
is useful however controlling what code and agent rights can be challenging this is where tools come in tools are just predefined python functions that agents can use instead of writing arbitrary code agents can call tools to perform actions such as searching the web reading files or calling remote apis you can control what actions an agent can perform in this example we ask the assistant agent what time is it in NYC well the assistant now has two different tools to choose from get time and get weather which again are just userdefined python functions let's dive a
little deeper into how this works we still have a user agent that's going to ask what time is it in NYC to the assistant agent now the assistant agent is going to say okay do I need to use a tool if the answer is no then the agent is just going to ask the model the question and then get an output but if the answer is yes then the agent can choose either or maybe both of the tools to come up with the answer but in this case they're going to choose the get time function
or tool execute the tool get a response send it to the model and then give us an output so what we're going to do is we're going to run some tests so we're going to recap the function call test really quick test the tools with different prompts and then I'm going to show you the differences through the autogen framework and then we'll see what happens when we purposefully give prompts that aren't related to any of the function calls or the tools and you'll see the difference between them okay so let's go ahead and start coding
let's create a new directory under our project this will be 11- tools and then let's go ahead and create a main python file or actually we can just call this tools okay so the first thing we need to do is have our Imports we import autogen and then annotate it again because we need that for the tools just like we did function calls and then we're to import date time so now we're going to start creating our tools and the first one is called get weather remember we had a get weather and get time so
I'm going to have a parameter called location it's going to be a string and then this is just a simple description of the parameter and I'm not going to actually call an API right this is just an example so if the location that's given is Florida say it's hot in Florida else if it's Maine say it's cold in main else just say I don't know this place and then we have the get time well here we're going to take a time zone which is just a string we're going to get the date time now with
the library that we imported and this is just a way to get the current time and then we're just going to return the string so those are our two tools that we're going to be using and registering to our assistant agent let's go ahead and create our agents and let's begin with the assistant agent so we have the name for the assistant agent we say you are helpful AI assistant return terminate when the task is done this is actually a helpful uh portion of the system message then for the llm config we just have this
I have all this information in line so you don't have to create a separate file for uh this part of the project so I have the model actually I don't really want to use I want to use 3.5 turbo I was just playing around different models so we're going to just use 3.5 turbo and then you can use um put input your API key here okay and now we're going to create the user agent so we're going to give the name user give it a termination message so it knows when to look for the word
terminate from the assistant agent and then human input mode is never we don't really want to intervene this time okay and then the next thing that we need to do is register the tools with the assistant agent so we're going to say assistant. register for llm similar to the function calling we're going to give it a name so I'm just going to give it the name of the function you can give this something different so I'm going to say get weather and then the description and the description is helpful this going to help give or
let the assistant know that this is the correct tool for what we want to use so giving it a good description will help and then finally you need to say in these parentheses the name actual ual name of the Python function that we created so we're going to do the same thing for the get time method as well and then we need to register the tool functions with the user proxy agent because the user proxy agent is going to execute these tools for us so we're going to say user proxy register for execution um just
give it the name you can just give it the function name um because we also have in the same thing in parentheses say the give it the actual python function for get weather and also get time and then the last thing is we need to initiate the chat so we're going to say user proxy do initiate chat we want to initiate the chat with the assistant and the message is going to be uh we'll do what the first one what we did in the PowerPoint so what is the time in New York City okay now
let's go ahead and run this and see what we get okay so it ran pretty quickly The Prompt was what is the time in New York City so the assistant said to the user hey I have a tool I I want to use a tool and I have a suggested one that we can use called get time and we had the parameter as it gives the time zone for America New York based on the city that I gave it so then the user agent is going to execute the function tool and here is the response
the current time in your time zone which is America New York is well I ran this on the late at night on the 24th so the it went back to the assistant the assistant said the current time in New York City 9:54 p.m. and then that's the response so we're done okay great that worked but now what I'm going to do is I want to give it two questions and I want to say what is the weather lake in Florida and this time what is the time in Paris so when we go to run this
what it should do is it should use or execute both of the tools that we gave it let's see if it does okay so we you know here's the prompt the first uh suggested tool call was the get weather and and you know the argument was or the parameter was location is Florida so the user executed it return that back to the user so it's saying it's hot in Florida then the assistant said hey the suggested tool call this time is get time with the time zone being Europe SL Paris and then the user executed
it and this is the time zone in Europe Paris so then the model concatenates both of them saying the weather and Florida's hot and the current time in Paris is well it's actually just about the same time I didn't actually write code to give the exact time based on the time zone the exact time uh this is this is actually still my time zone that's okay I don't live in Paris and then we terminated and we're done okay great we have examples of it using a single tool and now we have an example of using
a multiple tools okay now I said that I would go back to function calls and do that test so you can have an idea what it what that's doing so let's do that really quick so if we remember I had this simple had this simple example where we said or we for the function call so let me go ahead and run this where this was the currency converter we at we had the prompt here the the currency bot or the assistant agents gave the suggested tool call you see it still actually just says it's a
tool call here a currency calculator gave all the arguments um the user proxy executed it the currency bot um gave that back to the user and we're finished with the chat so as you can see that doesn't really look to me either like there's really any difference between function call function calling and tools at least as far as as far as autogen framework is concerned however now what I want to do is give these prompts that have nothing to do with the function and we'll first actually do this with the function calling so this is
a currency converter what if I say can you give me the answer to 2 + 2 okay something super simple right it's a currency converter so but this is just a simple math problem let's run this and see what it does and look what it's doing it I mean it uh kind repeat itself a few times but I gave it the prompt and it said I can help you with currency exchange tasks let me know if you have any currency exchange calculations to perform and it didn't actually give me the answer to 2 + 2
now let's do the same thing with the tool so let's come in here and let's modify this and say what is oh actually let me just copy this so that there are no there no variance so what can you give me the answer to 2 plus 2 now let's run this one and see what happens here okay well so what happened here was that the AI assistant decided that it didn't need to use tools or that there were no available tools to answer this question so it just instead went directly to the model to answer
the question so the user went to the assistant the assistant said okay well the answer is the answer to 2 plus 2 is four and that's it we're done so it still answered it but with the tools that gives us the possibility to do something a little bit more specific it's kind of generalized but you know if you have something that's like hey I need to say this to a file or I need to save this to SQL and I don't necessarily want the assistant to come up with code to maybe a database that are
exists right so I know that this exists I just wanted to do this here's a tool that if I ask it to do something it could do it could do it also just something that I wanted to also point out that I found was um an open AI API um and I kind of just you know just looking around that for function calling they've actually deprecated this in favor of tool choice so if you go up here for tools or to Tool Choice um this is what they prefer that they prefer us to use now
so I hope this helped you understand what tools are all right so for this video this isn't exactly autogen related but this is going to be a part of the next little project that we're going to be doing and we're going to be creating images however like I said before with Dolly 3 that costs some money so we're not going to be using Dolly 3 because you have to pay for the uh $20 per month with the chat GPT subscription I don't want to do that you probably don't want to do that if you do
I gave you the code already but instead I want to introduce you to something called hugging face if you have not been here they have over right now actually right now they have 620,000 models okay and on the left- hand side here they have all these different types of models listed out for you so they have text to image image to text image to 3D text to 3D text to video translation you know text to speech they have so many and what we're going to be focused on is I'm going to show you something and
you're going to learn about inference servers so when you come here to huggingface ./ models just choose the text to image and now it's going to filter out by all the text to image models and the one that we're going to use today is the stable diffusion Xcel base 1.0 and so not only is it about having the subscription the chat GPT but running something like a text to image model locally is taxing on your hardware and if you don't have the hardare where to run it it is going to take a really long time
you'll be better off to use a service L runp pod rent out a server and then you know give all the python code there create the files and then run the text image model however I'm going to int introduce you to another way when you're here you can choose the deploy at the top right then choose inference API serverless if you don't what this is first off inference is like basically the whole round trip from the prompt to the completion that's called inference you can just send a request to them and they'll give you back
the response in this case you're going to send a text request to them through a post request then they're going to give you back an image and they actually give you all the code here that you need to do it so all we're going to do in this video is simply copy paste this and run it so you just can see how it works so if you don't already you will just need to sign up but it is free then you just need to copy this we are going to create a new directory so create
new directory we'll call this 12 and say text to image create a new python file and then just copy and paste everything and it even gave me my Bearer token for authorization okay and all I'm going to do here is after I formatted the code I'm just going to add one line here image. saave and then just give it a file name so it saves it uh locally so all we're going to do now I'm just going to Simply run this so I'm just going to run this it finishes I minimize the terminal and here
we are we have a test.png and here is our astronaut riding a horse it is that simple and if I were to do this on my machine this would take so long if my you know machine didn't crash before it finished and this is something called an inference server of course this was free they do have dedicated servers that you know I would you might want to use for production but this was free just for us to use for development okay so for this next project what I want you to do is I want you
to create a main main python file create the config list like we have been create a user and an assistant agent I want you to create a tool called create image register this tool correctly like I showed in the video the tool should create an image from the inference server on hugging face like I showed you save it to local and be creative go ahead and pause the video try this out and whenever you're done replay the video and I'll show you how I did it okay great job if you got or even if you
just got something working great job even for trying um if you weren't able to here I'm going to go through what I did so I have my API URL and the headers this was kind of the main thing so I created a tool called create image uh had the response and the image bits you know this is kind of exactly what we did before this is really isn't anything different I have a random number and a file name this just gives me a random file name which I'll show you over here so that I can
run this and I don't have to rechange the file name or won't overwrite then I have the image open and then I finally save that and just return the message that we were given the tools here tend to complain if you tried to you tried to like not return anything because you don't really need to it'll kind of complain so I just have it return the message so the warning isn't there and then I have the llm config an assistant agent and the user proxy agent and then finally I registered for the assistant agent I
registered for llm the create image uh function and then for the user proxy agent I registered the same create image function uh registered for execution and then I said user proxy initiat creat an image of a professional football player now I said football I even spelled it like I know internationally football here in the States soccer so whenever I did that I got this which is but this is an actual football American football player his hand is kind of morphed into the actual ball here and I'm not sure really what's going with his mask up
here but not a bad image so I hope you learned something from that and I hope you're able to get something working you know just trying and doing some repetition it will help you get used to how the multi-agent framework works today we're going to be using Reddit to create a newsletter this will be done using the Reddit loader from linkchain and then autogen will orchestrate the agents to create the newsletter the first thing I want to show you how to do is just go to Reddit and create a simple app from your user so
that we can get an ID and a secret to use for the Reddit loader and we can get uh whatever subreddit and the post from there that we want all right well the first thing is this URL will be in the description you come here there will be a button if you haven't done this before to create a new app you click on it you want to give this information in here right so I gave it the name Tyler newsletter you choose the radio button for script you just can just put reddit.com uh URL in
here and then you'll click I'm not a robot and when you click create app it'll say application created and you'll be given your client ID here and your client secret here and this is what we're going to use inside of the Reddit loader so you just want to copy and paste those things when we get there okay well we might as well just dive into the code so again like I said you need your client ID in secret I'm going to paste my client ID here and then my client secret here okay so we take
from Lang chain the document loader we take the Reddit post loader we have a user agent which I just you I think you can name this whatever you want but I kind of follow the convention that they have extractor by and then this is my Reddit username now you know it um you can give categories here so I think they only allow you to do controversial hot new rising and top so I'm just going to put new uh I just want to retrieve posts from subreddits and then queries here you can put um a list
of subreddits that you want and then you can also put the number of posts from I think it's from each of these that you would like to have so then we say documents equals loader. load so it's going to load all of that information into these documents we have the config list and the llm config I will have it so that you can set up using olama LM Studio or if you want to use open ai's API that'll all be in the Json configuration then I have two agents I have the writer agent which I
just want to say don't change information you're given just parse the page content from the Reddit post no code will be written if I give it kind of a basic writer prompt it'll take the information from the post and and try to morph it into something else and I don't want that I just want the information from the post and then we have a standard user proxy agent and now we want to initiate the chat so it's just a simple user proxy initiating chat with the writer um I want to extract the page content and
URL from each of the documents from above so in case you had more than one document right there there's going to be uh page lots of page contents from each of the documents and then I just give it some information to separate it out I want to create a newsletter uh with the newsletter Title Here make sure it's catchy then a format of the markdown here it's just going to print this out for us this going to be a simple example to show you how to use Reddit to get you know it's not I mean
it's simple but it it does give you a lot of information right you can uh search for any AI related uh subreddit and then get the latest post from there let's run this let me delete the cache and re run this okay so it finished um here is the user chatting with the writer and then the writer talks back and says okay here's the content parse and formatted markdown ready for a newsletter so it looks like it kind of has a title and I wanted to do three posts on open AI so here are three
different posts and these are I said it to be the newest on so uh as of now you know this one I think was like maybe 40 minutes ago um so let's click on one of these let's click on number two suggestion for updated laws of Robotics by Samuel so I open it up that is by him it was actually 20 this one was 28 minutes ago and here is the full description so if it was a newsletter it gives part of it and then says read more here and you can click on the link
to read the rest of it here I just copied and paste it into uh reddit.mma which looks nice right so it has um each each one here and then has the author and then you can choose uh it creates a link for you to read more more about it I didn't even do a whole lot with the prompt I just created a kind of a simple outline for I wanted the newsletter to look like and it created this for me which is pretty cool all right you just finished the course and thank you for watching
I hope you're able to learn something especially this was your first time working with a multi-agent framework or working with AI agents in general I'll have references to everything that you need down in the description below now the thing is with especially with AI right now things are changing at a very rapid pace so depending on when you see this if something doesn't work please leave a comment section saying hey you know this what I'm trying isn't quite working and I'm trying to update my GitHub repositories with the with the all the new changes and
updates that are happening almost every other week but I can always miss something anyways thank you for watching please please subscribe like and comment I'll see you next video
Copyright © 2024. Made with ♥ in London by YTScribe.com