Deploying Billions of AI Agents is Easier than You Think

5.98k views4284 WordsCopy TextShare

Cole Medin

Mark Zuckerberg predicted in an interview recently that before we know it, there are going to be BIL...

Video Transcript:

recently in an interview Mark Zuckerberg the founder of meta made a prediction that in the near future there are going to be billions of Agents running out in the wild literally more than the number of people walking on this Earth might seem like a bold claim that is what we're going to talk about today but first check out this clip so I think we're going to live in a world where there are going to be hundreds of millions of billions of different AI agents eventually probably more AI agents than there are people in the world we just don't believe that there's going to be kind of one big AI whether it's a product or a model that uses we kind of fundamentally believe in having this broad diversity in different set of models and that you know every business and um you people are just going to want a lot of their own stuff that they're going to make and I think that that's kind of going to be interesting it's going to be a lot of what makes this interesting and he isn't wrong in fact he is spoton especially considering just how easy it is to deploy and scale AI agents in a very configurable way imagine a world in the future where businesses have a core set of AI agents to pick from built by people like you and have them deployed and customized to their needs instantly I'm talking dozens or hundreds of agents for millions of different businesses that is the billions of agents that Mark is talking about now I'm not going to give you some grandiose introduction where I deploy hundreds of agents at the same time but what I am going to show you in this video is how simple it is to really take an AI agent and deploy it into the cloud in a scalable way a lot of people maybe yourself included build your AI agents on your computer but then when it's time to bring it into the cloud you get intimidated and it seems very difficult to actually take your agent into production and scale it for multiple businesses but I'm going to show you that with the certain setup that I'm going to walk you through it's actually really easy and so this is going to be a super practical guide for you if you're looking to deploy your agent but also my goal here is just to get your imagination going here taking this kind of process and seeing how much you can extend it to really make it so that you can partake in building the billions of agents that are going to take over the world probably in the near term and it is all revolving around my favorite tool for deploying code which is Docker and if you don't love Docker by the end of this video I am not doing my job right and also what we're deploying today is the GitHub agent that we've been working on as a part of the miniseries covering my full process for building AI agents because this video is a part of that Series so this agent can consume entire repositories for code Q&A and now we are bringing it to the cloud and the best part is we're going to be doing it completely for free to get started super exciting stuff let's go ahead and dive right into it all right so let's set the stage here a little bit and then we'll dive into getting things set up with Docker and I'll show you how easy it is to deploy this to the cloud and I'll even deploy the front end that we built in the last video in the series to the cloud as well so this is our GitHub agent right here that we've been building with pantic AI which is my favorite agent framework right now and so we have these tools defined for it to get the metadata for a GitHub repository to get the file structure of it and then of course we have our tool to get the contents of specific files and so using all these tools together that is how our agent is able to analyze code and answer questions that we have about a repository that we give it the URL to and then also in this series we created an API endpoint for our agent with fast API that way we have an endpoint for our agent that we can hit with a front end like I showed early in the series how to use Agent Zero which is my platform I built for you on the live agent Studio to instantly have have a front end for any agent that you run locally and we also showed in the last video on the series how to create your own custom front end we did that with lovable and bolt. DIY and so we got our endpoint here this is all nice we have everything ready to deploy in the cloud as an API endpoint but this is where we get to the difficult part so everything's running locally and it's running great I mean I've got it open in my terminal right here I can even go to Agent Zero and test it out so I'll paste in this prompt that I have ready for it here asking it to describe the bolt. DIY GitHub repository and it's hitting our agent things are working well it's calling the tools it got the answer for us this is working absolutely phenomenally but it's all running locally so at some point I want to actually deploy it to the cloud that's what we're going to be doing right now but there's an issue here we have to make sure that the machine that we are deploying this API .

2 has the right version of python it has the Right Packages installed there are things that we have to do before we can just go ahead and run our endpoint script right here to host our agent so at this point you might be thinking what the heck Zuckerberg how can you have billions of Agents out there when all these machines need to have everything configured exactly right for your agents and if a machine goes down you've got to spin up another one to make sure that has everything set up and what if you want to run on a serverless architecture and it doesn't support your database driver by default so there's not a way to install that how do you just manage everything for your agent it can seem kind of difficult to get all this set up when you have all these prerequisites for your agents even simple things like Python and your packages if only there was a way to package everything together in a neat way where you have this deterministic set of steps to get everything set up for your agent and then execute it at the end so you have this perfect little isolated environment that you can put anywhere and it'll work really on any platform with any architecture and that is where Docker comes in take a look at this it's basically exactly what I just said we need we have a set of steps right here here that sets up everything in this isolated environment you can kind of think of it as its own private little computer that has just what you need for the agent installed and then it even runs the agent at the end right here with our fast API endpoint and a huge bonus to setting this up as well besides just making it so you can really deploy this anywhere is that you can use an AI IDE like cursor or wind surf to Define all of the steps to create your agent in code so that way you don't have to create some external process to go go into a server and run these commands one at a time you can actually do this in code and have ai help you create it I did not create this Docker file myself I just use wind surf to make it I'm going to be honest and that's why I'm not going to go through some in-depth guide right now for how all these commands work I mean I do understand that but right now the focus is just using this Docker file I created with AI and using it to deploy our agent because with this we basically have this little virtual machine running our agent that a lot of different platforms support and so that's what we're going to do right now is take this Docker file go over to one of those platforms I'll show you just how easy it is to deploy this and scale it and I'll talk about how you could deploy it to other types of platforms as well all right so the first thing that I do before I ever deploy a Docker container to the cloud is I test it locally and I'd highly recommend that you do the same just like with any other code that we create we want to make sure that there's nothing wrong in our Docker file especially because we're using AI to create it here I mean that's what I did and I'd recommend doing we want to make sure it didn't hallucinate anything the worst thing is when you have an error when you deploy in the cloud and you don't actually know if it's because of your Cloud configuration or because there actually is an error in your code so you want it running locally perfectly first then deploy it into the cloud and so we're going to build and run the container using Docker desktop so just go to docker. io file this updated read me all the code for the agent I'll have that in a GitHub repository which I'll have Linked In the description of this video so you can download this and follow along completely as well so I'm going to go ahead and take this Command right here to build the container with Docker just a very simple command to build I'm in the directory right here that we're seeing in vs code which is why I have a DOT for building in the current directory so it'll find the docker file in the current directory and for me it built instantly just because I already have things cached I built this when I was prepping the video obviously for you when you run it the first time it'll take a little bit cuz it's going to run all of these commands here to get that perfect environment set up for exactly what your AI agent needs the thing that'll take the most time is installing all of the Python packages for your agent and so then going back to the read me here we can now run this command to spin up our container here so I'm going to copy this go back into my terminal uh hold on let me copy it right here all right there we go now I'll go into my terminal paste this we're exposing Port 80001 because that is the port for our AI agent endpoint and then passing in the EnV file because obviously our agent needs the EnV file just like when we run it outside of our container so I'm going to run this and the output that we get from Fast API looks just like what we usually see but now that it's all running within the container itself and you can go to Docker desktop and actually check that out yourself too so I'm going to go back over to the agent zero front end and make sure that things are still working now that it's running within the container so I'm just going to do a little test message this time and make sure that everything everything's still working I just want a fast response and there we go it's working well and if I go back to my terminal we got the output right here from the pantic AI agent things are looking great so we know that everything is working perfectly now locally now we can go and take the same little isolated machine and deploy it to the cloud and because it is this isolated environment that we set up exactly for our agent we have absolute certainty that since it's working locally it will also work no matter where we deploy it in the cloud whether we do server deployed on dedicated server it doesn't matter cuz it's all running the same Docker container that is the beauty of Docker and what makes it so scalable and so easy to deploy so let's go ahead and do that now so there are a million different ways to deploy Docker containers in the cloud I mean you can go serverless with AWS or Google cloud or go to digital ocean use their app platform or their droplets like you might have done with n8n there are so many options for you because of how flexible Docker containers are for this video specifically I chose render I'm not sponsored by them in any way this is just the one that I found the easiest to deploy Docker containers as you'll see in a little bit and also they have a free tier just makes it so easy for you to get started and you can use their free tier for both the docker backend for the agent that I'm about to show you and then also deploying the front end that we built in the last video in the series I'll show you how to do that at the end as well there's also great documentation for deploying Docker on render that I'll have Linked In the description they just have great documentation in general which is another reason that I picked to them and so to actually get started deploying you'll go to your dashboard just dashboard. render.

png repo so it does that intelligent determination for you um very very convenient I'm going to select the main branch it's there by default you can pick the region that's closest to you or your customers leave the root directory the same and then for the instance type we're just going with the free tier right now take note of these limitations but it is a great place to get started and then for the environment variables this is one of the beautiest parts of Docker you can use environment variables to determine how things are built in the docker file and also this is how you can get things configured differently to have your AI agent set and customized to different clients or businesses or Departments of your business whatever it might be and so I'm going to be adding in my environment variables this way just copying in from myv file but before I do that I just want to speak a little bit to this is one of the things with Docker that makes it so scalable is you can have environment variables here that you deploy four different instances like I can go on render right here and have one instance for an agent for my sales department and then another agent that is the instance for my marketing department and they can have environment variables set to determine different things with the agent to make it behave differently and customized to the people that are actually using that agent that's another part of what makes it so that you can have these Docker agents that can scale so well is you use environment variables to configure them essentially and so what I'm going to do here is pause the video enter in all my environment variables that I have locally and then come back once that is done all right environment variables are set I'm not going to go over all these in detail because I cover that in other videos in the series here but the important thing is it was super easy to bring in and this is how you can configure so much different Behavior with your agent if you are using your environment variables well and so then for the advanced section here there is nothing that you have to change we have already set up everything that we need it's so easy with Docker because it's the docker file that's really instructing how you get this an instance set up for exactly what your agent needs and all of those dependencies so I'm going to go ahead and click on deploy web Serv and now once you go into the view here where it shows the build logs you can watch everything in real time as it's pulling the code from GitHub running your Docker file to build and serve your agent and the API endpoint for it so I'm going to pause and come back once it's done with this entire setup here took about 5 minutes and now our AI agent endpoint is live on render take a look at this we got the output very familiar to us this is exactly what we saw in our terminal running locally and we got this nice little message saying our service is live and now we can go ahead and test this out so I'm going to go back over to Agent Zero and test it and then we'll get into deploying the front end on render as well so I'm going to copy this link right here go back over to Agent Zero start a new conversation let me open up the settings right here the only thing I have to change because I set everything else the exact same for superbase and everything is the URL here so I'm going to replace the Local Host with what we have from render here so we got this special render URL and you can configure this to be your own domain and everything I mean all the flexibility you would need in render you've got and then it's going to be the same extension here/ API and then pantic GitHub agent so I'll go ahead and click on Save it'll refresh and I'll have a new conversation and yeah let's actually let me just go back and steal the same question here I'll just use the exact same question make sure that things are working well and so we'll send this in right here and then we'll go back over to render and make sure that it's actually talking to our agent and there we go yep we got a new request out to our agent and it looks like it had to oh yeah there we go all right yeah it took a little bit there's a bit of a delay there but yeah we got the response here so it got the repo info ran that tool back over Agent Zero we have a response so this is working perfectly it took like 5 minutes and we have the docker container running up on render already and there's so many ways that you could take this forward by uh creating load balancing or deploying to serverless architecture or getting more render instances spun up to scale your agent horizontally or you could go up in your instant size to scale it vertically there's so many more things that I want to get into in other videos on my channel but this is a fantastic starting point just showing you with Docker how easy it is to deploy any AI agent you could possibly build all right last thing we want to do really quick is deploy the custom front that we built in the last video in this series to render as well so just as a refresher here we got this GitHub repository and it's very simple we just started with lovable built our app there and then moved it into b.