ChatGPT Operator is expensive....use this instead (FREE Open Source)

267.12k views4907 WordsCopy TextShare
NetworkChuck
🌐 Build your next project on Hostinger with an INSANELY fast VPS: Get 10% off with code NETWORKCHUC...
Video Transcript:
We can now give a task to an AI agent like, Hey, go find me a Japanese VCR that supports TBC on eBay and add it to my cart. Oh, also make sure it's working and the AI agent will just simply go out, open a web browser and do this thing. It's like giving a task to an assistant and you can go about your day.
By the way, that's based on a true story. I'm actually working on a video where I'm needing a Japanese VCR and I did this a few weeks back. Now I'm doing this with open AI's operator.
They released this a few weeks back. It's a research preview, so don't be surprised if it's kind of janky. It also is only available to pro users, which means you got to be paying OpenAI 200 bucks a month.
But I found an open source alternative. It's like this. It'll open up a browser, do the whole thing.
Actually, I think it's kind of better. It's free open source. I'll show you how to use it right now.
Get your coffee ready. This is actually pretty fun. Hey, network check from the future here coming up, I'm going to pit operator versus the open source option to see you can create and purchase a virtual machine in the cloud and they, I'll have 'em log into the terminal and create a file.
Who can do it the fastest? Can they do it at all? I don't know.
And by the way, this segment is made possible by our sponsor posting here. We'll see what happens. The project is called Browser Use Enable AI to control your browser.
It's created by these handsome fellas, and I have to say the project is very impressive. It does a lot. Let's peruse it a bit.
Actually first, just know there is a paid version of this. If we go to their official website, not on GitHub, we can see that they're backed by a Y Combinator, meaning they've gotten some funding and here they're even touting their performance saying they are 2% better than operator at 30 bucks a month. They are still cheaper than chat.
GPT operator, the enterprise option, yes, a month. Okay. I mean it made me stop and look at it and go, what?
But anyways, open source is what we care about. We can host this ourselves, use our own local stuff, even local ai. We don't have to go to the web unless we want to have, I mean our web browser be accessed via the web or rather, unless we want to go out to a website with our web browser, I said that backwards.
I need some more coffee. And what I love about this project is that it doesn't feel like it's in a research mode like Chad, GPT is, and it's very programmatic, meaning that if you are really into building AI agents, which I'm getting into, that you can program your agents and have it do all kinds of insane things. They have some examples here like add G items to your cart and go check out.
And it's obviously all done with code on the side. Don't be scared. I'm going to show you an option that's very gooey friendly.
You can add my latest LinkedIn follower to my leads in Salesforce, read my resume and go find jobs for me. Write a letter in Google Docs to my papa thanking him for everything and say, the document is A PDF. Now, thankfully, you don't have to know how to code anything and you just want to take it for a spin yourself.
Right? Now, if I go back to the browser use account, I can see one of the projects is called Web ui. This is very easy to set up.
We're going to walk through a right now and you can try this yourself with Alama. So a couple of things you're going to need to make this happen. First, you'll need some coffee.
That's just the rules. I didn't make 'em. Maybe I did everything in it requires Coffee network, chuck.
coffee. Two, you'll need a machine to run this on if you want to run this locally, which is what we're doing right now. So Mac, windows, or Linux.
I will be demoing right now this setup on Windows, which will be using WSL, which is the Windows subsystem for Linux. So it's basically Linux and it will be very different from the bare bones of Linux or Mac setup. And if you're like, Chuck, I have no idea what WSL is, I have a video on that right here.
And honestly, that's pretty much all you need. Oh, you know what? I lied.
You're also going to need some sort of AI to use, right? We're using an AI tool. One tool you can use that's completely free, completely local, completely awesome is a llama.
Go out to a llama. com, click on download. You can install it for Mac, windows, Linux, and it's very quick and easy.
Have that going. If you want to use open AI or Clot or any of those cloud-based models, all you need is an API key. I'll show you what that looks like.
It's actually, it's going to be better than a local model because they've got more resources. Okay? First thing we'll do is launch our terminal.
My favorite place to be and because I'm in Windows Land, I need to jump into Linux with WSL. I'll launch my Ubuntu, I think it's 2204. Yeah, it's 2204.
Now, the first thing you want to do is make sure you do have Python three point 11 installed at least three point 11. The easiest way to do that is with PI ENV, with PI ENV installed. All you have to do is type in PI ENV installed three point 11.
I already have it installed, and then you do PI NV global three point 11 to make it live and you can switch back and forth between Python versions. It's awesome. Link below.
Now real quick, make sure you have Python three point 11 by typing in Python three dash dash version and you should see Python three point 11. Now we're going to clone this, get repo the web ui, copy this command, paste it here, cloned, and then we'll jump into that directory by typing in CD web dash ui. Now to make sure we keep things clean, we're going to launch a Python virtual environment or create a virtual environment.
We'll type in Python three dash M for module specify VENV, and then name our virtual environment dot vnv at enter. And by the way, if you've never used a virtual environment, you may not have the module installed. We can do that right now by tapping in.
Pip install a virtual, ah, why is my curs bouncing around virtual ENV, just like that. Now with our virtual environment created, let's activate it. We'll type in source vnv slash ben slash activate.
Boom. This creates a nice little box for us to play in and know other stuff that's going to be impacted by the things we install. Now we'll use the command pip install dash r and we'll type in requirements txt.
This is a file that's right here in our directory and it's going to describe the requirements we need for this project. It'll do it all for us right now. Ready, set, go.
And we'll watch it happen while we're sipping some coffee and done. And then one more thing we have to install is this tool called playwright, which I've never heard of, but I think it's essentially doing headless browser stuff. It's amazing.
Just copy and paste that I already have it installed just so I should be good. You just might take a moment. And finally, one more thing we have to do is get our environment file ready to go.
They do have an example environment file that we're going to copy to our own. So we'll type in CP Env example and we'll copy that file to env just like that. Now let's edit that env file nano.
And this is not required by the way, nano env. And here we can add any kind of API keys we want to have here. So open API anthropic.
We can also specify an Alama endpoint, which normally if you have alama installed, you'll just want to have a local host. Now for me, I do have an external alama server that's more powerful. Terry, have you not heard about Terry?
Terry is my AI server. I've built in this video here. He has dual 40 nineties.
It's amazing. But I'll add his IP address and we'll use him for my stuff. And then I'll go ahead and add my open AI API keys and my anthropic because I'm going to show you what they feel like.
And don't worry, I will end up revoking these keys. So it's okay that you see them right now. When you're done here, hit control X, Y enter to save.
Then now all we have to do is run this command. Let's scroll down and find it. They do have a Docker option, but Docker can be kind of tricky.
If you want to try it, go ahead. The local setup is easier for me. So this command right here, it's going to launch the web ui dot pi script, copy and paste that.
Hit enter and we should be off to the races. Yeah, yeah, it's working. Okay, so now we're going to navigate out to our browser and go to local host port 77 88.
Let's do that right now. And here we are. Now let's go full browser mode here.
Actually, no, we'll leave it right here because there is some cool stuff we'll see in the command line. Now, fair warning, there are a lot of bells and whistles you can play with. You're going to play with them.
They're super fun and you can go crazy with this, especially the scripting part when you go into just messing with Python. For now, we're going to do something just quick and easy. Let's first go to our LLM configuration, this option right here.
Here we have our LLM provider. I'm going to choose, let's see, I'll do alama this time. So this is going to be local AI agents, nothing in the cloud, and then we'll choose our model name.
Now we'll say if you're doing Quinn or Llama two, they're dumb. They have a really hard time doing this. I normally want to do it with deep eq, R one 14 B at least, but I'll show you Quinn real quick.
Now by the way, you do want to make sure you download the model just like so. If you want to get Quinn with alama, you'll open up your browser, I'm sorry, not your browser, your terminal, and you'll type in alama pull and that model name just like this. And then that's really all we have to do.
We'll go to our run agent tab, and here they have a little demo option, just a quick little thing to try out. Let's run it. Click on run agent and watch what happens.
Oh, browser window over here. Let's scoot it back over here. You can see on the right side, our terminal is thinking and things were failing.
It's failing because Quinn is dumb. Yeah, it just couldn't do it. Let's try another LLM.
Let's try deep seek R one 14 B. Pretty smart guy. Let's run him.
Okay, browser windows open over here. Oh, okay, so what's doing stuff? Check that out.
It's like notating things on the page and numbering them so it knows kind of what to look for. This is amazing. And notice how on the site, it's like Autocorrecting, it'll fail.
Try again. It'll max out at five times. Alright, let's stop that.
This is kind of boring. Alright, I'm going to run this one more time. I'm going to try it with the local LLM just to see what else I can do with this.
So I'll select llama once again, I'll do my 14 B deep seek. Let's do something simple like go out to network Chuck coffee, find the 4 0 4 error coffee and add it to my cart. That's what I'm drinking right now, by the way.
Oh no, it's called 4 0 4. Not found. I don't even know my own coffee names and let's it happen.
Okay, so it made it to my site very quick. It's finding the search. It's like watching one of my kids try to use the computer.
Oh wait, it's not called 4 0 4. Not found. What am I thinking?
It can't find it. It's having such a hard time. Let stop him.
Stop. It's okay buddy. It wasn't your fault.
It's called 4 0 4 error. Let's try it again. I still can't believe we're able to do this and locally too, it feels like magic.
There it goes. It found the coffee. Now we'll add it to my cart.
Why did it go to 200? Okay, another very good blend by the way. And that's not a blend.
It's a single origin. I forget where it's from now. I'm curious.
I'm going to try this here in a moment. Can this guy solve the capcha? Because that is something that Chad GT operator will not do.
Okay, here we go. Time for the competition. For the open source browser.
We'll be using Anthropic and Claude three five and we'll be using our own browser. This is cool because it'll keep my logged in sessions and here are the instructions. We'll see how well this does.
I have no idea. Essentially I wanted to log into hosting and create a VPS for me. I have no idea if this is going to work.
And then here are the instructions for the operator. Same thing, but I'm going to have them use two different things. One is going to use Ubuntu 24 0 4 as the OS one is going to use one of the applications that hosting your offers.
It'll just be installing Docker and I'll launch them roughly at the same time. Ready? Set, go.
We're off to the races again. It's so cool that the open source option is using my built-in browser. Okay, we're already at hosting here.
It's going to get logged in. Come on buddy. Now open source has an advantage because it's already logged in.
It's my browser. I know. I'll have to log it over here.
I take control. I'll look at it. Go over here.
Now let's going to the VPS stuff. It's going to set up A-K-V-M-V-P-S. Oh, it's going now.
As you can see, they got VPSs everywhere. We'll choose the best latency I told it to anyway. It's searching for Ubuntu.
It found it. It's so smart. I love it.
I'm still sick of the password over here. It doesn't copy and paste. Stupid thing.
Okay, we'll set the root password. I have no idea if it'll actually do this, right? It did it.
Oh my gosh, it's doing it so well. Okay, so we have options here. I want KVM two eight terabytes of bandwidth.
That's a lot. Two virtual CPU cores. Okay, here's the thing, can it use the coupon code?
So I sold, just choose one month. Oh, I'm so excited. Wait, did it add the coupon code network?
Chuck 10? Is it doing it? Whoa, whoa.
It's adding 20 servers. Stop. No, stop, cancel.
Oh my gosh. I better stop that. We're just going to try that one more time.
I'm going to be very specific about the number of servers I want. I won't count that a again to he didn't know. I mean he should have, but goodness, that was scary.
I'm stuck on caps lock over here. I can't even do anything. I started over, gosh, it's stuck on caps lock.
How am I supposed to do this? Operator? Okay, caps lock is currently not on for me.
Try it freaking again. So if our open source is looking real good, caps lock is finally turned off. Eight gigs.
A ram is really good for 6, 9 9 a month. That's crazy. That's a good server.
Alright, here we are again. Don't do 20 servers please. It did 11.
Why is it doing 11? It's doing one month but it put fricking 11 over there. Oh no.
I think he just made 11 servers. No, and I don't think it used my coupon code chat. BT is still screw me over.
Just bought 11 servers. At least it wasn't 20 for a year. I have to restart chat GPT again.
Alright, they're setting up my VPS. I'm going to give chat. CPT back control.
Okay, so the open source browser thought he was done, I think. Yeah, he thought he was done. So he did make a server but he didn't stink and use my coupon code I don't think.
Let's see if he actually made that many servers. Yeah, he did. What am I going to do with all these servers?
No, they're amazing because they are a MD Epic CPUs. I've got full root access on all these guys so I can do whatever I want. Man.
Chat g. PT is still trying to figure it out. Scroll down, dude.
Chat. GBT is having a hard time now. What it's hanging up on right now is the application options.
Yeah, I want to take control. I want to help 'em out a little bit. Can figure it out.
Idiot. With hosting here, you can install regular Linux oss or you can do applications that are pre-installed. Bunch of options here.
We'll choose Docker because that's what I told him to do. Now I'll let him finish Dummy here, lemme do this for you. Okay, it's going but seriously, if you want to have a project in the cloud, which I do this all the time, hosting is an amazing option.
Powerful servers coupon code. It's doing the coupon code and with never chucked 10, you'll get 10% off. Lies.
It does exist. Maybe it's only a year to 12 months. There we go.
It's 10% off a year. I do want to try it one more time with the open source. I feel like we're missing something.
I'm going to add one more thing. I don't want 11 servers. I'm going to try and make it only do one.
What's happening over here in Chachi PT Land? What's it doing? Oh, it's accessing the browser terminal now.
Okay, Chay PTs in the terminal instead of asking me, okay, it said it created it. Let's see. No it did not.
It's not there. Chay Piz is a liar. No, no, stop.
It's doing it again. No, no, stop. Okay, the verdict browser use works great if you want more servers than you want.
Chad CCPT was okay, but he had to have his hand held the entire time and he lied at the end. Anyways, thanks to hosting it for sponsoring the segment. If you want a VPS, you should get one right now.
Use the code network. Chuck 10 for 10% off a year. Link below.
Limited time anyways, back to stuff. Okay dude, you're stressing me out. I got to cut you off.
So that's using local ai. Now we know that using any kind of cloud-based ai, like open AI or anthro, it is going to be a bit more performant. Let's try that.
I want to test the speed. So we'll change from llama to anthro and we'll choose the Claude three five sonnet model, which is very, very smart. One of my favorites actually run this same task.
Keeping in mind, this is very demoing, right? You can do many, many more cool things through programming. What have you do?
That was fast. It's going, I found the coffee. It's on the right coffee page now.
Okay, so smart and added it to cart. That's so cool. Oh my gosh.
Okay, I'm going to cut you off because you're done. You did such a good job. Good job buddy.
I do want to test Quinn one more time just to see if it was a fluke because I know many of you, this might be the biggest model you can run on your laptop or whatever you're using. Let's see, let's do something simple. Go to YouTube and find a video from Network.
Chuck. Let's see how it does. Dang, that was fast.
Okay, go Quinn. That is so cool. Oh my gosh.
And it started playing the video. That's so awesome. Now just so you know, I haven't played with this extensively, but you can have it to where it uses your browser.
So the one big limitation with Chad GBT operator is that it's using this random browser that it operates and you can actually interrupt it. So lemme show you what that means. If I were to ask it, like right now, I can take control of our eBay session and log into my own eBay account.
It takes a minute. It's very slow, very buggy. But here I'm using the browser that can say finish up.
You can have control. Again with browser use, we can actually use our own browser with our own settings. Everything's still logged in.
Our password manager and our AI can handle it for us. That's so powerful, dude. It's still watching videos over here.
That's so amazing. I wonder if I can get it to leave a comment. Okay, I got to try that.
So I'm going to try and tell it to find a specific video and leave a comment. Of course, when you try to post a comment, it'll ask you to log in, but I wanted to get to that point. And this is using Quinn.
Yeah, we're still using Quinn. I had to make sure we're still using that. Go on YouTube and find the video from Network Chuck covering Docker networks.
Leave a comment saying what should we say here in 2025? Sorry, I couldn't think of anything better. Let's just try this.
Go Quinn, you've got this. I could honestly do this for hours. I'm not going to make you sit here with me and do that, but this is so fun.
Just imagine the automation things you can do. Are you kidding me? I love this.
Oh, it gave up. Now a couple of things. I'll notice you probably saw this deep research thing.
I've not tried that, but you can also go to recordings and it will actually show you the result of what it was doing. So if you're like, what did this guy even do on my browser? You can watch the play-by-play.
I don't know why Quinn gave up. Let's see if Deepsea can do it. Deepsea is moving.
So I found some videos. Is it going to recognize that those are not the videos? Well scroll down.
I'm like wanting to scroll stressing me out. Come on, you can do it buddy. Not the docker video or the network video, but it's on a video.
Yikes. 666 comments. Please leave one please.
Okay, it's going off the rails. Sorry buddy. I'm cutting you off.
I do want to see if Claude can do this very fast and then I want to jump into a test to head to head test between open AI operator and browser use. I want to do that same Japanese eBay situation. Alright, let's go.
Claude, same task. Let's go. Okay, I think I found the video first time.
Come on, jump on it. You're there now leave a comment. Why'd you scroll down?
You were right there. But I love seeing the thinking on the right side here in the terminal. It's about the sign of our YouTube premium and it's turn to sign.
Or maybe try to leave the comment maybe. I didn't see that. So I think it gave up.
It's done. Now I know some of you may be wondering, Chuck, I need a Linux environment with a gui. What if I'm just running command line headless?
I believe there is a headless version where you can just run in headless mode without a gooey. I haven't tried that. So if you want to try it, leave a note in the comments, encourage somebody that can work.
Let's find out now time to test this. Head to head versus chat GBT operator. To make it fair, I will use philanthropic Claude on my web UI here in my browser use and we'll give it the exact same instructions.
I'm curious if they'll find the same exact VCR. Okay, ready? Set.
Who should I start first? I'll start him first. Go, go and they're off.
This is fun. My agent got to eBay first. Operators typing in first.
Oh wait, hold on. This is so cool. Okay, we got search results on chat.
GPT first browser use is still trying to figure out where the search button is. Yeah, and eBay. I'm sorry.
Operator already found it. Yeah, add it to my cart. That was pretty quick.
Come on browser. You still let me down? Oh, weird.
What happened? I don't know. I'm going to try it with deep seek.
Maybe that does better. I'll tell you one thing, even though mine might be a bit slower, it still wins out because of this little tagline right here. Sam Altman's tracking what you're doing.
Oh wow. It got further than Anthropic did. Claude.
It's probably going to go for the same VCR. There it is. Come on.
Add it to cart. Add it to cart. Yes it did it.
Will it proceed to cart? Yes it did. It finished.
It did it. Oh my gosh, that is so cool. This is not fake enthusiasm.
I know people comment that I do that. No, this is seriously amazing. Now I want to do one last test and that's testing if it can do a capcha.
Now I know for a fact the operator will not do this, so there's a test website for Google or Capcha. Lemme show you what it looks like here. It's simply where you can test a capcha.
It'll bring it up, right? Let's see if operator will do this. Solve this capcha.
Yeah, it can't do it. It's like you do it. No, you do it.
Let's see if my local one can do this. I'm still on deep seek. This is all local I think, right?
Yes. We're on alama deep seek R one 14 B, solve this stinking. Let's go.
Come on. I want you to win. I really want you to win.
Okay, we're here. It's got the caption up. Oh my gosh.
Will this trip it up? What's the command line saying? I'm so curious.
It's probably having trouble realizing you can click on those pains. It's also probably having trouble recognizing stuff instead of click button with index zero. What's zero?
It keeps clicking the cap button. Okay. You know what?
I'm curious if we were a bit more specific about what it should do. Let's stop that. I want you to solve a capcha.
Go to this site, click the I'm not a robot check box and then a capcha verification will pop up. There will be a series of pictures. Do what the instructions say.
Okay, let's see if it does this. Let's get the terminal up here. We're on the site.
We have the capcha up. Did it do it? I don't know if it did it or not.
It may have solved it because what happens when you finish it? Because it's just a demo, right? Let's do it side by side.
Oh, so it should say I'm not a robot. Come on dude, I have so much faith in you. Oh, it's clicking squares now.
It did something. It's learning. I don't know if it selected the right square that time though.
Okay, we've got to end this. So this is an open source version of the chat. GPT operator.
Very cool. I think this project is so fun. Anything we can run open source local is amazing.
I wish I had the time to go crazy with this and program and do all kinds. Maybe I do have the time. I might do this.
Let me know if you want to see a video of just some sort of programming automation thing. Give me some ideas. I would love to hear that.
Comment below. Also think about this, the hacking ramifications. If you and I can get access to this like that, and really there's no limit to what we can do.
Think about hackers, how they can automate their processes. It's kind of scary. Yeah, he's never going to figure it out.
That's all I got. I'll get you guys next time.
Related Videos
Copyright © 2025. Made with ♥ in London by YTScribe.com