This Browser Agent Automates ANYTHING (N8N Skyvern)
33.42k views7787 WordsCopy TextShare
Ben AI
👇 Join my community and get the template of this video, all my other templates, tech help & more:
h...
Video Transcript:
hi there so in this video I'll show you how to build a no code AI agent that can browse the web completely autonomously and fill out contact forms resolve cap Chas apply to jobs using your resume fill out government forms and even buy products for you all by just sending a quick voice message on WhatsApp now I know everyone's talking about open ai's Agent operator launch and it definitely looks impressive and has lots of potential but there are a few downsides to it right now first of all it only works well with the websites they trained their agents on like Airbnb and Open Table secondly it can't work autonomously yet right it needs constant confirmation and therefore it's not very practical for most use cases yet and lastly we can't use it in the API yet now the system I'll show you today works completely autonomously and can be used to automate high volume tasks like lead generation through contact forms uh manual form filling Mass job applying and much more I've built this system without code using n8n and skyver which is the browsing agent and in this video I'll give you a demo and a full step-by-step breakdown and the template will be available in my community if you don't know me yet I'm Ben I've been building and selling AI agents since 2023 I also run a community with over a thousand Founders professionals and AI Builders and if your business looking to adopt AI into your business you can also book us in for a free call in the description below I'll first show you a demo then I'll give you an overview of the setup in n8n and in skyver which is the browsing agent and then I'll give you a more detailed uh step-by-step breakdown uh at the end of this video video if you want to uh build or replicate something like this yourself also the template will be available in my community now just as a disclaimer we are still early uh with browser agents right uh generally they they are first of all quite slow uh second of all they're quite expensive at the moment but as we can all see these language models get cheaper uh by the week almost and thirdly they're not 100% reliable uh just yet but there are already businesses using um browser agents for specific use cases and I think we'll see a lot of interesting use cases coming up in the upcoming months and years uh especially because now with these browser agents we can actually start automating Dynamic workflow automations on softwares and websites that do not have apis uh but I think at least in the short to midterm it's going to be really important that we put a lot of guard rails on these agents and that's what something that skyvern is trying to do which I'm going to show you in a second and also what I think is going to be very interesting is combining work traditional workflow automation with browser agent apis right where we basically use traditional apis or normal apis and logic based automations and adding in browser agent apis uh for the part of the process uh where we want to automate a workflow on a website that doesn't have an API for example and that's sort of a system that I tried to uh uh create here today so let me show you an example so we can start for example with the job applying so this agent can actually first of all find jobs for me and then actually apply to the jobs by filling out the job form all right so I can tell my agent something like this hi there please uh help me find UI ux designer jobs and help me apply to them now first my agent is going to ask me some more information right to be able to apply to these jobs okay I can help with that to start could you please provide me with a link uh to your resume ideally it should be a Google Drve Link uh and it should be a PDF okay okay so I'll share my really old uh CV and it's actually going to use that CV right to upload it if they ask for it but also to extract data to be able to fill out the form thanks just to confirm is your name and phone number included in the resume uh yes but just to confirm uh it's B SP thanks Ben do you have any links to your portfol folio or your media profile such as LinkedIn or behance that you like to include with your application um so yes my my LinkedIn is so again it's going to use that right whenever they ask for those my LinkedIn and then maybe my portfolio just add my website is okay great and are you legally authorized to work in any specific countries without needing sponsorship uh yeah I'm legally uh allowed to work in the US in Europe and in Brazil understood what are your compensation expectations okay I'm looking for a job uh which pays more than 50,000 USD uh annually uh how many jobs would you like to apply to uh now these browser AG take a while so for demo purposes I keep it uh short uh for now only apply to uh three jobs please okay Ben just to confirm you'd like me to apply to three UI ux designer jobs your resume link including your name phone number LinkedIn profile portfolio link um yes yes that is all correct please go ahead uh and apply to jobs so now my agent is actually going to call uh the browser agent right he's going to fill it fill him in with all the information that I gave him to actually start uh first of all finding jobs and then applying to them so it started now again this is taking literally takes three or four minutes per job so uh uh in the meantime I'll just show you the other use cases and then at the end because when they finish we can actually see the recording of what they did so let me first show you the other use cases and then uh we can check the recordings out after so I can also of course use uh this agent to do for example Outreach right because it can fill out contact forms of websites so for example I could say um please help me do Outreach uh for my uh lead generation uh agency uh to plumbers in Miami now again you will need some extra information to start to start doing it okay I can help with to start what's your full name again it's going to use that in the contact form right email address and phone [Music] number now what kind of message or request would you like to uh send right so what should I include uh please help me write a message basically we all offering uh leaden Services uh through C email specialized in uh for plumbers very random example but okay here's a draft message I'm reaching out from uh your we specialize in Genera generating high quality leads for plumbers through targeted called email campaigns I noticed you're based in Miami and I believe our services would you be open to quick chat how does that that sounds great the name of of my agency is Ben AI okay I've updated message yeah now that I have your full name would you like me to proceed with reaching out to plumers yes please proceed and reach out to Plumbers for me now again it will send this to the browser agent who's now going to well first it's going to find plumbers of course and then it's going to send it to the browser agent to to fill out contact forms okay all right now let's do one more use case because it can actually also um buy stuff for you right on on a website like Amazon for example uh so I'm not going to provide my credit card right now so uh am I login details so just to show you for the demo we'll just uh use uh add to card uh demo right so I can say something like yes I want uh to buy a gift for my mom for her birthday uh can you help me find a gift what kind of gift uh specific uh my mom likes likes cooking so maybe uh a cooking book uh would be great a cookbook do you have any specific type of cuisine in your mind uh yes she likes uh cooking desserts especially uh Italian desserts so maybe any top rated Italian dessert cookbook would work okay Italian dessert cookbook it is do you have any preference in the author or perhaps a specific price range uh no specific author anything that's top rated and under $100 is is good I look for top rated Italian dessert cookbook under $100 do you want to purch only one book or would you like to search for any other gift no bu uh you can you can do only that one uh you can go ahead all right so now it actually uh went ahead and and uh added it to the card right so let me speed up the video quickly because again this browser agent takes a while and then I'll show you the recordings of what it has done uh after so let me check skyver so you can see I'll explain skyvern in a second right but you can see it's now buying the Italian or at least adding it to the car because I didn't provide the the credit card details but you can see it's still running but it has done the job application so we can check here because basically get the recording as soon as it's done so let me open full screen so you can see it actually took three minutes but here if it so it first found a uiux designer job and you can see right it also fulfills the salary range and here you start seeing it first analyzing all the data it needs to fill in right and here you you'll see it start adding in the information so here what types in my name WR my email now I should type in my LinkedIn profile and it did right you can see here it's uploading my actual resume right right it takes this experience for my CV and what's your salary exp I actually filled it out per month too and you can see it applied so in this case there was no cap capture but you can also do that so that was the first one now can see a second recording this one was a bit faster so uxui right compensation again that's good the same thing first analyzing the page right fills out the information so took the email actually from my CV all right resume are IL legally authorized to work in the US yes right filled in my uh portfolio to my my portfolio link right and now it's doing the LinkedIn URL and yeah applied it's pretty pretty crazy but as you can see this does cost 10 cents uh per per applying but yeah if you think job applying you could apply to 100 jobs paying $10 which is quite a good deal now let let me speed it up again so we can see this other one so it just finished and it failed but it's of course because I didn't provide my login info and my credit card so let's check what it did so uh actually you can see you have an anti scraping method here right with the capture you can see it fills out the capture right proceeds to go to the website analyzes the website again and you can see this one takes actually really long 8 minutes uh I know what it does here but it found authentic Italian desserts 75 traditional favorites Made Easy check the check the reviews looks like and it it added it to the car so now you can of course also provide this with providing your login details and uh credit card and then you can actually buy stuff so but you can see this one cost actually 60 cents uh cuz I you pay 10 cents per screen analyzed right and because you had to go to multiple ones then the the cost go up pretty quickly now let's check the the contact forms so we got the two plumbing companies so let's see so Plumbing let's see so name phone number email and here you can see he fills out the message right that he came up with I'm reaching out from Ben we specializ in generating high quality leads for plumbers and he sends the message let's check the other one so it fills it out here it's hard to see but it's filling it out in the black boxes here see if we get a confirmation yeah you can see I filled out the message and send yeah you can see the but the button changed color so it was send yeah you get the idea it's uh it's not it's still early stages but you can see the long-term potential of this and the cool thing is this was all done through an API call so let me show you quickly how uh this is set up and then I'll show you also in detail how skyver works so I'll first give you a quick overview of uh how it's set up here in n8n and in skyver and then later in this video I'll show you in more detail step by step how this is set up so you can see here we have our AI agent right and of course I connected it to WhatsApp and also whatsa voice um and our agent here has runs on Google Gemini right uh 2. 0 because it's free for now we have the postgress uh chat memory and we have three tools right we have the job applying tool we have the contact us form tool and the product P purchasing tool now our agents we've instructed of course to gather all the necessary information before uh starting to use a tool so for example the job applying tool we has a few input Fields right the first one is the role um the yeah the type of job where're the user is looking for of course and we have the resume link and uh the additional information right the portfolio links Etc so that those are the input fields and whenever the agent uses the tool we have this uh workflow automation here right and it's pretty straightforward all we do here in this L&M step is we generate a Google search query right for to basically try and find relevant jobs right and relevant job links to apply to right so this model all it does is generate a Google search um query right and then here we have the Google serp API that um does a Google search for us right to find relevant jobs right so this will output uh URLs right and results then we split those out those URLs and in this case we limit it just for the demo purposes I put that in I don't want it to go and do 20 different jobs but you can you can take this out and then um and then here we have a loop over items where here this is the API call to skyvern to our browser agent right and our workflow automation will send all the relevant information over to our browser agent who can then actually start um applying to the job and filling out the the contact form right and it will Loop over each URL and then uh send all the URLs back to the agent here at the end so pretty straightforward and of course you don't have to set this up inside of an agent system right I think uh you could also just set up a Google Sheets for example let's say you have 200 job links you want to apply to you can just put them in a Google link uh Google Sheets right start the Automation and let your browser agent apply to 200 uh 200 jobs and that's I think why it's such a could be become a very interesting use case is combining workflow automation like this with these browser agents because we we can actually let these browser agents all also find jobs right on Google first right get the URLs and then apply but the point is it's extremely inefficient right we're going to pay first of all a lot of money because it has to go through all of these Pages it's going to take ages for it to to to go through all these Pages Etc and it's less reliable so by combining traditional workflow automation with traditional apis and logic and only using browser agents for the part that we can't do with traditional workflow automation we can have a very powerful comp ation and especially if we can train which is what skyver does these these browser agents on specific use cases right so I'll get the skyver in a second let me show you very quickly the other two tools so we have um the the product purchasing tool too right lot simpler and in this case all we all we do is we send a prompt over right so that's why this one was also more expensive because we didn't send a specific link to our browser agent you can see in Amazon is first have to do the capture right then uh try and find the product Etc so pretty straightforward right in the agent of course um the agent creates that prompt right to to execute the the product purchasing tool right and then we have the last one which is the contact us form tool as you can see we again have some workflow inputs right which our agent asks us for right which is the name of the person uh the email uh all the uh additional information right and the message and the information about the prospects we're trying to contact right in this case it was the plumbers right so I can show you very quickly here it's a similar setup as a job applying tool right again this one generates the Google search query uh uses the Google Ser API to do a Google um Google search and and uh the URLs will be split out and it will Loop over each URL and send it to our browser agent here now we have one more extra step here uh actually this is not the browser agent this is a scraper so we first we have one extra step here we scrape the website first and this code SN snippet then checks the scraped website quickly if there are input Fields because if there are no input Fields uh of course our browser agent can't do anything and will be wasting uh money because it won't be able to do anything there so if it doesn't find input Fields it will uh go this route and loop over the next URL if it does find it it will send it to our browser agent with again all the necessary information uh for our browser agent to actually fill it out correctly so so let me show you skyvern quickly so first of all what is skyver it's basically a web browsing uh AI agent right and it's actually one of the state-of-the-art web browsing agents they recently had a a benchmark study right on um the best browser agents right and you can see that skyver 2. 0 which is their new API uh actually outperforms all others including Cloud computer use now this hasn't been compared uh yet to the openai operator um but you can see they do a pretty good job and I think the really interesting thing with skyver is first of all they have it available in API right meaning we can add these browsing agents to our work workflow automations or agents inside of NN relevance AI uh make.
com Etc uh second of all they've trained these these browsing agents on these specific use cases that are actually useful for for many businesses right like the ones uh like the cont contact forms right the ones I showed you jobs of course uh invoice also so downloading invoices from portals which I can imagine for accountants and and uh bigger companies can be an interesting use case as it's a very manual job you can see you have the purchasing of course which becomes really interesting for let's say l large procurement pipelines right a lot of companies need to literally buy hundreds or thousands of products right uh now that can be automated with these browsing agents and even government form filling right like taxes Etc so it's a great use case I think for these really high volume administrative tasks in platforms that don't have apis that's really where it comes down to and the interesting thing with skyvern is because they trained it on these specific use cases they tend to perform pretty well on those specific use cases and lastly skyver also allows us to build our own workflows inside of these specific use cases for browser agents so we can give these agents a little bit more guard rails for a specific use case or type a website um and I think that's really what's necessary for these browser agents at least in the short to midterm to actually be reliable in production right so I can show you very quickly here you have inside of the dashboard right you can see we have the option to create a workflow right now I can show you here the job application workflow but basically if I click here on edit you see we have a little bit of a workflow Builder inside of um inside of skyver right so where we can actually sort of put some more logic or or guidelines guard rails into what this uh specific work flow should do right so in this job application one for example is if the API gets triggered the agent first always parses the resume right and we can add in that variable so it always takes that data from the resume right then you have a for each Loop so it Loops over all the different job URLs you send to the API and then you can add in prompts even right to give it more guidelines on what his next step is Right apply for a job terminate if the job is not available fill out all the fields as best as you can including optional deals fill out public burden statements if you're asked if you don't know the answer to an optional question leave it blank right so we can give it more more details about that specific use case and of course we can adjust this for specific use cases and you can add in even even more things right you can see navigation blocks to give them more guidelines through prompts we have action blocks to to take specific actions right we have uh extraction blocks so we can also of course scrape or get data from from websites and send it back uh uh we have validation blocks to check if it actually is on the right track right and and now you can see there are more options here right send email PDF parser Etc but uh I think it's quite interesting to play around with also it's quite interesting article if you want to learn more about how these uh browsing agents are actually work uh you can see they actually have a planner agent uh a task or execution part and a validator to make sure that these tasks are actually being uh being performed correctly right and if not it can adjust uh the task through the planner right it's quite an interesting article I make sure to also put in the link in the description so let me dive a little bit deeper into the n8n automation now this is a little bit more of a technical build so if you're completely new to uh naden um I do have some other tutorials on naden and I'll be coming with lots more tutorials on naden uh which will be a bit simpler in our community too we do a lot of uh beginner courses on all of these platforms so if you're interested uh check it out now first of all I've of course connected it to Whatsapp now if you don't know uh you have to set up the WhatsApp official WhatsApp business API to connect it to Whatsapp right if you want to know uh the full process of setting up an official WhatsApp business API uh I explain it at the end of one of my videos which I'm going to link up here so if you're interested uh make sure to check that that one out and then if if you want to connect to WhatsApp trigger right you'll need to uh add in a new credential here right and you have to add in a client ID and a client secret now where do you find that if you have set up your uh because it's a little bit different than uh make. com for example uh if you've set up your WhatsApp business account and you go to your uh meta account you'd have to go to um to the app settings and you go to basic and here you'll find the app ID and the app secret which are the two values you'll need uh here right the app ID is the client ID and the client secret is the the app secret right that's how you set up the trigger there um and then in this case I have a switch right it's basically a router right in the first one uh I routed if it's a text right I I route it one way and if it's an audio I send it another way right because if it's an audio we actually as you can see here we first have to download the audio which again we do with the WhatsApp business module now this is a different credential type right so you can see for this one we need the access token and the business account ID now this I also show in my other video how to to find this and and set this up um so make sure to check that out if you want to connect WhatsApp then uh I download I download the the voice right the voice memo so I just use Simple http request right with method get and then I add in um the the output of the downloaded file here from WhatsApp business so we can't go directly we first have to download it from the WhatsApp Cloud right and then we can download it in here and then we send it over to an openai whisper model right to uh transcribe a recording right and that's how we get uh get the message into our agent right now for the text message we just sent um we just set a variable that's it that's all this does to uh put it as text that's it and then we send it over to our agent now let me show you the agent configuration now first of all we have uh it's a tools agent of course because we're using Tools in this setup and uh we have the source for the prompt is defined below right and here we have uh the text right so this is basically the either the voice message or uh or the text message right which we send over and then we have a system prompt here which I can show you quickly right you're an AI agent specifically builds for handling three distinct tab and each of the workflows there are three tools to be called but before you call the tools here's a few things you keep in mind right you have to ask the questions right to make sure we get all the informations necessary right to to get the tools right that's basically what this prompt uh does right so for the job applied tool right we ask him what information we need from the user right the the the resume URL some additional questions right how many jobs would you like to apply to Etc again if you want to look at this in detail the template will be available in my community so if you want to check this out in detail you can check it out there so uh then we have the job applying tool right same thing some extra information and the contact us form tool right same thing right tool instructions so we give it a little bit more context on uh which tool to use when right and that's it and some examples right if you want to know my prompt framework I also have a a prompting video on my on my YouTube channel so that's it uh then we of course we've connected it to a chat model which in this case I connected it to Google Gemini right the 2.