FINALLY, this AI agent actually works!

78.51k views4366 WordsCopy TextShare

AI Search

This new AI agent browser actually works! Do Browser tutorial & review. #ai #aitools #ainews #aiage...

Video Transcript:

Go to my Gmail and reply to my emails. Oh my goodness, it's actually clicking on each one and replying, and it even knows to check my Google Calendar to give him my availability! This is pretty insane.

All right, how about this: go on X and like and comment on the first five posts about machine learning. Oh my gosh, it's actually doing this! It's searching for machine learning posts, liking, and commenting on the first five posts.

Or let's try this: order me the nearest pizza from Uber Eats. Oh my gosh, it's actually going to Uber, selecting a restaurant and pizza for me! It's even proceeding to checkout and paying for the order.

This is pretty insane. All right, let's try something even trickier: find the emails of three colleges and send them a cold email about my website development services. It's scraping Wikipedia, going to the websites of each of these colleges, and actually extracting the contact emails.

Then it's proceeding to go to Gmail and actually writing a cold email to each of them. Oh my gosh, this can totally automate sales and outreach jobs, guys and gals! Finally, we have an AI agent that actually works.

This agent can autonomously order food, reply to your emails, post and engage on social media, and even do sales and outreach on your behalf. This can save you so much time and significantly improve your productivity. But before we jump into this agent, here's the backstory: in the past year or so, I've been trying to hunt down an AI agent that actually works.

There are a lot of things that I just want to get an AI to automate, like writing my newsletter, posting on my social media accounts, or helping me draft a YouTube script. However, I haven't come across any particular agent that can actually do these tasks well without running into errors. I've tried pretty much all of them, including AutoGPT, Autogen, Baby AGI, Multi-Agent, Smith OS, and even Claude's recent computer use agent.

But for all of these, they were either really complicated to set up, and you needed to code everything yourself, or they would often get stuck in an endless loop and never proceed further down the workflow. So up to now, I haven't come across any AI agent that can actually carry out any real-world task very well. Now, a few weeks ago, I heard about this agent called Do Browser on Twitter (or X), and their demo reel, at least, was quite impressive.

So I reached out to the founder, asking if I could get early access to this as soon as it was ready. Then, fast forward to this week, I finally got access to this. So this tool is called Do Browser, and note that this video is not sponsored.

This is completely organic, and I could completely roast it if this tool sucks! In this video, I'm going to test this out on real-world tasks and show you how good or bad this AI agent is (at least for me). The results were quite shocking.

So right now, this is available as a Chrome extension. I'll link to this in the description below, and it's pretty simple: you don't need to clone any GitHub repo; you don't need to work with any code at all. All you need to do is click on "Add to Chrome.

" So let's do that, and I'm going to click "Add extension. " So afterwards, you should see that Do Browser has been added to Chrome. Now, to use Do Browser, all you need to do is click on the Do Browser icon at the top right of your Chrome window, and this should open up this chat interface.

If you don't see this icon, it's probably because you have not pinned it. So if that's the case, you just need to click on this Extensions icon and then click on Do Browser, and again, you should see this interface. Another way you can use Do Browser is to simply type "Do" in the search bar up here, and then once you press space, you are now using Do Browser.

So whatever prompt you do afterwards is actually prompting Do Browser to carry out the action in your Chrome window. Now, if this is your first time opening this, then you're going to see this message: "Please sign in to continue. " Now, I've been given early access to this, so I can sign in with my email and use this, but I'm not sure, at least at the time of this recording, if you can do the same yet.

I'll keep you posted once they do release it to the general public. Anyways, let's click "Sign in," and then once I've signed in with my Gmail account, you should see this. All right, first, let's see if it can help me reply to all the outstanding emails in my inbox.

So for the prompt, I'm going to write: "Go to Gmail, reply to the most recent three emails, please automatically reply to the emails based on your own judgment. No need to ask me for feedback. If it asks for my availability, check my Google Calendar and answer him accordingly.

" So let's press enter and see if it can figure this out. So first, it has decided to go to Gmail, and then it looks like it's opening the first email. This email is asking about AI tools for research.

It has decided to click reply, and now it's typing in a reply. In fact, this is happening really quickly. So it's already hitting send within like two seconds of opening this email, and now it's already proceeding to the second email.

You can see now it's actually opening Google Calendar to see my availability. Now, as you can. .

. See, within like 2 seconds, it was able to open the next email, read it, write a reply, and hit send. So this is blazing fast.

In fact, this is happening so fast that I don't even have time to do a voice-over explaining all the steps. So, in fact, what I'm going to do is just play this, and note that this is at 1x speed; I'm not speeding this up. Then, after it has finished processing all three emails, I'm going to open up each email and show you how it replied.

But for now, note that this is too fast for me to do a voice-over explaining all the steps, 'cause after I explain one step, it's already moved on to like the next few steps. Okay, so it's done. Now note that at the bottom here it says I have successfully replied to the three most recent emails in your Gmail inbox.

Let me know if there's anything else you'd like me to do. So let's open up my inbox, and as you can see, before these were all unread. Also, this Google Calendar was not open.

So, let's click on this first one. For the first email, he wrote, "Hi AI Search, I was wondering if you have any recommendations on AI tools for research. I'd love to hear your feedback.

" So the AI agent replied, "Hi Terren, thanks for reaching out. Here are some AI tools that are great for research: ChatGPT, excellent for brainstorming and generating ideas; Zotero, a powerful tool for managing research references; Semantic Scholar, helps in finding academic papers and related research; Notion AI, useful for organizing research notes and summarizing content. Let me know if you need further assistance.

" Perfect! So it totally understood the email, it generated a pretty comprehensive reply, and it hit send. Let's go back and inspect the second email.

So, here's the second email: "Hi AI Search, reaching out to see if you're available to chat about our business. Let me know a good time to talk. " And if you remember, when it was executing this, it actually opened up my Google Calendar to see when is a good time to talk.

Now note that I have two times blocked off here: one is from 5:00 to 7:00, and the other is from 9:00 to 11:00 a. m. So it knows to analyze my calendar and understand that, okay, I'm free after 11:00 a.

m. So that's exactly what it wrote: "Hi Terren, thanks for reaching out. I'm available to chat today after 11:00 a.

m. Please let me know if that works for you or suggest another time. Looking forward to your response.

" Perfect! And then finally, here is the third email. So in this email, Terren was asking, "Hi AI Search, I was thinking about our marketing strategy and I wanted to bounce some ideas off you.

I think we could be doing more to engage with our target audience and I was thinking about ways to improve our social media presence. Here's my strategy: post two times a week on Instagram, TikTok, and X; get an assistant to like and comment on at least 100 posts per week on all platforms; discuss giveaway opportunities with sponsors to promote on our social media. Do you have any thoughts on this?

I'd love to hear your ideas. " So here's how the AI responded: "Hi Terren, thanks for sharing your marketing strategy ideas. I think your plan is a great start.

Here are a few additional suggestions to consider: collaborate with influencers in your niche to amplify your reach; use analytics tools to track engagement and adjust your strategy based on performance; experiment with different types of content, such as reels, stories, and live sessions to see what resonates with your audience; consider running targeted ads to reach a broader audience. Let me know your thoughts on these suggestions or if you'd like to discuss further. " Perfect!

So not only did it understand the message and agree with what Terren was saying, it also added some further feedback. I mean, with this AI agent, pretty soon we won't actually need assistance managing our inbox; you can just get an AI to do this for way cheaper and way faster. Even with this AI right now, even though it's not perfect, this can help me increase my productivity and save so much time.

For all trivial emails, I can just put them into one folder and then get this AI to respond to all those emails in that folder. Now note that for more important emails that require your decision or that involve a lot of money or high stakes, then, of course, you wouldn't get this AI to reply to those emails. But for everything else, I mean, this AI can handle it pretty smoothly.

All right, next, let's see if it can help me order food from Uber. So, again, I'm going to open up my browser chat interface, and then for the prompt, I'm going to type, "Order pizza from Uber Eats. Just pick the nearest one that's under $30.

No need to ask me for feedback; use your own judgment. " So I'm going to press enter and see if it can pull this off. First, it's going to the Uber Eats site.

Note that I'm already logged into my Uber Eats account, and I've already added my address. Now it knows to search for pizza in the top search bar. Okay, so next, it's selecting one called Pizza Garden, which is 1.

9 miles away. Next, it has added this Neapolitan pizza, which is $18, so it is under $30, and now it's actually, I believe, going to check out, and it's actually continuing to the payment. Let's see if it'll click pay.

Yes, it does! All right, so just a warning here: if you've already added a payment method and. .

. You can get this AI agent to order things on Uber or buy stuff on Amazon; it will actually proceed to buy things for you. So, just a warning: I deliberately removed my payment options so that it could not proceed further.

But you can see it got as far as this: "Add a payment method," and then in the chat here, it says, "It seems the payment method needs to be added to proceed with the order. " So, I need to manually add the payment; that's where it ended the workflow. But note that this 100% could help you order food from Uber Eats simply based on your criteria.

It can help you search for a restaurant, pick the correct items, and then even proceed to check out and pay for the order. Thanks to Thly for sponsoring this video. Thly is a powerful AI platform that automates phone calls using natural-sounding AI voice agents.

These agents can be set up in just 17 minutes, so you can quickly start taking calls without needing any staff or complicated systems. Thly is perfect for automating customer service, sales, and marketing. You can create your own voice agents to ensure that they sound just like your brand and meet what customers expect.

Whether it's answering simple questions or handling more complex issues, Thly's voice agents can do it all, making the customer experience smoother and more enjoyable. You can also integrate your AI voice engine into most CRMs like Salesforce, Zoho, and Zendesk. Thly also provides helpful tools to analyze how well the voice agents are performing.

Users can look at detailed reports and real-time data to see how customers are interacting with the agents. This information helps businesses understand what their customers want and how they can improve communication. Thly is a game-changer for businesses looking to improve their phone communication.

By combining smart AI technology with powerful analytics, it helps companies engage better with their customers while saving time and resources. Check out Thly today and get 50% off via the discount code in the description below. All right, let's test it on another task.

Let's see if it can like and comment on X (formerly Twitter) on my behalf. So, here's my prompt: "Go to X and find the most popular posts about machine learning, and like and comment on the first five posts. " Let's click enter and see if it's able to do that.

All right, so the first thing it's doing is going to X, and now it's clicking on "Explore. " Now it knows to search "machine learning" in the top search bar over here. Let's see what it's going to do next.

It has liked this first post already, and it has now proceeded to open this post. It has added a reply, and now it's moving on to the next post. It has liked this post and then clicked on reply, and it has added a reply.

It has clicked reply already. I mean, this is moving so fast; I barely have time to do a voice-over explaining what it's doing step by step. Already, it's at the third post; it has opened the pop-up for it to add a reply, and again, it's typing in its reply super fast.

Right now, it has found the next post, has liked it, opened the pop-up, and again, it only takes it like one second to add a comment and then click post. All right, so the fourth post is done, and now we are moving on to the fifth post. Again, it has liked this post and then clicked the reply window.

It has added a reply and clicked post. So, after that is done, note that in this chat window, it says, "The comment has been successfully posted on the fifth post. " I have now liked and commented on the first five posts about machine learning as requested.

Let's actually scroll back up and see if it indeed liked and commented on the first five posts. Here's the first post: you can see indeed it has liked the post, and the agent added a comment to this post. If I go back, here's the second post: it has liked the second post, and indeed the agent also commented on this post.

If I go back, here is the third post: the agent has liked the third post, and if I click into here, note that I have also commented on this post. If I go back, here's the fourth post: again, I have liked the post and have also commented on this post. Finally, here is the fifth post; again, the agent has liked this post and commented on it.

How insane is that? Finally, we have an agent that can help me automatically like and comment on Twitter! All right, let's try something even trickier.

Let's say you're a web developer and you like to build websites; in fact, you're planning to start an agency to help colleges or universities build websites or improve their existing websites. Now, you could manually go find these colleges and reach out to them yourself, or what if you could get this AI agent to automate the entire process and outreach for you? So, let's try this out right now.

Again, I'm going to open up this browser chat interface. For the prompt, here's what I'm going to type in: "Search for three colleges. Assuming you're targeting colleges in, I don't know, Canada, try to find the contact email on the site, then email them using my Gmail.

Send them a cold message asking if they need help with building a website for their business or improving their existing website. " Link to my portfolio: ibuildwebsite. com, which I just made up obviously, so let's click, click enter and see what it does.

All right, so it has decided to go to Google and search for colleges in Canada. You can see right now it's typing in "colleges in Canada," and then it has decided to go to the first result, which is a Wikipedia page listing all the colleges. Then it has decided to click on this one: Algonquin College, and it looks like it has found the website over there.

So now it's clicking on the website, and now it's searching for the email address. It is now looking for a "Contact Us" section; it has scrolled all the way down the page but could not find an email. Instead, it's going to type "contact email" in the search box to locate the email address, and finally, it has found an address for the President's office: president@algonquincollege.

com. So it's going to save this email, and then it's now going back to the Wikipedia page to select the second college. It has selected Cambrian College, and now it's looking for the main website.

It has clicked on the official website link, and it has found a "Copy Email" button for it to extract the email of Cambrian College. This is lightning fast, by the way! After copying the email for Cambrian College, it's now moving on to the third college on this Wikipedia page, which it has selected to be Canadore College.

Now, interestingly, after clicking on the main website link, it just goes back to Google for some reason. The nice thing is it has identified this error and is aware that this is not the website for Canadore College. So what it's doing instead is searching on Google for "Canadore College official website," and then it's clicking on the first result from Google to actually enter the official website.

It has found the contact email for this college, so now it has opened up my Gmail and is proceeding to write cold emails to each of these colleges. Right now, you can see it filling out the To field, the subject line, and the email body. Notice how fast this is.

In a matter of like 1 or 2 seconds, it has already filled out and sent the first email. Now it's proceeding to send the second email, and again, everything is just blazing fast. This is happening way too quickly for me to actually read out everything for you, so I'm just going to wait for this to finish and then open up each email to show you what it actually sent.

It has now sent two emails already; right now it's sending the final email to Canadore College, and again note how blazing fast this is. All right, so finally the chat has ended, and you can see in the bottom right here it says all three emails have been sent to the respective colleges. It's listing out all three colleges.

Let me know if there's anything else you'd like me to assist with. So if I go to my sent tab, you can see that indeed it has sent three emails. Let's inspect each one.

This one is to Algonquin College, and here's what it wrote: "I hope this email finds you well. My name is Sam. I specialize in building and improving websites for businesses and organizations.

I came across your institution and wanted to offer my services to help enhance your online presence. You can view my portfolio at ibuildwebsite. com.

If you're interested, I'd be happy to discuss how I can assist you further. Looking forward to hearing from you. " Perfect!

So here is a beautifully written cold email asking if they need help with building or improving websites. Now let's go back and click on the second email. It's pretty much the same as the first email, but this is sent to Cambrian College.

And then finally, the last email—again, this is pretty much the same, but it is sent to Canadore College. So how insane is that? I think this AI agent is going to change sales forever.

You no longer need to hire any sales reps to do prospecting or cold outreach for you; you can just get this AI to not only send out cold emails but also do the prospecting! It can find and gather all the emails of whatever target clients you're looking for. How insane is that?

Keep in mind I've tried a handful of other AI agents in the past, and none of them were able to do this so smoothly. This is a real game changer. All right, next let's test it on some research data collection tasks.

So again, I'm going to open up the browser tab, and then for the prompt, let's try this: 1. Identify three recent research papers in machine learning in 2024. 2.

Add these research papers to a new Google Doc with the following info: title, author, publication date, journal name, and summary. There are multiple steps involved here: first, it has to identify three recent research papers in machine learning; then it needs to open a new Google Doc and generate all this information accordingly, and then add it to the Google Doc. So let’s see if we can figure this out.

I'm going to press send, and first it’s going to Google. It's typing in "recent machine learning research papers 2024," then it's clicking on the first result and then on the first result from that. So right now it's scanning this page, and then it has extracted all this information from that paper.

Now it's going back to the results page of papers with code, and then it's clicking on the second result. And again. .

. It has gathered all the information from the second paper, and now it's proceeding to click on the third paper. After gathering all this information, it's moving on to Google Docs, where it has created a new document.

Now, it's proceeding to add all the information to this new document, and we are done. It says, "I have successfully added the details of the three recent machine learning research papers to the Google Doc. " Now, there are some flaws with this.

First of all, it could not find the journal name from the site Papers with Code. This is accurate, but I would prefer that if it couldn't find the journal name, it looked somewhere else to fill in this information. Also, if you go back to this Papers with Code page, it seems to have skipped a paper.

It got the first one correct, "Demo," and then it got "Moon Cake," as you can see here: here's "Demo," here's "Moon Cake. " But then, for some reason, it skipped over this paper, "Moss Former," and decided to extract the data from this one, "Auto Rag," instead. So, as you can see, this third paper is "Auto Rag," not "Moss Former.

" There are some flaws in this demonstration, but it is a lot more robust than other agents that I've tested out. So that sums up my review and tutorial of "Do Browser. " This is actually surprisingly good and much better than any other agent I've tested, which is why I've decided to do a full video on it.

You simply need to prompt it with the task, and it will carry it out. There's no need to code up a team of agents yourself; there's no need to download or clone any repos from GitHub and hit a ton of errors. This is a really simple Chrome extension, and in this video, I tried to test it on some realistic use cases like social media engagement, doing sales and outreach, or doing data collection and research.

But I'm sure this agent could likely do a lot more than what I showcased. Anyways, let me know in the comments what you think. Would you be interested in using this?

And if so, what would you use it for? As always, I will be on the lookout for the top AI news and tools to share with you. So if you enjoyed this video, remember to like, share, subscribe, and stay tuned for more content.

Also, there's just so much happening in the world of AI every week; I can't possibly cover everything on my YouTube channel. So to really stay up to date with all that's going on in AI, be sure to subscribe to my free weekly newsletter; the link to that will be in the description below. Thanks for watching, and I'll see you in the next one.