Google is making a lot of serious moves in AI, and the recent launch of Gemini 2. 0 just shocked the world. You know, at first, I was skeptical.
But honestly speaking, after trying it, this is impressive. So, in this video, I will share what makes Gemini 2 unique, its limitations, and some of the best ways to use it over ChatGPT or other AI tools like Perplexity. Let's go.
What makes Gemini 2 truly unique compared to other AIs, it has multi modal capability. Unlike ChatGPT, which became multi modal in the middle, Gemini has always been multi modal since the beginning. So you will find the experience is more native and integrated.
Not just enhanced multi modal capability in Gemini 2 to handle text, audio, video much better at the same time. But the multi modal live API just takes it to the next level. You can now interact with Gemini in real time streaming, like video, webcam, or screen sharing.
Even for ChatGPT for now, you can only do it through the mobile app version, and it is a completely different experience. Second, it is the multilingual native audio output, which I found people seldom talks about, but this is definitely one of the biggest strength of Gemini in natural language processing and native audio output. Remember why everyone loves notebook airlines audio?
The control over tone, style, accents, pacing, emotions are all superior. Now in Gemini 2, it even supports generating native audio in multiple languages. Although these upgrades are not yet available in the standard Gemini Advanced yet, for now, you can test it through the Google AI Studio until they are fully roll out.
But keep in mind, everything you upload in the free version of AI Studio may be used for model training. So I recommend just playing with the publicly available data and not some sensitive data. Of course, Gemini 2 is not perfect and it's still experimental under Gemini advanced.
For now, I find the image generation is still limited, and Google has been so cautious in handling anything related to people. Also, even in AI studio with less filtering, Gemini may experience errors in more sophisticated data analysis, may give false information with its real time interactive mode. So for now, personally, I do not recommend using it for complex tasks involving large datasets.
Now, the best use cases, with its unique strength in language and multi modal capability, instead of just asking it to do tasks like static landing page review, Data analysis that are better suited for text input. Here are my best use cases. The first one is user experience analysis.
Yes, Gemini 2 can see your screen, but what makes this powerful is to ask it to analyze the entire process in real time, because this is something that can't be done so easily through the text input and it's super useful to identify friction points and suggest improvement for the user journey flow. Let's say I have this website and I will ask Gemini how to improve the checkout flow. As a good practice, I always turn on grounding so it can search for web data when needed.
Now click share your screen and select the right page. I'm going to walk through the entire checkout process on my website step by step. So as I navigate each page, please help to analyze what you see and provide specific suggestions on how to improve the user experience and optimize that particular step of the checkout flow.
Okay, I will analyze the checkout process as you navigate through the website and provide suggestions on how to improve the user experience and optimize each step. I'm ready to start when you are. Okay, so let's start with the homepage.
The brand is TravelPro, and there are navigation options across the top of the screen. So any suggestions to make the navigation or experience more smooth? Sure.
One thing to improve the navigation on the homepage is to make the Shop All dropdown more obvious, maybe by adding a small arrow icon next to it to indicate it's a menu. It could also be beneficial to make the navigation options across the top of the screen bigger, as they look a bit small. Would you like me to continue with this page?
So here is the category page, so any suggestions you will make to this page should make the checkout process more smooth? Okay, looking at this page, I have a few suggestions to improve the conversion process. First, the number of filter options on the left hand side is a little overwhelming.
There are seven different categories, many with several checkboxes inside. I would recommend consolidating some of the filter options into more general categories to make it less cluttered. Second, the product images could be a little larger.
Lastly, adding an add to cart button directly on the product tiles might also improve conversion rates by making the checkout process a little more streamlined. As you can see, Gemiini can analyze while you're navigating. So the best way to use it is not to present it just a static landing page or a static presentation file, but to walk through the whole process that involves multiple stages, Like your checkout process, user onboarding flow, lead generation flow.
Note, for now, I find that Gemini can only give you advice for the page that it is seeing right now and not able to retrieve the whole process from start to finish. So as you navigate each step, pause and ask for immediate feedback before you move on. And now you might be wondering what kinds of prompts should I try first?
I found a great resource from HubSpot that gives you tons of real world ideas to inspire your own prompt writing for Gemini or any AI tools you're using. I've put it in the description for you to download for free. This is an extensive prompt library covering different aspects of marketing, from content, social media, SEO, to areas like brand analysis and even productivity hacks.
I find this is particularly useful when I explore new AI models because it inspires me with some new possibilities for what I can do with AI. For example, the brand identity section, you can use AI to create ideas for your brand messaging matrix across different customer touch points, which is something I haven't thought of. So these are good starting points to develop your own prompts.
I recommend you download this in the description below for free. And thank you HubSpot for sponsoring this video. The next one, interactive troubleshooting or learning, not only for workflow optimization, with the real time screen sharing, it can become your troubleshooting guide.
Whenever you just need some quick help on using common softwares or tools, and it may be just faster than using ChargeGPT or searching on Perplexity. Let's say I'm doing keyword research using Google Keyword Planner, but I don't know how to use it. Then I can ask Gemini to guide me through the process using the stream real time.
So now click Enable Grounding to make sure Gemini can use Google Search to find the information. And then click on Share Your Screen to open up the Keyword Planner interface. and start asking question now.
and start asking question I'm sharing my screen showing Google Keyword Planner, but I don't know how to use the tool. Please guide me through how to do the keyword research process for the keyword book case. Okay.
Let's go with discover new keywords. Please click on that option. After clicking, you can then enter the keyword bookcase into the text field.
Please click on the get results button. After that, you will be able to see the keyword data. So tell me how to identify the best targeting keyword.
There are 2, 000 keyword ideas available now. Okay. To identify the best targeting keywords.
You should look for keywords with a good balance of monthly searches and manageable competition. For example, bookshelf with ladder has 10K - 100K monthly searches with high competition while glass bookcase has 1K- 10Kmonthly searches with also high competition. Consider which keywords align with your business goals.
Now let's say I've imported keyword data into Excel and I need Gemini to guide me how to build a pivot table. Now you can drag other fields like AVG, monthly searches, competition, or any other field you're interested in to the values area. This will populate the pivot table with the data.
Which field would you like to explore next? Um, I have no idea. Which field do you recommend?
Given you're trying to identify the best target keyword, I'd recommend dragging competition into the values area to to see how competitive each keyword is and maybe top of page bid to understand the potential ad cost. So let's focus on monthly searches and top of page bid. Okay, with just monthly searches and top of page bid in the values, you can now analyze the data for keywords with high monthly searches and low top of page bid, as those keywords are likely good targets.
Would you like me to explain how to sort this data? Yes, please Tāo sort your data. Click on the drop down arrow next to row labels at the top left of your pivot table, which is the cell with a black triangle inside a gray box.
From that menu, select more sort options. This will open a new dialog box for custom I apologize for that oversight. It seems I directed you to the wrong spot.
Instead of the drop down arrow for row labels, You should click on any cell in the sum of AVG, monthly searches column, then right click to bring up the sort options. Now choose sort largest to smallest in the sort menu. This will sort all of your keywords based on monthly search volume.
Yes, but now the column suddenly becomes the top of page bid instead of the monthly search volume. So this is a convenient way to use Gemiini for learning. Ask questions quickly with the interactive experience.
But you can see Gemini sometimes can give you wrong information. For obvious questions, it's working fine. But when it comes to more sophisticated data analysis, just like in this demo, it may mislead sometimes, though, I would say this is still one of the best case to use it because you can just immediately jump on and ask question.
And not just for software. Maybe you're reading a book, you want to ask a question, or just read it out loud for you, just like an audiobook. So the ideas is endless here.
Next, dynamic content analysis, the advanced multi modal capability in Gemini 2, not just hear, but watch and understand the context so much better. And so it will be useful to do dynamic analysis and analyze content like video or content that involves motions and give actionable feedback. Let's say I have this YouTube video about SEO strategy and I want Gemini to help me to analyze it.
Now upload the video and ask it to analyze both the visual content and audio , provide suggestions how to improve the video storytelling, flow, pacing, audience retention, and overall quality. And also identify exact moments or time frames where viewers might lose interest. Now you can see this is amazing.
Gemini gives me really comprehensive feedback. What my video did well, like clear delivery and a good variety of visuals. So you can see Gemini not only hear what I said in the video, but can see what is being shown in the video.
And then it proposed detailed suggestions on my storytelling, my on camera presence and energy, which is crazy. It just perfectly shows how Gemini 2. 0 excels in understanding context, even for dynamic content.
So this is something ChatGPT can't do well for now. Gemini also follows my instructions to give me exactly the timeframe where I should pay attention more and finally a summary. So this is really fantastic.
And this not only applies to YouTube videos, it also works great to analyze your sales pitch, presentations, onstage performance, using its advanced contextual understanding. So definitely try it. The next, step by step process documents.
As marketers, agencies, or business owners, you must have encountered a time where you need to spend hours preparing process documents, or what we call SOP, for internal team training, client onboarding. So what makes Gemini powerful in doing this is that it can understand the context by just watching without even hearing anything. Let me show you.
So let's say I have this keyword research process and I need to create a process document for my client. So I record the whole process myself without speaking a word. Now I can upload it to Gemini and ask it to format this video into step by step instructions, which I can use in a training document.
So, wow, immediately you will see it will turn this video content into a series of instructions. And I would say they're all accurate. This is really amazing.
As you can tell, Gemini just recognized everything happening in the video perfectly. And of course, you still need to fact check yourself, but imagine how many times you can save with this. You can just streamline your internal process so much faster.
So let's try another example. This is a cake recipe video. Again, there is no voiceover, just some on screen text.
Now upload to Gemini and ask it to turn it into a series of instructions which to be used on a website. So you can see, this is absolutely amazing. You can literally ask it to reverse the process and to break down every step for you in a structured format.
This is something even ChatGPT can't do for now. So definitely one of the best case you should try. Next, generate a natural voiceover.
Gemini native audio outputs is super impressive. So I found this is a great use case to ask GemIIni to generate voiceover by just giving it the text and immediately you'll have a professional voiceover that is ready to use. Super amazing.
So let's use the previous keyword research process example. So go to stream real time. Now make sure you select audio as the output and then in the prompt paste back the process instructions we just got from Gemini and ask it to say this set of instructions word by word in a professional manner.
Okay, I can do that for you. Here's a professional voiceover script for your training video. Following your provided instructions word by word.
Step one, access Keywords Explorer, navigate to the Ahrefs dashboard. Click on the Keywords Explorer option in the navigation bar at the top of the screen. You can hear that.
This is so impressive. And the best thing is now you can download the file. It's in high quality.
You can use it back in the video or tutorial, or perhaps you have a product demo, training video, You can now just turn any written text into audio. I would say the only limitation now is that there is no direct conversion from text to audio file. So you have to let the model finish saying it before you can download the file.
But still, this is a great way to use it and you can instruct it to use different style, tone, and pacing. So this is a use case that works best for standardized content, like tutorial, training. For podcasts, I still recommend using your own voice so that you can build a real audience.
The next multilingual content adaptation. Translation is definitely a core strength of Google tech, and this is just perfectly shown in a Gemini 2. 0.
So if you need to do marketing or business in different regions and want to localize content, Gemini is great in adapting the content for different languages. while preserving the context flow. Let's say I have this article about origami in Spanish, and then I can upload it to Gemini and ask it to turn it into a ready to use podcast episode script in English, with a high energy tone.
And then immediately we will get a ready to use script in English. And this demonstrates not just the strong translation capability of Gemini, but also the way it uses wordings in the script to make it conversational. Just like what we see in NotebookLM.
Now we're not done yet. So go to stream real time again. This time let's try another voice.
Then paste back the full script and the prompt and ask it to say this episode word by word in a high energy tone. No scissors, no glue, nothing else but your hands and your brain power. Seriously, it's pure paper magic!
And you know what? That's what sets it apart from papyriflexia. What I am impressed is you can hear how Gemini can handle the dynamics in a speech.
You can feel the energy and it's so natural. And not just English, you can paste the script and ask it to say it in a different language like Spanish. Hello everyone and welcome to the podcast.
Today we're going to dive into something super cool. Seriously, I'm blown away by the native audio output by Gemini, so definitely try it. Gemini 2.
0 is a big leap forward in Multimodal AI capabilities. The use cases we explore today are just the beginning. If you want more inspiration, I share some more unique ways you can use Gemini 2 for marketing or business.
I share this on my community. You can find the link in the description to join. And before you go, also check out this video about using NotebookLM, which is now powered by Gemini 2 and combine it with Perplexity for your research workflow.
I will see you next time.