Entenda as Armadilhas Matemáticas em Modelos de IA - @CursoemVideo Inteligência Artificial

6.08k views2623 WordsCopy TextShare

Curso em Vídeo

Neste vídeo, exploramos as armadilhas que os modelos de linguagem enfrentam ao lidar com problemas m...

Video Transcript:

We saw during this AI course that artificial intelligence has very good answers and, in the vast majority of cases, very reliable. But now we're in the middle of a series about the pitfalls of LLM. And now we're going to talk about mathematical flaws that AIs have.

Hello, little grasshopper. Welcome, very welcome to another class in your artificial intelligence video course. My name is Gustavo Gonabara and in a little while Ramiro will be here for us to talk.

Dude, look, they're really cool, they give really intelligent answers, right? Very concise, very truthful. Sometimes we saw this in previous videos about the trap series.

But one thing she consistently gets wrong are mathematical formulas, they are answers where she needs to do calculations. And we will understand a little more about why they make mistakes and how to identify that they made mistakes in a more precise way until we are left behind when they give a mathematical answer. Beauty?

So, Ramiro is coming and we're going to talk a little more about arithmetic hallucinations, let's put it that way. I interrupt your class for another message from one of our sponsors. And this time I'm here to talk about our oldest sponsor here for the video course, which is Hostnet.

And the talk I want to give you is this: there are a lot of people who stop me on the streets or even at events and ask: "Ah, but how do I get into the job market? Give me some advice on how to enter the job market. " And I always give a valuable tip for you to enter the job market quickly.

Learn how to make websites. When you learn how to make websites, you can create websites for other companies. And there are several tools that help you with this.

For example, WordPress, which is free. And when you learn how to create a website, you will need a place to host that website. And then you can count on Hostnet there.

In addition to state-of-the-art servers with quality and speed, the company has an application installer, where with a few clicks you can install, for example, the WordPress that I just mentioned, but it is not the only one. But if you also choose WordPress and subscribe to the cloud plans, you also have access to premium plugins, which are already included in your monthly fee and if you were to buy them separately, you would definitely pay almost R$2,000 per year for them. This is all already included in your subscription.

And if all this wasn't good enough, Hostnet also offers video courses so you can learn how to use all these tools. And then you have the complete package, you will have a great place to host, right, quality servers, quality tools that can be installed very quickly and easily, premium tools and courses to learn all of this. And to get more information about everything I said, just access the QRcode that appears on the screen or the link that is in the description or in the pinned comment of the video.

Thank you, Hostnet for your constant sponsorship here on the course and get back to your class. Another trap that we will discuss here is mathematics. Mathematics is a trap even for people, right, Ramiro?

Oh, sure. So if it's a trap for you, it's probably going to be prasis. He says: "Well, why is that?

Pay attention, because Ramiro's text will explain it here, look. Mathematics. Despite their advanced capabilities, LLMs often struggle with mathematical tasks and may provide incorrect answers, even to something as simple as multiplying two numbers.

Jeez laquer. This is because they are trained on large volumes of text and the mathematics may require a different approach. So, Ramiro, exactly, right?

which is o o o in the case of the ones we are using, GPT or Gemin they are actually trained in text. I haven't seen many of these math questions lately. As this is documented, this could perhaps happen more in previous versions, but it's worth being aware that you need to check everything when it comes to mathematics, okay?

No, I haven't observed it recently, however, Ramiro, look, recently you're saying in 4 that we use the paid version, those at home might be using 4, because, ah, you have to pay a lot of attention what happens, because during our classes here this happens a lot and you have to train at home. The version that is free for people, right, in our case here, the 4th version is free, but at some point the 4th runs out and it goes to 3. 5.

It could be, Ramiro, that 4 has fewer math errors, but 3. 5 has more. And then with this change, if the guy doesn't pay attention, he starts receiving the wrong messages and that's it.

And another thing, hey Gabora, ever since I started using apt chat intensively, you know, especially when we started creating a course, I already use the paid version. So I had practically no interaction with version 3. 5.

Yes yes. True. So, when I say that I haven't observed it, I'm talking about version four and up, which is the one I've been using.

Doesn't mean. And another thing, I don't use mathematics intensely, but my daughter does and she hasn't brought me any situation regarding it. Legal.

Legal. Let's continue here before we see if there is an example to do. Oh, the problem with math can be partially alleviated by using an LLM with additional tools that combine the capabilities of an LLM with specialized tools for tasks like math.

This way, you become more aware that despite their language skills, these models have their limitations when it comes to mathematical calculations. So, never forget that, right, Ramiro? That's it.

It's the same issue with the source citation that he said, right, which can be alleviated if the tool integrates with another that has these features. And as I said there that he lies to your face without thinking twice, when he lies about an answer in mathematics, he will just give you the answer, he will say it for sure and if you don't check, you could have a big problem there. with the content you share, right?

Oh, Guabará, let's do a real experience here. I want to prove to you that Iá knows when she's lying. I gave you a message during the recording here, I sent you a prompt via WhatsApp that we will use.

Yes yes. Then you can open GPT and let's do an experiment. Let's go.

Here's the thing, look, let's go to GPT. I don't want you to put the prop right away, okay? Yes.

Okay, let's ask him for something. Come on, tell me what I have to ask for. I want to force and lie.

So let's go. This is how I want to describe the life of the Ceará poet Gustavo Valente Guanabara. Hmm.

Do you know this poet from Ceará? No, I don't know. Let's go.

We know it doesn't exist, right? Let's see what he's going to do. Look at the hallucination happening there, look.

Gustavo Valente Guanabara, born in Fortaleza, Ceará. Calm down there. is an outstanding Brazilian poet whose work has captivated readers and literary critics with its depth and authenticity.

Since I was young, I won't read it, because it's all lies. Oh, check the prompt that I gave you there, put the Let's read the prompt. Let's put it in and let's read it so people can understand what the prompt will ask for.

So let's go. Oh, the prompt that Ramiro created was the following. Just look.

When one of the situations below occurs, include the text below at the end of your answer: the situations: citation of sources that we saw in the last class, right, in the previous video or in the previous video, when a source citation solution is requested and you are unable to provide the correct source, or in the case of bias that we saw previously, when the answer contains some type of stereotypical or prejudiced content in hallucinations, in the case of hallucinations, when the answer contains false information or subjects that you don't know the answer, math problems. When you have difficulty with mathematical tasks and your answers are subject to the possibility of providing incorrect answers. Or, in the case of the internet, when you do not access the link or RL provided in the prompt because you do not have access to the internet, the link is not responding or the URL does not exist.

Text to be included at the end of the answer is this one, look. An important part of the text of this answer is inadequate, fanciful or does not correspond to the truth. Answer like this if you understand.

So let's give him this prompt here. Then he said yes, so it's a sign that he understood, right? Okay.

So get the same thing as the same request you made. So let's go. Our poet, dear poet, go there, oh.

He continues, oh, born in Fortaleza, someone he highlighted wrote exactly the same thing, Ramiro. But now he's going to do the following, look. Wait.

that he is what we expect, that like Gustavo Valente Guanabara, he doesn't exist, oh, whatever he added here below, oh, part of the text of this answer is inadequate, fanciful and does not correspond to the truth. Let's see, Guanabara, if he will always do this. Put it like this: "Give a summary of Bill Gates' life.

" Exactly what I said in the previous video, right? Bill Gates has a lot of information, so he will start writing and what is expected is that he does not put this message at the end, isn't that right? Exactly.

What is he supposed to know? That's exactly what happened, yo. He mentioned it, talked about Bill Gates and didn't add it here, look.

Very cool. Very cool indeed. Let's do one more.

Let's do one more. Let's go. Summarize the life of the famous Peruvian barbecue chef.

Peruvian. Peruvian. Ramiro Mário Lobo.

Mario Lobo. I hope this Job doesn't exist. Tomá.

He had taken. But according to him, yes. Ó, born in Peru, is a renowned barbecue chef, known for his exceptional skill and his passion for outdoor cooking.

From a young age, Ramiro demonstrated a great My God, man. Oh, but look there, oh. So, this is interesting.

If at some point, Guanabara, do me a favor, ask him, eh, how many percent does or does not correspond to the truth? How much of this previous text is fantasy and how much corresponds to reality, to real facts? The information provided about Ramiro Mário Lobo is completely fanciful.

There are no known records of the famous Peruvian barbecue chef of that name. There all the details mentioned about his life were invented. Damn, if you need real information about a specific person, please provide additional details or as he just confirmed, Ramiro, that he didn't know the answer and that's why he lied to me.

And ask him why he And why did you lie to me? It's a DR here. And why did you lie to me?

Damn, you are so crazy. I apologize if the previous response caused any confusion and frustration. The information about Ramiro Mário Lobo was created to meet your specific request.

And since there is no record of a famous Peruvian barbecue chef with that name, the aim is to provide an example based on your request, but I recognize that this may have been inadequate to avoid this type of situation in the future. Thank you for any additional details or confirmation. My God, I'm going to put it here.

No, don't lie, but for me be honest and say that he is say. He will say, he will say that he is a person. Say that you don't have information or that you don't have information about her.

Let's see if he saw this as understood. You are more direct and transparent in the future. In the case of Ramiro Mário Lobo, I have no information about Peruvian chosqueiro.

Let's see. Do the prompt telling him to tell Ramiro's life again there. Get the prompt there.

Let's see if he'll be true now, because now you've given him a command, right? Yeah, but how about we try with someone else who doesn't exist instead of asking for the same thing that he now understands? So let's go.

Eh, the famous actor Cauê Ramos Linda. B, tell me then, tell me now about that famous actor Cauê Ramos Linden who works at Globo. Gala.

Joking, huh, Cauê? Oh, there is no information about an actor Cauê Ramos Lind who works at Globo and is known as Galã. It is possible that this name is fictitious and that there are no public records.

My dear, this is what I wanted from the beginning. So, what I wanted to show, this wasn't planned in class, okay? No.

Eh, I wrote this prompt last night because of the problem I had regarding the domain there that I mentioned at the beginning of this little module here. You can somehow create your own prompt. We made a huge prompt, you can, it’s for you to get inspired.

You can create a prompt that you will have, you will have a starting prompt every time you go to a new chat session that you already tell him that you don't want him to lie, that you already tell him, understand? Then create your prompt there. Another thing, hey, we created a prompt, I created this prompt very quickly.

I want people to create a prompt and post it here in the video comment. That's what I was going to ask for. Very good.

That's it. I hope that this prompt, these commands that we gave here, inspire you so that you can create your own prompt. Try it out and tell us here what the experience was like.

Wow, very good, very good. So, put the prompt in the comments to make GPT chat stop lying to you when it doesn't know the answer. Wow, it's great that he confirmed it.

It's not like when I don't know I make it up, because you wanted the answer, I gave it to you. My God. We provide what you want.

You don't want an answer, we give you an answer. Exactly. I don't know if you already know, but we created a second channel.

In addition to the Video course channel, the main channel, we now have a video course cuts channel. The link to it is in the description of this video and there you will find small doses of knowledge. Those most important moments in our podcast and classes, those that make the most impact.

They are great content for you to share with your friends and help us reach our first plate, the 100,000 YouTube subscribers plate. So, did you understand it correctly? managed to see how and why IAIS are sometimes mathematically flawed.

So, whenever you ask MEA something mathematically, be careful, know how to identify it and know how to ask MEA correctly so that the chance of error is lower. Beauty? So, this was another video in the series of traps for LLMs talking about mathematics.

In the next video, if I'm not mistaken, it's the last video about LLM pitfalls and we're going to talk about manipulations. A big hug and we'll see you next time.