RAG Explained

75.73k views1262 WordsCopy TextShare

IBM Technology

Get the interactive demo → https://ibm.biz/BdmPEb Learn about the technology → https://ibm.biz/BdmPE...

Video Transcript:

so imagine you're a journalist and you want to write an article on a specific topic now you have a pretty good general idea about this topic but you'd like to do some more research so you go to your local library right now this library has thousands of books on multiple different topics but how do you know as the journalist which books are relevant for your topic well you go to the librarian now the librarian is the expert on what books contain which information in the library so our journalist queries the librarian to uh retrieve uh

books on certain topics and the librarian uh produces those books and provides them back to the journalist now the librarian isn't the expert on writing the article and the journalist isn't the expert on finding the most upto-date and relevant information but with the combination of the two we can get the job done love this sounds like a lot like the process of rag or retrieval augmented generation where large language models call on Vector databases to provide key sources of data and information to answer a question H I'm not seeing the connection can you help me

understand a little bit better sure so we have a user in your scenario it's that journalist and they have a question so what types of questions would you want to ask right maybe we can make this more of a business context yeah so let's say this is a business analyst and let's say they want to ask um what was Revenue in q1 from customers in the Northeast region right so that's your prompt okay so a couple questions on that user does it have to be a person or could it be something else too yeah so

this doesn't necessarily have to be a user it could be a bot or it could be another application even the question that we're talking about what was our Revenue in q1 from the Northeast you know the first part of that question it's pretty easy for you know a general llm to understand right what was our Revenue but it's that second part in q1 from customers in the Northeast that's not something that lln are trained on right it's very specific to our business and it changes over time so we have to treat those separately so how

do we how do we uh manage that part of the request exactly you'll need multiple different sources of data potentially to answer a specific question right whether that's maybe a PDF or another business application or maybe some some images whatever that question is we need the appropriate data in order to provide the answer back what technology uh allows us to aggregate that data uh and use it for our llm yeah so we can take this data and we can put it into what we call a vector database a vector database is a mathematical representation of

structured and unstructured data similar to what we might see in an array gotcha and and these arrays are uh better suited or easier to understand for machine learning or generative AI models versus just that uh underlying unstructured data exactly we query our Vector database right and we get back an embedding that uh includes uh the the relevant data for which uh we're prompting and then we includeed back into the original prompt right yeah exactly that feeds back into the prompt and then once we're at this point we move over to the other side of the

equation which is the large language model gotcha so that that prompt that includes the vector embeddings now are fed into the large language model which then produces the output with the answer to our original question with sourced upto-date and accurate data exactly and that's a crucial aspect of it as new data comes into this Vector database or things that are updated back to your relevant question around performance in q1 as new data comes in those embeddings are updated ated so when that question's asked a second time we have more relevant data in order to provide

back to the llm who then generates the output and the answer okay very cool so Sean this sounds a lot like my original analogy there with the librarian and our journalist right so the journalist trusts that the information in the library is accurate and correct now one of the challenges that I see is when I'm talking to Enterprise customers is they're concerned about deploying this kind of techn techology into customer facing business critical applications so if they're building applications taking customer orders processing refunds they're worried that uh uh these kinds of Technologies can produce hallucinations

or inaccurate results right or perpetuate some kind of bias what are some things that uh can be done to help mitigate some of these concerns that brings up a great Point love right data that comes in on this side but also on this side is incredibly important into the output that we get when we go to make that prompt and get that answer back so it really is true garbage in and garbage out right so we need to make sure we have good data that comes into the vector database we need to make sure that

data is clean governed and managed properly gotcha so what I'm hearing is that things like governance and data management are of course crucial to the vector database right so making sure that the actual information that's flowing through into the model such as the business results in the sample prompt we talked about is governed and clean but also crucially on the large language model side we need to make sure that we're not using a large language model that takes a blackbox approach right so a model where you don't actually know what is the underlying data that

went into training it right you don't know if there's any intellectual property in there you don't know if there's inaccurate IES in there or you don't know if there are pieces of data that will end up perpetuating bias in your output results right so as a business and as as a business that's trying to uh uh manage and uphold their brain reputation it's absolutely critical to make sure that we're taking an approach that uh uh uses llms that are transparent in how they were trained and uh we can be 100% certain that there aren't any

uh inaccuracies or data that's not supposed to be in there to be in there right yeah exactly it's incredibly important especially as a brand that we get the right answers we've seen the results of impact and especially back to our original question around what was our Revenue in q1 right we don't want that to be impacted by the results of a question that comes from you know that prompts one of our llms exactly exactly so very powerful technology but it makes me think back to the the library uh our journalists and librarian they both trust

the data and the books that are in the library we have to have that same kind of confidence when we're building out these types of gender AI use cases for business as well exactly love so governance AI but also data and data management are incredibly important to this process we need all three in order to get the best result