How to Build a Self-Improving AI with Agentic RAG and Flowise

9.51k views3491 WordsCopy TextShare
Leon van Zyl
Agentic RAG is a powerful approach for building AI solutions that are able to self-improve their res...
Video Transcript:
<b>Are you tired of your</b> <b>chatbot giving you wrong answers? </b> <b>Today, we'll be building a self-improving</b> <b>AI application using a</b> <b>genetic rag and Flowwise. </b> <b>That's right, no more frustrating AI</b> <b>hallucinations or irrelevant responses.
</b> <b>In order to build this application, we</b> <b>will use sequential</b> <b>agents within Flowwise. </b> <b>With sequential flows, we can add</b> <b>multiple agents to a</b> <b>single AI application. </b> <b>And since we want our</b> <b>application to correct itself,</b> <b>we will be using different agents for</b> <b>different functions.
</b> <b>But first, let's start with a simple demo</b> <b>of this application. </b> <b>As per usual, we will be using my Oak &</b> <b>Barrel restaurant as an example. </b> <b>Oak & Barrel is a fictitious restaurant</b> <b>that sells steaks and sushi.
</b> <b>So if a user asks the question like,</b> <b>"What are your current specials? "</b> <b>Our assistant will be</b> <b>able to answer that question. </b> <b>And if we have a look at</b> <b>the conversation history,</b> <b>the retrieval tool reached</b> <b>out to our knowledge base</b> <b>to fetch information</b> <b>about the current specials.
</b> <b>We then have this</b> <b>conditional agent that checked</b> <b>if the retrieved documents is indeed</b> <b>related to the user's question. </b> <b>And if so, it instructs the process to</b> <b>generate a response. </b> <b>And then this final generate agent</b> <b>provides this response to the user.
</b> <b>Now let's ask something that's completely</b> <b>unrelated to the restaurant. </b> <b>Like, "Do you sell crypto? "</b> <b>Let's send this.
</b> <b>And this time, if we have a look at the</b> <b>conversation history,</b> <b>the retrieval agent</b> <b>reached out to the tool,</b> <b>which did return some</b> <b>information from the knowledge base. </b> <b>And looking at this, there is</b> <b>no mention of crypto anywhere. </b> <b>Now this could lead to</b> <b>inaccurate answers or hallucinations.
</b> <b>So when we go down with the process,</b> <b>we can now see that the conditional agent</b> <b>determined that the documents fetched</b> <b>from the vector store</b> <b>are not relevant to the user's question,</b> <b>and is therefore calling the rewrite</b> <b>agent to try and rephrase the question. </b> <b>The rewrite agent then rephrases the</b> <b>question in an attempt</b> <b>to get a different output. </b> <b>Then finally, we do get</b> <b>a response saying that</b> <b>"Akenbarrel does not</b> <b>sell cryptocurrency.
"</b> <b>This might be an extreme example. </b> <b>Realistically, what would happen is a</b> <b>user would ask a question</b> <b>that is indeed related to the restaurant,</b> <b>but because of the way it's phrased,</b> <b>the documents returned from the vector</b> <b>store is not really related,</b> <b>in which case the rewrite agent will take</b> <b>the user's question,</b> <b>rephrase it differently,</b> <b>and hopefully a more suitable set of</b> <b>documents will be returned</b> <b>and injected into the context. </b> <b>Now let's go ahead and build this a</b> <b>gigantic RAG app together.
</b> <b>In the Flow-wise</b> <b>dashboard, go to Agent Flows,</b> <b>and if you don't see</b> <b>Agent Flows in the menu,</b> <b>then please ensure to update your</b> <b>Flow-wise instance to the latest version. </b> <b>And by the way, if you don't want to deal</b> <b>with installing or</b> <b>updating Flow-wise yourself,</b> <b>then check out their fully</b> <b>managed cloud service instead. </b> <b>The link is in the</b> <b>description of this video,</b> <b>and by using my link you will be</b> <b>supporting my channel.
</b> <b>Let's create our Agent</b> <b>Flow by clicking on Add New,</b> <b>Save this project,</b> <b>let's give it a name</b> <b>like "Self-improving RAG",</b> <b>and let's save this. </b> <b>Let's start by adding a</b> <b>new node to the canvas,</b> <b>and under Sequential</b> <b>Agents, let's add the start node. </b> <b>If you're new to sequential agents,</b> <b>then I highly recommend checking out my</b> <b>fundamentals video over here.
</b> <b>Let's also add a chat model,</b> <b>so let's go to Chat Models,</b> <b>and I'm going to use</b> <b>the Chat OpenAI node,</b> <b>but of course the</b> <b>choice of model is up to you. </b> <b>Just make sure the model is powerful</b> <b>enough to run agents and</b> <b>have tool use capabilities. </b> <b>Under Model Name, let's select GPT40,</b> <b>I'll also select my credentials,</b> <b>and I'll set the</b> <b>temperature to a low value like 0.
2. </b> <b>Now let's think about what this</b> <b>application needs to do. </b> <b>First, we want to call an agent that is</b> <b>able to retrieve information</b> <b>about a restaurant from a knowledge base.
</b> <b>After that, we will have agents that will</b> <b>score the response</b> <b>from the knowledge base,</b> <b>and then determine if the</b> <b>query should be rewritten,</b> <b>or if an answer can be generated. </b> <b>But let's first focus on the</b> <b>agent and our knowledge base. </b> <b>So let's start by adding</b> <b>a new node to the canvas,</b> <b>and let's add an agent node.
</b> <b>Let's call this agent</b> <b>something like Customer Support,</b> <b>then for the system prompt,</b> <b>let's enter something like,</b> <b>you are a customer support agent for a</b> <b>restaurant called Oak and Barrel,</b> <b>that always answers questions with the</b> <b>most relevant information</b> <b>using the tools at your disposal. </b> <b>And then we'll just specify that it's got</b> <b>this restaurant info tool available. </b> <b>Let's save this prompt,</b> <b>let's also connect our</b> <b>start node with this agent.
</b> <b>Let's also add this restaurant info tool,</b> <b>so under Add Nodes, let's go to Tools,</b> <b>and within Tools, let's</b> <b>add the Retriever tool. </b> <b>We can use the Retriever tool to fetch</b> <b>information from a vector store. </b> <b>So let's attach this</b> <b>Retriever tool to our agent,</b> <b>and for the tool description, let's enter</b> <b>Search and Return Documents</b> <b>Related to the Restaurant.
</b> <b>Great! </b> <b>Now our agent has a tool that I can call</b> <b>to fetch information</b> <b>about the restaurant. </b> <b>Let's continue working on this tool.
</b> <b>First, we need to attach a</b> <b>vector store to this tool,</b> <b>and within vector stores, let's add an</b> <b>in-memory vector store,</b> <b>and let's attach it to our tool. </b> <b>Now the vector store</b> <b>requires two inputs as well. </b> <b>Let's start with the Embeddings input.
</b> <b>So under Add Nodes,</b> <b>let's go to Embeddings,</b> <b>and because I'm using OpenAI,</b> <b>I'll simply add the</b> <b>OpenAI Embeddings node,</b> <b>and let's attach it to the vector store. </b> <b>And I'm also going to</b> <b>select my credentials,</b> <b>and that should be everything</b> <b>we need for the Embedding node. </b> <b>Now we can attach our knowledge base.
</b> <b>There are many different integrations</b> <b>that you can choose,</b> <b>but what I like to do lately is, if you</b> <b>go back to the dashboard,</b> <b>you can go to Document</b> <b>Stores, and within Document Stores,</b> <b>you can add a new document store, give it</b> <b>a name, and then click Add. </b> <b>And when you open the document store,</b> <b>you are now able to add multiple document</b> <b>loaders to the store. </b> <b>So as an example,</b> <b>I'll select a .
docx file,</b> <b>I'll then upload the OakenBattle</b> <b>knowledge base, like so,</b> <b>then under Text Splitter,</b> <b>I'll simply select the Recursive</b> <b>Character Text Splitter</b> <b>to chunk this document,</b> <b>and I'll set a chunk</b> <b>size of 250 characters,</b> <b>and an overlap of 20. </b> <b>These sizes are completely up to you. </b> <b>So let's go ahead and preview the chunks,</b> <b>and we can see a list of all the chunks</b> <b>and the content that they contain.
</b> <b>Let's click on Process,</b> <b>and that should be it. </b> <b>If we wanted to, we could add more</b> <b>document loaders, as many as we want,</b> <b>and if we ever wanted to remove</b> <b>information from the knowledge base,</b> <b>we can simply delete the entry over here. </b> <b>Let's go back to our Agent Flow, and</b> <b>let's add the document store.
</b> <b>Within the document store,</b> <b>we can also select the document store</b> <b>that we just created,</b> <b>and let's attach the document store to</b> <b>our in-memory vector store, like so. </b> <b>Let's test this flow out by</b> <b>adding an end node to the canvas. </b> <b>Let's go to Add Nodes,</b> <b>under Sequential Agents,</b> <b>let's grab the end node, and let's attach</b> <b>our agent to this end node, like so.
</b> <b>So now we have a very simple Rack chatbot</b> <b>that we can test out. </b> <b>Let's save this, let's click on Chat,</b> <b>and let's ask a question related to the</b> <b>restaurant, like, "Do you sell steaks? "</b> <b>And we can see that the</b> <b>customer support agent</b> <b>deadcalled the restaurant info tool and</b> <b>retrieved this context</b> <b>from the vector store,</b> <b>and we are getting</b> <b>the correct answer back.
</b> <b>So now that this is working,</b> <b>we now need to add a second agent that</b> <b>will be responsible for determining</b> <b>if the answer from the knowledge base is</b> <b>related to the user's question,</b> <b>and then conditionally call the rewrite</b> <b>agent or the generate agent. </b> <b>Now in order to</b> <b>conditionally call a process,</b> <b>we can go to Add Nodes,</b> <b>and under Sequential Agents,</b> <b>we have the choice between a condition</b> <b>node or a condition agent node. </b> <b>With a condition node, we can take a</b> <b>simple value, like a yes or no,</b> <b>and then branch the user</b> <b>off into different parts,</b> <b>or we could use the intelligence of an</b> <b>LLM to determine the path for us.
</b> <b>Let's select the condition agent node and</b> <b>add it to the canvas. </b> <b>I'm also going to break the connection</b> <b>between the end node and our first agent,</b> <b>and then let's pass the output from our</b> <b>first agent to this</b> <b>condition agent, like so. </b> <b>Let's give our conditioned agent a name,</b> <b>like, "Check if docs are relevant,"</b> <b>and let's simply attach the</b> <b>end output to our end node.
</b> <b>We will have a look at</b> <b>this next output in a minute. </b> <b>Now let's have a look</b> <b>at this condition agent. </b> <b>Let's go to Additional Parameters,</b> <b>and here we can see a default system</b> <b>prompt and a default human prompt.
</b> <b>None of these are</b> <b>related to our application,</b> <b>so I'm actually going to go</b> <b>ahead and clear these fields. </b> <b>For the system prompt, let's enter,</b> <b>"You are a grader assessing relevance of</b> <b>a retrieved document to a user question. "</b> <b>Here is the retrieved context.
</b> <b>This value within curly braces is</b> <b>actually a</b> <b>placeholder for a dynamic value,</b> <b>and we will inject the result from the</b> <b>knowledge piece into this</b> <b>placeholder in a minute. </b> <b>Here is the user's question,</b> <b>and again we have a variable for</b> <b>question, which we will</b> <b>link to the user's question. </b> <b>If the document contains keywords or</b> <b>semantic meaning related</b> <b>to the user's question,</b> <b>grade it as relevant.
</b> <b>Give a binary score of "yes" or "no"</b> <b>to indicate whether the document is</b> <b>relevant to the question. </b> <b>Remember, always use the extract tool to</b> <b>output only "yes" or "no. "</b> <b>Your prompt doesn't have</b> <b>to be exactly like this,</b> <b>but just ensure that you inject the</b> <b>context and the user</b> <b>question into this prompt.
</b> <b>Let's save this, and just</b> <b>to reinforce these rules,</b> <b>I'm also going to add a human prompt,</b> <b>which is very similar</b> <b>to the system prompt. </b> <b>Let's save this, and now</b> <b>let's hook up those variables. </b> <b>Let's click on "format prompt values,"</b> <b>and now we can see our placeholders for</b> <b>context and question.
</b> <b>For the context, we want to grab the</b> <b>output of the previous agent node,</b> <b>which is this sequential agent zero. </b> <b>If you were wondering how I knew what the</b> <b>agent's technical name was,</b> <b>I'll quickly show you. </b> <b>Let's close these pop-ups, and when I</b> <b>hover over this agent node,</b> <b>I can see this info button over here,</b> <b>and this will give me the</b> <b>unique name for this agent,</b> <b>which was sequential agent zero.
</b> <b>Great, let's go back to the condition</b> <b>node, let's go to prompt values,</b> <b>and now let's also assign the question. </b> <b>Let's click on "edit," and let's select</b> <b>the user's question. </b> <b>Let's close this</b> <b>pop-up, and then finally,</b> <b>let's set one value in</b> <b>this JSON structured output.
</b> <b>Let's add an item, let's call it "score,"</b> <b>let's make this an enum,</b> <b>and the possible values</b> <b>are either "yes" or "no,"</b> <b>and the description is "graining score. "</b> <b>Just to explain what's happening here,</b> <b>in the system prompt, we're telling the</b> <b>agent that it needs to check</b> <b>if the context is</b> <b>relevant to the user's question</b> <b>and then output a value of "yes" or "no. "</b> <b>What we have to do is store this "yes" or</b> <b>"no" value somewhere</b> <b>so that we can</b> <b>conditionally call different paths.
</b> <b>Otherwise, this is just simply a string</b> <b>output from the agent,</b> <b>which isn't really helpful in this case. </b> <b>So by defining this</b> <b>JSON structured output,</b> <b>the agent now has a variable that it can</b> <b>store the "yes" or "no" answer in. </b> <b>If we close this pop-up, we</b> <b>can now click on "condition,"</b> <b>and here we can set</b> <b>those different paths.
</b> <b>So I'll actually add two items,</b> <b>and I'm actually going to do this a</b> <b>little bit different,</b> <b>and these two paths represent the</b> <b>generate path as well</b> <b>as the rewrite path. </b> <b>For the generate path, we want to check</b> <b>if some variable is</b> <b>equal to the value of "yes. "</b> <b>The variable that we want to check is</b> <b>actually the variable that we created</b> <b>within this structured</b> <b>output, which we called "score.
"</b> <b>Remember, the score value will contain</b> <b>the result from this agent execution. </b> <b>So within the condition on the variable,</b> <b>let's select flow. output,</b> <b>and let's replace this with the name of</b> <b>that variable, which we called score.
</b> <b>Then if score is equal to the value of</b> <b>"yes," then we'll</b> <b>follow this generate path. </b> <b>Or if the flow. output score is the value</b> <b>"no," then we will</b> <b>follow the rewrite path.
</b> <b>Let's save this, and you will now notice</b> <b>that our condition node</b> <b>gives us a generate path</b> <b>as well as a rewrite path. </b> <b>So let's add the logic for</b> <b>the generate and rewrite nodes. </b> <b>Let's go to add nodes with the</b> <b>"insequential agents.
"</b> <b>We can either use an</b> <b>agent node or an LLM node. </b> <b>I'm actually going to use the LLM node,</b> <b>as we don't need any tool calling for</b> <b>these specific agents. </b> <b>Let's attach the generate output to this</b> <b>LLM node, and let's</b> <b>call this guy generate.
</b> <b>I'm going to copy this node, and let's</b> <b>attach the rewrite</b> <b>output to the LLM node,</b> <b>and let's call this node rewrite. </b> <b>Excellent. </b> <b>Let's also attach end</b> <b>nodes to these two agents.
</b> <b>So I'll just copy this</b> <b>end node, let's attach it,</b> <b>and let's add an end node</b> <b>to this LLM node as well. </b> <b>Great. </b> <b>For the generate node, let's click on</b> <b>additional parameters,</b> <b>and let's enter something like "respond</b> <b>with the word generate.
"</b> <b>Let's also go to the rewrite node, and</b> <b>let's say "respond</b> <b>with the word rewrite. "</b> <b>We will work on these two nodes in a</b> <b>minute, but let's</b> <b>first test this process out</b> <b>to ensure that our</b> <b>conditional agent is working. </b> <b>Let's go to chat, and let's ask something</b> <b>that's relevant to the restaurant,</b> <b>like "What are your specials?
"</b> <b>Right, so our customer support agent was</b> <b>able to fetch</b> <b>information about the specials</b> <b>from the knowledge base, and we can see</b> <b>that the condition</b> <b>agent correctly determined</b> <b>that the generate path should be</b> <b>followed, and</b> <b>therefore the generate agent,</b> <b>this guy over here,</b> <b>output the word generate. </b> <b>Let's ask a question that's not related</b> <b>to the restaurant,</b> <b>like "Do you sell crypto? "</b> <b>But now we run into a very strange issue.
</b> <b>But what you'll notice is that our</b> <b>conditional agent decided to call the</b> <b>generate path as well. </b> <b>Although we know, the question and the</b> <b>context from the</b> <b>knowledge base is not relevant. </b> <b>Now, why is that?
</b> <b>I intentionally made this mistake to</b> <b>demonstrate the difference</b> <b>between using the agent node</b> <b>combined with an LLM</b> <b>node using a tool node. </b> <b>At the moment, we are grabbing the</b> <b>response from the agent</b> <b>node, so this response over here,</b> <b>and passing that to the</b> <b>condition agent node as context. </b> <b>And if you think about it, this text here</b> <b>is actually related</b> <b>to the user's question.
</b> <b>The user asked if the restaurant sells</b> <b>crypto, and this response is saying no. </b> <b>We do not sell crypto, so</b> <b>the two are definitely related. </b> <b>However, the response from the tool, in</b> <b>other words, the</b> <b>document from the vector store,</b> <b>is not related to the question at all.
</b> <b>So what we should be doing is comparing</b> <b>the response from the vector</b> <b>store to the user's question,</b> <b>and not the response from the agent, so</b> <b>this text over here. </b> <b>Now, how do we do that? </b> <b>First, let's delete the agent node.
</b> <b>Let's add two new nodes to the canvas. </b> <b>First, let's add an LLM node, and let's</b> <b>also add the tool node. </b> <b>Let's add the start node to our LLM node,</b> <b>and I'm also going to</b> <b>move all of these nodes up</b> <b>just to create a bit of space, like so.
</b> <b>Then let's put this tool node next to the</b> <b>LLM node, and there we go. </b> <b>Let's attach the LLM node to the tool</b> <b>node, and let's attach the</b> <b>tool node to the condition node,</b> <b>like so. </b> <b>So for the LLM node, let's call it</b> <b>customer support again,</b> <b>just like we did with the agent node.
</b> <b>In fact, it's going to be very similar to</b> <b>the agent node as well. </b> <b>When we click on additional parameters,</b> <b>we can go ahead and</b> <b>enter the exact same system</b> <b>prompt that we passed to</b> <b>the agent node earlier. </b> <b>And for the human prompt, let's simply</b> <b>add a placeholder for</b> <b>the user's question.
</b> <b>Let's click on format prompt values, and</b> <b>let's select question, like so. </b> <b>All right, let's move</b> <b>on to the tool node. </b> <b>Let's call the tool node retrieve, and</b> <b>then all we have to do</b> <b>is take our retriever tool</b> <b>and attach it to the tool node.
</b> <b>And I'm actually going to</b> <b>move these guys up, like so. </b> <b>Great. </b> <b>So now the tool node will reach out to</b> <b>the vector store and</b> <b>return the documents,</b> <b>and we can then take the documents and</b> <b>pass them into this</b> <b>condition node as context.
</b> <b>So within the condition node, let's go to</b> <b>additional parameters,</b> <b>let's go to format prompt values, and</b> <b>let's change the</b> <b>context from sequential agent</b> <b>to the sequential tool</b> <b>nodes output instead. </b> <b>So now instead of</b> <b>receiving some text from an agent,</b> <b>we will now receive the documents</b> <b>themselves from the vector store,</b> <b>and then compare that</b> <b>to the user's question. </b> <b>Let's test this out.
</b> <b>So in the chat, let's try</b> <b>what are the current specials. </b> <b>We can see the retriever tool fetch this</b> <b>information from the vector store,</b> <b>and the conditional agent decided to</b> <b>generate the response. </b> <b>Let's ask an invalid question.
</b> <b>Do you sell crypto? </b> <b>And of course, the retriever tool fetched</b> <b>information from the vector store. </b> <b>Those documents were</b> <b>passed to the conditional agent,</b> <b>which determined the document is not</b> <b>related to the user's question,</b> <b>and is now therefore</b> <b>triggering the rewrite path.
</b> <b>Great. </b> <b>That is exactly what we wanted. </b> <b>So hopefully, you now have a better</b> <b>understanding of when to use tool nodes,</b> <b>instead of agent nodes with tool calling.
</b> <b>Now let's implement our generate node. </b> <b>This one's quite simple. </b> <b>Within additional</b> <b>parameters, change the system prompt to</b> <b>"You are an assistant for</b> <b>question answering tasks.
</b> <b>Use the following pieces of retrieved</b> <b>context to answer the question. </b> <b>If you don't know the answer, just say</b> <b>that you don't know. "</b> <b>And then we have placeholders for the</b> <b>user's question and the context.
</b> <b>Then it's also enter a human prompt,</b> <b>given the user question and context,</b> <b>answer the user's query. </b> <b>Let's click on format</b> <b>prompt values for the question. </b> <b>Let's select the</b> <b>question from the chat window.
</b> <b>And for the context, let's use the</b> <b>sequential tool nodes output. </b> <b>Great. </b> <b>Let's close this, and that's everything</b> <b>we need to do for the generate node.
</b> <b>Let's go to our rewrite agent. </b> <b>Let's click on additional parameters. </b> <b>Let's change the system</b> <b>prompt to something like,</b> <b>"You are a helpful assistant that can</b> <b>transform the query to</b> <b>produce a better question.
"</b> <b>Let's save this. </b> <b>Then in the human prompt, let's enter,</b> <b>"Look at the input</b> <b>and try to reason about</b> <b>the underlying</b> <b>semantic intent or meaning. </b> <b>Here is the initial question.
</b> <b>Formulate an improved question. "</b> <b>Let's save this. </b> <b>Let's go to prompt values.
</b> <b>Let's change the question to question. </b> <b>And that's all we have to</b> <b>do for this node as well. </b> <b>So what will happen with the rewrite node</b> <b>is that this lnM over here</b> <b>will try to rewrite the</b> <b>user's question in a different way,</b> <b>hoping we'll get a different answer.
</b> <b>Now, obviously, we don't want to stop the</b> <b>process at this point. </b> <b>So I'm going to delete this end node. </b> <b>So after this lnM has</b> <b>rewritten the question,</b> <b>we want to loop back to the customer</b> <b>support agent over here.
</b> <b>So let's simply copy the name of this</b> <b>node and paste it into</b> <b>the loop node, like so. </b> <b>And that's actually everything we need to</b> <b>do to build this application. </b> <b>Then in the chat, let's</b> <b>ask, "Do you sell crypto?
"</b> <b>So the retriever reached out to the</b> <b>vector store and</b> <b>retrieved this document over here,</b> <b>which is not related</b> <b>to the user's question. </b> <b>Therefore, the rewrite path was called,</b> <b>and the rewrite agent</b> <b>has rewritten the</b> <b>question to something like this,</b> <b>which produces a final output of "No,</b> <b>Oaken Barrel does not</b> <b>sell cryptocurrency. "</b> <b>So this is extremely helpful if the</b> <b>user's question is indeed</b> <b>related to the restaurant,</b> <b>but is perhaps not</b> <b>phrased in the correct way.
</b> <b>If you liked this video,</b> <b>then please hit the like button</b> <b>and subscribe to my channel</b> <b>for more Flow-wise content. </b> <b>I'll see you in the next one. </b> <b>Bye bye.
Copyright © 2024. Made with ♥ in London by YTScribe.com