Analysing Chatflows using LangSmith - FlowiseAI Tutorial #7

9.78k views2214 WordsCopy TextShare
Leon van Zyl
#flowiseai #flowise #openai #langchain Observability in LLM application is a critical component to...
Video Transcript:
<b>In this series, we've had a look at</b> <b>creating quite a variety of</b> <b>chat flows using Flowwise. </b> <b>And I think you will agree that the</b> <b>complexity is increasing quite a bit. </b> <b>And one of the biggest challenges is</b> <b>keeping track of what</b> <b>these AlloLm applications are</b> <b>doing behind the scenes.
And this will</b> <b>become an even bigger</b> <b>issue once we start moving into</b> <b>more complex applications like agents</b> <b>with tools. But</b> <b>thankfully, there is a solution</b> <b>for debugging and monitoring these AlloLm</b> <b>applications in</b> <b>detail. And that is called</b> <b>LangSmith.
If you're unfamiliar with</b> <b>LangSmith, it's a platform that was</b> <b>developed by LangChain. </b> <b>And if you weren't aware, Flowwise is</b> <b>actually using LangChain as one of its</b> <b>underlying frameworks. </b> <b>And what this product allows us to do is</b> <b>to monitor each and every</b> <b>step within our application</b> <b>in detail.
And there are some other</b> <b>benefits as well. For</b> <b>instance, we can view the amount of</b> <b>tokens used by our application. You might</b> <b>have noticed that I was</b> <b>using LangSmith in a few of</b> <b>my videos, but it wasn't available to the</b> <b>public at that time.
So</b> <b>there was no point in trying to</b> <b>demonstrate this. However, LangSmith</b> <b>launched to the public just</b> <b>a few days ago. So I highly</b> <b>recommend signing up for a LangSmith</b> <b>account.
And if I go to</b> <b>pricing, you will notice that it's</b> <b>free to use as a developer. And of</b> <b>course, for more serious</b> <b>implementations, you can upgrade to</b> <b>one of these packages. But in this</b> <b>series, we will be using the free</b> <b>developer account so that you</b> <b>can follow along.
So go ahead and click</b> <b>on Get Started and sign up</b> <b>for your free account. After</b> <b>logging in, you should be presented with</b> <b>a dashboard like this. And at</b> <b>the moment, we do not have any</b> <b>projects.
And what we'll do is I'll</b> <b>actually show you the</b> <b>behavior of four different projects. </b> <b>Let's start with a very simple</b> <b>implementation, which is just a very</b> <b>standard Allolint chain. </b> <b>This is nothing fancy, but I do want to</b> <b>use this to show you the basic</b> <b>functionality of LangSmith.
</b> <b>So this is simply an Allolint chain node</b> <b>with a very basic prompt</b> <b>template that will accept a</b> <b>subject as input and then generate a</b> <b>junk. And for the model, I'm</b> <b>simply using the GPT 3. 5 Turbo</b> <b>instruct model.
And if we test this,</b> <b>we'll simply get a response</b> <b>back as per usual. And if we go</b> <b>over to LangSmith, we won't see anything</b> <b>within these projects. Now in</b> <b>order to analyze this chain,</b> <b>we can simply go to settings.
And within</b> <b>settings, we have this</b> <b>analyze chat flow option. And here</b> <b>we have a few options of different</b> <b>service providers. We are</b> <b>interested in LangSmith.
Let's</b> <b>go to connect credentials. Let's create</b> <b>our LangSmith credential by giving it a</b> <b>name like LangSmith API. </b> <b>Then let's get our API key from LangSmith</b> <b>by going to API keys.
Let's</b> <b>create a new key. Let's copy</b> <b>that key. We can close this pop up and</b> <b>note that I will be deleting this key</b> <b>after this recording.
</b> <b>So please use your own key. Then back in</b> <b>Flow-wise, we can simply</b> <b>paste in that key and hit add. </b> <b>This is optional.
I do recommend</b> <b>providing a project name. </b> <b>Otherwise, all of these logs will</b> <b>be stored against the default namespace. </b> <b>And of course, you can</b> <b>specify a project for each and</b> <b>every chat flow.
But I'll just create one</b> <b>project name for all of</b> <b>these flows called Flow-wise. </b> <b>We can then enable this analysis by</b> <b>setting this toggle to</b> <b>on. Let's save this change.
</b> <b>Now let's test this again. I'll just</b> <b>enter horse. And now when</b> <b>we go back to LangSmith,</b> <b>we can go to projects.
We will now see</b> <b>that Flow-wise project. </b> <b>And when we open this,</b> <b>we can see the trace for the chat that we</b> <b>just executed. We can</b> <b>see that it was indeed an</b> <b>LLM chain that triggered this trace.
Our</b> <b>input was horse. We can see</b> <b>the time and date, the time</b> <b>it took for this to execute. And</b> <b>something that most of you will find</b> <b>extremely valuable is the</b> <b>cost for this execution and the amount of</b> <b>tokens that were used.
I</b> <b>know a lot of you have been</b> <b>asking me in the comments how we can keep</b> <b>track of the token usage or</b> <b>the cost in these Flow-wise</b> <b>chat flows. And this is exactly how it's</b> <b>done. Now we can see additional</b> <b>information about this</b> <b>by clicking on that record.
And on the</b> <b>left hand side, we can see all the</b> <b>different steps that</b> <b>were executed as part of this trace. And</b> <b>since this was a very</b> <b>simple application, there really</b> <b>only was one step. And that was the call</b> <b>to open AI.
So we can see</b> <b>the content of that prompt</b> <b>template. And we can see that that</b> <b>placeholder was indeed populated with</b> <b>horse. And we can then see</b> <b>the generation from the open AI model.
</b> <b>You can actually take this</b> <b>one step further by tweaking</b> <b>this prompt within Langsworth itself by</b> <b>clicking on playground. Then within</b> <b>playground, you have</b> <b>the option of tweaking this prompt and</b> <b>then executing it. And</b> <b>you can also play with the</b> <b>different settings so that you can copy</b> <b>these back into your Flow-wise</b> <b>application.
If you want to use</b> <b>this, you simply have to click on secrets</b> <b>and then paste in your open</b> <b>AI API key. Now let's have a</b> <b>look at a few more examples. And you</b> <b>might also find this</b> <b>fascinating.
Let's have a look at this</b> <b>LLM chain with an output parser. Here we</b> <b>have a very simple LLM</b> <b>chain with a prompt template of</b> <b>generate a comma separated list of</b> <b>synonyms based on the</b> <b>following word. And we will grab a word</b> <b>from the chat box.
And then for the</b> <b>model, we are using this GPT</b> <b>3. 5 turbo instruct model. And</b> <b>lastly, we also have a custom list output</b> <b>parser in this flow.
And</b> <b>this will take the output from</b> <b>the LLM and convert it into a JSON</b> <b>structure with a list. So</b> <b>before we run this, let's enable</b> <b>Langsmouth by clicking on settings and</b> <b>then analyze chat flow. Within</b> <b>Langsmouth, let's provide our</b> <b>Langsmouth credentials.
Let's provide a</b> <b>project name like</b> <b>Flow-wise. And let's set this to on. </b> <b>Let's save this.
Then let's run this in</b> <b>the chat. And let's enter the</b> <b>word happy. And this returns</b> <b>this JSON list of values.
Now what's very</b> <b>interesting in this</b> <b>example is based on this</b> <b>output, we actually have absolutely no</b> <b>idea how this list output</b> <b>parser works. But if we go to</b> <b>Langsmouth, and if we click on this</b> <b>trace, we can see that that word</b> <b>placeholder was passed into</b> <b>the model as happy. But we can also see</b> <b>this additional property</b> <b>which we didn't specify</b> <b>in the prompt template.
There's this</b> <b>placeholder called format</b> <b>instructions. And what the output</b> <b>parser module did was inject this piece</b> <b>of text to say your</b> <b>response should be a list of items</b> <b>separated by a comma. And there's also an</b> <b>example of values.
And that</b> <b>meant that the output of the</b> <b>model is now in this JSON list format. </b> <b>How awesome is that? And</b> <b>another useful feature is this</b> <b>metadata tab.
And there's a lot of</b> <b>information in here. But we</b> <b>can find some useful information</b> <b>like the model that was used during</b> <b>runtime, the temperature and the output</b> <b>parser that was used. </b> <b>Let's have a look at two more examples.
</b> <b>Let's have a look at this conversation</b> <b>chain. This is nothing</b> <b>too complex. This is simply a</b> <b>conversation chain with buffer memory</b> <b>assigned to it.
For the model,</b> <b>I'm using the GPT 3. 5 turbo model. And</b> <b>I'm using a chat prompt</b> <b>template with a system message of</b> <b>you are a helpful AI assistant.
And for</b> <b>the human message, we're</b> <b>simply grabbing the input from the</b> <b>chat window. Let's enable Langsmith by</b> <b>clicking on settings. It's</b> <b>going to analyze chat flow.
</b> <b>Within Langsmith, let's select our</b> <b>credentials. Let's enter the project</b> <b>name. And let's turn this</b> <b>on.
Let's save this in the chat. Let's</b> <b>enter something like the</b> <b>passphrase is Langsmith. </b> <b>And the reason I'm showing this example</b> <b>is to show you how memory</b> <b>works within these applications.
</b> <b>If we go back to Langsmith, we can see</b> <b>that the conversation</b> <b>chain is actually considered a</b> <b>runnable sequence. Runnable sequence is</b> <b>simply a Lang chain</b> <b>concept, which we won't really get</b> <b>into in this video, but we can still see</b> <b>our input. And if we click on this, we</b> <b>can see a very simple</b> <b>list of steps at this level.
And for the</b> <b>highest note, it simply shows us the</b> <b>input as well as the</b> <b>response that was returned. But we can</b> <b>also see that there are</b> <b>actually five hidden steps within</b> <b>this. So if we click on most relevant, we</b> <b>can click on show all.
So</b> <b>what we can see here is this</b> <b>runnable map step, which effectively</b> <b>receives our input as its</b> <b>input and then returns this output</b> <b>structure, which includes the chat</b> <b>history. This output is then</b> <b>passed along to the next step in</b> <b>the strace, which is the chat prompt</b> <b>template. And if you recall this chat</b> <b>prompt template as a system</b> <b>message as well as this human message,</b> <b>which is a dynamic value.
So</b> <b>if we have a look at the strace,</b> <b>we can see that the system message is</b> <b>indeed the hard coded value that we</b> <b>provided. But for the</b> <b>human message, this input value is added</b> <b>to this human field. So far,</b> <b>this is very straightforward.
</b> <b>If we go to chat open AI, we can see the</b> <b>output from the chat prompt</b> <b>template being passed in as</b> <b>the input for the model. And we can see</b> <b>the output from the</b> <b>model. And we can also see this</b> <b>output parser step.
And that is simply</b> <b>because the conversation chain will</b> <b>inherently convert its</b> <b>output to a string. And this is the final</b> <b>response that we saw in the</b> <b>chat window. But if we continue</b> <b>this conversation, so let's say in</b> <b>something like what is the passphrase,</b> <b>our model is able to answer</b> <b>this question because of the memory.
And</b> <b>if we go back to</b> <b>Langsworth, we can now see exactly how</b> <b>memory works. When we click on this</b> <b>latest trace, let's change this to show</b> <b>all. And if we now click</b> <b>on runnable map, we will notice that this</b> <b>looks a little bit different</b> <b>to the first time we executed</b> <b>this chain.
Of course, we do see our</b> <b>initial input, but the output now</b> <b>contains our input message as</b> <b>well as the chat history. And that</b> <b>contains the previous human and AI</b> <b>messages. Of course,</b> <b>we can go into these individual traces to</b> <b>see how this works.
Firstly,</b> <b>this simply accepts the human</b> <b>input and passes it back as I pass</b> <b>through. Then the second step in this</b> <b>process is responsible</b> <b>for fetching the chat history and then</b> <b>passing that back in this</b> <b>response. So now when we click</b> <b>on chat prompt template, we will notice</b> <b>that the input now contains</b> <b>our message along with the chat</b> <b>history.
And that affects the final</b> <b>structure of the chat prompt template,</b> <b>which not only includes</b> <b>the system message and the human message,</b> <b>but also the chat history. So if we now</b> <b>go to the next step,</b> <b>which is the chat open AI call, we will</b> <b>see that the input now includes</b> <b>everything including the</b> <b>history. And that is why the model was</b> <b>able to answer this question.
</b> <b>Now let's have a look at the</b> <b>final example. And that is this rag</b> <b>chatbot that retrieves information from a</b> <b>pine cone database in</b> <b>order to provide an answer. This is the</b> <b>exact same example that we</b> <b>used in the previous video,</b> <b>where we scraped the Lang chain</b> <b>documentation in order for</b> <b>the model to be able to answer</b> <b>questions on Lang chain expression</b> <b>language.
So let's enable Langsmith by</b> <b>clicking on settings,</b> <b>let's go to analyze chat flow, let's load</b> <b>our credentials,</b> <b>let's give a project name,</b> <b>and let's turn this on. Let's save this. </b> <b>And in the chat, let's ask</b> <b>what is LCL?
And as expected,</b> <b>we are receiving the correct response. We</b> <b>can click on this latest</b> <b>trace, we can see that this</b> <b>is quite an extensive list of steps, and</b> <b>we won't go through all</b> <b>of this. But what we are</b> <b>interested in typically is to see which</b> <b>documents were used as part</b> <b>of this context, we can get</b> <b>those by clicking on find docs.
And</b> <b>within this output, we can see</b> <b>information about the documents</b> <b>that were used along with their metadata. </b> <b>And we can scroll through</b> <b>these and see all of those</b> <b>different documents, which can be very</b> <b>useful for troubleshooting situations</b> <b>where a problematic or</b> <b>outdated document is returning the wrong</b> <b>information. And we</b> <b>could then go and delete that</b> <b>document from the knowledge base.
I hope</b> <b>you found this video of using Langsmith</b> <b>in flow wise useful,</b> <b>and please let me know if you would like</b> <b>a video to go over Langsmith in more</b> <b>detail. Also, please</b> <b>hit the like button and subscribe to my</b> <b>channel for more flow</b> <b>wise and Langsmith content.
Copyright © 2024. Made with ♥ in London by YTScribe.com