<b>HuggingFace is a platform that hosts</b> <b>thousands of AI models</b> <b>that we can integrate into our</b> <b>flow-wise applications for free. So if</b> <b>you're tired of paying</b> <b>for services like OpenAI and</b> <b>Anthropic, then this video might be for</b> <b>you. Before we proceed,</b> <b>I do need to issue a fair</b> <b>warning.
Although it can be a lot of fun</b> <b>playing with these models,</b> <b>it can also be exceptionally</b> <b>frustrating getting them to work</b> <b>correctly. I will give you practical</b> <b>advice in this video on how to</b> <b>improve the results from these models,</b> <b>but you might have a</b> <b>newfound respect for services like</b> <b>OpenAI and Anthropic after you punch a</b> <b>hole through your PC</b> <b>trying to get these models to</b> <b>behave correctly. That's not to say that</b> <b>these models do not have</b> <b>their place though, and I've</b> <b>used some of these models to perform</b> <b>simple tasks for free and then leaving</b> <b>the more complex tasks</b> <b>for models like GPT and Claude.
So now</b> <b>that that's out of the way, let's first</b> <b>have a look at HuggingFace. </b> <b>We can access HuggingFace by going to</b> <b>HuggingFace. co and we can then search for</b> <b>specific models or we</b> <b>can click on the Models menu to get a</b> <b>list of all the available</b> <b>models.
And on the left-hand side,</b> <b>we can filter on these models so we have</b> <b>multimodal models, computer vision,</b> <b>natural language, and more. </b> <b>If we only wanted to look at text</b> <b>generation models, we can simply click on</b> <b>this text generation</b> <b>folder and this will give us all the text</b> <b>generation models and we can see that</b> <b>there are nearly 70,000</b> <b>models at the time of this recording. We</b> <b>can see the specific</b> <b>details of a model by clicking on</b> <b>its link, for instance this "mixture"</b> <b>model which is extremely popular, and on</b> <b>this page we can see</b> <b>additional information about this model.
</b> <b>And most importantly, if we</b> <b>look on the right-hand side,</b> <b>we can see this section called "Inference</b> <b>API" and we can also see</b> <b>this text that says this model</b> <b>can be loaded on "Inference API" which</b> <b>means we are able to</b> <b>integrate with this model from tools</b> <b>like Flowwise. If you do not see this</b> <b>"Inference API" section, it means that</b> <b>integration with this</b> <b>model is not set up on HuggingFace and</b> <b>you might have to self-host</b> <b>that model. And that is not</b> <b>something that we'll have a look at in</b> <b>this video as I only want</b> <b>to focus on the free options.
</b> <b>So as long as you see this "Inference"</b> <b>section, we are good to go. So we can</b> <b>also test out that model</b> <b>by sending a message over here like "What</b> <b>is the capital of South</b> <b>Africa? " and we can see the type</b> <b>of responses that we can expect from this</b> <b>model.
So if we're happy with these</b> <b>results, we can go ahead</b> <b>and implement this model in Flowwise. So</b> <b>back in Flowwise, I've</b> <b>created a new chat flow and let's</b> <b>start by adding a new node. Let's go to</b> <b>"Chains", let's add an LLM chain and</b> <b>let's start by adding</b> <b>our LLM.
So under "Chat Models" and let's</b> <b>add the "Chat HuggingFace"</b> <b>node. And this node allows us</b> <b>to call those "Inference APIs". Let's</b> <b>connect this to our LLM chain and let's</b> <b>set up our HuggingFace</b> <b>credentials.
So under this drop down,</b> <b>click on "Create New". Let's give it a</b> <b>name like "HuggingFace API". </b> <b>And now for the HuggingFace API key, go</b> <b>back to HuggingFace, then</b> <b>create a new account or log</b> <b>into your account.
I'm already logged in,</b> <b>so after logging in, simply go to</b> <b>"Settings", then go to</b> <b>"Access Tokens", click on "New Token",</b> <b>give your token a name. You can leave the</b> <b>type as "Read" and then</b> <b>click on "Generate a Token". Now go ahead</b> <b>and copy this token, then paste it into</b> <b>this field in Flowwise</b> <b>and click on "Add".
Now we need to</b> <b>specify the model that we'd like to use. </b> <b>Now in order to get the name,</b> <b>go back to your model's page on</b> <b>HuggingFace and simply click on this</b> <b>"Copy" button next to the name</b> <b>and then paste that into this model</b> <b>field. We will not look at</b> <b>the endpoint in this video</b> <b>as this is the endpoint that will be</b> <b>generated when you decide</b> <b>to self-host your models.
And</b> <b>basically how that works is let's say</b> <b>that you go to a model that does not have</b> <b>this "Inference API"</b> <b>set up. If you can see this "Deploy"</b> <b>button, you can actually go</b> <b>and set up your own inference</b> <b>endpoint for this model. So you'll simply</b> <b>click on this option, you</b> <b>will load your credit card</b> <b>details.
You can pretty much leave all of</b> <b>these settings as the</b> <b>default options and you can then</b> <b>make this a public API as an example. And</b> <b>since I haven't loaded</b> <b>payment details, I'm not able to</b> <b>host this. But if you deadlight your</b> <b>details, you will be able to deploy this</b> <b>and by the end of the</b> <b>deployment you will receive a URL</b> <b>endpoint which you can copy and paste</b> <b>into this field.
But again,</b> <b>I'm only going to look at the free models</b> <b>in this video. But some of</b> <b>you might find this useful. </b> <b>Right, all we have to do now is add our</b> <b>prompt template.
So under</b> <b>"Add Nodes", let's go to</b> <b>"Prompts" and let's add our prompt</b> <b>template. Let's connect this prompt</b> <b>template to the chain</b> <b>and let's set the template. And let's do</b> <b>something like tell me a joke about, and</b> <b>let's set a variable</b> <b>with curly braces and let's call the</b> <b>variable "subject".
Let's</b> <b>save this and let's save the chat</b> <b>flow and let's test this in the chat. </b> <b>Let's expand the chat and let's try this. </b> <b>Let's enter something</b> <b>like "dog" and now you will notice a very</b> <b>strange looking response</b> <b>coming back.
And it kind of looks</b> <b>like the model is generating more than</b> <b>one joke about dogs and it's</b> <b>also added this random word</b> <b>at the start of the prompt. And this is</b> <b>one of those things that</b> <b>can be extremely frustrating</b> <b>when working with these open source</b> <b>models. But I am going to show</b> <b>you how to improve this.
Let's</b> <b>go back to the model page and let's see</b> <b>if we missed something. If</b> <b>we scroll down with this page,</b> <b>we can see this section here for</b> <b>"instruction format" and it's very</b> <b>important that you get this</b> <b>right. Most models, if not all models,</b> <b>will have some sort of section in the</b> <b>documentation explaining</b> <b>how to prompt these models.
So here we</b> <b>can see in this example that</b> <b>they start a prompt with these</b> <b>S brackets and then they use these square</b> <b>brackets to pass in an</b> <b>instruction to the models. So in</b> <b>between these brackets we have</b> <b>instruction 1, then we're expecting the</b> <b>model's answer and then we can</b> <b>pass in a follow-up instruction. So let's</b> <b>see what happens if we</b> <b>simply copy this example and let's</b> <b>change the prompt template.
Let's</b> <b>actually remove this. It's passed in that</b> <b>example and let's replace</b> <b>instruction with the text that we just</b> <b>deleted. Tell me a joke</b> <b>about subject.
Let's save this. </b> <b>Let's save the chat flow and let's see</b> <b>what happens now. Let's actually clear</b> <b>this chat and let's enter</b> <b>dog again and we can see the response has</b> <b>improved but it's still adding a whole</b> <b>bunch of funny words</b> <b>and characters towards the end.
So let's</b> <b>continue with this instruction. Let's</b> <b>also add this S bracket</b> <b>to the prompt. So before instruction</b> <b>let's add this opening S bracket and at</b> <b>the end let's add the</b> <b>closing S bracket.
Let's save this. Let's</b> <b>save the flow and let's see what we get</b> <b>now. So let's pass in</b> <b>dog and this time the response has</b> <b>greatly improved and I think</b> <b>this is one of those issues</b> <b>that a lot of you have been struggling</b> <b>with and it really just comes down to</b> <b>getting the instructions</b> <b>on how to prompt these models from the</b> <b>documentation and then implementing those</b> <b>instructions in your</b> <b>prompt templates.
So I hope you found</b> <b>this video useful and if you did please</b> <b>hit the like button,</b> <b>subscribe to my channel and let me know</b> <b>down in the comments which</b> <b>open source models you like to</b> <b>use and which prompts you use to get the</b> <b>best out of those models. If</b> <b>you like this video then you</b> <b>might also like this other video where we</b> <b>run these open source</b> <b>models locally using Olama.