why you suck at prompt engineering (and how to fix it)

28.41k views13542 WordsCopy TextShare
Liam Ottley
📚 My Free Resource Hub & Skool Community: https://bit.ly/3uRIRB3 (Check “Youtube Resources” tab for...
Video Transcript:
you probably suck at promt engineering and in this video I'm going to tell you why how you can fix it and how you cannot be the guy in the middle here of this mid with me so that's might a little bit off topic but if you give me a second I'll explain how this applies to the majority of people who are trying to do prompt engineering and build AI systems and why it's probably holding you back because you're stuck in this midwit range so if you haven't seen this meme before basically the low IQ people
and the high IQ people kind of converge on the same solution uh as you see here so we have the guy using Apple notes on one side and the genius using Apple notes on one side and in the middle you have the midwit who's over complicating it making it very difficult and painful for themselves and then we have the same thing with NES Cafe Classic on both sides and in the middle we have the midwit struggling with all these different types of coffee and fancy methods so how does this apply to prompt engineering I know
you're asking considering you clicked on a video that's about prompt engineering but it's actually a not so bow curve when it comes to prompt engineering and uh on the far left we have the stupid person who is just using chat GPT and prompting it as they wish kind of just throwing things in there on the far side we have what we're trying to get you to after this video which is a genius who has a toolkit of prompts and understands the science behind it and in the middle we have probably you right now which is
uh I mean no no disrespect to these other YouTubers cuz I've made videos on on proing myself like I'm I'm I'm part of the problem here but uh what these are all about is chat gbt prompt templates and and sort of taking the thinking away from you and putting it in the hands of this template that they've created so uh I'm not going to sh on my videos too much uh because these videos were talking more conceptually as well so I'd say I'm on that line and the content of this presentation in this video is
intended to take you from this plateau of someone trying to do PR engineering but not actually understanding the the science behind it which is what we're going to go into this video point of this video is to take you from someone who's on that Plateau as you can see here um and get you up to the sort of genius and and very capable PR engineer who's able to do great things with these language models and it's so important because your ability to prompt them and and provide instruction of these models directly impacts your ability to
get value out of them so if there's this amazing new technology called llms and you're better at using them you're going to go further in the AI space and further in life if you can better send instructions to these models so continuing on uh you may be wondering hey why is this new style why is the camera on the different side why is everything so casual um and that's because uh I've been wasting a not wasting but I've been spending a lot of time on my videos uh the past while as you may have noticed
some of you people starting to think that I'm a YouTuber um and I'm I've never really thought of myself as a YouTuber personally um I'm a businessman and YouTube is how I get clients for my business and I think you guys are starting to see me as a as a YouTuber and I I really as much as I love making videos and teaching you guys everything what I really like doing is working on my business and working my team and building the cool software that we're building through genive and work on the morning side and
also the cool stuff we do with my my education Community as well and teaching them how to start their own businesses like I so probably less Fancy videos that require a lot of time and editing and and if I have anything interesting to share and I want to talk about it like in this video because this video is coming out of me seeing so many people that I talk to in my community not understanding this fundamental skill and it is so fundamental but people have this misconception that they know how to De it which I'm
going to break like just absolutely destroy if you in this video uh and rebuild your skills as a prompt engineer so doing this because if I have something to talk to you about and I think it's important for you all um then I'm going to share it and also you may be wondering why do I do this at all and it's because I have a SAS and it helps agency owners to build AI solutions for businesses so if I don't teach you guys how to do prom engineering you're never going to use my SAS so
I have to do this stuff so that I can succeed and and make all the money with the SAS that I want so I'm you guys get a byproduct of me trying to build my SAS which is helping you to learn these things so anyway atics so why you're probably bad at prompt engineering have conversational prompt engineering versus single shot conversational is what everyone thinks is prompt engineering and they go onto chat GPT and they go hey hey yeah I got this got this cool prompt template and they Chuck in there and they can get
some responses from it and they're like man I'm so good at this and then they switch off and think that they're a prompt engineer and they know how to do this stuff um this of course is human operated there are follow-up prompts that you can do so you can say oh could you please like modify this a little bit and because of these follow-up prompts it's very forgiving in terms of what you can say um and how you can tweak it to get the RO responses and this really is just good for personal use if
you're working at a job and you might want to streamline some of the work play that you do there great like I mean Chad GPT is an incredible software and I use it the time as well so I'm not not on it but it is conversational prompting and on the other side is single shot prompting which is something that we can actually bake into a system uh that can be automated and can be part of a sort of ongoing ongoing system or flow uh where an AI task is embedded in it um there are no
follow-up prompts because there's no no human involved in most cases there's no room for error in that case you can't have jgpt putting hey here is the answer and they put in the answer it just needs to give you the answer every single time while the system's going to break uh because of this because if we can prompt it into something that is reliable we can actually have a very scalable system that is AI built into it which is ideal for these AI assisted systems and this is really how you can create value so the
benefit of conversational prompting skills which many of you will have I'm sure is that it might make you better at your job and might make your boss a bit more money CU you're able to do more work um maybe make you a bit more money on the on the process but the benefit of these single shot systems where we can build an AI task to do a specific function every every single time reliably is that it will allow you to build AI systems worth potentially thousands of dollars a piece as as I've done as many
people in my community have done as well if you don't believe me I don't care furthermore on this point of why you should take prompt engineering seriously Andre Kathy here uh says the hottest new programming language is English and this is no dummy he is a founding member of open AI he's also a leading AI researcher what he means here by saying the hottest new programming language is English is that you being able to write instructions in English is going to allow you to one generate code if you want to so you can translate from
English to codee that's one way of programming in English technically but another way is that if you can write effective prompts you can replace the the the programming required with a massive program or a massive script you can write a prompt that effectively does all of the things that that that script would have done so you can replace large blocks of code with a well-written prompt now which is really what I want you guys to focus on and say well I can have the abilities of a developer if I can write these prompts well using
llms properly um and furthermore this guy also this guy Liam otley I've founded a couple AI companies I have my own AI agency Morningside AI I have my own AI education Community uh my tripa accelerator and I also have a software U my AI SAS called agentive which is really what my focuses on right right now and I've got some pretty smart people working for me I'm not the brains of the operation anymore I I hope I was at one point but my CTO Spencer has like five six years of NP uh experience and he
does some really cool stuff for us and a lot of what I'm going to sharing in this in terms of how you should be doing your prod engineering and what I've learned and what I now use is from him so you might think I'm just some goofball who's been doing YouTube for 12 months uh but I do have teamed and I've paid people who are a lot smarter than me to give me this knowledge so now I'm giving it to you so now I want you to remember this a well-written prompt can replace hundreds of
lines of code going back to what I said before this is I think it's my quote but I'm just going to say someone said it cuz someone must have said it but that's essentially what you can do if you write a well-written prompt um now here's an example so there's there a video that will have just gone out recently on my channel where I manage my phone finances with AI I set up a system where my assistant can send money I can send screenshots these things here through the system and out comes the other side
a tracker for all my expenses within my notion um it automatically extracts the extracts the transactions from the screenshots categorizes them stores them in my expense data database within the notion and this is kind of the system here you can pause and take a look but basically so it took me 2 hours to write a very good prompt that can success categorized format and then pass the data over to notion um and that's ended up saving 8 hours per month for my system so example there not the best one but you get the idea um
if you write a good prompt you can replace what would have taken like to me for me to do this expensive system with code would have taken a a whole lot longer and it would have been extremely messy um but the AI can just throw all the information at it say hey look this is what I want you to do with it and outcomes the transactions ready to go into notion and no we're still not ready to move forward because you need to understand that if you can just get this skill right that many people
don't have correct they think they can do conversational prompt engineering and that's going to be enough for them to go in and build these systems but in AI voice systems which are all the rage right now I've done a ton of videos on you can go watch them on my channel AI voice systems if you can't prompt correctly if you don't have good prompt engineering skills you can't do AI voice systems if you don't have good prompt engineering skills you can't create AI agents like gpts if you don't have good prompt engineering skills you can't
build ai's tasks into AI automations like on zapia and make Etc and you can't build custom AI tools on relevance and stack Ai and these other platform so if you can't just get this thing right and watch the rest of this video it's not going to be a retention hookie and and for your Tik Tok brain I don't care if you watch the rest of it but I'm telling you if you don't take the time to actually soak in this information I'm about to tell you and and get good at this prompt engineering skill you
are not going to make any money in AI because everything depends on it and finally what I want to do is a little comparison of the two different types of people you can be you can either watch this video and come out on the right side here or you can continue to do your whatever you think you're doing when you're prompt engineering um and you can be like the guy so go on the left the midw he has a handy bag of prompt templates he gets stuck when something doesn't work because he doesn't understand what
what the template's even doing so then he uses a more expensive and a smarter model like he moves from 3.5 turbo to four turbo and he goes oh yeah well now it works because he gets the models to do the work inad of himself so by doing this he creates slower and more expensive systems and therefore he struggles to create systems that are actually valuable for the clients cuz if it's costing them a lot and they're really slow there's less value for the client right and then number six he gives up on trying to start
an AI business and get into this AI solution space and then like some of you guys in the comments they become a triaa as a scam goofball and blame it on the model and not your inability to learn how to write English and then on the right we have the guy that you want to be uh he has a toolkit of prompt components and methods based on Research which I'm going to take you through in this video he approaches problems like an engineer he skillfully applies these techniques he achieves the desired performance with fastest and
cheapest model available so he uses the cheapest model he can get and uses his skills to make it do what he needed to do therefore he's able to create lightning quick and affordable AI systems for clients that create actual value because they're cheap and they're fast and then therefore he actually makes money because these clients like wow this thing is awesome and number seven this guy then finds other AI Chads like him who know how to do prompt engineering and are making money with AI and with him and his friends they all get AI Rich
um yes I'm selling the dream there but that is what's possible if you can get this thing right and that is what myself and a bunch of the other guys that I was just namam with they're all doing it uh it's happening um whether you like it or not so be like this guy don't be like this guy um yeah there you go so now we get into the perfect prompt formula for building AI systems which is the meat and poates of this video um Beware Of The Prompt formula as I mentioned you don't want
to be the guy who relies on the formula and while is while I am giving you a formula in this video I've put it in asx's and user capital letters so that you understand that I'm kind of taking the piss out of formulas because what I'm teaching you in this is going to be the science behind them um so that you guys if you run into an issue you'll understand hey look I can apply this technique to try and fix it so you'll actually be able to write good prompts forever if you understand the stuff
I'm going to teach and you actually absorb it so components of this prompt are role task specifics context examples and notes and behind each of these components is a related uh scientific paper or some research that has been done or some prompting technique that has been discovered and backed up with a research paper that you can see on screen here we have roll prompting Chain of Thought prompting emotion prompt F shot prompting and lost in the middle all of these are going to be covered in the next section to this video so let's jump into
it um oh before we do that actually what each of these techniques have is a increase in accuracy or performance for props and I'm going to retention hook you here with all these question marks because over time we're going to reveal just how much performance improvements you can get so if you stack all of these up together uh you get an increase in performance on your PR um just a lot of these are very easy to implement um but you're going to get a massive increase I'm not going to tell you how much it is
but a huge increase just by applying these simple simple techniques so we're going to be using an example for this video which is an email classification system uh and the the AI task here in the middle uh is where we're going to have be sending our prompt and in this case it's going to be someone comes onto uh someone's website they fill out a form that form then gets sent the form submission gets sent by email to the company the CEO or the Ops guy uh to his email and he gets it and then normally
has to read through it and then classify it and and take action from there but what we're going to be doing is imagining a system where there is this AI task or this AI node and make.com or whatever you want to use where the email comes in and then it's going to be classified using our prompt into opportunity needs attention or ignore label so super basic system I wanted to use as an example here let's get into it um we're going to be building up a prompt over time of of how we can apply this
techniques to make the to make this thing better and perform better so starting off we have the typical chat GPT prompt if you asked any mid midwit well not even midwit this guy's the stupid guy uh if you asked any regular uh bottom feeder chat GPT user they' probably give you a prompt like classify the following email into ignore opportunity or need detention labels and then they' paste in the email right so this is our starting point this is the typical CHT prompt and this is as far on the on the IQ scale on the
left as you can go so we're breaking down by component we're starting off with the rooll I know for you Tik Tok brains here you're probably going to look at this and be like ah there a lot of writing but uh can you just pause this video uh I'm not going to go over all of it I think some of you already know some of these components Ro prompting is something that you've definitely done before but I want to draw attention here to the research results with this little rocket ship to show that it's increasing
the accuracy uh when you assign an advantageous role in your role prompting by saying you are an email classification expert uh trained to be the assist this it can increase the accuracy of your prompts and the performance of them by 10.3% and secondly if you give complimentary descriptions of your abilities to further increase accuracy you can get up to 15 to 25% increase in total so this is as simple as here's the example you are a highly skilled in Creative short form content script writer that is the role with a knack for crafting engaging informative
and concise videos so you add a role and then you give it key qualities like engaging informative and concise and you basically hype it up and tell it man you're so amazing at this this this so you need have a role that is strong and tells it that is advantageous to what it's doing so if you're solving a math problem you are an expert math teacher and then you can give it some more examples after that of the key quality so takeaways here select the role that is advantageous for the specific task EG math teacher
for math problems and then enrich the rooll I like that word enrich the rooll with with additional words to highlight how good it is at that task super simple um that's Ro prompting so this is what we're going to be doing to kind of tie everything together in this video which is a before and after so this was the this was the low IQ one remember this so this is our starting point and here we have what happens after we add in the roll thing so you're going to need to pause this as this thing
gets bigger it's kind of hard for me to put the whole prompt on the screen uh but the before and after um you're going to have the r prompt here highlighted and well low lighted in Black uh so you can see what we've changed so here we've still got the task here we've still got the bit before but it's just now part of a Li and pront we have the role included as well you are an experienced email classification system that accurately categorizes emails based on the content and potential business bagged great so task now
going back there that's pretty helpful this is actually the task um so the thing that most people actually put into ches or into the prompt is the task itself so it's basically just telling it what it's going to do uh usually starting with a verb we want to say generate a this Analyze This write this but be descriptive as possible while also keeping it brief so an example here is generate engaging and Casual Outreach messages for users looking to promote their services in the dental industry especially focusing on the integration of AI tools to scale
businesses your messages should be direct so it's telling it what it should do use a verb nothing too crazy here um but what I will mention is that this is where because we're doing these single shot systems we need to insert values cuz it's going to have our prompt written and then we need to be throwing different like in this case the email content is the variable that we need to put in this place so in this case you see that I have the dental industry as the niche and the pink one here which the
integration of tools as the offer um this is from an earlier video that I've done within the task is where you can insert the variables that are going to be used uh throughout the system so if you go back a little bit uh we have the email content variable and you can see here that it's already become part of the task so classify the D here's the variable based input that we want then we have the technique that's associated with the task component um and that is Chain of Thought prompting this is something that's fairly
common now and pretty widely known um it involves telling the model to think step by step without our instructions or B yet you can provide it with step-by-step instructions uh for it to work through each time which is my kind of preferred way of doing it so here's the example um we take this script writer example as well um and in this case if you just give it a list of six points so hook the viewer in briefly explain provide one two F standing facts described so we're giving it step-by-step instructions on how it should
perform the task and the research results of of thought prompting being incorporated into your prompts it's a 10% accuracy boost on simple problems I me that's like very very simple problems like solve this or 4 plus 2 equals blah BL blah uh but 90% accuracy on complex multi-state problems which is likely what many of you are going to be uh dealing with with the system that you're trying to build so 90% accuracy boost is pretty insane and uh considering you only have to write up a little list of what it should do chain of th
promting something you should uh you should really incorporate uh key takeaway here the more complex the problem the more dramatic the Improvement using chain of Thor prompting so that's the task if we go across now you see that we've included a chain of Thor component to the task so the old one which was just the chat GPT uh low IQ person is this and we've added on the roll prompt and we've also added in a section for how it should approach a task a step-by-step Chain of Thought prompting method that we've Incorporated next we have
the specific section which is below the task and this is really an addition to the task so to not get it too bloated on the task component you can then have important bullet points that reiterate uh more instructions or important notes regarding the execution of the task so using the example of the Outreach message generator prompt examples of specifics what this might be each message should have an intro body and outro with a tone that's informal use placeholders like this so it's kind of a list of additional points that outside of just the core part
of the task you can give additional uh kind of bullet points which is pretty handy uh when you're modifying The Prompt when you're editing it if you think it's not doing something correctly you can just easily add another bullet point on so this is kind of what I will do most of my modification when I'm writing my prompts and the tech associated with specifics is called emotion prompt and this refers to adding short phrases um containing emotional stimuli emotional stimula emotional stimula right to enhance the prom performance so here's the research results emotional stimula can
be things like this is very important to my career this task is vital to my career and I really value your thoughtful analysis this continues on from role prompting a bit cuz you're kind of continuing to hype this thing up and say look like you I really appreciate how how good you are at this thing and and you being part of this business and what we're doing is so important and it has massive implications on myself and my business and also on society as a whole the more you can hype it up and tell it
that is its task is like the world is going to fall apart if it doesn't do this thing right the better the performance you can get out of it so the research results here are adding emotional stimula which can be as short as these two little phrases here this is very important to my career um and this is vital to my career these little lines here uh increased 8% on simple task and 115% on complex task compared to zero short problem so huge increase on complex tasks which is likely what you're going to be building
your problems for anyway and it also enhanced the truthfulness and informativeness of llm outputs by an average of 19 and 12% respectively so not only are you getting the increase in accuracy is is this thing getting the right uh the right output in the right response but also it's more truthful and informative which is me fluffy things but more being more truthful and informative is probably a good thing right so the ROI just adding a few of these words for the performance of your prompt is ridiculous there's no reason you shouldn't be throwing in a
couple these emotional kind of lines which is a this is very important like this is such a key thing in the business that you are part of so the key takeaways here adding simple phrases like these can encourage the model to engage in more thorough and deliberate processing which is especially beneficial for your complex tasks that require more careful thought and Analysis so how does this actually add into our prompt we have it below the task section here I can zoom in and we have the specifics this task is critical to the success of our
business if the email contains blah blah blah blah and it's just a list of additional instructions and we can throw in that emotion prompt in there as well so that's specifics you can see it's sort of coming together here then we jump into context this is kind of self-explanatory but just giving the model a better idea of the environment in which it's operating in and why can be helpful to increase performance and this also gives us an opportunity to really further instill the role prompting that we did at the start and also the emersion prompting
that we've done in the specific so an example here from our email classification system could be our company provides AI solutions to businesses across various Industries but Accord about who the business is we receive a high volume of emails from potential clients through our website contact form Your Role again role prompting we're incorporating again reminding it of the role that it has is classifying this emails is essential emotion prompt for our sales team to prioritize the efforts and respond to inquires inquiries in a timely manner by accurately identifying motion prompt again Etc so you can
read the rest of that but we're we're heading up with a ro prompt again we're giving it context on the system that it belongs to and here's here's my general notes I'm getting here to myself but General notes for context is to provide context on the business including the types of customers types Services products values Etc then you can provide context on the system that it is part of as you can see here we're saying this is part of our sales process and we get a lot of emails and then you can provide a little
bit of context on the importance of the task and the impact on the business um so you directly contribute to the growth and success of our company therefore we greatly value your careful consideration and attention to classification so just kind of reiterating a lot of the stuff that we've done in the role and also in the uh in the specific section as well here's the before and after we've added this context section section down the bottom uh not rocket science the example section kind of self-explanatory but we want to give examples to the model on
how it should perform and and how it should be replying to it so you given input output pairs is what you usually refer to them as um and this goes on to the technique of few shot prompting uh single shot one shot prompting um and in this case we're going to be talking about few shot prompting because that's giving more than one example so uh I'll give you a little bit of a a look into the research results here um now all of these research results attached to Scientific papers that i' I've gone through and
and found and and put in here for you so if you want to get access to all of those research papers I'll put it on a figma or put it on in the description so you can have a look at the papers themselves I'm not pulling these out of my ass uh these are coming from papers where people have actually studied these things so um and this graph here shows the effect of adding these input output examples on the performance and accuracy of the prompt so zero shot prompting is on the far left we have
10% accuracy for these 175 billion parameters version of gpt3 as soon as you add one example to this it jumps up from 10 to nearly 50 to 45% accuracy and then we get sort of a a diminishing returns as we continue to increase up to here is 10 examples so this is 10 input output pairs so a QA QA QA one QA and one example of an input and an output that is a a a shock with a one shock prompt we got a 45% accuracy and as we got up to 10 we got a
60% and kind of flattened off after there so the research results uh is that GB3 175 billion parameters achieved an average 14.4% improvement over its zero shot accuracy of 57.4 when using 32 examples per task so that's way up here um and using a lot of them and it kind of crept its way up uh but for us the key takeaways is that providing just a few examples literally going from zero examples to one massively increases the performance compared to zero shot prompting when it doesn't have any examples so accuracy scales with the number of
examples but it shows diminishing returns most of the gains can be achieved between uh 10 to 32 well crafted examples and personally I go for like 3 to 5 I don't really want to be sitting there all day writing all these examples and the more examples you give the more tokens you're putting in the input of your prompt and therefore the more expensive it is every time every time you call that prompt so if it's part of this email classification system and we have 32 examples we're going to have 32 examples worth of context and
token usage in our Automation and that means every single time an email comes in it's going to be sending off huge amounts of tokens uh as part of the input and going to be charged on those import tokens as well so 10 to 32 is is a sweet spot according to this paper just do 3 to 5 it does a job enough um and at least in my experience and and the stuff that we do at morning side as well so a little bit more on examples I won't bore you too much here but this
is kind of the key part here that these guys doing these these uh these papers and doing the research they documented roughly predictable Trends and scaling and performance without using fine tuning so by giving examples you are kind of impr prompt fine-tuning these models uh and people talk about fine tuning and everyone thinks that you need to do it I personally for me and my development company we build these AI solutions for businesses and we've never had to use fine tuning because we're actually good at prpt engineering and there's only a very limited number of
use cases where fine shunting actually gives you an advantage um and that's just from our experience so if you want to avoid doing the messy stuff of data collection and fine tuning and all that crap uh just get good at prompting get get good at writing these examples and you can achieve the roughly similar uh performance increases um as fine tuning without fine tuning so this graph here shows an interesting uh bit of data that I do want to share is getting a little bit Ticky but uh this graph on the right here shows a
significant increase in performance from zero shot which is the blue to few short completions so if you add in some examples you're going to jump up from I think it was 42 up to nearly 55 60 a big jump immediately just by adding a few examples but interestingly the gold labels here so these orange pillars these orange bars uh that refers to the tests done where the labels were correct so maybe if the email classification was um here's the email here's classification and we gave it correct examples the performance increase within the study was shown
regardless of whether those labels were correct so this tells us something interesting that the llm is not strictly learning new information so by giving us giving it few short examples that have the correct labels it's not necessarily learning that information it's actually just learning from the format and structure uh and that helps to increase the accuracy of the outputs overall the accuracy of the label itself does not actually appear to matter too much uh on the on the overall performance so you can have incorrect labels and it's still going to perform just as well um
because you've given it some examples on how it should respond so long story short throwing in three to five examples is going to greatly increase the accuracy and the performance of your prompt um and it's also should be thought of more as teaching it how to structure the output so this is very important if you're not getting the structure you want and throwing in a whole bunch of other rubbish like oh well this is the answer to the question if you just give it a few examples of how it should respond it's going to look
very closely at that and it's going to perform much better for you so think of it as fine tuning of the St the tone and the length and the structure of the output um and I think this is something that a lot of people miss out on when they don't add these things in because it's it's so important if you just wanted to give you one word and you kind of try to tell it in the task to just give one word responses sure it might listen to it but if you give five examples of
input and then just a one word output like in our case opportunity or or needs attention or ignore these labels for our email classification system uh it's going to perform so much better so here's a before and after again we're getting a little bit small here so I'll allow you to pause this on screen as you wish but we've given it a couple examples you can see how I've done it here in this case it's email label um I usually tend to go for a q and a uh that's usually my go-to strategy or input
output um but that's that's basically how we do it we go example one uh we give the QA and then we give a space example two some you don't even need to put these on um you can just leave it as that and it sort of figures it out uh but that's that's F shot property and examples and how we've compared them now getting on to the final bit stick with me because you are learning some very good stuff here uh the notes section is the final part and this is our last chance to remind
the llm of key aspects of the task and add any final details or tweaks uh this is something that you'll end up using a lot as you're actually doing the prompt engineering workflow um in the list I usually end up having things like output formatting notes like you should put your output in X format or do not do X like if it's doing something as I do a test this is kind of where I'm iterating on the on the prompt so if I if it gives me an output and it has doing something way wrong
or just say at the bottom at the note section say do not do X or you are not supposed to do this never include it in your output uh these kind of things are very easy to slap onto the note section at the bottom um small tone tweaks reminders of key points from the task or specifics is really what I use the note section for um and and as I say here it usually starts out quite skinny because if you do the all the prompt incorrectly you'll have well I've got nothing else to say in
the prompt all I've got nothing else to say at this bottom section then you give it a spin you throw some inputs at it and it starts doing some wacky stuff and you come back and go oh well this just reminded of some things I've said earlier on and you start to add this list of things to the notes now don't let it become too long u because it's going to start to sort of water it down you'll notice that it'll start forgetting earlier notes if you put too many notes in um but less is
more here and if it's it's really just to tweak these outputs to to get the right right kind of responses without refactoring the whole thing and restructuring how you did the task in the specific so it's just kind of a lazy way of tacking things on to just get it nudged towards where you want it to go um now we have the note section and it's based off the Lost in the middle effect which is from another scientific like research paper um and this lost INE middle effect is is most famous kind of for this
graph here uh which shows that language models perform best when relevant information is at the very beginning Primacy I'm learning new stuff here as well or end recency of the imput context so performance significantly worsens when the critical information is in the middle of a long context and this effect occurs even when the models are designed for long input sequences so yes gbt 4 32k back in the day was designed for 32,000 tokens but it didn't really listen to anything in the middle um luckily the models that we work with now um are much better
at retrieving information over large context um but you should still keep this in mind because it still seems to apply um and this is why the note section is at the end this little graph here basically shows you that uh when you place the information at the start the accuracy is higher and when you place it in the middle the accuracy is lower and when you place it at the end the accuracy is higher but not as high as the start so it really listens to the stuff at the start so the role prompt it
takes it very seriously and that's why we have our task up the top as well that's why we have the context in the middle because it's not as important so see he's starting to knit together all this information understand these how all these different uh techniques knitten together so the way that I've structured this prompt and the way my team have structured it I'm going to really re retelling you what we do at morning Side by adding these things all in together uh you see how it starts to fit together into a proper strategy and
not just throwing over the wall and having some kind of prompt formula it's actually based off the science um and and if I L to talk about science these days so uh that is lost in the middle I think have a little more here the research results of course that you've been anxiously waiting for is that when a relevant document is at the beginning or the end of a context GPD 345 turbo achieves around 95 around 75% accuracy on a QA task um an increase of 20 to 25% compared to when the document was placed
in the middle um so the key takeaways from this is instructions given at the start and the end of The Prompt are listened to by the LM far more than anything in the middle um for this reason the note section is a handy to append reminders uh for anything that happened in the task or the specifics that you notice it maybe isn't listening to and you need to reiterate um but be aware that increasing the context length alone does not ensure better performance still having less context or fluff will mean the remaining instructions are more
likely to be followed so while lost in the middle refers to okay where should we put where should we structure the prompt to include uh the right information to be listen what's the most important thing in the prompt and where should we put it yes that does that but it also it also gives us information on how we should try to keep our prompt as short as possible because it's over longer context periods that these things start to get bad so the shorter you can keep the prompt in general it could listen to the whole
thing very very well but as soon as you've like really made it bloated um it's going to be losing some of that stuff in the middle so less is more um and having less less fluff is always going to make your your prods perform better so here you can see in the note section uh please provide the email classification label and only the label as your response so again reiterating the format we want the output to be in um do not include any personal information in your response if you're unsure uh on the side of
caution and assign the needs attenti label so little reminders as we've gone through and and we tweaking this email classification prompt you will add those things at over time so getting back to this little diagram here we have the role prompting covered off you know how to use that technique is tell it a roll and and tell it how good it is at that role Chain of Thought give it a list of things that it should do and how it should break down the the task motion prompt tell it how good it is tell it
how important everything is that it's doing few shot prompting give it examples that it knows the kind of output format you want lost in the middle kind of tells you how to structure everything and where to put the right information and you can add on a couple little uh things at the bottom so that it really listens to them at the end and finally here we have markdown formatting man I'm talking at a mile here and I'm getting really hot anyway markdown formatting is kind of the final piece of this puzzle and tied all together
and I learned this from a CTO Spencer he put me onto this technique and I use it all the time now so uh markdown formatting is a way that we can structure our prompts um for both our sake so that it's more readable CU When you write these large prompts it can get a little bit and like there's a lot of stuff going on so for our sake it allows us to structure the reprompt better but also it allows the llm to understand the structure a little bit better as well while I don't have any
research to back that up uh my only data on why we should be doing this and why it may perform better is because you can see over here uh someone managed to extract out the system prompt from th 3 within chat GPT and open AI themselves are actually using uh using these the smart formatting so you can see uh a pound symbol here and then tool so these are marked out headings as we're going to go into in a second but if open AI is using it um to train their systems and to to prob
their own systems we should probably be using it as well which is kind of why we're doing it here so uh basically markdown gives us a few new tools to structure um you may notice if you're writing a prompt you just got PL text you don't have any any method to to Signal what a hitting would look like or what bulb would look like but markdown gives us uh those those techniques so we have hittings uh hitting one is the largest hitting two is the second lest hting three is the third lest so you have
now different layers of hittings so you can have like roll task all these in the hitting one so just H one as a as a pound symbol and then and then a space and then whatever you want after it which you'll see in a sign um but then if you have little subsets or subsections like examples hitter and then you want example one you can have example one as a hitting three or a hitting two so you have different layers of hitting and importance uh you also have bolds italics underlines list horizontal rules and more
so if you want to jump into the fancy stuff I'll teach you the basics here of markdown but you can also do these other things I'm not sure what the effectiveness is um of bolds and italics and stuff but I tend to just use the use the headings as a as a structure tool so key takeaways on markdown formatting is use these H1 tags single pound symbol uh to Mark each of the components for your prompt and then you can use the H2 or three tags or even bolds and stuff to sort of add add
additional additional structure to other parts of it so here's example of how you should add it in hitting one roll hitting one task specifics context and then Within context I've added in here look you might want to break the context into subsections of okay let's use a heading 2 and go about the business about our system so you don't need to do that all the time but this is how you can start to use other types of headings in like H2 or H3 tags to to split up uh some of the other subsections under each
of your main headings and then again examples we can have an example one as a as a heading three and give the examples and the notes so that's roughly and you come in here and obviously you are a BL blah um generate BL BL blah you get what I'm doing you get what I'm saying and so what this all looks like when we tie it together um we now have our completed prompt which this is the before remember this is where we started this is the uh the the the super guy who doesn't not had
a prompt this is what we started with and this is what we have after when we apply all of these techniques now this is a little bit overol for an email classification system but what I want to show you is that this is how you would apply it to a simple task like this so we have the roll that's wrapped in the AG one tag we have H1 tag here Etc um and we have all of these different components role task specifics context examples and notes all integrating the uh techniques that we've been over in
this video and now stacking up all of the increases in accuracy that we get from these different techniques we can see that we don't know how much markdown formatting gives us uh but the total is potentially above 300% increase in accuracy then the final step here is we can add up all of the different increases and and the performance increases that we get from these techniques and we can can sum it up to a 300% or more increase in in performance so me you can listen to me or you can just ignore it or you
can use these place by Place wherever you think you need it um but considering emotion prompting is literally just a few words saying you're the best and this is really important to me and Ro prompting is like one or two lines and lost in the middle is really just more of a an understanding of where to put the right information you prompt you've now got a toolkit and going back to this guy over here look at this guy he's got a toolkit he understands the science understands from research papers at why these things work the
way they do and because he has this this deeper understanding of what makes llms do the right things that they want them to do he's better able to perform and as you can see he is on the upper end of the spectrum here so this is the guy that you should be now all you need to do is take these and apply it and you'll start to see and and connect them go okay okay so lost in the middle um that's not doing what I want maybe I need to change the stuff at the start
and the end okay uh it's giving me the wrong structure and style okay maybe maybe I give some more F short examples of how it should be responding and I I take my time and I write them carefully and I tell them the kind of style and structure of the response I want it's really not rocket science and people have already done the hard work by doing the the research to get these kind of results so um to wrap up this video I've given oh actually we have a considerations page here uh context length and
costs as I mentioned earlier for high volume tasks um like this example of email classification system uh I guess it's not too high volume but if this thing is doing like 50 50 100 reps a day it's really being put through the ringer and there's a lot of volume going through the task that you're building you need to focus on making that prompt as short and succinct as possible uh because every time you run it you are charged for the input and the output tokens so while you may only be outputting a label in this
case of just needs needs work new opportunity or needs attention or ignore you're also charged for the input tokens as well so all the prop that you put in you're going to be charged for plus the inserted variables as well so you've got the prompt then you're inserting the email context you're getting all of that information and that over you're you're going to get charged on that so uh keep in mind that if you're doing a lot of volume try to use a a cheaper model as we're going into next but also keep the The
Prompt shorter as well the choice of model is important as well better prompt engineering and the skills that I've just taught you on this going back to this guy here he has better prompt engineering skills and can get better performance out of Cheaper models this guy doesn't have the skills so he relies on the more expensive and slower models which are not good for the client um to get the performance that he needs because he doesn't have the skills to get it to do what he wants and that brings me back to this choice at
model point which is where possible you need to use your skills and use your advantage to bend the cheapest and fastest model to execute the task successfully so 3.5 turbo is basically free like this thing open AI has made that so cheap and whatever whenever you're watching this video might be different but the cheapest fastest model should be your goto and if you can't get it working there then you can go up but you have the skills now um if it has high volume and requires fast responses this is when your skills will shine because
you can create prompts that do and perform um fast and cheap then we have the temperature and and other model settings if you're doing creative rating adiation Etc then test higher levels so 0.5 to1 uh but anything else if you're putting systems like this whereas classification or AI is kind of doing a a a fixed piece of the of the puzzle uh you want it to be on zero just have that we're trying to fight against the inconsistency and and natural randomness of these models and in order to do that we need to uh set
that temperature to zero and that's going to make the system a lot more consistent uh so zero is what I typically use for basically anything apart from creative writing cutter uh script rting prompts the other and the other model settings like frequency penalty and top PE are not needed in my experience just play around with the the temperature that's all you need to worry about what I'm going to jump to now is actually having a chat with my CTO Spencer um and he's going to share what we've done at morning side on one of our
projects where we had to go from GPT 4 uh which was doing the job great and then the client wanted to change to GPT 3.5 turbo to save money and then we had to kind of rebuild everything in order to get it working so uh we're going to jump to that and you get to here for Spencer again lot smarter than me and a lot of the stuff that I'm sharing actually came from what he's learned uh learned on the job and what he does at warning side so everyone if you haven't met Spencer already
this Spencer my CTO he's a lot smarter than I so I'm bringing him on to chip into this prompt engineering video just briefly because um a lot of the stuff that I've just told you about has actually come from has big brain here he's been sharing a lot of the the research papers particularly within our slack across the companies we're on the same page so Spencer I wanted to bring you on here particularly because we've been working with a one of our biggest clients ever today U and I want to particular focus on how I
was talking in this video about the pr engineering skills allowing you to get more out of uh lesser and cheaper models um and how we've had to switch from a gbg4 based SAS that we built over to a hbt 3.5 turbo and and the difficulties in transitioning that so if you just want to um give any notes on the on the presentation prior but also specifically on uh getting more out of these these lesson models really which is what I'm trying to teach people in this video yeah yeah definitely so um yeah it's an interesting
one I usually uh like to try and break things down so um when going through these path the key is is that obviously want to use the cheaper models first so 3.5 comes comes first to mind um in this case specifically for this client there's a lot of complex uh kind of information that they were synthesizing out of it so we made the decision to start off with gp4 um to to make sure that we were getting the responses that we wanted now once it kind of got closer to uh to release there we realized
that the the cost that was Associated um with running these models is going to be ative so we had to yeah kind of take that transition now and and gauge down to 3.5 so whenever I'm doing that specific task the key one that I'm looking at is yes prompt engineering one um and then two is scope reduction um gp4 is really good at a bunch of different things uh and and understanding kind of the hidden context that uh that's in the words that you're doing uh 3.5 is is much less so so um you almost
want to break it down into smaller kind of component size chunks for the task um and then use those as kind of contributive to to get the same results as you would with four um so that was the steps that we're taking in this particular project another good tactic to use as well and and one that I would highly recommend is using gp4 first and then taking the input and output pairings as training data to fine-tune a 3.5 model as well um because we found that that's that's really helpful uh for getting your cost down
but keeping up that GPT for L quality yeah I'm kind of just bashed fine tuning earlier in this video because I say it's it's unnecessary in almost every case um so I mean using few short examples is essentially a way of of fine tuning VI prompting so if you just give a few short examples of gp4 outputs or human rid outputs would that not do a lot in terms of getting more towards the outputs that you're looking for yeah 100% and you're completely right on that one fine tuning for I would say a vast amount
of use cases isn't really NE necessary you can get I would say 90 even 95% of the way with uh with just good old fashioned prompt engineering and and F shot prompt in here um with f shot prompting there's a interesting paper that came out last year um and I can't remember the specific name of it but uh it talks about the decision boundary so there's an important uh kind of lesson to learn on that is that for the fot prompts that you're giving the important part is to give ones that are confusing to the
model itself so the ones that you notice that it's getting wrong consistently if you actually categorize those and take those in and take the one to five artist examples that you get and then use those as the uh yeah as the examples in there you'll actually get a lot of better results coming out of your model too well that's that's I'm learning something on this on this call in this video as well because uh I mean I'd always start in my fut show examples have kind of like the most common ones you might check a
a curve B in there as well but I just kind of put the five three to five common ones um but knowing that we should try to figure out when it's stuffing up and then and put those on next examples is great so any other notes you have on on the content just Tak a look at the presentation but the markdown formatting aspect um any of the other any other techniques I know motion promps than you want for me so anything that you got there yeah uh markdown is one that we use extensively um I'm
a huge ner so I I like writing in markdown anyways just because most of the the notebooks uh Jupiter notebooks if there's any other uh data nerds out there like myself um so it's it's rather um yeah consistent familiar for myself is is any data or or papers that you've seen with the uh the markdown base because in the presentation just before I was like look I I can't find any research papers but I'm sure just probably G on but uh it's more like if open AI using it you'd be pretty stupid not to do
it and even just functionally for us as as writing these prompts it's so much more useful to at least have some kind of structure to it so purely on our side you'd use it regardless just to make it easier on your on your end yeah absolutely so I definitely remember reading I think at least a couple papers about structured uh structured inputs in markdown format and there's other ones as well that you can use um but even intuitively so when they're doing the fine-tuning or fine tuning in terms of uh uh reinforcement learning with human
feedback rlf um what they're doing is they're actually providing markdown based formatting and that's how they're structuring these prompts that they're giving to it in order to fing it so intuitively of course if it's seen it more it's going to do better when it sees more of the same that it's been trained off um the cool part about using markdown as well is you get to actually use semantic information so if you're writing a Word document if you want to put bold in there if you want to put something in italics titles subtitles all these
things it makes it into a much more structured format and that Nuance comes through on the other side to be able to uh yeah make better better prompts to to get better outputs the other one that uh I would suggest as well is they like small little things so uh being very encouraging towards uh an llm can help so uh I usually start off with you're a world class X and you know you are an absolute star doing this it seems a little bit ridiculous at the time that I'm not getting this positive feedback to
a machine but uh very helpful um the other one's telling the model to take a deep breath and to think it through step by step before responding I'm 100% serious has been proven to actually increase the quality of your responses and that also doubles as a as a great one when you're significant other as is angry usually that yeah yeah I would not suggest that as a as a I'll be honest follow the chcken by calm down anyway it's good you mention that sorry that the the hype in the model up I talked about this
just earlier in the video is that look this a motion prompt thing where you can get I think 115% increase in your in your accuracy it's just by being like wow you well firstly on the role prompting being like wow you are like the best at this and then providing enriching it with additional words to to reinforce like how good it is at that toas and then the other I think so um let M anyway back to what you said yeah I and it's actually funny as well Persona based uh thing so if you uh
not only tell it it's a world class X if you actually use names of specific people especially people who have written over the Internet or uh you know if you say you are Albert Einstein it will actually come out with higher quality outputs um that are very much in the style of writing the the person that you're talking about I use it for programming personalities so Theo he he does the T3 stack um and I'll constantly say you're Theo show me how to refactor my code like the wood and and that actually goes really really
well um and then the other kind of last one in here is on the positivity rout but not using negative uh feedback for so a lot of the time your your first impulse is going to be like stop doing this don't do this don't do that if you instead focus on do this or do that um the negative conent uh words actually are associated with worse outcomes than positively France yeah it's just interesting because then the in the research for this and I was trying to put together okay like negative prompting is this a real
thing it seems like the consensus is that it doesn't actually uh do much but I will I've anecdotally the contrary which is uh if if it's doing something incorrectly I'll usually just put at the very bottom in the notes section just never do this in your output and it usually tends to work so I mean there's both sides there it works for me sometimes but it's probably something a lack of my skills as well um that I should be doing it further up but yeah there's some really good things I think if you guys can
as B said that's another GM that I'll be I'll be incorporating into my prompting is giving it a name giving the rle a name um and that's something OB you just say you're an expert this this this um but if you have an example of a real person or that someone that the internet would have had information about um you can throw that in there as well yeah absolutely um yeah I think those are the the big topl line ones for me at least right yeah no that's really helpful again this is why I brought
Spencer on even I've I've learned something here um but yeah we can jump back to the video thank you Spencer thanks so much then so I hope that's drilled in the importance of PR engineering and and being able to use these cheaper and faster models to achieve the outcomes that your clients want otherwise you're not going to make any money uh but going back to to this I just want to say look everything that I've just taught you here can be applied to all these different types of systems and what I want to leave you
off with at the end of this is examples of things so an AI agent is is like GPT is are a good example of this um or the building AI agents on my own platform in my own software agentive if you want to check it out we're only on weight list at the moment so you can check that out in the description uh but agentive allows you to build AI agents as does the gbt Builder on on the chb site but what we want to do if we modifying this prompt formula for this use case
of AI agents is to modify to include how to use the knowledge how to use the tools and your answer then you can provide examples of response styes and Toad so you can pause that take a look see but here most important things to point out is that I've added in U so you can see roll task specifics and then tools so the tools here if you are adding custom tools into your uh into your gpts or into your AI agents you can add a little section uh using the same kind of format right we
have a heading and say you have two tools to use one I like to include the knowledge base if I've added any knowledge to my AI agent I'll sell tell it use the knowledge base because it's actually that's how it's working they use it as a knowledge based tool they just don't already tell you that it's a it's a tool um so you construct it knowledge base is one of the tools you have you can use it when you're answering AI business related questions and number two is a coine similarity tool it could be other
tool that's calling relevance or something uh but tell it how to use each of the tools that's involved and then examples of okay here's a question someone ask the agent here's how you should respond uh Etc so not not rocket science you guys can use that uh but that's how I write my adapt this formula to do AI agent prompts and it works really well next is voice agents you need to modify the prompt formula to include a script outline if necessary uh so sylow BL AI air all these things that are popping off right
now uh you can modify the same prompt template uh to do uh really good voice agents for you so role task but in the task here we're giving you an outline of how it should talk and the steps involved uh then we have the specifics then we have context about the business uh this is an example for a restaurant um I'm just giving a bit of context on the restaurant there then we have examples of how it should respond to the most common questions as I said before you can also come in here and add
in a script section and add in like a rough outline of how the script would go but I've kind of included that in this uh in this in in this section here from a high level at least so voice agents same sort of thing modify it to to do the job then we have ai automations which can be using zapia make air table air table now has AI which is cool uh but you can create powerful AI tasks and businesses they can be relied upon to handle thousands of operations a month uh what we just
built in the email classifier is an example of an automation so I don't really need to go over this but here's another example at the end here you can see sometimes I like to throw this in um is after I've given examples at the bottom I'll go q and then I'll put the constraint in or in this case the variable uh in again and then I'll leave the a open up put space and then it's just going to kind of autofill that and it's a it's another technique you can use to to get it to
only output uh the exact kind of uh output style that you want so feel free to use that as you need AI tools um you may not know what I mean by tools but basically we can set up a bunch of inputs say Okay Niche offer then we can insert that into a uh into a into pre-written prompt and then that's going to be allowed to connect to either gpts or you can build it um on on a on a landing page and it can be used to speed up workflows so there's so many different
ways you can use it um here's an example again you can pause that this an example um here you can see I'm inserting the variables uh we have lots of input output Pairs and then I'm screaming at the end here because because it wasn't do what I wanted so uh yeah take those I'll I'll leave a link to this presentation down on uh I think it'll be on my school community so you just find this video um there'll be a a resource for this thing in the YouTube Tab and you can find this video pull
this up and then and use this as you wish so I want to bring you back to this um here's a lollipop um because you get a lollipop for now completing this course and you're now a successful and a a genius level I'm not even sure what this guy's supposed to his name is supposed to be but he looks like a genius to me he looks like a Jedi or something cool so you now this guy and you didn't end up being stuck in this uh this midb territory so here's your little lop and I'm
proud of you for getting through this because the skills that I just taught you as I say affect every different thing you're trying to sell in this AI space if you don't have this nailed um you're not going to be able to build things and you're not going to create value for your clients cuz you're going to have to use even if you're kind of okay but you can't get the cheaper model to do what you need it to do then you're not going to be able to succeed long term and I mean you put
yourself up if if someone was offering the same AI service and you said Hey look it's going to cost you this much month and it's going to take 10 seconds to respond and some other guy goes okay it's going to cost you one1 of that and it's going to take a quarter of the time um who's going to win there so as as much PVP there's not much PVP going on in the space right now because there's very few people selling selling a Solutions at agencies so we're still very early to it but over time
if you don't have these skills you're going to get wiped out by people who do um and yeah keep in mind there's so much potential to be squeezed out of these prompts and out of the these models if you just apply this technique so every 300% increase I'm going to be making a couple more of these Style videos if you did like this if you like me being a lot more uh no and just telling you outs then let me know in the comments because I much prefer doing these kind of videos even though I'm
now getting super hot and ready and my cats here but I've like making this personally it's a lot more fun than my normal videos where but uh yeah you get the idea if you've enjoyed please let me know down below and uh subscribe to the channel if you haven't already I'm probably going to have a couple more videos like this on core things that I think you need to understand because if you don't learn this then you can't use my sass and I can't make money so I'm very selfishly teaching you this stuff so that
one day you can use my sass and I can sell my sass for hundreds of millions of dollars so forgive me for being selfish but you get to win along the way um but yeah see you in the next one
Copyright © 2025. Made with ♥ in London by YTScribe.com