Prompt Management 101 - Full Guide for AI Engineers
8.35k views6125 WordsCopy TextShare
Dave Ebbelaar
Want to get started with freelancing? Let me help: https://www.datalumina.com/data-freelancer
Need h...
Video Transcript:
prompt management 101 with the subtitle boring but highly important Concepts about how to manage promps at scale without going crazy so this video is going to be for you if you work with proms professionally within your codebase and want to get a better understanding about how to properly do so so that you can over time improve your large language model applications so in this video I'm going to share all of the different methods and strategies that you can use to manage your proms I'm going to cover everything that we've seen from the comp companies that we work with with data Lumina and now for those of you that are new here my name is Dave eilar I'm the founder of data Lumina which is a data ni development company so we essentially help companies to get started with data and AI either assisting their engineering teams or completely done for you projects and next to that I also share all of my lessons here on YouTube so if that's your thing make sure to subscribe as well but now let's get into what we are going to cover within this video so we're going to start with an high Lev introduction about prompt management what is it why is it important we'll cover some common mistakes we cover the methods and then we also highlight our approach that we use a data Lumina to serve all of the different clients uh in the projects that we're currently working on and then we'll conclude with some best practices in this video we're not going to cover prompt engineering we're not going to cover the latest model updates or best practices or Frameworks to improve your applications no we're really going to talk about prompt management which is a systematic approach to storing versioning and retrieving prompts in your llm applications and some key aspects of prompt management include Version Control potentially decoupling your promps from your code monitoring and logging optimizing promps and also integrating promps all important Concepts that let you systematically manage and improve your llm application that is using prompts over time why is this important why am I going to create an entire press presentation on this concept and why should you watch this well when it comes to building llm apps your prompts are often more important than your code let me highlight this with the following example and also given the Assumption assumption here that you don't break your code on the left here we have a very messy function with an okay prompt where there are some variables in there and we can give some context to the model some guidelines maximum words maximum length Etc but very like messy code something that you do not want in your code base on the other hand example two we have a very clean function but a bad prompt it's just right about product name and this simple example just goes to show that whenever you're building applications if if it runs if you manage to get it to work like you can refactor everything that you want but your end user or the client is not going to notice anything about essentially how your codebase is structured you just of course want to follow best practices for yourself or the team that you're working with but to be fair the client or end user is not going to notice it whereas a slight ever so slight change within your prompts which could be a simple as changing like one word can have great effects on the outputs and outcomes and thus utility of your llm application and that's why prompt management is really important so now let's first cover some common prompt management mistakes and at least I can say for myself I've made all of these mistakes in the beginning so first Common mistake prompts scattered across your project so imagine this you're building an application and you have some functions here you have an open AI let's say you have an open AI API call somewhere in a function within a file and in there is a prompt and then in another file you have another prompt and then somewhere else you for a different use case based on some if else logic you have another prompt over there everything is scattered so now when you come in or a team member comes in and you want to change something you really have to dig so where is that prompt and what is it affecting then also limited Version Control so if you store all of your prompts at the within your codebase So within let's say your python files or as in separate txt files for that matter you have limited Version Control you can manage that and keep track of that with Git which for a lot of use cases might be enough I will get into that but for some use cases especially if your applications gets more serious more users or the team get larger you might want to have an alternative method to managing and fting your prompts other than just plain git because while you can keep track of everything you cannot very easily roll back that change because you could go through the commit messages right and then see okay where did we exactly change that prompt but it's not really where with one click of a button you have version one and version two of your prompt and if it doesn't work you go back to version one also lack of standardization so not having a clear like thought out structure or standardized way that you manage prompts within your codebase and within your projects can also again lead to these inconsistencies harder for team me members and also yourself to come back later and then adjust them and update them number four no evaluation framework so adjusting prompts based on just like gut fuel when you are looking typically at uh solo like isolated examples so you're trying to solve a problem so one answer should be slightly different for example than you expected you change the prompt let's say in the playground oh it works on this isolated example let's push that to production without knowing what the effects of that change are going to be really at the global level of your code base and then prompt redundancy so having a lot of very similar prompts scattered across your code base each with for example different nuances or different cases where uh it's going to be hard to keep track of that so let's say you have two very similar prompts and you find out that on like let's say prompt a you tweak some things because you ran into an issue and you change that push it to production and then for getting that you also actually had prompt B which was very similar but just had let's say some different conditional logic or something in there um that there is no then link to that so prompt redundancy and then finally mistake common mistake number six is no metadata so just a prompt in a txt file or in an fstring or just in line within your API call without any context about who wrote this what version is this on what's the utility what model should we be using this no metadata so these are some common mistakes that I see a lot of new AI engineers make and like I've said I've made all these mistakes as well and you should also consider here that depending on where you are at in your like Learning Journey or development Journey the more serious your project gets the more important it becomes to make sure you don't make these mistakes so if you're just starting out working on a let's say portfolio project this doesn't really matter as much when you start to put things into an application that people are actually using or you start to work with the team this becomes more and more important and a must in order to make sure that things don't get out of control because it will happen otherwise so now let's cover some of the methods that you can use for prompt man for prompt management and we're going to cover five methods that we've seen that we've encountered and you could even view these five as kind of like a leveling system as well where level one the inline promps which we'll go in a bit is like the easiest most straightforward one where everyone starts and as you progress it gets more complicated to do so let's say but you also get additional benefits uh but each of these methods here has some trade-offs which we will also get into each have their uh own pros and cons so high level um we have separate slides for all of these methods which we will now get into so you can determine which level you are at right now and also decide what makes the most sense for you because I don't recommend level five custom database storage for everyone it really depends on the project so high level we have inline prompts C centralized storage structured centralized storage external promp management tools and then finally custom database storage so now let's get into the first one inline promps so this is where everyone starts right embedding prompts directly in the codebase typically Within the API calls or function parameters so here you can see an example generate project description you have to prompt in here and then you make the API call so Pros simple and quick to implement direct context availability within the code so a developer can come in here and look at this generate product description function and can instantly see okay this is the promt that we're dealing with so this is what we're sending to the model but then some cons it's hard to maintain an updated scale it lacks centralized management uh Version Control and metadata so when you look at this prompt as is it's very straightforward right but in production environments especially when your application gets more complex prompts are often not as simple as this quick like oneliner so usually they can get pretty uh at least like a couple sentences or paragraphs to give more context on style formatting Chain of Thought F shot examples so this can get big quite quickly also potentially with more variables and at that point you most likely want to move it to a place somewhere else where in the function you can just focus on like the function logic and then decouple that in some way so inline proms this is where everyone starts it has some pros and cons but it's typically not what you want to do so if you are at this level over here one easy upgrade fix that you can Implement is switching to centralized storage where you store prompts in a dedicated folder or profile within the codebase improving your organization but without enforcing a specific structure so what you could do for example within your project just create a folder prompts and put it in there and you can even create subfolders in here for example different uh let's say pipelines workflows or categories but having that centralized place so whenever you know okay we have to update a prompt you know where to go and you have that overview of everything that is in there so some of the pros here improved organization compared to inline prompts easier to find and update as well and then we also of course have some cons there is still no really enforced structure which can lead to inconsistency so there's still we have some files in here but we it doesn't say how we should structure these files again limited version capabilities lack of metadata it's a it's a slight change from having them really in line towards a centralized place where we manage them but there are still some issues with this approach all right and then let's get to level three which is structured centralized storage which is where you organize your promps in a central location but now also with a predefined structure or template and for this you could for example use FST strings or you could use l chain or llama index with for example a prompt template that you see over here or you could use ginger as a templating engine which is actually what we're using and I'm going to share in the next section why we use ginger and why we prefer it over using something like uh Lang chain or using FST strings which additional benefits or you could use something like prompty which is a a framework um by Microsoft you can see it over here in the left corner it's pretty small but um definitely look into prompty look into prompt. a and see if if you like it if it resonates with you but the the main ID here is with level three is that we have C A centralized place but also a predefined structure whereas with level two we just have one place right but it can be you could put anything in those files maybe it's D maybe it's python files whatever and here we really get clear on that and probably for 80% of you watching this is all you need a proper structured centralized storage system to manage your prompts and like I said the next section our approach I will share some code examples that you can see on the left over here with Ginga and our prompt manager that you can copy if you like it so make sure to stay around for that but now uh quickly pros and cons enhances consistency maintainability allows for Dynamic prom generation because we have these templating files and systems facilitates easily easier easier collaboration among the team members because it's really clear right this is where where we place the PRS in this structure um and the stru these structures and templates can also be extended to include metadata because we have that structure uh and then some cons it requires initial setup and agreement on template and structure so yeah you have to get clear on okay what is it that we're using requires line learning a templating language or system which to be honest is it's very easy but it's it's uh it it takes some more work than just okay dump the prompts in here or whatever in line um and then we have versioning is still limited to just G which might be might be all you need but like I've said sometimes you need a little bit more and that is where we get into level four and that is external prompt management tools so you can also manage your prompts utilizing a third-party application such as Lang Smith or Lang fuse to manage your prompts and here we really decouple our prompts from our code paas so we really take it out of that put it into a separate storage location and then within our code we query essentially the prompts whenever we need them and with this we essentially turn it into a Content management system that stores our prompts the pros of this um we can get into advanced features like versioning rollbacks metadata and analytics also depending on the platform that you're using so I know that both Langs Smith and Lang fuse offer those types of capabilities where Lang fuse is fully open source Lang Smith is from from Lang chain and it's closed Source easier to get started Lang fuse also has a cloud version by the way but you can also fully self host it what is and that is what we're doing and it also provides a US userfriendly interface for non-technical team members so you can just look at the prompts over here so this is the prompt for the event planner you can see we have a version we have some variables that we have to put in we even have a model config and then with in our code is just Lang fuse prompt is Lang fuse. Gap prompt and we get that event planner but it's it's not within our code base so uh but like I said even non-technical stakeholders can come in here and maybe for example you're working with a copyright or something like that they can come in here or you're working with a client um the cons introduces extra overhead to manage this surface because it's somewhere else and it also uh introduces the this dependency on this third party service which could also kind of like result in a vendor locking of some sort where if you build your entire code base around having your promps in that system you are likely going to stick with that and given how quickly everything is changing within the AI landscape this can sometimes be a drawback as well and this can bottleneck local development because you have this external dependency sometimes if you quickly want to test things um and you don't want to put a new prompt into the system it it can get messy with that and this just may be overkill for smaller projects so here we're definitely talking about more serious applications that are most likely already uh running in Productions and in production and users are using this system that is really when you should consider upgrading to an external prompt management if you feel the need to like I've said for some project smaller project this might be all you need but this is where it gets a little bit more serious and then finally next to using an external promp management tool that's already pre-built you can also just build a custom one build your own because to be honest it's not that challenging you could just create a database where you store your promts and have metadata versioning in there everything tailored to your specific need you could just for example look at all of the variables and data that for example tools like Lang Smith or Lang fuser using and you just create a simple let's say post gql database put it in there and then make your own API function to query those prompts so this is also something that you can do and here uh you can also potentially have easier Integrations within other applications within let's say your company or your ecosystem or whatever you're using it's more flexible you can do you can do anything here so it can be tailored to specifically to your project needs allow for dynamic retrieval and complex querying easier integration like I've said but requires database management setup and management complex to implement andain also can bottleneck local development and again maybe overkill for smaller projects so those are the five prompt management methods that we have identified so here finally again an overview and what you can do for you you can consider where you are at right now and then whenever you feel like you're running into issues with your current approach when it comes to managing your PRS you could look at you okay what does the next step look like and improve from there but now let's cover our approach that we use at data Lumina and this is based on around ginger as a templating engine so Ginga is a powerful python based template engine ideal for managing proms we found offering flexibility and first ability and here again I want to stress out for most projects especially when working with small teams a CMS or custom database setup might be Overkill so in these cases we prefer working with with ginger and a predefined folder structure so I would say for most of the projects right now this is our approach we are with one or two project right now considering to set up a custom database because it gets more serious and we want to have even greater control over our Pros but let's cover why we why we use ginger some of the benefits and then I'm going to share some code examples with you and show you how this actually comes together within a project so um why we use it well centralized within the project so we have we can just create a dedicated folder we have it in the project every developer can come in there and instances okay these are the prompts Ginga offers a unique file extension and while this might sound very like simple it's a subtle Nuance where what a lot of uh AI Engineers do is either you like use f strings so you have python files with f strings in there or use for example Lang chain or what is it llama index and then also python based files with these prom templates but now what you get if you start to work on an application you have some files python files that have the promps and they probably have very similar names to let's say product description prompt or or product description.
py but then you also have a product description. py which is the actual function to call it and this can get messy so it's all python files with this we found solo new on we just know okay the G G2 files the ginger files that's where our prompts live and we can easily open those up we can also easily add metadata to these files using front matter we can enforce input validation which is something that is a little bit trickier to do with FST strings we can also use F logic which is really big I will give some examples but we can use uh conol control structures like if statements Loop filters macros and even prompt inheritance which is going to reduce the redundancy and if done well can optimize your token usage with impr promps and it's also highly flexible so when you are for example using an external promt management tool there are some design choices in there with this you can basically do everything and tailor it specifically to to your needs so now let's look at an example so we created a simple prompt manager class acting as the interface to render prompts from our template folder so this is the high level overview of everything that uh we're dealing with but I'm going to switch right now to the code editor Let's see we got everything going on over here so here first of all we have the prompt manager class and if you want to copy this uh this is the shot for you to do so so you can uh take that and then for for example in this example project over here we have the app folder and there we have a promps folder and in there we have the templates so the prompt manager file that you're looking at right now also lives here but then we also have the templates and you can see that we can specify that template directory so that when we use the prompt manager class it can just look in here and we can load the files very simple prom the promp manager it doesn't initialization but we use static methods and a class method so we don't have to initialize we can just directly call the get prompt and get our prompts and we also get can get the template uh info from it so it's two simple functions that we can use so now let's have a look at how that works let's come over here in the playground come over here to the pipeline so in the promps let me zoom in a little bit for you so you can see what's going on and let's start this up you can see over here in the prompts we have this ticket analysis. Jinga file and what we can now do within our project we can say prompt manager.