welcome to AI engineering one this lecture series will be on building AI products and this first lecture will be about an intro to building AI apps so I'm going to touch on Foundation models use cases planning and the AI stack first of all you should know that this lecture series is for developers building on top of foundation models and that I'm making an assumption throughout these lectures that you're familiar with large language models you have an idea for a product it's not necessary it needs to be a final idea but there should be something in
your mind that you can toy with and consider as we walk through some of these topics so a foundation model is an API accessible large language model trained for General text generation tasks these are some examples you have GPT 4 cloud and llama although there are many others you can customize Foundation models to fit your use case this is how you differentiate your product so the real value in building an AI based product is finding a way to use a foundation model but to integrate it into a system that differentiates you and gives you uniqueness
there's several ways to build on top of foundation models but the first way I'll talk about is prompt engineering this is writing instructions that tell the model how to behave it's a very simple process primarily text based and if you've ever used something like chat GPT then you've prompt engineered many different times the difference with formal prompt engineering being that you set a specific set of instructions in the model's context that it will use to inform all of its responses so for example for a chat bot you might decide to give it a prompt that
specifies it to talk like a cowboy or to behave as though it's a math teacher and that will inform the way that it acts when users are interacting with it a layer of complexity higher than prompt Eng engineering is retrieval augmented generation or as it's colloquially called rag this involves connecting your model to information in a database to inform its responses so if we follow up with our Cowboy chatbot example prompt engineering would give it the instructions to talk like a cowboy but in order to improve the app you might add retrieval augmented generation with
a database full of historical information from the wild west so that when users ask ask a specific question the model could retrieve that information from the database and use it to inform its responses and then the next layer of complexity up you have fine-tuning which is where we train a model on a data set of new input output pairs to specify how it should behave it's important to note here that Foundation models undergo a pre-training process where they're trained on vast data sets of text mainly from the internet the difference between the pre-training and and
the fine-tuning process is that you train the model on much more specific and structured data and this allows it to know exactly how you want it to behave in the wild you train it on what to expect as inputs and what you want it to give as outputs this is a much more complex process than simply attaching some database for retrieval augmented generation and definitely much more complex than writing some prompts in a prompt engineering workflow but you'll find that fine-tuning allows you to have the most control over exactly how your model will behave so
I want to talk about some use cases for AI and if you do have a product idea as I'm talking about these maybe consider which of these buckets your idea might fall under or if you don't have an idea consider maybe which domain you'd most like to operate under so we have generation and transformation interaction and decision support and Analysis and optimization these are some examples of use cases that fall into each of these buckets I won't list them all off but for example in generation and transformation you might create a code generator in interaction
and decision support you might create a chatbot or a recommendation engine and in analysis and optimization you might create some sort of classifier some sort of workflow automation tool similarly there are different buckets we can think about categorizing our product in when it comes to how the AI will interact with humans first we have critical versus complimentary then we have reactive versus proactive and third we have Dynamic versus static so the way of thinking about critical and complimentary is ai's role is critical if the app will fail without AI you can think of self-driving being
an obvious example whereas complimentary means that AI is enhancing the functionality for reactive versus proactive a chatbot will only react upon a user action versus automated AI traffic alerts anticipate the user's needs as the geolocation in some sort of mapping device updates the user's location and dynamic versus static face ID operates off of a dynamic AI algorithm where it needs to learn your face over time so that let's say two years from now it can still recognize you because it's seen how your face has grown and changed versus Apple photos object detection AI doesn't need
to update every single day it can wait for a major iOS update to fine-tune the underlying model to be better at recognizing objects in pictures so an important way to think about any product is to think about its urgency to think about whether it's a need or a want and what level of value it actually has to offer to your target users so if you just want to explore what AI can do or you want to build a new fun tool utilizing your new AI skills most of the time whatever product you would build would
fall under a low urgency category this shouldn't discourage you from necessarily building something if you feel strong about it but you should recognize that it is not built with intent to fulfill an urgent need uh level above that we have medium urgency these types of products see an opportunity in the market and aim to capture that opportunity with AI there's many many many different use cases you can think about and if you've been approaching an idea with any sort of product mindset chances are it probably falls under this category you probably saw an opportunity that
AI presented and created an idea to capture that latent opportunity at the highest level we have saving users from obsolescence there's a lot of hype and narratives going around about the ways that AI could make humans obsolete in the future and the ways that it's making humans obsolete in the present as well so if you're building a product that saves users from this obsolescence it would fall under the category of high urgency and of course the higher the urgency the more value added by building this product and the higher likelihood that you can experience a
return on your investment so things to consider but not necessarily rigid barriers that you need to set for yourself in deciding what you will and won't work on as far as product defensibility you have to understand that easy to build products are easy to replicate there's a lot of people entering this AI space because it lowers the barriers to entry and that's a great thing it's really fun to have an idea and to build a demo fast but if anybody could build your product in a week then somebody else will and your product won't last
so you need to think about what you can do to differentiate yourself what is stopping another company from mimicking your prompt engineering or API rapper will the infrastructure you build around GPT 4 still matter when we have gpt7 so outside of the realm of just thinking about reproducible is your product you have to think about how well it can last if you're not doing much on top of the underlying llm infrastructure then each time a new version of the underlying model comes out whatever work you've done to augment it will become more and more obsolete
so if you are trying to build a flash in the pan product just to make a quick Buck maybe this isn't something you need to prioritize but if you're trying trying to build something that will last in 2 5 10 years you need to consider how your product will be different and how it will have lasting power and differentiability when we're working with gpt7 instead of just gp4 so all of that goes into the fact that you need to think about building Moes around your product if you're not familiar with the term of a mo
in business you can think about a mo surrounding a kingdom some natural barrier to prevent Intruders from breaking in and three powerful Moes especially when it comes to AI products are a technology mode a data mode and a distribution mode and that is having unique features that aren't easily reproducible having unique information that's not easily accessible and having unique ways to get users that your competitors cannot replicate you want to be the first or the best to bring AI to a niche market if you cast your net too wide you will be outcompeted it's a
great idea to introduce AI to web search but perplexity and chat gbt are already doing that it's also a great idea to introduce AI to coding but cursor is already doing that you need to think of a niche market where AI is either not yet applied or not applied well and devise a way to introduce it to that market in a way that will allow you to win the market and beat out any further competition I know it's easy if you've spent a lot of time operating in the AI space or just thinking about it
to think that every Market is already being served but that's far from true and even if they are being served chances are they aren't being served optimally and there's a lot of room to find these Pockets where you can step in innovate and monopolize so think about how you can find a niche market and bring AI to it first or better than the competition and a good way to approach this process is to set strong metrics to measure your product's Effectiveness it's not good enough to colloquially think that okay I'm going to have the best
product design or I'm going to have the best marketing or my artistic AI product is going to create the coolest Generations you need to find quantitative measures of your product's Effectiveness so that you can know whether the things you're doing behind the scenes to move the levers programmatically are actually working and worth your time and investment success measures you want to measure volume obviously how much of your product is moving you want to measure labor savings what is it bringing to the consumer to be using your product rather than a different product or rather than
doing the work manually you want to think about user feedback you need to have routes to collect user feedback and you need to be talking to your users to see what they like about the product what they dislike about the product and you want to measure automation rate so how well does your AI do at automating a task that would be loathsome or tiresome if it had to be done manually by a human without the AI and then we have usefulness thresholds these are another thing that you want to quantify the relevance of your outputs
how good is your model at generating the correct or relevant generation based off of your user queries what is the cost per request how much are you spending each time your model or models in your infrastructure running some sort of generation interpretability how easy is it for you to look into your architecture and know what is happening know where something is breaking down or where something is having an outweighed impact so that you can go and know how to augment and iterate and improve and then we have latency how long does it take between a
user input for that user to see an output and if latency does break down for any reason what can you do to improve that latency or conversely if your latency is very low what might you inject into your architecture that might introduce a little bit of extra latency but that could have an outweighed effect on the quality of the outputs this is why it's important to be able to quantify and measure all of this so I'm setting four Milestones that you can think about in terms of taking your idea out into production so first you
want to brainstorm and especially you want to validate an idea so it's very easy to come up with a bunch of cool and innovative ways to apply AI I'm sure that if you're watching this you have thought of but before you decide to get gung-ho and just start building it's very very worth your while to validate that idea by talking to your perspective customer and asking them what their actual pain points are what they think could be improved in their workflow or in their life via Ai and via your idea and then to ask them
how much they'd actually be willing to pay for it once you've done that and validated it then if it turns out you actually are onto something you want to create a business plan you want to think about how you will get a return on investment how you will turn a profit before you dive too deeply into building out and coding your project it's very easy to get excited especially after you've passed step one but if you don't have a business plan you might be missing out on a key way you could be generating money if
you don't start from Ground Zero with that way in mind and if you start building down a specific route only to find that yeah you've created something cool but there's no real way to monetize it after you have a business plan you want to reassess the feasibility there's a good chance you get to this point and you realize wow the idea that I brainstormed and validated and created a business plan for when I created the business plan it turned out it's not actually going to make me any money or it's not going to be worth
my time or it's going to be very very very difficult to implement so after you've gone through these steps you want to reassess the feasibility of your idea and then four you adapt if necessary this could mean going all the way back to step one and starting over at brainstorming this could mean going back to step two and creating a new business plan or this might just be a higher level ethos of adaptability that you carry with you throughout the process the AI space evolves quickly so you need to build flexible infrastructure and be model
agnos it's very easy to get married to using GPT 4 but for certain reasons that might not be the best option for you as we'll see later in different lectures there's good reason to involve more than one model in your overall infrastructure and furthermore there's very good reason to involve lighter weight models that might offer you lower cost and lower latency and there's also good reason to potentially use only open source models let's say for example if you're dealing with clients who want to ensure high security or High Fidelity to laws and regulations and that
brings me to the fact that AI regulation is subject to change you need to understand the legality of your product this might not be the top of Mind thing for you right now if you have an idea that you are really excited to implement but again if you are thinking months and years out you want to make sure that you're not violating any law and also to ensure that you have an eye on any laws that may emerge during the lifetime of your product so that you aren't regulated out of the market or you aren't
sued potentially by a customer who bought your product because they thought that their data would be secure but it turns out that your database shows all of their secure information there's different gradations to this but you want to make sure that if you're building something and you're scaling a product you can say with confidence that it's legal and you won't get sued or get regulated out of the market so the AI stack is similar to most software Stacks you want an application layer and you want an infrastructure layer but the differentiator here is sitting between
those is a model layer so getting into the application layer there's different ways to tackle this but I think a good sequence in which to approach it is that you want to start by creating a a prompt that instructs your model to behave in a unique way this can be done with pretty low effort in a coding notebook but you want to test out a bunch of prompts and iterate on them and make sure that you can take a foundation model and get it to behave as close to precisely fitting your use case as possible
through prompting alone after you've done that the next step as far as the application layer goes would be to build an interface that makes it easy to interact with your model unless you're building an API most of your users will not be coders and you'll want some sort of interface that allows them to easily log in and interact with your model to input the expected sort of behavior that you expect in production and to get out the expected sort of outputs that you desire for your model you also want to write business logic to collect
transactions and user data again this lecture series is AI engineering with a focus on product engineering so outside of just having a well behaving model and a pretty interface you want to make sure you actually have a way to collect money from your users and to authorize users so they can log in within an account or to make sure that they can update their model in real time and personalize the way that the product acts for them and then lastly for the application layer you want to make sure that you're Gathering user feedback to refine
your application there's different ways of doing this if you just have simple thumbs up thumbs down that's one way if you collecting user inputs and running some sort of analysis on it to understand better how users are interacting with your product um whether they're liking it or not that can be very very effective you definitely want to consider privacy policies and making sure that users know that you're going to be analyzing their data if that's the route you decide to go now for the model layer again a sequence to think about this would be to
First prepare and organize data for knowledge retrieval Andor fine-tuning there's different ways to go about this that I'll get into in subsequent lectures but for now you can think about the fact that there's public data and there's private data you will have a much more effective hold on your Market if the data that you're using is not easily accessible but the flip side to that is it's very hard to get proprietary and unique data so for starting out with an MVP my suggestion to you would be to find any public data you might have available
that can tailor your model to your specific use case and to decide whether you're going to go the rag retrieval augmented generation route or you're going to go the fine-tuning route again my suggestion is to start simple start with prompt engineering consider augmenting that with retrieval augmented generation and test out your Models Behavior significantly I would even recommend testing it with us users but whether or not you decide to do that from there you can decide whether you want to fine tune so after this step you want to create an evaluation system to monitor model
performance again again again it's very easy to go down the rabbit hole of trusting your gut trusting qualitative metrics like how good does this seem but if you can quantify the metrics with which you measure your model's performance you'll have a much easier way to scale your business out and you can be that much more certain that the things you do to change your system are having a positive or negative effect for example with large language models a simple change in the prompt engineering or in the knowledge retrieval or even in the fine-tuning can have
big impacts on Behavior so if you have evaluation system set in place before you start moving the knobs on your overall system too much then you can be sure that you have a solid way to know whether the things you're doing matter and then lastly for the model layer we have optimize the quality latency and cost of model inference as you get more complex with your system design there's easy ways to decide to increase quality at the expense of latency or at the increase of cost conversely you can also really really really bring down latency
but that might have a very negative effect on the quality so you want to make sure that you're not necessarily thinking about minimizing or maximizing any of these parameters but considering them all together and how you optimize the ratios of quality to latency to cost and then lastly for the infrastructure layer we'll start by select a cloud platform to host your application data and manage your compute resources for a lot of solo developers this can be a daunting task because they just want to play with AI models and build something cool but maybe you haven't
worked with AWS or Azure or gcp before I would strongly recommend looking into which Cloud platform makes the most sense for your product before you get too deep into coding out the application or model layer because a difference in platform can have a very big difference in for example what types of pipelines you'll be building out between your model and your application layer so definitely think about where you're going to be hosting all of your resources and code when it's in production next we have organize your database or databases to store and retrieve both your
application layer data and your model layer data these can be the same database these may be different databases it really depends on what works for you and then lastly here we have create continuous integration continuous delivery pipelines to iteratively edit and redeploy your software if you've worked with cicd before this is pretty straightforward but definitely something you need to consider and be sure to have solidly in place so that when you're iterating when you're updating and re-releasing your updates and edits to your model and your infrastructure and your application layer that nothing's going to break
and that things can be updated and edited seamlessly so the AI application workflow I broken it down into some steps here that you might want to use in thinking about how to go from idea to final product so first we start with a proof of concept you want to develop some sort of tool or view or something that you can test out with prospective users or with your internal team and make sure that you have a broad understanding of what this product will look like in a final version of course it's likely it what it
actually looks like will change over time but before you get into coding before you get into anything too deep you want to make sure you've developed some sort of proof of concept and validated it with prospective users once you have this proof of concept you want to gather the necessary data for your use case again this doesn't necessarily need to come after you validated with your users but my recommendation would be to go about the proof of concept go about the prompt engineering first and then understand what data you want to gather in order to
augment what you've already got going on it's very easy if you've ever done any work with Data before to know that you can get really really lost in this data processing process sometimes it's hard to find the right data sometimes it's hard to clean the right data and my recommendation would be that if you have the vision of the proof of concept solidly outlined before you start going and searching for data it will be much easier to know how to not waste your time this might be different if you already have access to some proprietary
data but if you're going to go looking for public data then make sure that whatever you are looking for and whatever you're finding before you waste time cleaning it only to find that it's not that relevant to your use case have a strong Foundation of your proof of concept and then third we have the model so again we're working with Foundation models so up until a certain point there's probably not much you have to do to touch the underlying model but if you find that it makes sense for your use case then you will want
to strongly consider fine-tuning the model and preparing data in order to fine-tune or postt Trin your model to behave rigidly to fit your specific specific use case different use cases may or may not require this in subsequent lectures we'll get into how to tell whether you need to find T your model but this is a general way to think about stepping through the phases of building out the application and then we have the infrastructure so once you have your proof of concept once you have your data once you have your model you want to integrate
all of that into an infrastructure in something like AWS or Azure or whatever have you there's different ways outside of the three major Cloud platforms um and infrastructure isn't limited to Cloud compute it's also thinking about what database are you going to be using it's thinking about the languages and Frameworks that you're using and how you're going to write code to bring all of these modules and components to integrate together so this becomes sort of a cycle once you are done with the first three phases and you might have to think about okay well how
is this twe and the infrastructure going to affect what data I need or affect the model or affect the product and you have to start thinking about all of this as one whole system so that was everything I wanted to talk about in this first introductory lecture I hope you gained some insight into how to think about taking an AI project from idea to final product in the next lecture we're going to talk about Foundation models and I'll get into what the pre-training process entails how to think about the post-training process and how to think
about sampling and hallucinations in large language models so stay tuned for that