Introduction to Large Language Models

58.49k views2422 WordsCopy TextShare

Google Cloud

Check out how large language models (LLMs) and generative AI intersect to push the boundaries of pos...

Video Transcript:

[Music] how's it going I'm M today I'm going to be talking about large language models don't know what those are me either just kidding I actually know what I'm talking about I'm a custom engineer here at Google cloud and today I'm going to teach you everything you need to know about llms that's short for large language models in this course you're going to learn to Define large language models describe llm use cases explain prompt tuning and describe Google's generative AI development tools let's get into it large language models or llms are a subset of deep

learning to find out more about deep learning check out our introduction to generative AI course video llms and generative AI intersect and they are both a part of deep learning another area of AI you may be hearing a lot about is generative AI this is a type of artificial intelligence that can produce new content including text images audio and synthetic data all right back to llms so what are large language models large language models refer to large general purpose language models that can be pre-trained and then fine-tuned for specific purposes what do pre-trained and fine-tuned

mean great questions let's dive in Imagine training a dog often you train your dog basic commands such as sit come down and stay these commands are normally sufficient for everyday life and help your dog become a good Canan citizen good boy but if you need special service dogs such as a police dog a guide dog or a hunting dog you add special trainings right the similar idea applies to large language models these models are trained for general purposes to solve common language problems such as text classification question answering document summarization and text generation across Industries

the models can then be tailored to solve specific problems in different fields such as Retail Finance and entertainment using a relatively small size of Fiel data sets so now that you've got that down let's further break down the concept into three major features of large language models we'll start with the word large large indicates two meanings first is the enormous size of the training data set sometimes at the pedy scale second it refers to the parameter count in machine learning parameters are often called hyperparameters parameters are basically the memories and the knowledge the machine learned

from the model training parameters Define the skill of model in solving a problem such as predicting text so that's why we use the word large what about general purpose general purpose is when the models are sufficient to solve common problems two reasons led to this idea first is the commonality of human language regardless of the specific tasks and second is the resource restriction only certain organizations have the capability to train such large language models with huge data sets and a tremendous number of parameters how about letting them create fundamental language models for others to use

so this leaves us with our last terms pre-trained and fine-tuned which mean to pre-train a large language model for our general purpose with a large data set and then find tune it with specific aims with a much smaller data set so now that we've nailed down the definition of what large language models llms are we can move on to describing llm use cases the benefits of using large language models are straightforward first a single model can be used for different tasks this is a dream come true these large language models that are trained with pedabytes

of data and generate billions of parameters are smart enough to solve different tasks including language translation sentence completion text classification question answering and more second large language models require minimal field training data when you tailor them to solve your specific problem large language models obtain decent performance even with little domain training data in other words they can be used for f shot or even zero shot scenarios in machine learning f shot refers to training a model with minimal data and zero shot implies that a model can recognize things that have not explicitly been taught in

the training before third the performance of large language models is continuously growing when you add more data and parameters large language models are almost exclusively based on Transformer models let me explain what that means a Transformer model consists of an encoder and a decoder the encoder encodes the input sequence and passes it to the decoder which learns how to decode the representations for a relevant task we've come a long way from traditional programming to neuron networks to generative models in traditional programming we used to have to hardcode the rules for distinguishing a cat type animal

legs four ears to fur yes likes yarn and catnip in the wave of neural networks we could give the network pictures of cats and dogs and ask is this a cat and it would predict a cat what's really cool is that in the generative wave we as users can generate our own content whether it be text images audio video or more for example models like Gemini Google's multimodal AI model or lamba language model for dialogue applications ingest very very large data from multiple sources across the internet and build Foundation language models we can use simply

by asking a question whether typing it into a prompt or or verly talking into the prompt itself so when you ask it what's a cat it can give you everything it has learned about a cat let's compare llm development using pre-trained models with traditional ml development first with llm development you don't need to be an expert you don't need training examples and there is no need to train a model all you need to do is think about prompt design which is a process of creating a prompt that is clear concise and informative it is an

important part of of natural language processing or NLP for short in traditional machine learning you need expertise training examples compute time and Hardware that's a lot more requirements than llm development let's take a look at an example of a text generation use case to really drive the point home question answering or QA is a subfield of natural language processing that deals with the task of automatically answering questions posed in natural language QA systems are typically trained on a large amount of text and code and they are able to answer a wide range of questions including

factual definitional and opinion-based questions the key here is that you needed domain knowledge to develop these question answering models let's make this clear with a real world example domain knowledge is required to develop a question answering model for customer at support or healthare or supply chain but using generative QA the model generates free text directly based on the context there's no need for domain knowledge like let me show you a few more examples of how cool this is let's look at three questions given to Gemini a large language model chatbot developed by Google AI question

one this year's sales are $100,000 expenses are $60,000 how much is net profit Gemini first shares how net profit is calculated then performs the calculation then Gemini provides the definition of net profit here's another question inventory on hand is 6,000 units a new order requires 8,000 units how many units do I need to fill to complete the order again Gemini answers the question by performing the calculation and our last example we have 1,000 sensors in 10 geographic regions how many sensors do we have on average in each region Gemini answers the question with an example

on how to solve the problem and some additional context so how is that in each of our questions a desired response was obtained this is due to prompt design fancy prompt design and prompt engineering are two closely related Concepts in natural language processing both involve the process of creating a prompt that is clear concise and informative but there are some key differences between the two prompt design is the process of creating a prompt that is tailored to the specific task that the system is being asked to perform for example if the system is being asked

to translate a text from English to French The Prompt should be written in English and should specify that the translation should be in French prompt engineering is the process of creating a prompt that is designed to improve performance this may involve using domain specific knowledge providing examples of the desired output or using keywords that are known to be effective for the specific system in general prompt design is a more General concept while prompt engineering is a more specialized concept prompt design is essential while prompt engineering is only necessary for systems that require a high degree

of accuracy or performance there are three kinds of of large language models generic language models instruction tuned and dialog tuned each needs prompting in a different way let's start with generic language models generic language models predict the next word based on the language in the training data here is a generic language model in this example the cat sat on the next word should be the and you can see that the is most likely the next word think of this model type as an autocomplete in search next we have instruction tuned models this type of model

is trained to predict a response to the instructions given in the input for example summarize a text of X generate a poem in the style of X give me a list of keywords based on semantic similarity for X in this example classify text into neutral negative or positive and finally we have dialog changeed models this model is trained to have a dialogue by the next response dialog tune models are a special case of instruction tuned where requests are typically framed as questions to a chat bot dialogue tuning is expected to be in the context of

a longer back and forth conversation and typically works better with natural question like phrasings Chain of Thought reasoning is the observation that models are better at getting the right answer when they first output text that explains the reason for the answer let's look at the question Roger has five tennis balls he buys two more cans of tennis balls each can has three tennis balls how many tennis balls does he have now this question is posed initially with no response the model is less likely to get the correct answer directly however by the time the second

question is asked the output is more likely to end with the correct answer but there is a catch there's always a catch a model that can do everything has practical limitations but task specific tuning can make nlms more reliable vertex AI provides task specific Foundation models let's get into how you can tune with some real world examples let's say you have a use case where you need to gather how your customers are feeling about your product or service you can use a sentiment analysis task model same for vision tasks if you need to perform occupancy

analytics there is a task specific model for your use case tuning a model enables you to customize the model respon based on examples of the tasks that you want the model to perform it is essentially the process of adapting a model to a new domain or a set of custom use cases by training the model on new data for example we may collect training data and tune the model specifically for the legal or medical domain you can also further tune the model by fine-tuning where you bring your own data set and retrain the model by

tuning every weight in the llm this requires a big training job and host boosting your own fine-tuned model here's an example of a Medical Foundation model trained on Healthcare data the task include question answering image analysis finding similar patients Etc fine tuning is expensive and not realistic in many cases so are there more efficient methods of tuning yes parameter efficient tuning methods petm are methods for tuning a large language model on your own custom data without duplicating the model the base model itself is not all altered instead a small number of add-on layers are tuned

which can be swapped in and out at inference time I'm going to tell you about three other ways Google Cloud can help you get more out of your llms the first is vertex AI Studio vertex AI Studio lets you quickly explore and customize generative AI models that you can leverage in your applications on Google Cloud vertex AI Studio helps developers create and deploy generative AI models by providing a variety of tools and resources that make it easy to get started for example there is a library of pre-trained models a tool for fine-tuning models a tool

for deploying models to production and a community forum for developers to share ideas and collaborate next we have verx ai which is particularly helpful for those of you who don't have much coding experience you can build generative AI search and conversations for customers and employees with vertex AI agent Builder formerly vertex AI search and conversation build with little or no coding and no prior machine learning experience RX a can help you create your own chat Bots digital assistants custom search engines knowledge bases training applications and more Gemini is a multimodal AI model unlike traditional language

models it's not limited to understanding text alone it can analyze images understand the nuances of audio and even interpret program in code this allows Gemini to perform complex tasks that were previously impossible for AI due to its Advanced architecture Gemini is incredibly adaptable and scalable making it suitable for diverse applications model Garden is continuously updated to include new models see I told you way back in the beginning of this video that I knew what I was talking about when it came to large language models and now you do too thank you for watching our course

and make sure to check out our other videos if you want to learn more about how you can use AI [Music]