Introduction to large language models

792.96k views2350 WordsCopy TextShare

Google Cloud Tech

Enroll in this course on Google Cloud Skills Boost → https://goo.gle/3nXSmLs Large Language Models ...

Video Transcript:

hello and welcome to introduction to large language models my name is John Ewald and I'm a training developer here at Google Cloud in this course you learn to Define large language models or llms describe llm use cases explain prompt tuning and describe Google's gen AI development tools large language models or llms are a subset of deep learning to find out more about deep learning see our introduction to generative AI course video llms and generative AI intersect and they are both a part of deep learning another area of AI you may be hearing a lot about

is generative AI this is a type of artificial intelligence that can produce new content including text images audio and synthetic data so what are large language models large language models refer to large general purpose language models that can be pre-trained and then fine-tuned for specific purposes what do pre-trained and fine-tuned mean imagine training a dog often you train your dog basic commands such as sit come down and stay these commands are normally sufficient for everyday life and help your dog become a good canine citizen however if you need a special service dog such as a

police dog a guide dog or a hunting dog you add special trainings the similar idea applies to large language models these models are trained for general purposes to solve common language problems such as text classification question answering document summarization and text generation across Industries the models can then be tailored to solve specific problems in different fields such as Retail Finance and entertainment using a relatively small size of field data sets let's further break down the concept into three major features of large language models large indicates two meetings first is the enormous size of the training

data set sometimes at the petabyte scale second it refers to the parameter count in ml parameters are often called hyper parameters parameters are basically the memories and the knowledge that the machine learned from the model training parameters Define the skill of a model in solving a problem such as predicting text general purpose means that the models are sufficient to solve common problems two reasons lead to this idea first is the commonality of a human language regardless of the specific tasks and second is the resource restriction only certain organizations have the capability to train such large

language models with huge data sets and a tremendous number of parameters how about letting them create fundamental language models for others to use this leads to the last point pre-trained and fine-tuned meaning to pre-train a large language model for a general purpose with a large data set and then fine-tune it for specific aims with a much smaller data set the benefits of using large language models are straightforward first a single model can be used for different tasks this is a dream come true these large language models that are trained with petabytes of data and generate

billions of parameters are smart enough to solve different tasks including language translation sentence completion text classification question answering and more second large language models require minimal field training data when you tailor them to solve your specific problem large language models obtain decent performance even with little domain training data in other words they can be used for fuse shot or even zero shot scenarios in machine learning few shot refers to training a model with minimal data and zero shot implies that a model can recognize things that have not explicitly been taught in the training before third

the performance of large language models is continuously growing when you add more data and parameters let's take Palm as an example in April 2022 Google released Palm short for Pathways language model a 540 billion parameter model that achieves a state-of-the-art performance across multiple language tasks Palm is a dense decoder only Transformer model it has 540 billion parameters it leverages the new Pathways system which has enabled Google to efficiently train a single model across multiple TPU V4 pods pathway is a new AI architecture that will handle many tasks at once learn new tasks quickly and reflect

a better understanding of the world the system enables Palm to orchestrate distributed computation for accelerators we previously mentioned that Palm is a Transformer model a Transformer model consists of encoder and decoder the encoder encodes the input sequence and passes it to the decoder which learns how to decode the representations for a relevant task we've come a long way from traditional programming to neural networks to generative models in traditional programming we used to have to hard code the rules for distinguishing a cat type animal legs 4 ears 2 fur yes likes yarn catnip in the wave

of neural networks we could give the network pictures of cats and dogs and ask is this a cat and it would predict a cat in the generative wave we as users can generate our own content whether it be text images audio video or other for example models like Palm or Lambda or language model for dialogue applications in just very very large data from multiple sources across the internet and build Foundation language models we can use simply by asking a question whether typing it into a prompt or verbally talking into the prompt so when you ask

it what's a cat it can give you everything it has learned about a cat let's compare llm development using pre-trained models with traditional ml development first with llm development you don't need to be an expert you don't need training examples and there is no need to train a model all you need to do is think about prompt design which is the process of creating a prompt that is clear concise and informative it is an important part of natural language processing in traditional machine learning you need training examples to train a model you also need compute

time and Hardware let's take a look at an example of a text generation use case question answering or QA is a subfield of natural language processing that deals with the task of automatically answering questions posed in natural language QA systems are typically trained on a large amount of text and code and they are able to answer a wide range of questions including factual definitional and opinion based questions the key here is that you need domain knowledge to develop these question answering models for example domain knowledge is required to develop a question answering model for customer

i t support or Healthcare or supply chain using generative QA the model generates free text directly based on the context there is no need for domain knowledge let's look at three questions given to Bard a large language model chat bot developed by Google AI question one this year's sales are one hundred thousand dollars expenses are sixty thousand dollars how much is net profit Bard first shares how net profit is calculated then performs the calculation then Bard provides the definition of net profit here's another question inventory on hand is six thousand units a new order requires

eight thousand units how many units do I need to fill to complete the order again Bard answers the question by performing the calculation and our last example we have 1000 sensors in 10 geographic regions how many sensors do we have on average in each region Bard answers the question with an example on how to solve the problem and some additional context in each of our questions a desired response was obtained this is due to prompt design prompt design and prompt engineering are two closely related Concepts in natural language processing both involve the process of creating

a prompt that is clear concise and informative however there are some key differences between the two prompt design is the process of creating a prompt that is tailored to the specific task that this system is being asked to perform for example if the system is being asked to translate a text from English to French The Prompt should be written in English and should specify that the translation should be in French prompt engineering is the process of creating a prompt that is designed to improve performance this may involve using domain-specific knowledge providing examples of the desired

output or using keywords that are known to be effective for the specific system prompt design is a more General concept while prompt engineering is a more specialized concept prompt design is essential while prompt engineering is only necessary for systems that require a high degree of accuracy or performance there are three kinds of large language models generic language models instruction tuned and dialogue tuned each needs prompting in a different way generic language models predict the next word based on the language in the training data this is an example of a generic language model the next word

is a token based on the language in the training data in this example the cat sat on the next word should be the and you can see that the is the most likely next word think of this type as an autocomplete in search in instruction tuned the model is trained to predict a response to the instructions given in the input for example summarize a text of X generate a poem in the style of X give me a list of keywords based on semantic similarity for x and in this example classify the text into neutral negative

or positive in dialogue tuned the model is trained to have a dialogue by the next response dialogue tuned models are a special case of instruction tuned where requests are typically framed as questions to a chat bot dialogue tuning is expected to be in the context of a longer back and forth conversation and typically works better with natural question-like phrasings Chain of Thought reasoning is the observation that models are better at getting the right answer when they first output text that explains the reason for the answer let's look at the question Roger has five tennis balls

he buys two more cans of tennis balls each can has three tennis balls how many tennis balls does he have now this question is posed initially with no response the model is less likely to get the correct answer directly however by the time the second question is asked the output is more likely to end with the correct answer a model that can do everything has practical limitations task specific tuning can make llms more reliable vertex AI provides task specific Foundation models let's say you have a use case where you need to gather sentiments or how

your customers are feeling about your product or service you can use the classification task sentiment analysis task model same provision tasks if you need to perform occupancy analytics there is a task specific model for your use case tuning a model enables you to customize the model response based on examples of the task that you want the model to perform it is essentially the process of adapting a model to a new domain or set of custom use cases by training the model on new data for example we may collect training data and tune the model specifically

for the legal or medical domain you can also further tune the model by fine tuning where you bring your own data set and retrain the model by tuning every weight in the llm this requires a big training job and hosting your own fine-tuned model here's an example of a Medical Foundation model trained on Healthcare data the tasks include question answering image analysis finding similar patients and so forth fine-tuning is expensive and not realistic in many cases so are there more efficient methods of tuning yes parameter efficient tuning methods or petm are methods for tuning a

large language model on your own custom data without duplicating the model the base model itself is not altered instead a small number of add-on layers are tuned which can be swapped in and out at inference time generative AI Studio lets you quickly explore and customize generative AI models that you can leverage in your applications on Google Cloud generative AI Studio helps developers create and deploy generative AI models by providing a variety of tools and resources that make it easy to get started for example there's a library of pre-trained models a tool for fine-tuning models a

tool for deploying models to production and a community forum for developers to share ideas and collaborate generative AI app builder lets you create gen AI apps without having to write any code gen AI app builder has a drag and drop interface that makes it easy to design and build apps a visual editor that makes it easy to create and edit app content a built-in search engine that allows users to search for information within the app and a conversational AI engine that allows users to interact with the app using natural language you can create your own

chat Bots digital assistants custom search engines knowledge bases training applications and more Palm API lets you test and experiment with Google's large language models and gen AI tools to make prototyping quick and more accessible developers can integrate Palm API with makersweet and use it to access the API using a graphical user interface the suite includes a number of different tools such as a model training tool a model deployment tool and a model monitoring tool the model training tool helps developers train ml models on their data using different algorithms the model deployment tool helps developers deploy

ml models to production with a number of different deployment options and the model monitoring tool helps developers monitor the performance of their ml models in production using a dashboard and a number of different metrics that's all for now thanks for watching this course introduction to large language models