GenAI Essentials – Full Course for Beginners

277.89k views238849 WordsCopy TextShare
freeCodeCamp.org
Learn the essentials of working with AI in the cloud from @ExamProChannel. This comprehensive course...
Video Transcript:
hey it's Andrew Brown and welcome to the free generative AI Essentials course the purpose of this course is to give you comprehensive knowledge so that you can go and start building generative AI applications and in particular this is the prerequisite course for my free gen boot camp where we actually go out and build real projects so I hope you're as excited as I am and you're able to get as as through much of this content uh so you can go build projects with me and I will see you soon okay [Music] hey everyone it's Andrew
Brown and welcome to the start of our journey asking the most important question which is first which is what is the genis uh genis gen Essentials certification so this is a practical gen certification teaching you the fundamental concepts of ml AI gen all modalities of gen with a strong focus on llms programmatically working with J workloads uh both with cloud and local l M and it is a cloud vendor agnostic approach so there are other geni certifications out there it us has one Azure has one Nvidia has one and you know what I've made courses
for them all and they're okay but they're very very specific to those vendors and they leave out huge huge gaps and they have all these expectations of what you should already know in order to um uh uh to start building with Gen and so the purpose I made this gen Essentials course is so that you have round uh round and Broad knowledge so that you can be successful in any area of gen no matter the technical choice and to me that is super super important I'm going to get my head out of the way here
so the course code for the exam Pro gen Essentials is exp geni 001 I do want to point out that this course I kind of feel as like a beta course at this point and so I'm hoping at some point I will release a version 002 and include all of the information that I want to actually include in this course course it's still really good comparative to other certifications it'll definitely blow it out of the water but I you know I just want to say that the release of this course will have some gaps and
they will be addressed in minor updates as we receive feedback so pay attention to updates or upcoming road maps um for this content because this is not a oneandone course we're going to have to come back to it and continuously update it because gen is moving very very quickly okay but let's continue on so who's the certification for we'll consider the examp Pro gen Essentials if you are preparing for the preit knowledge to take the free gen boot camp so you can be successful in its completion and Grading for my boot camps we go and
we build we hit the ground running and so my expectation is that you can code you have General familiarity with the stuff you have uh knowledge under you so this thing is going to prepare you if you do not take this course before my free gen boot camp you're going to be struggling a little bit uh you can still do the free gen boot camp but um you if you want to maximize uh uh how much you learn please please please go through this course um and it's going to help you out okay uh another
reason you should consider taking the uh exam Pro gen Essentials if you need Broad and practical knowledge understanding gen Solutions so you have the technical flexibility to move in any technical Direction right now geni we don't know what direction it's going do you use cloud services do you do local local LMS should you be fine tuning should you be using rag um how much does it cost how do you secure it there's so many questions um that uh that that need to be answered and this is my explo exploration think of it like a CTO
trying to explore and evaluate for for the company I'm coming back to you with all that information so that you can make informed decisions and have confidence building with jny so I'm hoping that's what I achieve throughout this course you need to focus on implementation deliver gen workloads that are both secure and in budget at the end of the day we got to build things for people that's why we're learning gen to apply the knowledge and so that's what I want to do I generally focus on implementation and development because I have a developer background
I think it's fun building things um and so you're going to see a lot of Hands-On in this course whether you can do it or not is a different story if you find that the the Hands-On stuff is beyond your knowledge just watch the videos and and absorb the information you can this is all about getting you exposure to the the the most amount of knowledge and and the shortest amount of time if you cannot do the things do not stress out about it okay uh let's continue on here so we're looking at the Gen
road map so there's this certification course which is going to give you confidence and understanding talking about J and having the tools to get started building okay um and that is going to be based on the exam Pro gen maturity learning model which we'll look at in a separate video um the followup to this gen centrals course is the Project based boot camp which is called the free gen boot camp this will give you proof that you can build gen workloads through Hands-On projects which will be at gen. Cloud proo bootcamp.com register if you have
yet to do so go register now even if you don't finish this gen essential course you can absolutely register this is an optional course to prepare you so it's up to you what you want to do um but I I strongly recommend you go through this course if you can but still you know if you don't have the time but you do have time for the boot camp sign up for the boot camp and be there um how long does it take to study to pass this exam well we have between 15 hours and 30
hours if you are experienced you have some other geni certifications you're going to have an easy time through this but you are also going to benefit because there's going to be a lot of stuff that you didn't see if you took the adabs Gen practitioner you're going to be really surprised ads AI practitioner you're going to be really surprised how much stuff that we covered that you didn't see anywhere else same thing with the Azure AI engineer associate you'll be really surprised how much stuff we cover in this that uh I could not cover in
those other ones because they're out of scope but they're not out of scope here we're covering all of gen in this course um so yeah if you uh yeah so if you are experienced 15 hours if you're a beginner you're looking at 30 hours what is the average study time I could not tell you for this one because we don't have enough data at this point but I do feel that this is more a lecture and lab heavy practice exams are not going to be your main focus as the exam itself is not difficult it's
just there to evaluate your knowledge that's why we have um this optional certification step I do want to tell you that if you want to get certified to earn this badge you do have to pay for it uh I I don't know what the cost is in this video but I know that it's not a high cost to to get it if you if you want if you are just gaining the knowledge to take the boot camp still do this course okay but if you want that extra badge to to show off you can um
also do that by um paying for that that certification uh the recommended study time here is 1 to two hours a day for 15 days what does it take to pass exam watch the lectures do the Hands-On lab and again I want to point out that we are working with so many different Technologies there's going to be areas where you you might not have an account you might like okay I want to learn this but I don't really like Azure so I'll just watch the video that's totally fine or there might be uh things where
you actually need Hardware to run it and so you know maybe you need a Windows machine or a Mac or a Linux machine and you don't have that thing or you need a certain amount of compute that's totally fine just watch and and do the best you can but where you can do things go do it and there's a lot of free Solutions and we try to focus as much on free Solutions as possible uh so yeah just do your best okay um do uh paid online practice exams that's just kind of something we always
throw in there for this course we have a free practice exam there's no other providers because this is a an exam made by us at this point so we will have a free practice exam so you'll have one shot to do that and then you can pay to do the uh the exam that gets you the certification okay so it's up to you that's going to be at at www. exampro doco j001 so sign up to get your free practice exam over there for grading uh the passing score is above 750 points so you need
around 75% to pass you can fail by getting exactly 75% it just depends on how many um um uh questions there are because they don't always split perfectly and so you always want to aim above that score okay for response types there are 65 questions 65 scored questions you can afford to get 16 scored uh questions wrong there are no unscored questions other exams do that I don't do that in my exams I I think it's just stressful I think people would rather just know exactly what they are getting and do that there are no
penalty for wrong questions the format of questions is multiple choice multiple answer case studies case studies can be worth more than a single point so the distribution isn't perfectly like one point per question uh in case studies are more complex um um but uh you know there's not that many so I wouldn't stress out about it okay um the duration is 2 hours you get two minutes per question the exam time is 130 minutes the seat time is 160 minutes I believe that is correct yep yep yep yep um and so the the SE time
refers to the amount of time that you should allocate for this exam this includes the time review instructions Show online Proctor uh Proctor your Works Space read accept NDA complete the exam provide feedback at the end where do you take the exam online from the conveni of your own home exam Pro delivers the exam via teacher seat anchor so I don't think people know this but we actually have our own Proctor system um now the question is what format will it show up uh for this because we have an external portal and then we have
one that's integrated I don't know what's set up for this I might make an additional video that's only in the platform that you'd have to watch to understand how to take the exam so just look out for that um but it shouldn't be too difficult but yeah it's all exam Pro so the place where you consume the course if if you're on YouTube and you go to exam Pro and sign up for for free you can then figure out from that portal where to take the exam and so hopefully that will make sense there uh
Proctor means that it's super it has a supervisor a person who monitors the student uh students during the examination our Proctor system is a little bit different whereas uh we might capture the information and then audit it later so it's not always a live Proctor but like uh it will be it will capture information and then it'll be an audit audit period And if everything's fine then we will release the course out to you um but yeah we we'll see how things go for that okay um this exam is valid till well it's forever because
it's just how it works um other certifications what they do is they'll say good for two to three years and get recertified um it's not like we have a partner Network here so there's no forcing of uh uh recertification but if there is a major update you might want to consider getting recertified because there is just new knowledge there so I mean the certification holds for as long as there are no new versions um and that's pretty much it so yeah hopefully that is clear we didn't do an exam guide breakdown I'm going to do
that in a separate video but we are going to look at that maturity model here shortly [Music] okay hey everyone it's Andrew Brown and I'm joined here with uh Rola um and so we are going to be working our way through the geni road map I believe earlier in the uh introduction I described this as a maturity model um and so originally This was um very very detailed um uh flowchart or road map that Rola had produced along with um oh Don Don and so I I took it and I I mangled it or I
should say I reorganize it into what I call a maturity model where the idea is that we can step through it and have a way of understanding where we might be in our journey in terms of a gen road map by no means is this an exhaustive list there is so much in gen but I'm hoping that um for those that are are starting to try to understand gen that this will become a a way for you to understand it and then once you have that core knowledge then you can be introduced to uh something
that is a little bit more complex or robust and so what I want to do here is I want to work uh walk through with this uh with rollup and um just confirm that uh we have a good scope of learning here in terms of the course that is that is here not everything is included in the course just because of um the strict time limit I had to produce this course but I still want everyone to see what the scope of what I think people should learn and so we'll start at the top and
actually we do have a lot of content um in the course that is not in the ml introduction but I think that if you're going to learn generative AI you probably need to have some understanding of machine learning R do you think you can skip it or you or you think it's an absolute necessity to have some basics of ml I think it depends on what you want to do if you want to be just a practitioner who sets up a single system maybe you can skip it but if you want to stay in the
field um You probably need some understanding of the bigger picture and you know I think with lots of things you can get away with with not knowing for uh for a while but at some point you will have to learn it and uh sometimes it's better to learn something earlier on than to uh push it back later I don't think a lot of these things are that hard like vectors and embeddings or um I mean at first they seem very intimidating like like seeing the transform architecture and all those moving Parts but I think there's
a lot of content out there that does break it down into smaller smaller parts and um uh I would just say it's like you don't you don't have to be a mathematician to to fully understand the basics of the ml um and you don't need to be a car mechanic to use a car right you if you want to rent car and use it once that's great but if you're going to use it every day if it breaks down you need to know where to look right um and also those who that are watching we're
both a bit tired because yesterday was New Year's uh New Year's Eve and so I was passed up uh past past U midnight and I think Rola as well uh so just understand our energy levels are are high in our in our minds but um but anyway so after you learn machine uh some basics of machine learning I think that's that's a good a good basis to now come into generative AI um obviously I think that a lot of a lot of people are interested in agents I think that's what's talked about a lot these
days at least in what I hear when we talk about gen um so you know there are some things uh like agentic workflows and agentic coding it's a bit odd that I put them here because these are very Advanced topics but I put them in here early just because I know people want to know about them and talk about them and so it was more so to dive into these things in deep but more just to tell people yeah this is the angle of where you might want to want to go go if that makes
sense um obviously responsible AI is maybe a higher priority for folks I'm not sure uh what could go in here but even though this this part looks really small basically the Gen introduction is is more so like uh I don't know like this part here um but yeah that's what I have for the geni but I think um one thing that is uh really interesting is the modalities of generative AI um and I think that a lot of folks just focus on text generation llms but there is this whole skew of things video generation 3D
Generation image generation audio generation are there any more types of generations that I'm missing out on there Rolla um I think there's a bunch on protein and DNA which I mean they can look like they're text but they're specialized types of text um there's a lot to do with protein uh folding and things like that that also been used in gen so there's a few very Niche um topics but these are the main ones I think text image video audio 3D yeah those are the main but there are Niche uh topics that have to do
with health science or very specific uh Fields well it's a bit it's a bit tricky there because like you look at images and videos and and video is just a series of images or you have 3D and 3D uh or 3D files and they're just uh vertices and and text files but yeah I guess there are specialized ones and that's the thing as I don't know of anything beyond as you said like things in um the health scien Health Sciences if there's anything else that we would consider a modality but these are the main ones
one thing that's kind of really hard to figure out is like if you have video to video or uh image text to image is it the modality of text or video based on if it has one or the other the way I I put them in is like if it outputs that thing that I put it into that modality but I don't sure if we have to think about it in such clear-cut binary way of thinking these type of um uh inputs outputs for these different types of models no and I think we put them
there initially to to just given people the idea that there is a lot Beyond just text generation right there's a whole slew of things happening and what usually happens is if you look at a model card if you're looking for something specific and you look at a model card they will tell you what the input and out to that model is and you look for something very specific so I think it's just enough to know that other modalities exist both in input and output and that you can find uh a model that suits the input
output combination that you are interested in and I think that since the creation of this um uh this flowchart actually there's been an explosion of video generation tools that I can't even name them all so that's why I'm kind of hesitant to put uh put names here I mean whisper is still hanging around for audio generation but uh yeah for video there's everybody has one right now um and so I don't know it just kind of shows like how fast things are moving even in a few months like there'll be just an explosion of models
um in those areas but when you know the basis of these things but if you know video to video you know what video to video is right uh at least at a high level of what that means because underneath that could mean something completely else and it's moving very fast I think all of the all of the most of the things that we've seen in the public eye are about two years years old at the moment so whenever you read an article on gen or something notice the date because things get outdated very fast well
it's also really hard to figure out what to include practically because um as I'm doing stuff everyone's like oh use that did you hear of this constantly did you hear this did you hear that and I'm just opening these things and going well which is relevant and which is not but I think that um even if you do use a tool that might become dated I think that there are things that are always going to be in that realm of that kind of stuff and so it's not fully wasted information you just kind of have
to move as quick as you can and you're going to have to expect that uh things are going to be uh the floor is going to be moving uh moving around on you all the time yeah for text generation text to XX single turn multi- turn I have multimodel multimodal multimodel a lot of terms that seem very similar now we're into LM Basics which is still a generative AI a lot of stuff here we have a repeat of ethics and bias because we have resp responsible which have some overlap um and so you know looking
into LMS probably the most important thing is understanding I don't know like tokenization aony of appr prompt um I'm not sure what else there is there but yeah if you fundamentally understand I think like what a context window is uh and what it can do then you have a lot of power just with llms there um we have it we're like I'm supposed to have a section on prompt engineering down here and even this I didn't do all of the videos but I realized that as was making some of the content here I subconsciously was
utilizing some prompt engineering techniques that I wasn't aware of and I don't I'm not sure how much value there is to like learning every possible prompt engineer technique because there's walls and walls of of possible ways that you can do things but it is really interesting that um uh you know if you just are you just have an idea of creatively how you can feed stuff into prompt engineering that you might implicitly be uh doing them uh how much do you keep on top of prompt engineering techniques do you feel that it's worth going learning
all 200 of them or no no I don't I mean things are going to keep changing I think the the main things that have shown promise are good to know and those are usually a handful of things right and we can talk about them in the course I think understanding the anatomy of the prompt is really important that's what's going to allow you to understand um you know what you can do and and how that affects the model I think that's very important but you know um there there's going to be a lot of papers
and a lot of um understanding of things as they come I don't think they all move the needle in huge ways uh those the do you'll hear about but I think those are a handful you don't need to know all 200 no uh in terms of AI power assistance I feel like that's one of the the first entry points that a lot of people have with um uh uh generative AI but you know what when it came to these videos I didn't have much to say because they're all kind of the same um and and
there's not that much functionality built around them now some some AI par assistant have some bells and whistles uh like Claude has the idea of projects or chat PT has that really cool voice feature or Google Gemini they supposedly can process larger types of PDFs and the way they do it is special and um actually I think AI 21 I don't know they have or AI 21 technically has an AI power assistant but some of these things here they're not intended to even act as AI power assistants they're just there as demonstrations so like when
you go to meta or mistol I don't think you're supposed to use them as your day-to-day AI part assistance it's supp it's mostly just to show you a means to do that um but you know with with your level of knowledge do you use AI power Assistance or or do you do you roll your own so to speak um um I tend to like to do things my way but because I trust my own systems uh I we do we do rely on anthropic CLA quite a bit uh my team uses a lot of cursor
for coding and things like that uh my main issue is the the probability I would like to understand what how good what the probability confidence in any particular responses which right now okay yeah right now it doesn't tell you right it just gives you an answer know yeah I never thought about that yeah so but I guess like when you're using uh uh models direct directly would they give you probability or or would you have to do some uh some uh post calculation to get that in predictive AI most models give you some sort of
probability of confidence some sort of measure of how well it thinks it did but that's that's but we were talking about just general models like any kind of ml model is that what you're saying but with LMS would they have to would they also explicitly return them as well or no so uh most llms do not because they're ating sequences word after word they do not tell you what the what the models um how well the model thinks it did but in other probably if you're doing a classification for example it would tell you um
so I guess that's a good thing it's a good thing that you said there because uh when we talk about um llms you have to everyone talks about having uh evaluations right but is it because the nature of LMS makes it hard for it to produce that number so you have to have an additional step whereas when you're using a classical machine learning model it's producing a single value and then it can also give you the probability back at the same time kind of so in predictive machine learning traditionally it was very task specific so
you trained it on a very specific task and so it learned that task and it gave you some sort of metric of its performance on that result now generative AI tends to be and foundational model llms tend to be more um their general purpose so they can do a lot more things and they can do different types of things beyond the things that they were um trained to do that's where all these evaluation metrics come in is is if you see and we talked about this at some point um is they would tell you what
how does it Benchmark on coding how does it Benchmark on reasoning how does it Benchmark on accuracy um because they they they can do a lot more things than we train them on because they're very very large models and so there's emergent properties that come with that um and so I'm just going to iterate here that AI power assistance is I think where a lot of people uh start learning it but uh what happens is that when you're using AI power assistance you find out that it doesn't have a programmatic way for you to start
implementing it for your own Solutions and you know the purpose of this course is to show folks how to go beyond the AI power assistance of course you can leverage this course to do the best that you can do with a powered assistance but um you really want to move Beyond in this kind of part here I almost I feel that I should have a little line here because this is where the real story starts right after oops that's not what I wanted but if I can uh get this open here it's not I don't
know how to get my shapes here's my shapes but I feel that right here uh yeah if I if I was to draw a line right here this is where things start to get serious it's obviously an arrow not a line um and this is where you start get into workbenches and playgrounds surpris techically I showed where all the workbenches were and playgrounds were in these various Solutions but I didn't really use them I just skipped right over to code because there's not much to talk about they're they look a lot like a assistance they
do have more uh knobs uh knobs to turn like uh what top K and temperature and things like that once you learn one they all kind of feel very similar but uh some of them will have interfaces to their extended uh extended functionality so like if you're using a CO here they would have a little box for the response format for Json if you're in um uh anthropic uh then if you want to do tool use you could do that in place and some of them will generate at the code for you so you can
start utilizing them and sometimes they gener out good code I can't say they always do I think in some of them they're actually producing older code I'm actually really surprised um with how many AI Services uh like they'll have a version to uh API and and then their main service that Jour toota is still doing version one uh and it almost kind of feels like uh these companies kind of even keep up with their own their own changes which is really surprising um but I'm not sure what your experience is about um documentation and how
how accurate it reflects the current versions of things usually they're they're behind I think part of it is the the speed of things but also I think part of it is cost right the the older mod model is likely smaller and it's likely cheaper uh and so some some of it is cost um so you know once you start working with things programmatically at least in this way before you even start writing code you might start to think about what kind of models you want to use and there's all kinds of models out there there's
foundational models we have models that are fine- tune which technically Foundation models are fine tune models models that are instruct models we have embed models we have open source models we have Edge optimized models or we could call them slm small language models or we could call them on device models or we could call them something else there's a lot of names for this one in particular um and then we have uncensored or you could say unrestricted I don't know the word uncensored doesn't sound good but unrestricted models uh and we have a lot of
models why do we have so many darn models um so I think these are all different terminologies to um to describe different metadata about the models so open you know open source models are models where the weights are open um plus the data plus the code open weight models are models where the weights are open but not the data and the code so each word descri each term describes a very specific scenario so Edge optimized models are usually smaller models that you could put on your phones and Edge devices as opposed to the larger models
that need a server so because there's so many different um ways that you could use these things you have to have words to describe the different situations I guess and I think there's probably a lot more open weight models than there are open source models um there might be a bit of a confusion between that because I always think like I like this is open source model like but is it I mean I can download the weights and I can do things with it but I don't really know how they've trained it um one model
mistol and um uh meta tend to be open weight models but they're not open source models so the data and the code that generated the models are not there um open Ai and uh is is a closed model and anthropic is a closed model um yeah so there's different ways in which these things are released allowing people different types of control um and that's where the the naming comes in and and I like if people are watching here yes I'm updating the this as we go because that's how it is you collect information and you
you find new ways of organizing it uh I know of of one open source model that I can that I can always remember which is IBM Granite because they say that their model is truly open source and and by that they mean like you even know what data was sourced to train the model um so it's really interesting the different levels of things there we have models of service so I think this is probably one of the easiest ways to start working with llms um and and some other modalities uh where they provide a a
unified API to access to multiple different models um and so Google Amazon Azure and Alibaba all have them actually I haven't tried Alibaba yet I just know that they have it and I was I was I I said I was goingon to go use Alibaba I never did um is there any other ones that are out there the only other ones that I can think of that are models of service is grock grq for those that are listening there's a lot of GRS um in the AI space and it gets a bit confusing but that
one was amazing however um they don't have a paid a paid tier so like you can use it and it's super fast and it's only for open source models but um uh they don't have a a payment plan so you can't even utilize for production they do have a business plan but um it's a bit hard to get in there and so once we're done with models of service now it's time to get serious and we're actually looking at um out like some of these are cloud services but now we're getting into a very serious
um development and this could involve actually downloading models to your computer or uh using uh best-in-class thirdparty cloud services and so this is not exactly organized the same way as the course but now we are into Dev tools and workflows and so something that folks probably should know about Is the Gen life cycle l mops these are Concepts that sound very similar in the cloud space when we talk about I don't know the development life cycle and devops but um they can be applied there as well and then there's just some core developer tools that
I think that people are always going to come across like I'm on hugging face hugging face all the time it's like the GitHub of the AIML world I was surprised that I I used AMA more than I thought um uh do you ever touch Ama or is it um it's not for you Rola I have but I I know people love it it's it's open source there's a lot it's a lot more open than a lot of other systems so people like it I haven't personally played too much with it but I I don't even
remember I don't even remember what LMS is I'm not even sure what that is but it's in there uh so I can't remember that must have been something at one point but now I don't know what it is almost tempted to delete it we got llama file Lang chain llama index these two things were things that I heard a lot about early on when I was I was learning about generative Ai and I actually ended up once I started gaining so much knowledge I don't even touch them at all now I don't even think I
even bothered to teach them because even though they were uh I think they're actually really good as a as a learning tool um I just found that uh the the promise of what they were supposed to provide was to give you an opinionated framework so you could rapidly prototype things but uh because this space again has mve so quickly um a lot of the documentation was out of dat or The implementations Underneath didn't really live up to the promise um do you use any kind of um orchestrational uh services or uh Frameworks for rapidly building
out llms or or like agents or workloads or what do you do so our team usually uses sometimes Lang chain um and guardio to create very fast prototypes um but like you said everything's catching up so even Bedrock we use a lot of AWS bedrock and now they can you can do a lot of these things natively in Bedrock now you can do a lot of workflows and and model changings and so on which uh you know they they were quite you know some of the first tools that came out but now a lot of
things are catching up like you said well yeah and and it's it's really surprising like how robust Amazon bedrocks coming on the Azure side they were they were leveraging open AI but now they have their own generic one like Amazon Bedrock um I don't not sure why I put gml over here I mean they it's by the same creator gml and ggf um but uh llama CPP that's something I just want to do more with because it sounds so awesome but um and it comes up a lot but I I I didn't get to spend
too much time there with it um and Onyx is something I again I I really wish I had time to talk about that more because that came up a lot um for compatibility for model weights uh I try to remember what it does it's it's a file format it's like is it like ggf but a different file for you remember what Onyx Onyx is in particular no I'm not sure actually so this one here um I remember it's a this is for interop interoperability um and that's the idea of um kind of using the same
formats or the same um the way you encode the the models and all of that in the same way so that you can actually exchange use the same systems across different models and across different providers and so on um I know a lot about interoperability from healthcare because everybody does what they want and it's a becomes a headache which is where the space in AI is as well right every provider um releases in whatever encoding they want and tokenizer they want and they encode on disk the way they want and so that uh makes it
a bit hard to um kind of use the similar systems across models and across so that that's great yeah it's definitely an interesting topping it it would make life easier for everybody if we have certain uh Common Grounds see see I knew you knew it because as soon as I opened it up it took you two seconds to know what it was um app prototyping i' I like I come from a developer background so for me I I really enjoy AI code code assistance and app prototyping and this course we covered Streamlight gradio I think
we did Fast HTML uh vzero and lovable there's actually a few other ones that that I had covered as well um and then on the other side we have ai code assistance so we Amazon Q developer I I did cover that one GitHub co-pilot Gemini code assist codium cursor um and these things are actually this one has codium in Wind surf now which uh I would say I really like wind surf but again you know things change every other day um and uh it's just neck and neck these these tools are changing and so having
general knowledge about the tools and not really being focused on just one thing with conception understanding is going to just make your technical path a lot easier because you're not never going to be left behind uh and you basically are just jumping ship constantly to new tools but it's not that big of a deal except when every tool cost $20 then then you have to decide decide which one you're going to unsubscribe to I put extra emphasis in this course on setting up the developer environment I found that this was um one of the largest
challenges with working with local models understanding how to set up cond uh installing Docker containers uh pulling containers uh setting up Jupiter a Jupiter server because that's a very com a common way um and I was actually really surprised on how many free options there are right now to learn I think a lot of people think that uh well it is expensive to run models but a lot of people think that it's that it's too prohibitively expensive to learn gen but I is just blown away by how many resources are right now um as every
company is putting out their own Kool-Aid for you to drink that they really want you to uh buy into it and I'm sure at some point that this will change as as the um uh the space matures and everybody understands that but right now is like the best time to learn because there's so many free free resources it's unbelievable um but uh would you do you agree or would you say do you think do you see a lot of a lot of free learning opportunities right now or like with I do I think there's a
lot from from all the big providers AWS gcp uh hugging face has a lot of things there's a lot from a lot of different resources all small startups as well I think there's always great resources um it's a matter of time I think and commitment well even like in hugging face they have all these projects running here and you open it up and they have something called like um uh something zero it's and I don't know what it is but it's it's like some kind of compute that is distributed that allows you to run anything
this is not exact the one I meant to click on but it's just very interesting that there are so many of these things running and I'm just like how do they how do they have the money for all these for all this compute but they do and it's not showing the name of it's called g0 or something or um something but um uh yeah it's just it's just really interesting one part that's a little bit hard for me to cover is ji security I'm not sure if it's because things are still emerging we do have
a few friends on the security side that are going to help us out at some point to um expand on this but I imagine that this is uh rapidly changing to match uh the needs of of this expanding Market um of course learning how to work with containers I think is very very valuable there's a few different solutions not just open AI Opa but uh Nvidia has one called Nvidia workbench um I I would just say like there's not I haven't seen much tooling around containers and k8s and when I've talked to folks in the
kubernetes uh or Cloud native Community they're all like I know containers really well but I don't know anything about J and I'm like but you know containers right um and I think that um those that know how to work with containers that uh they they might have um uh an extra Edge if you will for deploying things into production and we definitely need people that know how to do that so serving models I think um uh is is also very useful it's it's unbelievable how many ways you can serve a model just as web servers
have servers uh llms have servers and I also assume that um um General models have servers um but I here's a a small list of things in fact I covered I don't think I covered P LM but there was a few other ones that I I covered um do you have to deal with uh serving models uh very often or uh uh what what does serving models look like for you Rola yeah so when you when you put a model out for usage people have to we call it inference or survey you have to put
it on a machine for people to Ping that machine and and get their answers and but we tend to do a lot of things native to AWS so you can do that through Sage maker and Bedrock quite nicely so um but yeah if if you want to put it off the out of the cloud or use your own Solutions then yes but the main point is to actually serve the model right that's how you make it public and and usable yeah and I think I think with all major providers with with their specific uh ml
platforms that they have their own their own uh inference engines or whatever you want to call them and I remember the one for sage maker uh yeah these ones these ones are really interesting as well Ray I I really stood out to me because it allows you to do things distributed um so I didn't realize that a lot of these things were um that only serve on a single machine and you have to in do something like Ray to take BLM and then spread it across machines which was really interesting uh there's model optimization um
so I have a few here a few things here listed um stuff keeps shifting around here there's another one here I don't know think I covered it but again merging stuff but it was called like webn or something webn or something I can't remember what was called but it was a Microsoft project um and so that stuff's changing all the time but one thing that I I thought that might be really important is understanding the underlying uh uh Hardware um I think that um we really don't know what direction uh the Gen at least I
don't know maybe R you know but the the the direction of of what it looks like to build a workload right and and what do you need to do that like uh because I know there's different levels of of models and inference and so I in my mind I think of it as like lightweight medium and heavyduty uh inference and training and so when I say that I mean like if I have a a computer like a modern computer and it can run a uh let's say a seven billion parameter model uh that's like a
light that could be a light workload something that that is feasible that I could run or something that could end up on my phone one day and so I think okay what does that do for the end end business or customer if now they can run this size of model locally right like and you and you have this one Capital cost of like a TH or $2,000 machine and now you can build a workload that's that's scoped only for your office and is that a benefit to anybody and so because the stuff becomes is becoming
more and more possible to run on uh on our devices and stuff like that I keep thinking well we need to know a little bit more about this right um uh but I don't know if you have any thoughts about uh the hardware side of it or I mean the hardware side is interesting because these models are so large that they push the limit of what we know about Hardware right so we can we're going to talk a little bit about model sizes and what that means for running them and and so understanding how these
models have pushed our need for custommade chips and things like that is fairly interesting and um and like you said having Edge um compatible or Edge optimized models can change applications in in our world as we know it and I think we've seen we saw a lot of different phones this year with their main catch being that AI now runs on your phone right so Edge um gen on edge is is predicted to be a very very lucrative and big Trend well it it'll be very interesting to to to see what happens and just the
other thing is just understanding how much resources these things take because when you see or you can uh kind of think about how much it takes takes and then you might ask like how is this sustainable and what's going to happen to some of these uh services that are out there and how are they going to make a profit um and you know if I do invest my time and effort should I go with this or should I go with something that is more feasible that I can run on at this size and so uh
it's really hard to say I was talking to one of my friends and I'm sure they're going to be really interested in the boot camp um because they're that's that's they're really excited for the boot camp and specifically the hardware side and and they wanted to go and spend a bunch of money because they wanted to build a home lab because they don't want to get the the the rug pulled on them so to speak that the price is going to go really high and now they have this uh service that becomes like essential for
their for their dailies and so um they're trying to invest in that so I thought that that was an important inclusion of it whether I did a good job on that is is up to the folks that are watching but hopefully uh it is good then we're getting into more advanced territory and basically everything below this line I mean Onyx probably is below this line too but everything below this line is more I would describe as Advanced Techniques um though to be fair if you're going to be deploying models you're probably going to want to
have evaluations and guard rails and things like that but I don't know I think all this stuff is um and maybe this is the stuff that I didn't get to cover much in detail this is probably where um I've done the the the poorest job in in this this course but maybe that should be its own course because all these things are uh very involved at least I think they are um but uh do you think that uh folks that are becoming Engineers or or maybe not AI Engineers but they're picking up these generative AI
skills for their companies do they need to learn super fine tuning like know how to do it or just uh like actually be able to do it or just know of it um I think knowing of it is always a good thing I think it depends on the field you're in if you are in a general field then these models are advancing fast enough that you might not have to do it on your own but if you're in a uh Niche field where the model needs to be fine-tuned on your specific field of data then
probably you need to know but if you're in a general company general use case scenario just knowing it is exists is probably enough at this point I think the only thing that might be missing from this is all the general data skills that you need and um uh I guess I just maybe it needs its own data primer I think I might have put some data in the course I think I had some data data content there but I don't have any data specific video saying like this is how you prepare data for for ML
models or llm models um how important is data to your models does it matter the data that you put in your models you're asking me how important data is forl models is that your question I know is it like well I'm just saying like well I'm just saying like imagine you wanted you have something you're like oh I want to train a model and I wanted to do something is 20 lines good enough like how much how much time do you need to spend on data comparing comparative to uh comparative to actually um engineering llm
Solutions so um I think let's start with regular uh ml before llm data is the bread and butter of machine learning right if the model learns from data um and then and so the data has to be there it has to be of good enough quality and it has to be um pretty consistent then and so that the model learns um from it now in terms of llm specifically it's less of an issue on you it's not an it is an issue for the model of course because it's still learning it's less of an issue
on you because the model provider of that foundational model has done all the work and unless you need to fine-tune it or change anything about it or make it more uh you know for your field you don't need to worry about it right because it fine-tuning llms is more a privilege than um not everybody can do it uh how much data are these uh things trained on well in the in the words of Yun who is one of the pioneers of uh deep learning and machine learning um it take 20,000 years for me and you
to read That's how much data these things get to see before they become useful uh so can you and I sit down and curate a data set tomorrow maybe if we commit a few months to it but it's not it's not for the faint of heart I I I kind of Wonder like with data sets at large how do they know what's good and bad data when they have to consume that much data or I guess it just kind of evens out eh no well that's the thing is is we're seeing all sorts of things
and we see and and we see biases a lot of the data is also scraped off the internet right there's biases in them we're seeing these models being biased they they are perpetuating social bias because these days the sets are not necessarily cleaned it's it's also really important to know what data sets the models have been trained on I think I was looking at some of the Google models I can't remember which one but it's something to the tune of 40% of the data is from social media right like that's thing that's not good that's
not good right there not good at all curate social media data there's nobody who there's biases in social media data there's no accuracy so what data and and that's the important part for example between open source versus open weight models is that there's more transparency over the data that the model has seen um and so if you have a chance to look at what percentage of uh social media data what percentage of literature what percentage of of Science and like whatever the breakdown of the data content that the model has seen will give you a
better understanding of that what what I what that model looks like right uh it's like it's like looking at somebody and saying did this person go to college or they they spend 10 years on social media that could be a model right depending on what sources it saw yeah I I just think of the movie uh and for those that have not seen it uh you might want to see it there's a movie called idiocracy uh where um it's oh R do you know the movie my husband likes that movie there's a there's a really
good line in there I always forget it's uh it's Bravo or bra it's the the quench you anyway it's the product that's in the the movie but uh yeah it's a bit scary to think like that they're scraping social media data but even even um again I think there's a lot of data it's just it's just it's interesting to me but anyway I just wanted to show people the road map to give an idea of of the scope of it um hopefully that gives uh people an idea of the boundaries or or scope of Genera
AI this stuff can be a lot more complicated so I did my best to um simplify it and break it down um but uh in reality if this was if this was graphed properly things would be all interconnected all over the place um and and and I made it look like a journey through one step whereas it's really all over the place uh or if anyone's ever played the the video game p the Exile um which has like a skill tree um there's the pth to Exile skill tree I feel like it would look more
more like something like that if I click into here and it's just all over the place that's kind of what geni um yeah here it is you go here and it's just it goes everywhere right and so I feel that if we did it like that it would look like that and people would be like I'm done but um I think that uh this is a a fair compromise here uh thank you Rola for helping us here in the road map uh for folks that are watching Rola is going to show up in a few
other videos here uh and um yeah that's about it CIA [Music] Chow hey this is angrew brown I just wanted to point out that there are some areas that I wasn't able to fully record um just because of time constraints but also there's just other really good resources out there that you can utilize um to fill any gaps that you need before the Boot Camp starts or to uh make sure that you're successful if you're trying to get the certification for the uh the Gen Essentials course uh so I've compiled a list along with free
camp at gen. cloudro bootcamp.com boo forre Camp of a bunch of resources and we probably will keep adding to this list of things that you could utilize um uh uh utilize to fill in those gaps so for example fine tuning was something I didn't get to do a whole lot about if you go over to here you can see that there are ones on fine tuning right uh or uh you know Google's offering I didn't get to do a whole lot about but there's tons of things about here with the Google offering some of these
are my videos as well and so you know I'm just suggesting you to uh take whatever you can to prepare um and that list is here at the gen. cloudro bootcamp.com boooth slf freeco Camp so there you [Music] go hey this is Andrew Brown and we are taking a look at the definition of what is artificial intelligence and we really wanted to put this against uh the terms of machine learning deep learning and generative AI so that it's very clear what the differences are often people just say AI when they mean ml or deep learning
so understand that um these terms are uh not used correctly often but people generally will understand what you're trying to say so it's not a big deal if you use them out of turn but let's make sure that we know what they are let's so let's first take a look here at artificial intelligence also known as AI these are machines that perform jobs that mimic human behavior okay that's the key thing here is that they are human like or doing tasks that you'd expect a human to do um and that is clearly a very broad
term of what is AI and so you can see why a lot of things are attributed to being AI then you have machine learning and machine learning initialized as ml is machines that get better at a task without explicit programming now now of course we have to code a machine learning model but once we have that model and we pass things into it it's able to complete its task with its very complex algorithms um so you could also just think of it as an uh it's a a a special algorithm to perform a task that
would negate you the negate you having to do calculations or programming or things like that then we have what is deep learning and when we think of a lot of the AI stuff we're usually think of deep learning because it's these machines that have an artificial neural network inspired by the human brain to solve complex problems so you probably have this uh you probably seen a graphic of it of like these nodes and they're interconnected and they go through layers that's deep learning a lot of people call that machine learning or AI but no that's
that's the L then we have gen so gen which is more of a u marketing term but generative AI is a specialized subset of AI that generates out uh content such as images video text and audio now I don't have it in the graphic on the left because it's hard to say where does it go does it go here right because it is a subset of AI but technically um gen often utilizes deep learning because when we think of it and my Line's not dry here today but um when we think of it there we
go there's the line is that a lot of gen techniques like large language models or um or um Vision models things like that are utilizing neural networks so it is deep learning [Music] okay all right so I know we keep talking about what is AI what is Gen AI but we're going to cover it again just so that it becomes more clear from different perspectives um so let's talk about what is artificial intelligence so AI is computer systems that perform tasks typically requiring human intelligence um so these include things like problem solving decision making understanding
natural language recognizing speech and images and an ai's goal is to interpret analyze and respond to human actions it's there to simulate human intelligence in machines when we use the word simulate we're talking about mimics aspects resembles behaviors but what we're not talking about is emulation which is replicating exact processes and mechanisms as if you created literally a virtual human brain that's what emulation would be um so AI applications are vast and include areas such as expert systems natural language processing also known as NLP speech recognition robotics uh and more AI is using various Industries
for tasks such as uh we're talking about business to Consumer so think of a customer service chatbot if we're looking at e-commerce think of a recommendation system if we're talking about the Auto industry uh maybe we're looking at a atomous Vehicles if it's medical then medical diagnosis there's a lot of applications for AI but it's a broad application for all sorts of things now let's take a look at generative AI so generative AI uh often initialized as geni or or said as geni is a subset of AI that focuses on creating new content or data
that is novel and realistic it can interpret or analyze data but also generate new data itself it often uh yeah so like types of content produces would be text images music speech and other forms of media it often involves Advanced machine learning techniques uh so it could be using things like Gans it could be using vae so variational Auto encoders um a lot of current llms use the Transformer architecture so if you're using um chat GPT or Claud Sonet or any of the popular ones they're basically all Transformer architectures uh gener I has multiple modalities
and when we say modalities it's like think about your your senses you have touch taste hearing smell so modalities are the kinds of content or or um senses that a model has so we have Vision so realistic images and videos text generating humanlike text audio composing music molecular which is more of an interesting one so drug Discovery via geomic data and uh I want to make it clear again we're talking about large language models but llms large language models will generate out humanlike text and is a subset of gen it's just one modality of the
many modalities um but it's often conflated as being AI or gen AI just because it's the most popular and In Demand right now and the most developed so just make sure that you understand that gen and AI is not all about large language model it's just one modality one application of of the broad sense of AI and gen now let's just make sure we have a side by-side comparison uh and then I'm sure after this you'll definitely know uh definitively the difference between Ai and geni so in terms of functionality AI focuses on understanding and
decision making whereas geni is about creating new and original outputs for data handling AI analyzes and makes decisions based on existing data geni uses existing data to generate new and unseen outputs in terms of applications AI spans across various sectors including data analysis automation NLP and Healthcare where gen and yes I see the spelling mistake uh it's creative and Innovative focusing on content creation synthetic data generation defix and design so there you [Music] go let's talk talk about Jupiter so Jupiter notebook is a web-based application for authoring documents that combin live code narrative text equations
and visualizations and before it was called jupyter notebook it was known as I IPython notebook and jupyter notebooks were overhauled and then turned into an ID called Jupiter lab which we'll talk about here in a moment but you generally want to open notebooks in Labs um and the leg the Legacy web-based interface is known as Jupiter classic notebook and to be honest I get confused between Jupiter lab and classic I think most things that use these days are Jupiter lab um but the confusion is because we just call them notebooks even though Jupiter classic notebook
is the not note the uh the older one and the newer one is Jupiter Labs let's go take a look at jupyter Labs so jupyter lab is the next Generation webbased user interface it has all the similar features as the classic jup notebook in a flexible and more powerful use interface so it has notebooks terminals text editor file browser uh Rich outputs and the way you I think that you know that you're using Jupiter lab is that will have this uh these tabs here on the side and a bunch of functionality so Jupiter lab will
eventually replace the classic jupyter notebook and that's kind of true because um but not fully because in some places I do come across classic notebooks launching them up um but for the most part functionally it has been replaced then we have Jupiter Hub so jupyter Hub is a server to run jupyter labs for multiple users it's intended for a class of students a corporate data science group scientific research groups and so it has some components uh underneath you will come across notebook like experiences that are like Jupiter Labs so some companies will um extend the
functionality of it one example is Sage maker uh Studio Classic for whatever reason adab us um spent all this time creating extensions and extending Jupiter lab and then they decided uh no we're not going to have extensions anymore and we're just going to use the vanilla version um but uh there's also things like vs code that has notebooks or code lab that have notebooks and vs code is like its own kind of notebook thing it's not juper Labs but it's jupit lab compatible so just understand that you'll come across things that are notebooks that look
like Jupiter lab but they're not necessarily Jupiter lab okay [Music] let's take a look at natural language processing also known as NLP and in machine learning it's a technique that can understand the context of a corpus a corpus is a body of related text the text that you are working with and NLP intersects with computer science and Linguistics so if you know a lot about the the nature of uh spoken and written language then uh computer science here is going to meet in the middle here so that we can um make sense of it using
algorithms so NLP enables us to do things like analyze and interpret text within documents emails and messages interpret or contextualize spoken texts like sentiment analysis synthesize speech uh such as using a voice assistant talking to you automatically translate spoken or written phrases and sentences between languages in uh interpret spoken or written commands and determine appropriate actions another thing you'll hear a lot is language understanding which is supposed to be it's a it's more like a specialized subset of NLP um uh that just goes farther to understand uh more traditional older ways of doing NLP but
uh anyway what I'll do is we'll just take a look at this um very simple flowchart to give you some idea of things that are related with an NLP this is mostly just to get you exposed to some terms it's not important to remember what these are and I can't even describe them off the top of my head um but again just you exposure to NLP terms so that when you see them later you'll go look up and be like I remember seeing that term here so here we have like text wrangling pre-processing language understanding
so structure and syntax processing functionality which is what the NLP uh does for you in the end but text text R pre-processing is where you are preparing uh text to be uh put into possibly um a machine learning model or maybe you're using it for um some kind of analysis or something like that and so this is basically taking text and um formatting it changing it and so what could we be doing here well we could be doing conversions maybe we're lower casing things maybe we're upper casing things um maybe we're turning contractions into their
full forms or vice versa sanitation this is where you are maybe stripping out HTML or special characters or you are removing stop wordss when uh you have stop wordss later on in your ml models tokenization which is converting um the text into uh Vector embeddings we have stemming okay we have uh lonization so there's a lot of things here but you can see it's mostly just like formatting the text to be utilized for something else we have language understanding so these are processes to make sense of the text so part of speech tagging so is
this an adjective is this a noun things like that chunking how can we uh break up the text and then work with those chunks later on down the road so that it still makes sense dependency parsing so you know which word relies on other words and what relationships do they have to other ones uh consti consu parsing very hard for word for me to say but like imagine a um a a tra tra green and so like you know a noun has an adjective under it which has another thing under it you look up if
you look it up and go to Google Images you'll you'll know what I'm talking about then we have processing functionality what are we using NLP 4 so we have name entity recognition this is is where you have a body of text and it's highlighting uh important words like maybe important nouns that it thinks you you care about or things like that or personally identifiable information we' got engrams sentiment analysis is this text positive negative happy sad information extraction what are we trying to get out of a large body of text um same thing with information
retrieval questioning and answering topic modeling so you know again not super important to know these in depth right now but the things that are important we will see these terms again um and you'll know what they are then so don't worry about trying to memorize this now but just get that exposure to NLP terms [Music] okay hey this is angrew brown and we're looking at the concept of a regression and this is a process of finding a function to correlate a label data set into a continuous variable or number uh so imagine we need to
predict a variable in the future such as the weather what is it going to be next week and so the idea is that you're going to plot your data onto a graph or vector space our dots are represented as vectors um and we're going to draw a line through it which we call a regression line and the point of the regression line is that is our prediction so if this is going over time based on the temperature um you know uh that is how we are figuring out in the future what things are going to
be so the distance of a vector from the regression line going to just get out a different colored pen tool other than red so maybe s and so imagine this dot here to the line that's what we're going to call an error because the idea is that um things that are closer to the line is the prediction and things that are farther away from the line are an error from the line so hopefully that makes sense there are different regression algorithms used uh uh that can uh that we use to predict future variables so we
have mean squared error uh root mean squ error mean absolute error and So based on the algorithm that you use to draw your line that's going to change um the prediction [Music] okay let's take a look at classification this is the process of finding a function to divide a label data set into classes or categories so the idea here is we're going to predict a category to apply to the inputed data so will it rain next Saturday is it going to be sunny or is it going to be raining so the idea is we have
our data we're plotting it on a graph but we're drawing a classification line that divides the data set okay and the idea is that if it falls on one side then it's sunny it falls on the other side then it's rainy and so again if you have a different type of algorithm that's the thing that's doing the division um it's going to have different results you have a a logistic regression a decision tree random Forest you can use a neural network you can use a a a name AV v b I always say that wrong
so I do apologize or you can use KNN or you can use a support Vector machine at or svm so just understand that there could be more algorithms of this but these are the common ones and you know if you want to learn more about how these different algorithms will change just look up on the Internet uh what that would look like and there's definitely visualizations out there [Music] okay let's talk about clustering this is the process of grouping unlabeled data based on similarities and differences the key word here is unlabeled when we looked at
uh um classification that was labeled data so the idea here is that we're grouping based on similar user differences so imagine that this grouping of dots that are close together we determined that that is Windows and this uh group of dots are Mac computers and just like classification progression you have different algorithms are going to give you different results and the reason why I show you these algorithm names is because when you have to do classification regression or uh clustering uh you're going to see these names because you're have to choose what algorithm you want
to utilize right now it's not so important to uh know them but when they are important we will look at them uh in more detail [Music] okay so we are going to dive into the types of machine learning in other slides in more detail but this is just kind of an overview so that you can kind of see these terms up front um so we'll just quickly go through this here and we're going to group them um based on what they're trying to do so the first is learning problems we have supervised unsupervised reinforcement these
are three terms you're going to hear quite a bit with machine learning uh the key thing here is that supervised is where you have labeled data and unsupervised is where you're working with unlabel data for reinforcement this is an agent an agent that operates in an environment and must learn to operate using feedback and this kind of sounds like agentic workflows or agentic coding we're talking about gen which we'll learn about later but the idea is like imagine you wanted to make a uh a machine learning model that played the the Mario or or the
Sonic video game that'd be using reinforcement learning okay then we have hybrid learning problems so we have semisupervised self-supervised multi- instance so semi-supervised is where you have a mix of labeled unlabeled data you have a lot of unlabeled data and a little bit of labeled data and so um that's kind of a a mix between supervised and unsupervised you have self supervised um and I believe that this is where um the idea is that it can label its own data I think but we'll find out later on in future slides we have multi- instance where
we have um examples of unlabeled data and so then we just kind of bag them together um again we'll cover that later on we have statistical inference so here we have inductive deductive and and transductive so using evidence to determine the outcome or then we have deductive using general rules to determine the specific outcomes and then we have transductive used uh to predict specific examples given specific uh specific things from a specific domain okay then for learning techniques we have multitask active online transfer and Ensemble so multitask is fitting a model on one data set
that addresses multiple Rel problems active is the model is able to query a human operator during the learning process um online is using available data and updating the model before prediction is made kind of sounds like rag when we're talking about gen um but again this is just general machine learning right so we have transfer and model is first trained on one task and then some are all the models used as a starting point for uh for related task and then we have uh Ensemble where uh two or more models are fit on the same
data and the predictions from each model are combined so yeah we're going to see these terms again but just trying to get it uh up front here for you [Music] okay let's take a look at the divisions of machine learning this is just another way to break up machine learning um and these terms you're going to see uh more in how we're going to structure our upcoming slides here so I just want to give you a quick overview here so we have classical machine learning and the advantage of classical machine learning is the data is
simple you have clear features um um and generally classic machine learning is extremely uh cost efficient compared to other types of machine learning but this is where you have supervised unsupervised uh kind of uh stuff so you know when you think of classical machine learning think of those two things supervised and unsupervised um uh learning then you have reinforcement learning this is uh when there is no data and the idea is that the model is going to through trial and error figure out what is the right thing to do this is where we have real-time
decision- making game AI so we talked about Mario or sonic uh uh like the the ml model playing those games and failing again and again and again until it can pass the game a learning task or robot navigation so think of automous uh driving vehicles that would be a good case for reinforcement learning we have Ensemble methods when uh quality of data is a problem so then you are going to have different strategies to work with multiple models or algorithms to have a better outcome and here we have things like bagging boosting stacking okay and
so you know you'll see those terms like boosting you'll definitely see the word boost more uh when we get to that then we have neural networks and deep learning you should just really think of deep learning as neural networks this is when the data is complicated and or the features are unclear this is where you'd use uh neural networks like a convolutional neural network a recurrent neural network uh a gan so generative adversarial [Music] adversarial Network sorry U multi-layer percepton uh or perceptrons sorry MLP Auto encoders and I just have a really hard time pronouncing
these things but yeah you're going to see these terms again so again don't worry about it right [Music] now let's take a look here at classical machine learning and so when we say classical we're talking about algorithms that have existed for quite a while may maybe as early as the 1950s because we had these mathematicians and they figured these out and a lot of these things actually relate to um statistics right so we're taking statistics um and utilizing them uh in these algorithms in our Computing spaces so hopefully that makes sense but yeah it's they're
called classical ml because we are dealing with algorithms and one example would be nearest neighbor algorithm which was invented in 1967 and lots of companies today definitely could utilize classical machine learning uh to solve business problems just because they're old does not mean that they're not good it's just a matter of organizations knowing how to adopt uh classical machine learning so let's talk about first supervised learning so this is where we have data that has been labeled into categories and this is great when we are doing something that is Task driven we're trying to make
a prediction because the idea is we have this labeled data and so then we can bring unlabeled data and tell the machine to label it right so here we have classification so imagine we want an outcome this would be to predict the C what category something belongs to a use case here would be identity fraud detection we have regression this is where maybe we want to predict a variable in the future so we're we're trying to figure out a market forecast um and we cover you know classical regression so you should know what these are
um if not you will know about what they are soon enough because we'll cover them more than once um then for unsupervised learning we have data that has been not been labeled okay this is where things are data driven so we recognize a structure or a pattern we're not making a very specific prediction um here we have clustering so the outcome of something so you group data based on similarities or differences example here would be targeted marketing Association so find a relationship between variables through Association the use case here would be a custo a customer
recommendation we have dimens dimensionality reduction so here help reduce the amount of data or pre-processing this is a problem you have a lot of data um and this a use case here would be big data visualization so um yeah there you [Music] go all right let's compare supervised versus unsupervised learning and I know we've already talked about it like twice before but we're going to talk about it again and then again because I'm just trying to give it to you in different perspectives so that you really know the difference between these so let's talk about
what is supervised learning so this is a machine learning task or function that needs to be provided training data and the training data is when you provide labeled data the correct answers and the Machine can learn from those results so show me how to do it and then I can do it on my own that's what's happening here and so for supervised learning models we have classification regression what about unsupervised learning this is a machine learning Tas or function that needs no existing training data uh for this it will take the unlabeled data and discover
its patterns applying its own labels so I am an independent worker I can figure this out on my own right uh and for this these unsupervised learning models we really should have put the unon that there let me just fix that there unsupervised we have clustering Association Dimension dimensionality reduction and so supervised learning tends to be more accurate than unsupervised learning but requires more upfront work whereas unsupervised learning still requires human intervention to validate the results so hopefully that is clear [Music] okay okay let's review it one more time I know it's getting tiresome but
it's very important that you remember the difference between supervis unsupervised and reinforcement so supervised learning is where the data has been labeled for training it's Tas driven and you're making a prediction this is when the labels are known and you want a precise outcome when you need a specific value return and so here we use classification regression as example examples of supervised learning there's more than just those two but that's what I want you to know for now we have unsupervised learning data that has not been labeled the ml model needs to do its own
labeling it is Data driven you're recognizing a structure or a pattern when the labels are not known and the outcome does not need to be precise when you're trying to make sense of data here we have clustering dimensionality reduction Association then you have reinforcement learning so there's no data and there's an environment and an ml model generates data and many attempts to reach the goal this is decision driven you have game AI learning task robot navigation so hopefully that is clear and it's in your head um we're going to repeat these again but it's going
to be less of this um and more detail [Music] okay let's talk about supervised learning models and we're going to cover classification and regression again um just so that we really know that we know what these things are so classification is a process of finding a function to divide a data set into classes or categories so imagine will it be cold or will it be hot tomorrow right so very clear it's either one or the other it's going to fall on one side of the line or the other one we have different algorithms we can
use like log Logistics regression K nearest neighbor support Vector machines kernel spms uh Navy's uh bay decision stre classification random Force classification so we're listing a lot more here we have what is regression regression is a process of finding a function to correlate a data set into a continuous variable number so what is the temperature going to be tomorrow and here we have uh things like s uh simple linear regression multiple linear regression polom regression support Vector regression decision tree regressions random for regression just again want to continuously repeat that so you know what these
things are [Music] okay let's take a look at unsupervised learning uh so what can we do here we have clustering and again we've covered these prior but I just really want to make sure that you know what they are so clustering is a process of grouping unlabeled data based on S similarities and differences right so we Ed an example previous viously um you know is this a Mac or is it a Windows here it's about age and something else and so it's saying you know is are these people do these people have cholesterol are they
highrisk or lowrisk um for chering algorithms we have K means uh DB scan K modes then we have Association so Association is the process of finding relationship between variables through Association um so the idea is that if somebody buys breads then suggest butter because based on previous combinations we know what people want um so there are different algorithms for that I cannot say those words so I'm not going to attempt it but you can see them here on the right hand side we have dimensionality reduction this is where we're reducing the amount of data we
retaining the data Integrity often used as a pre-processing stage and we have lots of algorithms for this principal component analysis linear discriminant analysis generalized discriminate analysis singular value decomposition uh Laden uh dial I can't can't say that word there's just too many words that are too hard to say but there's a lot there's a lot for dimens dimensionality reduction um yeah and so hopefully you can remember those things classification regression clustering Association Dimension dimensionality reduction [Music] okay let's take a look here at neural networks and deep learning first defining what are neural networks so these
are often described as mimicking the brain you have a neuron or node that represents an algorithm the data is inputed into the neuron and based on the output the data will be passed to one of the many connected neurals the connections between neurons is weighted the network is organized into layers there will be an input layer multiple hidden layers and an output layer you could technically have one hidden layer but often you have multiple layers if you have three or more now we're talking about deep learning if you have less than three that's just a
neural network um and just look at the visual for here for a moment because each node or uh neural remember that it has its own um its own algorithm like how it's going to process that data and I'm pretty certain that most neural networks the the algorithm is going to be same for all the nodes but we'll talk about that as we dig deeper into the neurons themselves um but then there's the concept of a feed forward neural network which is initializes fnn I don't know why it's not ffnn but whatever so these are neural
networks where connections between nodes do not form a cycle that means that they always move forward so data moves forward okay we don't have neural Network's going back and this way and that way they're just going One Direction which is forward then you have back propagation this is where after um things ran into like everything's ran through it's going to move backwards through the neural network and adjust the weights okay to improve the outcome on the next iteration so after it's ran it actually just to update all the weights and that is back propagation this
is how a neural network learns it has to do back propagation okay then we have a loss function so it's a function that compares the ground truth to the prediction to determine the error rate so how bad the network performed ground truth right is data that is labeled that you know to be correct okay okay now we're talking about how these neurons are going to have their own algorithm right because up here we say that uh it represents an algorithm so this is where we have these um algorithms which we call activation functions so an
activation function is an algorithm applied to a hidden layer node it's one of these things right here let me just get my pen out again one of these that affects the connected output and so an example of that would be ru or reu I don't know how to pronounce it properly but I recognize it uh but we will be looking at activation functions when we look at Neons a bit uh a bit soon here um there's the concept of DSE so when the network layer increases the amount of nodes we call it more dense uh
and when the layers decrease the the amount of nodes we call it sparse okay so when we see an increase it's dense if it's decreasing it's spse um and for deep learning algorithms we have supervised and unsupervised just like with classical machine learning um and so on the supervised side we're going to see things like uh fnn RNN CNN so you are passing in labeled data for this to work for unsupervised learning we we have uh dbn's SES RBM S's and not important to really remember this but I just wanted you to know that they
have supervised and unsupervised learning um uh for for de keep learning [Music] okay hey this is angre brown and we are taking a look at Bert so Bert stands for bidirectional encoder representations from Transformers and this was a model that was released in 2018 by Google researchers um and you can find uh the models on hugging face for the account organization Google BT but the easy way to think of uh Bert or this is the easiest way that I like to understand it is that there's that Transformer architecture that we saw from all attention is
all you need if you were to just take the encoder and then feed it into itself multiple times you would get Bert and at least that's how I understand it to be um and so Bert is bir directional meaning that it can R text both left to right and right to left to understand the context of text so it really does understand um the context of of of words and where they are in sentences um Bert is pre-trained on the following task so it uses Mass language model um and so that kind of task is
where you provide uh let's say a sentence and then you have like fill in the blanks so they say that they Mass tokens for uh sentence input but I would describe it as uh it's filling in the blanks for sentences the other part of it is it does next sentence prediction so this is provided this is where you provide two sentences um A and B and so the idea is that Bert will predict if B would follow a so by being trained on these two things um it provides it the ability to uh be aware
contextually aware of sentences and be very good at natural language processing now the model itself in this state you're not going to do anything with it you need to then train it further for a specific task and so this could be for name entity Rec uh recognition questioning answering sentence perir task summarization feature extraction beddings and more there's absolutely more things you can do here and Bert comes in many many different sizes so you have the base which is 100 million parameters you have large which is 240 parameters you have uh tiny which is 4
million parameters and there's like 24 or so other additional models um if you go to the Wikipedia it tells you how much data it was trained on it was something like 800 million something somethings uh and it was like the the Corpus text was from the Toronto library in the Wikipedia English so this is primarily trained on English I'm sure there are multilingual variants uh if we were to go over um to hugging face and uh there are many Bert variants that try to be better or solve other things uh with Bert um which is
fine and while Bert is an Al model it is still used and it's ubiquit Baseline in natural language processing so when I say that I mean like they use it as a baseline or or uh against other other things because it's just uh so good um that it's not going away anytime soon here's an example of using us Bert so let just get my pen tool out here so we can get a little bit closer here but the idea here is that here we're using um Transformers which is a hugging face library and we are
loading in the sentiment analysis Pipeline and what's that going to do is that going to download the The Bert based uncase model that specifically trained for sentiment analysis now I said earlier that Bert it in its pre-trained state you wouldn't use it for anything right you need to train it further uh for those tasks and so it doesn't look like it but it is actually pulling a a a trained a very specifically trained one on this size of model and so here we can uh take these two sentences and then we can um uh do
sentiment sentiment analysis on it and get a score back so hopefully um that makes sense there but yeah you're going to see Bert again and again and again so it's worth your time to [Music] learn all right so in this video I want to see we can get some hands-on experience with Bert so Bert is the bidirectional encoder uh representation from Transformers uh by the researchers at Google and so um I think that we can use this to create embeddings and that is one use case for it um I don't know if Bert actually has
anything to do with it but it's kind of fun uh seeing Bert there since this is a a Google researcher thing I think that maybe we might want to try to utilize either uh Google collab I wouldn't mind also popping into vertex Ai and seeing where we could run it there um but let's go ahead and try with Google collab first I am not experienced with uh bird at all but I'm sure we can stumble through it together and see if we can figure it out so I'm going to go ahead and make a new
notebook and um this will be called uh I'll just call this Bert Bert uh embeddings simple and so all I want to do is figure out a simple way to do that now there is an option here to start generating code with Gemini so I'm going to go ahead and choose that um and say you know I want to implement a simple embeddings example using uh BT okay let's see if it can do that for us I've never used the Gemini feature in here but that was really fast and so looking through this I've seen
code uh around on the internet and this looks pretty much exactly like other tutorials um so like even off screen here I have one from like Chris McCormick and I think if I go down here you'll notice like you load it you have the you're loading it from uh hugging face that's where it's coming from okay and so similar things here so yeah this looks okay if you're wondering where this is coming from I'm going to I'm going to go um this is hugging face right so we're going you're going to see it a ton
of times but I'll just show it again so I'm going to type in this Transformers hugging face and so this is pretty much what we're looking at right here when we're utilizing it so I think this is built into these are models that are easily accessible within hugging face that uh are so commonly use that they're just kind of plugged into here um but uh sorry I just was trying to grab the link and put in the slides for folks later here I think what's interesting is they have a Bert Japanese so since I'm learning
um uh Japanese language learning I'm kind of interested in this one and so this looks similar it might be a variant of it for embeddings you might try it for fun uh not that there's anything super exciting about it but let's go open this one and see if this is actually the example um that we have here because they might have a simple example here of utilizing it so here's BT base unase which is probably the most basic one that we can utilize no direct example oh down here we have one kind of but I
think this might be fine so let's go ahead and take a look here so we have Bert tokenizer Bert model from hugging phase we're loading the pre-trained model called Bert uh base uncase and you know I keep saying that you anytime there's models on hugging face you can look them up right put them in here here and we can find them right so if we wanted to learn a bit more about this one we can read about it pre-train model on the English language using a mass uh language uh modeling MLM objective so here we
can read about it and it looks like there are some variants of models that we could uh swap in here and then we have some code here as well okay so nothing crazy I'm also just kind of curious for deployment um I was just trying to see if uh there was like a oneclick to Google collab but there's not that's totally fine let's go back over to here and so here we have tokenizer okay and then we're loading the models of the weight so loading pre-trained model tokenizer so the vocabulary then loading the the weights
okay then we have our text that we want to um tokenize which we're basically uh yeah we're going to tokenize so we're using our tokenizer and then here compute token embeddings yeah I'm not exactly uh sure exactly what it is but um I'd say this is fine so we'll accept this I just want to go ahead and I want to see if this can changes so how can I print out well it has the sentence and Bings down here so it actually does it down below here so that's totally fine okay so I wasn't 100%
certain but let's go ahead and see if we can run this okay so we'll just give that a moment and does it work well it's going normally I'd break this up into separate lines but since we have this Big Blob here I just wanted to see uh what results we would get so I'll just wait for it to finish all right so it has completed and so we're getting back um all the embeddings you can see there is a a lot of um uh stuff in here so I would just be curious to see how
we would run this on uh Google Cloud specifically invertex AI if they have a similar environment um I mean f obviously Google Cod lab is fine but I'm just trying to get it exposure to more toolings as much as we can um so I'm going to go back over to here actually maybe to the the B Japanese to me is kind of interesting so I might try to run this separately as well um so let's see if we can run that so I'm going to go ahead and just grab this code here and by the
way I'm just going to grab the link here put in notes so later on uh we can reference them but uh yeah let's go ahead and see if we can utilize this one so this one has Auto Model Auto tokenizer so we go back over here and it's a little bit different yeah it's a little bit different I'm not sure why it is but maybe because it's a more generic model it doesn't use Bert and uses the auto model I couldn't say for certain but I'm going to go ahead and add a new block here
and we already have um torch should already be included above I think it is right if we go back to the top here I want to clear out this uh this box this one's annoying there we go um and we already have that imported so I'm going to go ahead and bring this one in and then we'll grab these ones okay we'll bring them down and then we will bring in our Japanese line here what does it say um let me see here that looks like Necco Des AI that looks like a Neco that looks
like a cat to me I'm not sure about the rest let's go find out that's the only symbol I recognize I am a cat Okay waga Haan NE that is that's a fancy way of saying it w okay I would just say I would say it a much simpler way but my my Japanese isn't that advanced we'll go ahead and grab the next line here and we'll go to our next line here and then we will grab its outputs examples of using the model character uh tokenization well hold on hold on hold on so this
is an example of using a model with mecat and word piece tokenization okay so up to this point that's fine I don't think I care about the rest here let's go ahead and see if we can run this again I'm not saying I understand all the stuff I'm just saying like you can see how easy is to grab the code and get out values but again we need to really consult um an expert and so I might pull in an expert here to have better context other than just starting to work with these things okay
so we have a model error so do you install fugashi to use that [Music] okay so open this up in a new tab here and yeah we'll go ahead and just install this so I'll just bring this down a line here we'll just say um percentage pip install fugashi and I'm trying to remember like maob is um I think it's like a dictionary I used it before uh in preparing for the free geni boot camp but there were uh these things that if you gave them Japanese text it could break down its form uh based
on a dictionary so I believe that this is a basis for that maob dictionary that it wants we don't see maob here it's somewhere underneath um as a library and saying this isn't installed either okay so we'll get that as well so I'll just dump it here on the end and eventually we'll get everything installed now we might have to restart the run time sometimes you have to do that but we'll go ahead and try this okay but working with um anything with llms you're constantly having to fiddle with a library so it's totally not
unusual for this to happen you could be writing this later on and you could be getting total errors and everything is um they're like Andrew why doesn't your code just work it's just the space is very um uh it's just how it is right so that Library looks like it is now installed let's right the next one and there we go so we got some output here and so it looks like we have um some special things or saying this is the start this is our separator so these are um specific indicators and then maybe
I guess it's a identifying this as the section of the sentence um so again don't fully understand it but the point is we're able to get a run I just like to do one more thing here is if we can try to go do this on um Google Cloud so here in hugging face they'll have options like deploy and I rarely ever touch this I'm just kind of curious what would happen if we do deploy with Google Cloud code spaces now I have a Google Cloud account so you'd obviously have to make one and we're
in vertex Ai and it's opened up model Garden which I'm actually surprised that it's in here um so it says open models on hugging face now if you do not have uh if you do not have um any room for spend do not do this I'm just doing this to kind of explore it together so we can just together see what we can get as a result if you've never pressed that button so please don't do this because this might cost something and I want to have you to stay in the free tier so here
it wants to deploy it to vertex AI which is the fully managed AI platform if you never heard of vertex AI you can think of it similar to Stage maker Studio Labs or Azure ml studio and then down below we're bringing the model we have our endpoint we have a place we want to deploy it there are the machine specs which there is nothing displayed here your endpoint will be deployed with the following settings so it doesn't look like we get to choose but it's going to use a C3 High CPU for accelerate type unspecified
um and then we' get an end point that we would hit but I'm not exactly sure how I would end up utilizing this um so I have to select something but I have no options here and maybe the reason I have no options is because I don't have uh access to this um this item here so I'm going to go ahead and just search for what this is in in Google like I want to know its cost right um so this one here is a C3 C3 Series so Zeon scalable processors fourth generation okay Titan
titanium this doesn't seem like something that I I I would not not have access say too uh custom built security microcontrollers okay that's not gpus but it seems like maybe this is ried on CPUs and not gpus that's what we're always looking for like what are we running this on and so I'm pretty sure yeah this is running on this is running on CPUs the question is like why don't I have access to launch one so um maybe I go back over to Let's go ask Google Gemini uh Google Gemini I think they have a
free tier still and so um let's go ahead and find out why I can launch this so I'm looking for here into uh here and I can't seem to launch uh how do I get um quota within Google Cloud cuz maybe I have to put in a service uh limiter request I yeah quota limits and so it says navigate to the quota page oh it's just quotas you think I know but um honestly I don't change quotas very often right oh by the way this launched me into an existing project you may have to create
a project before even doing any of this but I'm going to go here first and let's see if we can find quotas you got to spell right if you want to show up so go over to here okay and we might not be able to run this today because if I don't have access to it and it takes too long then I might not be able to complete this but that's totally fine and so in here they're saying go to the quotas page identify the limiting quota look for the quota that's preventing you from launching
instance so here we go now we have um some things here and this is [Music] within where are we we are in um vertex AI so I'm not sure if this would be under vertex AI or just general compute engine API because that's what it might be utilizing underneath here um so that is what we need to find here I wish they had like a simple search oh they do maybe okay so let's say vertex AI okay and we do have some options so what I'm looking looking for is the region we're in right now
we're in Iowa I believe yeah or yeah Iowa so somewhere here we're going to have Iowa actually what region is Iowa us Central 1 okay so we're looking for us Central 1 US Central one so that would be this one here it says five H okay let's click into this one here to use this API you may need credentials okay um I mean I don't want credentials per se I just want to be able to launch this [Music] here so I feel that if I was to create credentials that wouldn't necessarily make it work but
let's go ahead and press it anyway it says the API is enabled so yeah that's not my problem dat belonging to Google user so yeah I'm not okay let's go look take a look here so look for quota that's preventing you from launching instances such as vcpus compare the current usage use the request quota increase button on the quota page provide details for information um yeah I'm not I'm not sure let's go back over to here uh to to here let's just back out for a second and and back one more time back one more
time back one more time and I think I might want to try is just typing in C3 because that's the type of compute we're looking for right C3 CPUs and it says unlimited and value8 so we definitely have access to these um why there's two I don't know but let just open this one here and then open that one here I just want to see if something's not enabled no this is enabled as well this is enabled as well oh my goodness let's go back over to here and I'm just going to figure it out
I'll be back in just a moment okay you know another thought I have is like do I have to create a machine spec before this and maybe there because there's no item because I didn't create one and that's kind of like what I'm thinking now so I'm going to go ahead and just open up a new tab cuz I don't think we have a service limit issue here maybe we just have to create a machine spec um I'm not exactly sure where we would create that but um let's go over and oh I don't want
to launch flux let's go into like llama 3.12 but these are ones that we can easily deploy yeah now see that we're seeing machine specs right and we didn't specify machine specs I'm not sure why your quot limit for the Nvidia 4 C Central 2 is zero usage so it could be maybe it's just where we're launching it let's go see if we switch to another region like all nothing Toronto nothing this is really odd okay so what I'm going to do I'm just going to ignore this for a second um and I'm just going
to go back to the model Garden here and let's see if we can find Bert this way so I can just type in Bert okay because we're just looking for generic Bert um fine tune deploy B I don't want to fine tune it I just want to use it neural network based Tech uh technique for natural language processing um again I don't want to train it per se but we will click into this one and again that's trying to fine tune it but maybe that's what it wants to do when it launches it what if
we click fine tune what do we get I just want to see if we get that window that we have no no this is not what we want at all cuz we just want to launch it we just want to launch it at pre-made model that's all we want to do we don't want to create a pipeline here today um so that's not exactly what I was hoping for um but clearly there is some way to deploy it but it's not showing up here today let me just try this one more time worst case we
could just go somewhere else like Amazon Sage maker I just really wanted to do it here yeah we just don't have any machine specs odd odd odd so let's go back over anyway I'm going to try to check uh service quotas one more time but if I can't figure it out we'll just uh roll back on this okay so one thing I just wanted to do was a sanity check and I made my way over to um uh Google compute engine and I can see C3 right here and I can select it and there doesn't
seem to be any restrictions for me launching it remember I said earlier it can get expensive well 20 cents for me is not that bad but I just don't understand why I can't select it but that's totally fine so I don't we're going to be able to launch it there with Google Gemini here today um so that one is a bust that doesn't mean we won't use vertex later on but maybe we could deploy this over to Sage maker now I know Sage maker really really well AZ your ml I'm all right with it and
so here's another example and so Sage maker sometimes has jump starts like what they're already pre-made but it doesn't look like it has um uh much here just looks like it has some code so here it kind look similar but notice that it's bringing in Bodo 3 and Sage maker so it has a sage maker variant of the hugging face model that we can utilize okay um so I mean this is fine but this is not doing embeddings per se so I don't know um this kind of curious what happens if I click this oh
so this is the ads infr okay so it's just slightly different code so this is just if you wanted to run it on um ad's AI accelerators or trainers the code would be a little bit different yeah there's no jump start I'm not sure why they mean cloud formation I guess they just mean that they'll have a cloud information example so that's kind of a bust let's go take a look at Azure ML and here they say create a workspace discover and deploy models via there and yeah their examples aren't any better so sometimes this
stuff is good sometimes it's not and uh you know that's just what it is now we could still run this um on um any of those platforms but I just wanted a a oneclick solution so you know I think we're done here for Bert but this exp exploration is fine you can just see how sometimes these conveniences look like conveniences but they're not okay anyway I will see you in the next one okay ciao you hear me well yeah I hear you good cool hey everyone it's Andrew Brown and welcome to another video I have
Rola who is definely an expert comparative to me which I I know nothing as you just heard me talking about Bert but let's see if we can actually find out really how Bert Works um and so I have my slides pulled up here let's bring them up here and uh R is going to help us enrich the information I have so when I learned about Bert I actually uh learned about it in the machine learning um ad certification I didn't think much of it because I thought it was old and maybe uh folks don't use
it anymore but I was at an event with Rola um at the adus Montreal user group and I heard her say there are use cases where you can still use Bert or should use Bert where LMS um might be uh too much too much work and so I wanted to kind of dig a little bit more into that um and so I have my slides up for display here for Rola and so I understand Bert to be this bidirectional encoder representation from Transformers it was uh researched by Google and uh you know I saw a
video which showed the for architecture and what they did is they just split it in half and took the first part of the architecture and stacked it and they called that b so is that true is it just cut in half and then stacked well so um Transformers come I they have two pieces right they they've got the encoder piece which encodes data so it takes um natural language and encodes it uh into mathematical representation and then you've got a decoder where if you've got mathematical representation you can pass it through a decod and get
natural language right and um the first generation I think of um llms came as an encoder decoder couple and so you would put in natural language it goes through this mathematical system and then it would you would still get um uh language out of it exactly that so there's a encoder decoder um gpts for example are a decoder only but Bert itself is an encoder only model um and so what that means is it takes you have to think of these things as what is the input and what is the output and so uh an
encoder only model uh takes in natural language or or takes in um for for in our case in a language model takes natural language and then it comes out with a natur with a mathematical representation right and so what birth is and then you add a few layers of um you know these These Things Are neural Nets at the at the heart of them and so you add a few layers and and you can um come up with different representations and so what Bert is really good at is things like sentiment analysis for example or
classification um so you think of it as input is text a lot of text and output is less text so either a word or um a s a sentiment a classification something like that does that make sense yeah and one thing that I I I I think I got from that as well is that you were saying that Bert if we go back here to the architectural diagram and I just want to confirm if this is what I understood I'm trying to get one of these images a little bit larger here it's a little bit
hard today I'm not sure why it's uh making it hard for me but I want to go back to that Transformer architecture um and I just want to make it a little bit larger if I can here so just give me a moment here to get the link so I'm just copying the link address here and so here is the Transformer architecture and you said well I said Bert's just this side stacked uh over onto itself and that's what we get for Bert but I think you had suggested that GPT is the right side that
is stacked upon itself or is it if you just took the right side would that be considered GPT yeah GPT are decoder only models I mean in a nutshell yes uh but yes so they're they're not uh they're not encoder decoder they're decoder only models okay interesting uh so let's go over into the next part of Bert um and so yeah so Bert is bidirectional meaning it can read texts both from left to right right to left to understand the context of text sorry that's not exactly what it means it's supposed to be non the
naming is a misnomer um it's non-direct it should be named non-directional it's called B directional the idea is that it doesn't the sequence of words often matter right um and so if you say um I don't know um a killed b or B killed a it it's a very different understanding of how things are and so traditionally the way it worked is um we had rnns and different variations of recurring neural networks and the way they did that is through sequential understanding um the the Transformer architecture changes that and attention changes that in that um
that sequence is understood differently not not sequentially and this is why the non- directionality comes in I think okay so all right so that makes sense so takes in all of the data at once it doesn't read right to left nor right right left to right it just takes in all the data uh but internally it works in a mechanism where um it is not reading through a sequence it's really the the attention mechanism make it so that it can highlight certain words um in association to others okay so so uh have you ever seen
the movie um um uh it's a comic book movie um have you heard of Dr Manhattan have you ever SE heard of Dr Manhattan so Dr Manhattan is uh uh he's all knowing not like he becomes um uh I don't know he becomes like kind of like a God and he and he's aware of everything all at once at all time so would we interpret its understanding as it contextually is aware of all the words in all directions would that be a better description or um I think the w i it is aware aware of
all of the words but it is able to give certain words more importance based on what it understands of language so there's a combination of things where one language uh where one word kind of enhances another um and the it starts to pick up these nuances of word connections based on its own training so it is aware of all of the words it just understands which words it really needs to focus on right all of the articles for example doesn't necessarily care for um does that make sense yeah yeah it's contextually aware okay so aware
extion see instead of think instead of thinking of like it's reading left to right just understand that it's contextually aware of each word and it's important in a sentence okay and the movie I was thinking of is called Watchman if anybody cares I don't know why I I just uh I had to I had to remember what the movie was um so Bert is a pre-trained model on the following task so two things that it says it does is mass language modeling so MLL and the other one is next sentence prediction NSP so you know
my thoughts is like you train the model based on these two things does that mean in its pre-trained state like it's been trained for these two things that that's all it can do at this point is like it Can it can uh uh infer if you were to do these two tasks it would be able to do it in this state because I was trying to figure out what can you do with this model I assume you can't do anything model unless you train it again but can you do anything with it in it in
the in the uh non-trained state or sorry the pre-train state um I think there's a lot so there's there's something called emergent tasks in llms where you train it for a specific thing but it turns out it does this other thing really well and we call that an emergent task um and we're learning that the bigger the model uh size then the more emerging tasks it it has so the fact that it is pre-trained on these things then we know that it can do these things because it has been trained on them um but there
has been a lot of emerging tasks and a lot of uh I think a lot of people were surprised at how much for example that gpts can do um in terms of solving different problems and that's because of the size now Bert is a model that is about 340 million so it is a couple of orders of magnitude smaller than our bigger models you have to remember this is one of the first one of the few first models that came in the Transformer architecture came out in 2017 this is this came out in 2018 um
so it is one of the first models it is one of the smaller models at 340 billion um the gpts right now are about 1.2 trillion so it's about a couple of order of magnitudes bigger so I'm sure I can do more than that um I have to remember use cases I have seen it we have projects where we are running sentiment analysis and and classification uh with Bert but um I'm sure there's other things that it can do I just haven't uh played with it enough okay so so that that word emergent emergent uh
did you say emergent models or emergent emergent tasks or eerg tasks yeah things that can do so so again to to ground it to a real world example I'm going to keep doing this as a real world example is like OIC was uh supposed to be for something else and now it's used for weight loss is that kind of EXA exactly you discovered that it can do this um right so but was that the case with bird they like did not expect to you like or was it like they made the Transformer architecture and then
they said okay we're going to just take off the first half and see what we can do with it and then they were're like oh I guess I can do this or uh um I think the first thing that I have to go back through the the literature sure um I think they were just experimenting with different models and things like that I don't know that this is not the one for the sequence uh sequence to sequence I have to go back and see um well if if you want I have a pause feature I
can pause if you want yeah let's let's pause that let's pause let's pause and see I we're back from uh from paus Land here and so we did a bit more research in here and I don't know if we remember exactly what we asked because we had a really good sidebar conversation but um I think we were talking about um uh the discovery of Bert and what came before it uh and or or you know like did did GPT or Bert come first I think that it was it GP Bert came first or GPT came
second so it depends on different references we looked at two different references and so like we said the Transformer paper which is the attention all you need paper uh came out in 2017 and now was a uh a work between uh Google but also the University of Toronto I should say that as a Canadian people forget that it's University of Toronto it's a you know uh and then um there in 2018 uh Bert came out as a an encoder only model and the gpts came out also uh as a decoder only model and based on
my reference on the book I'm reading it said that bird came first but when we looked in the internet there were referen that said gpt1 came first so I I would say they came at the first in the same year um the they they're the first two I guess one of the first two that came out well that kind of makes sense because you know if they had this Transformer architecture and then they split it in half and played around with it I could see them doing that I can't imagine that they made them in
Iceland and attached them or maybe they did I'm not sure but uh all we know is that all these things happen close together um and uh we are we are where we are now um but anyway we'll continue on here so Bert can be fine-tuned to perform uh the following tasks now I only know Bert for one thing and that's to use it for embeddings I shouldn't say that because I did run Bert a few times uh in this gen Essentials course for different fine tuning but I did not realize how broadly it could be
applied to things so we had name entity recognition question and answering sentence pair task summarization feature extraction and beddings uh and more I actually have another example I can't remember what it is but it's on the next slide um and so there's all these things that you can do for fine tuning what do you see folks using Bert the most for uh in the industry and or like today like what would people gravitate towards uh Bert to utilize for so I see it uh for I've seen it work with the classification uh so you give
it a piece of text and then you classify um that as um a particular label you're interested in um and that makes sense all of these things if you think about what they are named entity recognition uh I see sentiment analysis as well we we some people use that which is also a label in a sense um they're they're all the same type of work so incoder only models are really good at um analyzing text to reach a conclusion right to to do some sort of task to reach a conclusion not for uh to generate
a sequence to sequence system so yeah this is all in line um and I think yeah the most I've seen is classification um and sentiment analysis okay so classification sentiment analysis that's good because those are the two main or sorry embeddings and then the other two as you mentioned were the ones that I were utilizing it for um I'm surprised I can do question and answering but uh I always wonder if like the very early um oh s yes I think even if it does question answering it's fairly short like the context out the output
context window would be fairly limited well this wouldn't be something that people would use as uh something for their um uh website say as a as a customer service uh body would be too incapable of it um yeah I don't think that's what it's made for it's not a um I think the context would be fairly short uh so there are multiple sizes of birds we have 100 million parameters 240 million parameters four million parameters and they said there was like 24 other varying models um I'm assuming Bert base uncase is the most used model
I that's the one I keep seeing the most I'm not as sure as to why but obviously I would imagine that you're going to get different levels of Performance Based on the amount of uh parameters being used is Bert tiny four uh four million is there any use case for such a small model or um other than learning I well I think if you can put it on Ed or something like that potentially uh it depends again what you want to do with it and how well it does so you'd have to it's really hard
to know what a model will do in any particular situation what we do know though is that okay sorry uh what we do know is that so there's a really good paper that came out called the chinchilla paper I don't know if you've heard about that one okay so the chinella paper looks at optimizing model the model sizes versus the data versus the compute so it it looks at um let me pull it actually let me pull it in uh but some really cool things came out of of that paper um and the idea is
that oh let me get you the name of that um and so it really looked at the relationship between the model uh the model size the data set size the Computer Resources required and what it came out with is it realized that um obviously the bigger the model the more information it can understand the the better uh the more things it can do uh but also the bigger the model the more data it needs right so um it came up with the idea that many models seem to be overparameterized and undertrained so um can we
do a lot with 340 million I think we can uh it depends on what data it's seen and what data you use to give it so um yeah we you can if you find tun it specifically on so sorry the one key thing that I the one thing I heard that sounded very important was that uh there are models that are over partied but undertrained meaning that they have a lot of um connections between their nodes but the amount of passes I'm assuming that's when we been training is uh is as few or the data
plus passes going through is not enough yeah so there's a lot of space in that model to learn it's like having a big brain but not actually filling it with with much in a sense right like you so the idea of um the size of the model we have to understand so is is 340 million enough well it depends on what data it's seen right so these things tend to be very difficult to judge like that um and so it's It's a combination and and the more and the bigger the model is then the more
data it should see for it to do a decent job and so the chinchilla paper came out with the conclusion that a lot of models seem to be overparameterized and under um and undertrained so they they they would get used they would do better if they see more data and just as a side note I think they came up with the number that the data set size and tokens should be about 20 times the number of parameters so if for 340 million parameter model you'd have to multiply that by 20 to understand what the size
of the data set that's ideal for that model okay so I guess my other thought is that if you have a model that has a lot of parameters but you don't do many passes of training and your data set's not that large is the size of the model going to be smaller because the training was less intense and or the parameter size is going to basically determine the size model the the parameter model the parameter size which is the which is in a sense just uh a how many model how many parameters this model has
that's constant so even before it um before a model is trained at all it still has the exact same parameter models and what ends up happening in um the in in the training world is those fit let's say 340 million parameters they're initialized to random weights they're initialized randomly and then as it sees the data what it does is it um it makes a prediction and then it goes back uh it we say it it optimizes objective function so it estimates how how off it is and then it adjusts those parameters accordingly and then it
Loops over and over it it makes a prediction calculates it its errors readjusts and then it does that over and over and over until it minimizes its error margin and so in the beginning of the training you still have you have 340 million parameters that are just random and at the end of it you have 340 million parameters that have some sort of representation of the real world of what it trained not of the real world but of what it trained on um and so the the data comes in in that how good are these
parameters tuned like how good good are these parameters to allow you to to complete a task well if that makes sense I actually uh I I I could show maybe I should share this with you I did um a little bit of a representation on a linear uh and a logistic on a linear regression about how it starts and how it adjusts you could see it on a notebook I'll share that with you it's it's really nice how to while you see it train and adjust it that sounds really cool um I what one thing
I was trying to try to figure out was that um or for for folks that might think is like is a larger model in file size mean that it's more trained or more intelligent but what I'm hearing is that the size of the model is based on the amount of parameters because that's the amount of data it's being held within the model weights and so all you're doing is adjusting those weights so you're not going to they're just numbers right so it's not going to be bigger or smaller I mean there might obviously be some
flux to some degree but it's not going to be you know if you did you know 10 times more train it's going to be 10 times larger it's really the parameter size that's going to determine its end size exactly so regardless of what where the model is whether it's good or bad it'll have the same size it is a constant size um but what what really matters is how well what those numbers are these 340 parameters what are they and are they because they you have to think about it is they they they in that's
a mathematical model right and so that's what matters the file size though some people tell me oh the file size increases the file size should not increase the reason the file size increases is because certain programs attach metadata to the system and so when you when you train the the optimizer sometimes or or um attaches metadata of of training States that's what makes a bigger file the model itself should always be constant at whatever level you set it at okay that makes sense so so it there are cases where the data gets larger but it's
not for the reasons that you think it's because you know as you're saying there's additional data being attached um as metadata and that makes total sense there um just continue on here with Bert yeah there's obviously a bunch of variants I didn't look at any of the variants I just assume like I know I always see the word distill so I assume that that's just a more optimized optimized model like I don't know if it's a pruned model or whatever you call it but I just know that um they're generally more performant like two three
times and they're cutting Corners to to do that not in a bad way I mean like they're they're figuring a way to make it more performant um I haven't looked at all these either um but you have to think about of these as so I think Roberta let me uh they all came out later like Roberta came out in 2019 so a year later you have to think of this as a natural like you know how man evolved um you know the evolution of man where we and then and you have to think of it
like that right every year there's new models um based on new understanding based on New Concepts based on better optimizations and so um this is the Natural Evolution of it yeah so so yeah I understand that I guess I guess I was just really getting hung up on the word distill because I keep seeing it and every time I come ac across a distilled model uh they always describe it as like two three four times faster and they've done something to it whereas to um again I'm not sure it's pruning or they're doing some kind
of optimization to the model to make it more proficient like for instance whisper has distilled whisper and I just keep seeing the term whis uh distill so I guess my real question is like when you see these terms like distill does that do you know what that means or it's just that it is they're just putting that name there and it's suggesting that like is there a rhyme and reason to the naming across models or there's no convention and people can attach whatever they want like sometimes I see models that have Omni in front of
it and are they just putting Omni in front of everything when they that they think Omni sounds cool or yeah I think um these are researchers in Labs right or people in different companies and there's no real naming convention uh and so they can realistically name things the way they want we just talked about how really bidirectional is somewhat a misnomer should be non-directional but these you know the the namings I don't I don't attach too much to it um and they also want it to be really cool when when you make an acronym out
of it so sometimes it is a misam for the acronym um but a distilled model so there's a lot of ways that you can um optimize a model uh like you said some of it is pruning some of it is um these uh teacher uh teacher models uh AWS actually I reinvent they now now um Bedrock you can create these um smaller student models and what that allows you to do um is to take this really big model and create a smaller model that is very optimized for your task um and so the idea is
to um instead of and we talked about this in in in the previous um system we ran but maybe we we can mention some of this uh these models can be really really big right they're they're fairly huge um and so what um the the and and a huge model means more latency it means more resources it means more cost and so this idea of creating smaller models uh distal models in in various ways you could do this in in many ways you can you can prune you can uh teach uh a student model you
can um you know do different things uh the idea is to reduce cost and latency and resources uh by doing that and there's various ways we can do that we can potentially I can potentially put um some resarch for you as to like the different ways that is done so I I think what I'm seeing here is that dist uh distillation uh distill could suggest that it's this teacher knowledge transfer thing where you have a more intelligent model transferring to a more cost effective model but also to warn that uh naming conventions are not not
necessarily said Stone so someone could put distill in front of it and the means could could could be different um and that's just a general a general a general um I think a general warning that we might suggest is that um you everything you have to look at you have to look at see what it does um regardless of its of its uh name tag model name I think naming conventions are not there yet in terms of consistency across different teams yeah like I might I I might I might create a model and then I
decide that it's just distillation but I actually don't know what distillation means and I came from a chemistry yeah I came from a chemistry background I think the the world distillation means to to take the essence of right to to you know when you distill a chemical you're really um taking all of the impurities and and getting the the essence of that chemical um see I would I I would just make a model smaller somehow and then I would put distill in front of it and everyone would be like what are you doing Andrew but
I'd be like I don't know I thought that's what it was um uh yeah so then I put this while bird is an older model still used for Ubi Baseline in natural language processing that's specifically what um uh Wikipedia said so they were saying that I is I guess a comparative for when you're doing NLP experiments to use it as a baseline I don't know what they mean by that I just know that they say that so I'm not sure how you use it as a baseline but that's what they say I'm not sure I
think the biggest thing in my mind like we said is that uh it is an encoder model only so it's great for when you have text that you want to do a very specific um task on you want to reach a particular conclusion um and that's where these encoder only models come in uh and so here I have an example of Bert using sentiment analysis so here we're using the hugging face pipeline uh we're using the Bert Bas uncased uh and then we have a couple sentences we uh and then uh it I guess what
it does here is it downloads the classifier model I'm not sure how you like to work with models and um I'm not sure if hugging face is too high abstract for uh uh for what you want to do a dayto day but as far as I understand what's happening here is that when you use the hugging face Pipeline and you say sentiment analysis even though we're specifying The Bert based uncase model it's going to download a version of The Bert based uncase model that is fine-tuned for sentiment analysis I think that's what's happening here um
as that's what the documentation kind of suggests um but yeah this is an example here of Bert and I don't know that's that's all I got um but I'm not sure if you have any thoughts on on the code or if there's anything to to talk around the code for for pre-training models um or to further further train models sorry well I mean I think going back to the reinvent uh keynote from uh uh Verner um the idea of simplexity like everybody want the simplexity remember that that was a really good talk and this idea
of abstracting for you so I think there's a lot of all of these like the the fact that you can do this with a huge model with a few lines of code is really impressive but that what also that what that also means is that there's a whole lot of complexity under the hood and the way you need to deal with these is to actually understand how the system works how the API the all of these libraries work what this provider um deals with and so I the way I like to work is to stick
to a single provider or or at least like a few providers so that you can understand kind of their way of working and and how to look at their documentation and and and do all of that why because again there is a huge amount of complexity that is kind of removed and abstracted for you and you can just call a few lines of code and while that's sounds great um you to really understand what is going on you really have to peel the the the lid off a little bit and understand and so if you
kind of get used to um particular providers then you can understand how their documentation work where where to find particular information uh where to read particular things um if that makes sense so so that word simplexity it it sounds like it almost sounds like they're the opposite but um but the idea is like you have a complex system but they simplify it for you and so you you under you understand these complex tools through through this simple simple interface so you are doing complex things but things are abstracted away from you exactly like look at
that is that enough like the amount of things that happen like for it to understand that sentence there's a process of tokenization it's It's cutting these into tokens and then it's embedding them and then it is um you know running them through a mathematical system and then it's and then it's bringing them back to to a label so there's a whole lot happening and you write two lines of code right it's Prett but as the person that's using it I should generally understand what's actually happening um like the love like the complexity and appreciate how
much is being um abstracted away for me um or or the fact that there are cases where yes this is good in this use case but you might have to go and do this more at a lower level like the same thing with high High Lev abstract languages and lower level abstract languages is that um come's flexibility or come come more complexity will get more flexibility um yeah yeah but um yeah that's Bert and uh I don't know do you think uh they named it Bert because of Sesame Street character like somebody had a kid
somebody had a kid and they're like was Bert and they're like let's just make it let's figure out a way to name it Bert it it's very possible I you know I worked in labs for a long time and a lot of times we did reverse engineer the name to be the acronym so yeah very possible so so so you're part of the problem is what you're tell I mean marketing you have to it has to be easy it has to roll off the tongue it has to well I I appreciate your uh your expert
input on Bert here I definitely understand it a lot better I I hope that people that are watching when they hear me talk they don't realize how much information that is missing from what I'm saying and I'm just trying to give them the most practical route the best way understand it kind of like Chef Ramsey where uh um sometimes he'll have videos and he'll talk about the process of something it's totally wrong but the outcome of what he does is correct and so uh that's what we call a practitioner someone that's good at doing someone
but doesn't necessarily know exactly how they're doing it and gives inaccurate uh explanations and so folks just need to understand that as I'm making content here and that's why I have you here to expose the lack of knowledge I have um so I appreciate your time and we will see Rola in more videos and we're going to continue on with the course [Music] chowchow let's take a look at sentence Transformers also known as sbert I imagine it stands for sentences Bert is built on top of Bert it creates a single Vector for an entire sentence
and when comparing similar sentences this is much more performant than simply using Birch which will look at every single word these things are turning things into vectors and so if every single word has to be a vector that's going to add up very quickly and so sentence Transformers solves the limitations of Bert specifically when you are dealing with full sentences when you care about comparing sentences if you are working with a workload that is on a token per token basis that maybe Bert is better so it's not that uh sentence Transformers is better it's just
based on your use case but here's an example of us using sentence Transformers and um so here we're just importing sentence Transformers and uh I'm trying to get my pen tool out here but what you'll see here is that we are pulling in uh a a very specific pre-trained model and um I'm not exactly sure what all mini LM L6 version 2 does if we went onto hugging face we could read about it or the documentation on um esper.net but the idea is that we have these two sentences we're going to encode them into embeddings
and then we will be able to compare them so sentence Transformers is a really useful library because it allows us to do embedding semantic search retrieve and rerank clustering image search uh and more um sometimes you'll come across models and they expect you to use sence Transformers for the embeddings so it's good to know uh when a model is actually using sentence Transformers or not okay but there you go [Music] hey this is Andrew Brown I want to get you some handson uh uh skills or knowledge with sentence transformers this is something that just kept
coming up as I was learning about embeddings and so I figured we should learn the basics of it I'm not the best describe all of its use cases for sentence Transformers maybe I'll bring in an expert here that can help us understand a bit better but um I do think that there it is worth our time to go try to make sure that we know how to utilize this and run it so let's go ahead here and explore so I'm going to need some kind of environment to run this in and I don't think I
need anything too complicated for this um so I'm just trying to think about where we could run this um maybe Google collab might be something that we could use so we'll go ahead and type in Google collab there's a lot of places where we can use notebooks and today I'm going to go ahead and use uh this one here so I'm going to go ahead and just sign in so just give me a moment to sign into my account just a moment here all right so now I've logged in and let's go ahead and create
ourselves a new notebook so in the left hand corner I'm going to go ahead and create a new notebook and we'll just give that a moment now if you have to create an account um I've already created one so I honestly don't remember the setup for this but I'm sure you'll be able to figure it out um and so here on the left hand side we have our our coding space and let's go ahead and see if we can get this to work before we do let's just rename this to like sentence Transformers uh basic
okay and then I'm going to go back over to here and let's give this a go so the first thing we'll need is the sentence Transformers so go ahead and import that now the thing is is that um I don't know what this environment already has in it so it might already have some things pre-installed and one way that we could check cuz we're trying to import stuff here I'm just trying to think here there's a way for us to list um things I'm wonder if I go like uh pip uh maybe it's PIP show
or pip list let's see if that works I don't recall what the command is but we'll just say like show all um uh python libraries installed there's definitely a uh PIP show command it is PIP list okay so we'll go back over to here and so right away we're getting a large list and so one thing I'm wondering is like what do we have if we go over to here do we have sentence Transformers so I'm going to go and search for it and so I don't see it in our list um we could also
do this say uh uh this and see if we can get that exact version but this might not be pre-installed and that's totally fine if it's not but it is and so we have version 3.21 so that is interesting there I'm going to go ahead and just continue on um I don't know what the latest version is but sometimes you might want to check this I'm going to go and type in sentence Transformers I'm going to assume this is on GitHub which is over here by you UK laabs If we drop down here we can
see tags and we have 3.31 as the latest version this one's on 3.21 sometimes you might want to check what the difference is I'll go over to releases here okay and so we have some uh changes massive CPU changes speed up with open V that's really good um so yeah you know the question is should we use the latest one I don't know uh it really depends if this stuff will work on the current version uh but I'm going to just give it a go but notice here it says load a pre-train s Transformers model
so it looks like it's bringing that model in from somewhere since this is maintained by hugging face I'm going to assume that this is something that is available here on hugging face let's go ahead type that in and so it's all the way over here so here we have this is a sentence Transformer model that map sentences and paragraphs to a 384 dimensional dense Vector space and can be used for tasks like clustering semantic search so it seems like you know sence Transformers has a bunch of models that it can load or variance on its
original model um but if we go over to here I'm just curious can we see more yeah we have embedding models parallel ascendance data sets so so yeah looks like there is a bunch of um variants under here of embedding models that we can utilize but we'll stick with the default one here I don't know enough about any of these to know but it might be curious that like law GPT sounds interesting sounds like it has something to do with like Law data but so far we are seeing um uh that is Chinese text yeah
that's not Japanese I don't see any uh Japanese text here but yeah I'm not sure what about this one but just kind of curious but anyway so we're going to bring this one in here okay I'm going to go ahead and hit play it says name sentence transform is not defined uh that's fair because we didn't bring in the uh the definition so we'll go back over to here I'm going to go ahead and add a new code block and we'll paste it into here I'll hit run here we'll give it a moment okay and
now it should be imported um yeah we'll just wait a moment I'm not sure why it's struggling so much here but oh there we go and so we'll go ahead and bring in this model now I don't know how large this model is but I'm pretty sure embedding models aren't as large as a full large language model so I think whatever compute we have here will be fine we go up here to the top we can see it says Python 3 Google compute engine back end um does it tell us anything else I don't know
but like you know it's always good to know what kind of compute you have and I always forget what Cod laab has so cab um Google compute free what is it right including gpus tpus but what does it provide I'm not sure and here they say like what exactly is the compu this is two years ago so obviously probably out of date um so I'm not seeing at least I'm not seeing any clear answers as to what it is I know when we're using uh sagemaker it's very clear sagemaker Studio Labs it's very clear what
it is underneath but that's totally fine I think this will work here in Google Cod lab there's no reason why it wouldn't so we have some sentences so the sentences we want to encode I'll just bring these up here a bit larger we'll go ahead and copy this text we'll bring it on down and I'm going to go ahead and paste this in here and we'll go ahead and hit run the next thing I want to do is I want to go ahead and bring in our uh the actual calculations I think this is yeah
this encodes the sentence to embedding so this will actually give us the embeddings and so I think shape is us actually literally seeing the numbers so we have three and 384 and this one is doing a comparison of embeddings so I think this is comparison all other embeddings so we have three so we have a grid of nine so it's comparing it's basically making a matrices of comparisons of them all okay we'll go ahead and run this great so we have our our comparison so you might be asking okay we did this why did we
do this Andrew Well the idea behind embeddings is that we eventually want to store this in a vector store right so that we can find things that are similar and this is specifically creating embeddings for sentences right so sentences that are compar terrible and down below we are simply doing model. similarity to say which ones are similar and in a sense This Is Us doing kind of like the search or fetch against let's say if it wasn't a vector store it's not but here we are basically performing both the embedding and also the comparison to
then retrieve uh things that are similar so hopefully that is uh clear enough there's clearly a lot more that we can uh do here but I think that this is sufficient I just want you to get some exposure um to sentence Transformers and you can see that it does a lot of things so they say it can do Computing embeddings uh semantic textual similarity semantic search retrieve rerank a lot of different stuff okay and then it talks about the pre-trained models down here which I didn't really explore earlier but you can see there are some
uh that are here but hopefully um it's very clear sentence Transformers is for creating embedding specifically for sentences and so this example sample is done and um I'm not sure if it's necessary to save I'll just save that there and uh yeah we are 100% done here and hopefully you learned a little bit about sentence Transformers [Music] okay let's take a look at what a perceptron is so a perceptron is an algorithm for supervised learning of binary classifiers invented in 1943 and then the machine was built in 1957 so the Mark 1 perceptron which is
the name of the machine um it was able to do some form of image recognition uh what that would be I don't know I I wasn't able to extrapolate that but you can see all of the interconnected uh work just kind of like the human brain would have where you have these uh connections and layers and so this is kind of where the idea of a um of a neur Network you know came from and the fact that it's so old just shows you that we've been doing ml longer than you think but yeah hopefully
that lays the ground of of the word perceptron but we'll take a look now at a perceptron [Music] network all right so let's take a look at a basic perceptron Network and you might be saying why are we so interested in this very old um type of network it's not old this is neural networks they are perceptron networks um so you know just as goes to show you that the concept is not new it's just that we have now scaled it and we have a lot more compute and we're not connecting everything by hand right
so a basic perceptron has an input and output layer each layer contains a number of nodes nodes between layers have established connections that are weighted so here is that example the amount of nodes in the input layer the input layer right I'm going get my pen out here over here is a determined by the number of dimensions of the inputed vector what does that mean the number of dimensions of an inputed Vector so a vector remember our our graph we're taking a DOT and putting it somewhere so if you had a graph um or a
vector space that had an X and A Y then you have two inputs for the node right you have X and Y and it doesn't have to be X and Y it could be different kinds of values but that's the point there okay so the input layer is just connect ction points okay this input layer nothing that this layer does will modify the data okay just the starting point for it so the amount of nodes in the output layer is determined by the application of the neural network so if you have a yes and no
classification uh then you would only have one output node because you just want to know is it yes or is it no is it zero or is it one so it would not matter if there was a thousand input nodes but if your classification is yes or no you only need a single node for that right the output nodes and other layers can modify and compute new values based on the inputed data okay and so data moving between nodes are uh are multiplied by the weights right so that is what a weight does it it
affects uh the the strength or the weakness of the number of what you want to adjust it for the weights will be modified during the training process to produce a better outcome so hopefully that is clear but the only thing that's you don't see here is those hidden layers those additional layers but anyway we'll move on to now talking about how the algorithm of the actual uh um neural or the neuron works [Music] okay let's take a look at activation function so when data arrives to a node that that can perform a computation all arriving
inputed data is summed and then an activation function is triggered so the idea here is you have uh let's say you have two um uh nodes and you have connections to the the out the output node notice that it's summing that is the mathematical symbol for uh a sum and then we have a mathematical um uh symbol for a function right so it's going to sum it and then trigger the activation function so the activ ation function acts as a gate between nodes and determines whether output will proceed to the next layer the activ activation
function will determine if a node is active or inactive based on its own output which could be a range between zero to one to negative one to zero and there's all sorts of activation functions you can put in here um and this is not the full list and depending on if you're watching a a beginner like because I'm going to have this video in more than one course so if you're in a beginner course we will not show you uh the types of activation functions like literally how they work but in a more advanced ml1
we will because you will want to know them there so just understand that um you know if you don't see exactly what these look like it doesn't matter right now okay so we have linear activation function so it can't do back propagation um that's what linear activation functions can't do so here it just passes along the data then we have nonl linear activation functions so can do back propagation can stack and have many layers here we have binary steps so if greater than threshold then activate we have sigmoid used in binary classification susceptible to to
the vanishing gradient problem these are things again if you are doing real ml with me here then we will talk about them if you don't see it in the course it's because I'm trying to make things easy on you okay we have tan or tan I'm not sure how to pronounce it this is a modified skilled version of sigmoid still susceptible to the vanishing gr uh gradient problem which is something we really want to avoid uh reu again I don't know how to say it properly uh mostly and we're missing an L there nobody tell
me that okay mostly commonly used uh activation function will treat any negative value as a zero we have leaky relo this counters the dying relo problem with a small slope of negative values param relo so type of leaky relo where the negative slope is fixed at 0.01x exponential linear unit similar to relu no dying Ru problem saturates negative large numbers we have switch this is an alternative to uh to the reu by the Google brain team max out used in a in a max out layer choose the output uh to be the max of inputs
inputs soft Max this is something you'll see a lot if you're looking at architectural diagrams like if you look at the Transformer architecture look for the word softmax you'll always see these near the outputs converts these outputs of probabilities for the multiple classifications so yeah you know I might cover these or we might not uh based on that course but anyway uh that is the activation functions [Music] okay all right so we're taking a look at activation functions the first being the linear activation function it is also known as the identity function it's a straight
line as you can tell here the model is not really learning it does not improve upon uh the error term it cannot perform back propagation it cannot stack layers only ever has one layer this means your model will behave if it's linear so no longer handle complex nonlinear data uh the range is that it's Unbound so it's infinite it's derivative one what you put in is what you get out um so you know why would you want to use this I think that it's used for inputs um because you know if you're just passing something
along then that's totally fine there but if you had multiple hidden layers with this it's not going to be very useful but there you [Music] go let's take a look at binary step activation function so this function will either return 01 if the value is zero or less it will return zero the value is greater than zero it'll be uh it'll be one and that's why it's called a binary step function because it's clearly in one place or or the other it can only handle binary classification so on or off or true or false it
has a range of zero or one it is bound so it's not infinite it's one of the earliest used activation functions not used much today but you know when we were looking at that example of like uh producing yes or no you could see that this would be the activation function on the output function right because that'd be very clear but you can see this is very very simplistic [Music] okay let's take a look at the sigmoid activation function which is a logistic curve that resembles an S shape so there it is it can handle
binary multi classifications so think Cow Horse pig as we are looking at multiple types of classific classification we can now stack layers uh we have ranges between zero and one it tends to bring the activations to either side of the curve with clear distinctions on prediction one of the most widely used functions near the end of the function y responds less to X so this causes the vanishing gradient what we're talking about we say Vanishing gradient like look at this it just goes in advance vanishes into the gradient that's what it's talking about the network
refuses to learn further or is distractedly slow so if values are over here then you're going to run into some trouble so sigmoid is analog meaning almost all neurons will fire be active activation will be both dense and slow slowly and costly so think about that um binary step because if it's binary step it's either on or off um because remember that the the purpose of it is that if it's zero it's not going to pass data along if it's one it is so because this it I mean it it could technically be zero but
like even if it's here it's a little bit on right it's always on or it's it's like really on or it's teeny tiny on right so um there you [Music] go all right I want to admit something that's really embarrassing but when we initially listed out those activation functions I think I swapped the h&n so I called it tan H when it's just ton and that's why I was saying ton before because I'm like in my mind I knew it was T but like the H was off so I said tan8 so I do apologize
for that but it is ton and is the same as a sigmoid function but it's scaled and it's made larger so looks really really similar so it can handle binary multi classification because it's analog just like the other one we can stack layers we have ranges between ne1 and one the gradient is stronger so it has a a steeper curve it still has a anage ing gradient problem like the sigid um but versus ton and sigmoid is based on your use case so ton can assist in to avoid bias ingradients ton can outperform sigmoid so
you know it's depends if you need to do it or not [Music] right let's take a look here at REO so reu stands for rectified linear unit activation function where the positive axis is linear and the negative axis is always zero so it looks like that and again just remember the point of activation functions is that it's either on or off or always on to to some degree or not um so here the range is zero to infinite so we have a positive axis that is Unbound um so with sigmoid and T ton it fires
almost all the neurons and this leads to things being dense remember we said dense as in um there's it's adding more information as it goes as opposed to being the same or less it's slow it's costly um so the uh reu is Will Will sparsely trigger activation functions because of its negative axis gradient being zero so you have um you know if something is really low it's going to be zero it's not going to um be a teeny tiny bit on it's less costly but it's more uh efficient so it's a lot faster the negative
axis with a zero gradient has a side effect called the reu dying gradient so the gradient will go towards zero and will be stuck in zero because variations adjusting due to input or error will have nothing to uh nothing to adjust to so the nodes essentially die [Music] okay let's take a look at leaky REO activation function so leaky rectified linear unit activation function is where the positive axis is linear and the negative axis has a gentle gradient closer to zero do you notice that every time we look at one of these it's trying to
solve a problem and and try to be better so hopefully you're seeing that as we go through these activation functions so is similar to the reu but it reduces the effects of the REO dying gradient it's leaky because the negative axis leaks which causes some nodes not to die uh we have uh also paramed REO which is leaky uh REO where the negative slope is 0 uh or Z negative 01 we have Ru 6 uh where we have Ru where the positive axis has an upper limit so it's not infinite uh so the idea here
it's bound to a max value [Music] okay let's take a look here at exponential linear unit also known as elu it has a slope towards a negative one axis it has a linear gradient in the positive axis so that's what it looks like kind of like um uh what was the last one I pr forgot it was called but uh you know the one where it was zero in in the uh One Direction there but anyway so something between yeah reu and and leaky reu um so elu slope slopes towards the negative 1 negative value
it pushes the mean of the activation closer to zero meaning activation closer to zero causes faster learning and convergence uh elu avoids the dying uh elu problem it saturates for larger negative numbers so everything is a trade-off with these things okay let's take a look at the swish activation function so it has a slope that dips and eases out to zero in the negative axis it has a linear gradient in the positive access so kind of looks similar but like a little bit different Swit was proposed by the Google brain team as a replacement for
reu it's called switch because of its swishing dip it looks similar to relu but it's a smooth function it never abruptly changes Direction it it is non monotonic so it does not remain stable similar to relu we'll have sparity very negative uh very negative values will Zero out there are other variants in the Swit family so we have Mish hard Swish and hard [Music] Mish let's take a look at max out so this is a function that uh that will take multiple inputs and it will select the maximum value and return the value so um
the max out is a generalization of relu and the Leaky relu functions max out uh neuron would have all the benefits of reu neurons without having the dying REO max out is uh is that is expensive as it doubles the number of parameters for each [Music] neuron all here's our last one the soft Max activation function this is uh it will calculate the probabilities of each class over all possible classes when used for multiclassification models it Returns the probabilities of each class and the target class will have the high probability uh the calculated probabilties Pro
probabilities will be in the range of zero and one the sum of all probabilities is equal to one softmac functions is generally used in multiple classifications on the output layer so again I said if you look at the Transformer architecture which probably is in this course you will see it there and you'll see it in other ml models U diagrams for sure you can only assign a sing single label to a probability for this [Music] okay let's define a algorithm and a function so an algorithm is a set of mathematical or computer instructions to perform
a specific task and an algorithm can be composed of several smaller algorithms you're basically saying how do you do something that's what an algorithm is right how are we going to do something um so I want to take a look here at the K nearest neighbor KNN algorithm which can be used to create a supervised classification machine uh learning algorithm so tell me who are your closest neighbors and we will infer that uh that I can be considered of the same class so within KNN you can use different distance metrics uh such as uh idian
Hamming uh minski Manhattan so there's all different ones that you can utilize um a function is a way of grouping algorithms together uh so you can call them to compute a result so sounds like a machine learning model right where you have a grouping of algorithms so you know look at this k and end just here for a moment because we do uh see this happen a lot but K nearest neighbor is just like how close am I from here to here to here to here it's literally in the name how who are my nearest
neighbors okay so KNN itself is not machine learning but when applied to solve machine learning problem it makes it a machine learning algorithm [Music] okay let's take a look at what a machine learning model is but before we do that let's define what a model is in general terms so in general terms a model is information representation of an object person or system models can be concrete so they have a physical form think a design of a vehicle a person posing for a picture then you have abstract so Express as behavioral patterns think mathematical computer
code written words so what is a machine learning model then an ml model is a function that takes uh in data performs a machine learning algorithm to produce a prediction the machine learning model is trained not to be confused with the training model which is learning to make correct predictions uh an ml model can be the training model that is just deployed once it has been tuned to make good predictions so normally you'd have training data let's say labeled data and here you are going to have your learning algorithm and you're going to put it
through training so that's your training model and then you have hyper tuning where you are continuously tweaking the model to get it to where you want it to be okay then once you deploy the model that is your trained model your machine learning model which can go and produce predictions um and from here you could then provide it unlabeled data because you know its goal is to make predictions and that could be labeling data or doing other things okay and we call uh uh uh the interaction with the deployed machine learning model inference right so
when you are inferring something you are providing you're providing data and saying hey can you uh make a prediction for me and that's what inference is [Music] okay so let's take a look at what a feature is so a feature is a characteristic extracted from our unstructured data set that has been prepared to be ingested by our machine learning model to infer a prediction so ml models generally only accept numerical data and so we prepare our data into machine readable format by encoding which will visit later in more detail um so let's talk about what
is feature engineering so feature engineering is the process of extracting features from our provided data sources so imagine you have your data sources which you have then your raw data you're going to clean and transform them into features turning them into machine readable format information for your machine learning models and then you know you go from there okay [Music] so what is inference inference is the act of requesting and getting a prediction and when we're talking about in the context of machine learning we're inputting data into a machine learning model that has been deployed for
production use to then output a prediction so imagine our raw data is a banana and we tell we say tell me what this is to the machine learning model it's going to bring back information so it's saying it's a yellow banana and it has a confidence score of 0. .9 so if we talk about the inference textbook definition it's steps in reasoning or moving from premise to logical consequence but I think that it's easy to remember as the act of requesting and getting a prediction [Music] okay let's talk about parameters and hyperparameters so a model
parameter is a variable that configures the internal state of a model and whose value can be estimated the value of parameter is not manually set and will be learned outputed after training parameters are used to make predictions then we have model hyp or model hyper parameter this is a variable that is external to the model and whose value cannot be estimated the value of the hyper parameter is manually set before the training of the model hyperparameters are used to estimate model parameters and so we have things like learning rate EPO and batch size and here's
kind of a diagram hopefully it helps make sense but imagine you have a variable and you want to input it into your model right and we'll just make a box here to indicate that this is the model it's going to go into layers right and we'll talk about this again later on but uh parameters are the connections between uh nodes okay so the idea is that this will have a variable or a value and it will have a weight and those are those internal State those parameters okay so hopefully uh that is very clear there
because the idea is that when you want to uh utilize something for training right you're going to pass um V like a a Content or variables it's going to go through all those layers and then all these connections have to be set these parameters of these connections have to be set so you get the result that you want to get so hopefully that is clear but we will cover it again um if it's not clear later on [Music] okay hey this is Andrew Brown let's take a look at responsible AI specifically for ads and often
you'll see like a list of things like fairness explainability privacy and security safety controllability veracity robustness governance and transparency so this is the one that adab us defines other ones uh like Microsoft and other people have similar lists so they're more or less the same but for the exams for the AI practitioner they might give you a list of these so you might want to remember those key terms let's go ahead and see what we have in terms of resources for responsible AI here so we have model evaluation on Amazon Bedrock we have Amazon sagemaker
clarify we do look at that later that's for explainable AI to determine what's going on there and again we have guard rails we have a lab on that so we look at that we have clarify again clarify again model monitor which is more about monitoring the degration of a model we do talk about that Amazon augmented AI that is a human reviewing the end points so all these things are covered um yeah it doesn't look like they have a whole lot here we'll see eight a service cards provides transfering document intended use cases for fairness
so I know Microsoft has something very similar um but uh yeah I guess they're just down below here not super exciting to be honest yeah you got a bunch of stuff you can read through so you can see how they're being responsible with it I guess and yeah so nothing super super exciting here but um yeah I guess clarifies their big thing here remembering this list [Music] okay all right let's compare AI versus generative AI first starting with AI so what is AI it is computer systems that perform tasks typically requiring human intelligence this includes
problem solving decision making understanding natural language recognizing speech and images um and the ai's goal is to interpret analyze and respond to Human Action to simulate human intelligence in machines and I create a large emphasis on the word simulate because it's not emulating so simulate is where we are mimicking aspects resembling the behaviors of humans or other things emulation is when we're actually replicating exact processes in machines it's the real virtualization of the the human mind and you could say that's pretty much AGI so the artificial general intelligence but um the point new is that
AI is a simulation not an emulation AI applications are vast includes areas such as expert systems natural language processing speech recognition Robotics and more um for Industries it's across the board if you're talking about B Toc everyone's probably experienced a customer service chatot I hate them but that is probably the number one use for uh gen or AI uh we have e-commerce so recommendation systems so think like using Amazon on and maybe it's using AI there and you're not aware of it automous driving vehicles um Medical Diagnostic so lots of verticals there for different Industries
let's talk about generative AI so generative AI is a subset of AI focusing on creating new content or data that is novel and realistic it can interpret or analyze data but also generates new data itself it often involves Advanced machine learning techniques so we got our Gans our vaes our GPT like the transformer models and we're going to be talking about Transformers not uh Autobots and Decepticons but actual uh architecture for llms I really should throw in some uh some of those if they weren't copyright I would have them in there uh generi has multiple
modalities and if you've never heard the word modalities think of it like senses you know you have Vision you have touch you have taste things like that so for jni we have uh things for vision for text for audio right uh and there are some odd ones like molecular so something that generative ey can do is it can do a a drug Discovery via genomic data hopefully I'm saying that correctly um and a lot of people associate generative AI with large language models which generate out human like text and is a subset of gen but
it's often conflated with being AI due to it being the most popular and developed and it has strong correlation to the text modality because they usually do that but um large langu of models can be multimodal uh meaning that they can work across multiple modalities but it's mostly text so let's just sum this up and make sure we know the distinction between Ai and gen so AI is focused on understanding decision- making gen is about creating new and original outputs that doesn't mean that J can't do the the former but it has that added benefit
of generation for data handling it AI analyze analyzes and make decisions based on existing data gen uses existing data to generate new data and unseen outputs for applications um AI is generally more applicable because you know it just is whereas gen is very focused on Creative uh uh Innovative generation of synthetic stuff um and that M threw me off there it's not supposed to be there but I think you understand the distinction between the two and there you [Music] go hey this is angrew brown and we are taking a look at what is a foundational
model so a foundational model is a general purpose model that is trained on vast amounts of data we say that a foundation model is pre-trained because it can be fine-tuned for specific tasks so just remember we are training the model and we're going to say that it's a pre-trained model but imagine text images video structure data all sorts of data massive massive amounts of data that is going to produce your foundational model and from that foundational model it could do all sorts of things prediction classification text generation it might be limited to very specific things
that it can do but the point is that it can do a lot of things um and the reason I want to bring up foundational model is because you hear it a lot when we're talking about llms and it becomes a bit confusing how to distinguish it between llms and a foundational model but llms are a specialized subset of foundational models they are foundational models that uses Transformer architecture so if you remember that that llms are foundational models but they're specifically using the Transformer architecture then that will help make a whole lot of sense [Music]
okay let's talk about large language models this is going to be short even though there's a lot you can say about it I just want you to remember a key thing about large language models so a large language model is a foundational model that implements the Transformer architecture and we're going to spend a bit of time learning about the Transformer architecture in upcoming videos but uh the idea is that um you have natural language I can get my pen tool out here so we have natural language as our input it goes to the large language
model it predicts uh output for words and as it produces each word it feeds it back back in and continues um to produce until it is done so during the training phase the model learns semantics or patterns of language such as grammar word usage sentence structure style and tone that's what makes it so good at at uh interpreting uh uh language and giving things that sound with uh language understanding because it has that ability to um understand the semantics of language it would be simple to say that llms just predicts predict the next sequence of
words because as you use the model it outputs a word on the end of it and keeps feeding in and in and in and in until it's done but the honest truth is researchers do not know how LMS generate their outputs because there are so many layers um and there's so much going on there that at this point right now the level of complexity makes it very difficult to truly understand how it is reasoning its output um but it looks like it's just doing word for word but there is a bit more to it okay
but there you [Music] go the Transformer architecture was developed by researchers at Google that is effective at natural language processing due to multi-head attention and positional encoding and here is that architecture it comes from that white paper attention is all you need because that is the special mechanism that it is utilizing to pull off the Feats that it is doing um I try to remember what came before it it was like cnns and RN so convolutional neural networks and uh recurrence neural networks and recurrence neural networks could kind of do what Transformers do um but
they just had an issue with scaling and being able to remember everything that they were looking at and so this architecture found a way to do that and that was with positional encoding and multi head attention how important important is it to know this architecture um it's good it's nice to know so you get a bit of an experience in terms of what's going on there but to be honest working with um llms constantly it's just like you kind of forget about this and so it doesn't really inform any of your workflows or decisions I
guess it's just more like by looking at this uh you have more confidence at reading white papers right and and looking at some of the stuff of these architectures so that's why we're looking at it but Transformers architectures are made up of two components or two parts we have an encoder and that's you get my pen tool out here so it's very clear what we're looking at but it's this thing here right so that is uh our encoder I'm just going to erase that there and so you can get the idea that the one on
the right is going to be our decoder let's read about what the encoder is so reads and understands the input text it's like a smart system that goes through everything it's been taught and picks up on the meanings of words and how they're used in different context so that's the high level and then the decoder based on what the encoder has learned this part generates New pieces of text it's like a skilled writer that makes up sentences that flow well and make sense and uh as far as I understand that once you're uh you put
your data in here it comes through here right uh and it has to be already embedded and then once it goes through here it's uh it's going to Output uh that um that stuff and it's going to go into here and then each word as it iterates through it's going to go through here each word is going to go here it's going to produce your sentence with the next word and then it's going to go all the way down here and then add the next word and then feed all of this back in and again
and again and again so it's just going to keep looping until it runs out of um ability to write or it decides to stop um and there are very specific components that we're going to look at the multi head attention in the positional coding so we didn't really describe them here but there they are and you'll see them up close here in just a moment [Music] okay so tokenization is the process of breaking data input and in most cases text into smaller parts so here on the right hand side imagine you have a string and
you're going to break it up into its parts uh which we represent as an array here and then we're going to give it a unique ID to the models VOC vocabulary so when we're working with llms you have to tokenize inputs and depending on what llm you're using it's going to use a different tokenization algorithm so for example if you're using GP uh gp3 you'd be using bite pair encoding if you're using Bert you'd be using word piece if you're using Google T5 or or or GPT 3.5 you use sentence piece you won't really notice
this when working with llms especially if you are uh utilizing something like olama or managed service because um these apis are taking care of this um algorithm for you so you just input it and it works there but when you're working with um llms the input text must be converted or tokenized into sequence of tokens that match the models internal vocabulary what are we talking about when we say internal vocabulary well when an llm is trained it's creating an internal vocabulary of tokens of all the stuff that it knows right because if you consume uh
the the world's knowledge uh you want to take all that knowledge that text break it down into all its unique components tokens and then assign a value to it and so these large models could have between 30 to 100,000 tokens it could even be more than this or less depending on your model but tokenization is very important so that it understands what's going on here there are some things that we could talk about like what happens when it uh uh encounters a token it doesn't know but for the most part um this is tokenization that
you need to know [Music] okay let's talk about tokens and capacity because it really matters um about how much you can produce so when using Transformers the decoder continuously feeds the sequence of tokens back in as the output to help predict the next word in the input so what are we talking about here so here imagine we have our input as the quick and so we we feed into the encoder the encoder is going to produce um semantic context so that the decoder knows what to do with that text and then the decoder is going
to Output the next word this is the quick brown and what it does is it feeds that sequence of tokens back into the decoder and produces the next word and again and again and so the question is what is the capacity required to run this and so there are two components that we care about memory and compute so for memory each token in a sequence requires memory so as the token count increases the memory increases the memory usage eventually becomes exhausted and you cannot produce anymore okay so now for compute models uh a model performs
more operations for each additional token the longer the sequence uh is is then the more compute is required so a lot of AI servic that offer modelss of service will often have a limit a combined input and output because it really has to do with the uh the length of the sequence so if you have a huge input then you're not going to be able to generate a lot of words because you're going to hit that uh sequence token limit um a lot quicker so hopefully that makes it very clear about how memory and compute
are uh inter time with tokens um the way cost gets down is you know they have to to figure out a way of um reducing or making the model more efficient so that's helping to reduce the memory compute um there's other things you can do so if you have a conversation it gets too long what you can do is summarize the conversation um and feed it back into there so it it doesn't exactly use all of the context of what it had before but it can do something similar and help that conversation along okay [Music]
so what are embeddings well before we can answer that we need to answer what is a vector so a vector is an arrow with a length and Direction um that is the simplest explanation if you're talking to a mathematician they're going to have a more fancier explanation but the reason why this matters is that a vector needs to exist in a vector space um and so what is a vector space model it represents text documents or other types of data as vectors in a high dimensional space so right now we're uh only looking at a
2d axis but in reality this would be in at least a 3D access but the idea is that we have these documents and these documents represent some form of data and they are plotted onto our into our Vector space uh with distances between them okay and the thing is the distance between these other documents are going to correlate the relationship with them so maybe these documents up here I'm going to get my pen out here maybe all these things have to do something with um let's say vegetables and these ones all have something to do
with um uh let's say meat and this is dairy products over here so the way these things are organized on the um in the vector Space is really dependent on the type of embedding you use so what are embeddings these are vectors of data used by ml models to find relationships between data and you'll find that often you're going to be using a machine learning model to create embeddings and there's specialized um U machine learning models just for embedding so you'll see something like coh here which is a company that produces um or creates their
own ml models they'll have like command R but it'll be like command R embeddings and what it does is it takes an input and it outputs embeddings to be placed into a vector store so different embedding algorithms capture different kinds of relationships and so it could be the relationship could be uh similarity in words in terms of the way they are spelled or it could be the length of a word uh or the the relationship could be contextual which is like um you know is is the context uh related to a specific industry or vertical
so the embedding is going to change uh the relationship that is going to be um projected into that Vector space and these um ml models that produce embeddings are looking at not just like a single relationship like let's say length of word but multiple relationships and correlating that to put it into Vector space you can think of embeddings as external memory for performing a task for machine learning models embeddings can be shared across models which uh would give us a multi model pattern to help coordinate a tasks between between models but um yeah there you
[Music] go positional encoding is a technique used to preserve order of words when processing natural language Transformers need positional encoders because they do not process data sequentially and would lose order of understanding when analyzing large bodies of text the precursor to Transformers is um RNN so uh recurrence neural networks they operated in uh sequential order so they could retain the order of words however uh it made it hard to scale and to uh remember a large amount of words uh to a point so positional coding is a way to fix that in the architectural diagram
for Transformers you'll see positional en coding right after embeddings in this architectural diagram here we have positional codings up here so the idea is we have our input it gets token ized um turned into tokens then embedded into embeddings it's going to go to the positional encoding where it inserts those uh points and then we're on to our Transformer here okay but let's take a look at the input a bit closer so imagine you have each of those words or those tokens you're going to give them a positional vector and that's how it's going to
keep track of words as it's getting mangled uh and interpreted through the whole architectural uh diagram okay [Music] let's take a look at attention so attention figures out how each word or token in a sequence is important to other words within that sequence by assigning them word weights or token weights or attention weights if you will um so I want to talk about three types of attention we have self attention cross attention and multi-head attension and some of these are combined you'll see that in a moment but let's talk about the first one self attention
computes attention weights within the same input sequence where each element attends to all other elements and when you see this it basically means that as attention happens it keeps feeding uh itself right back into itself the same sequence so used in Transformers to model relationships and sequences so words in a sequence you have cross attention computes attension weights between two different sequences allowing one sequence to attend to another sequence this is used in TAS like translation where the output sequence decoder needs to focus on the input sequence encoder we have multi-ad attention so combine multiple
self attention or cross attention heads in parallel each focusing on different aspects of the input um so useing Transformers to improve performance and capture various dependencies simultaneously so how can we look at this in a practical way for our uh architecture for the Transformers so here you can see that in blue it says multi-headed self attention it's multi-headed because it's receiving multiple inputs you see VK uh q and um I believe that the Q is like for query key is for key and V is for Value it has something to do with like how search
engines think kind of like if you were if you were to let's say use YouTube and you were to type in a query there it would match to Keys which would then return you back a value so that is the best description I can give for it it's self attention because it feeds back its own sequence it's going to be the same back and forth um there on the other side we have multiheaded uh cross attention so it's multi-headed because it's receiving multiple inputs so we have the vkq but it's cross attention because it feeds
sequences uh sequence inputs from two different sources remember it says that um cross detentions two different sequences well we have V and K coming from the encoder and then we have q which is actually coming from the decoder so it's cut off here but the idea is that the decoder is feeding itself right back into itself and it goes through here into this uh one here and then we get the que and it goes right there okay so again it's not super important to remember this stuff it's just to get you a bit of exposure
to looking at these architectural diagrams and to see that there is a way to understand them uh but they can get very involved and it might be very hard to retain that information unless you are um actually very invested in understanding and building these things [Music] okay when we're talking about large language models there's this idea of fine-tuning where if we have a model that we don't like it we can do something to it to make it work a little bit better to understand fine tuning and the ways we can fine tune it let's just
talk about the components that that are involved in fine tuning and so we have to first take a look at Hidden layers and its components so when training you have layers of nodes also called neurons so think like your brain and between these nodes there are going to be connections and so connections are often between or cross layers um but connections can also be within the same layer and that's where we get this concept of self attention if you remember the concept of attention is really important when we're talking about Transformers for large language models
and and I mean if we represent it it'd be more like it's connecting back to itself and that's why we call it self attention because it's a layer that feeds back into itself uh which is self attention Okay but connections could also uh be where we have multiple sets of hidden layers uh and these connections are computed in parallel so the idea I'm going to just draw this here but imagine we have another layer with nodes right and the idea is that this one will feed into that one but this one's coming from here and
so now we it's called multi- head attention because it's coming from multiple sources and in fact some of these they'll come all the way back here and go like this and and feed in so you know that is ways that we can uh feed our data forward uh then we have parameters so parameters are the weights of connections so um over here on the right hand side get my pen tool out again we have a weight and this weight is the representation of this connection between these two nodes and so that's going to be a
value and so a connection might have one parameter but they can also have multiple parameters most cases it's one parameter but you can imagine that for the amount of nodes that you have in each layer they're going to have to connect to all the other ones in the next layer and that's going to add up really quickly let's take a look at some um uh Transformer models or large language models and understand how many layers they're utilizing for training to get perspective so let's take a look at gpt3 so gpt3 is not new um in
fact it is one of the smaller models that you can train still um like babage or Da Vinci if you go like let's say use Microsoft um Azure AI studio and you want to do fine tuning you can train gpt3 models and it has 96 layers or it's large uh uh if we think about its parameters that's 175 billion parameters so you can only imagine how many uh nodes or connections are going on in there but that's how many there are um and then we have Bert so Bert has 12 layers or up to 24
layers so Bert is uh still useful it's a um a much simpler uh Transformer that we can utilize we have gpt2 which has between 12 to 48 layers so the same or more as Bert then you have Google's T5 which has 12 encoder and 12 decoder layers or up to 24 layers there so you know we're talking about fine tuni it's going to be tweaking the amount of layers uh the the the the ount of connections we're going to train and things like that but let's go Define what is fine tuning so fine tuning is
retraining a pre-trained models weights or its parameters on a smaller data set so a model's weights is the outputed state of a model but in this case when we're talking about fine-tuning we're talking about a trained models output okay so then what is supervised fine-tuning sftt this is where when we provide the data set it's already been labeled right so imagine we have have um a bunch of cats uh or like photos of animals and so we're labeling what each animal is so that when the um the model is training it's like it has a
cheat sheet to know how to understand exactly what it is that it has okay but so we're basically explicitly telling the model what the data is as opposed to when we train our base model that might be unsupervised uh where we're not saying oh this is what this is right because we're giving lot imagine trying to do supervised training on a huge data set like labeling all that would be very difficult so the idea is that um we will produce our base model uh first or in the case of LM the base model is the
foundational model so you're taking an existing Model A A A foundational model and then we're going to train it as soon as we have a foundational model or base model and we decide to find tun it now it's being called a pre-trained model okay so understand those terms we have FM okay base model pre-trained model they're all the same area they don't necessarily mean exactly the same thing but they represent the same thing at this place in time so we're get ready to take our base model and fine tune it so we're going to bring
in our smaller data set I'm just going to uh clear all the ink off the screen here and so the idea here is that um we're bring in that data set and now we're going to train it retrain it uh and produce our fine to model now when I say we're producing these models or we're outputting these models we're not actually outputting models we're outputting the model's weights okay we're not creating new models we're just uh creating new outputed states of the model um and just understand that that is it often sounds like we're creating
new code or something but that's not necessarily true so let's now talk about the types of fine tuning we can do because there's a lot of approaches we can take to fine tuning um so and this is not even exhausted but the first let's talk about changing the data set so the data set itself the data you're going to put in there we could do instruction fine tuning that's where we take a data set and we tell exactly what we want as uh like let's say we say I say this you do that so you're
giving an example of what a person says and what the outcome is so that's instruction fine tuning uh then we have domain specific fine-tuning that's where you're taking uh a knowled a knowledge base or a data set of specific knowledge to update the model on that knowledge or to make it uh focus more on that knowledge set right so if we had a generic LM and we wanted to make it specifically for learning cloud computing I could load it up with the most upto-date um uh cloud data or even my own stuff to make it
teach like I would teach okay then we have changing the method of training so we have full fine tuning this is where all the models weights are updated and it's expensive so you say full fine tuning we can just think of it as traditional fine tuning it you're basically taking the existing existing model models weights like after the the base tune as the starting point and running it through the training process again now you can add these two things together right you can um do full fine tuning and change the data set they're they're uh
they're they can be done together or both separately it's up to you we have parameter efficient fine tuning so also known as PFT you'll see this term come up a lot uh it only updates a small set of parameters during the training and freezes the rest of the parameters there is a subset of PF called Laura which we're not going to talk too much about here but I'm just going to get you exposure to this if you are not needing to update every sing Single parameter then you're going to save money there another way is
uh last layer fine tuning this is where you will freeze all the layers except the last layer and when we say freeze we just mean we're saving the state at that point in time right or we're telling it to skip until it gets to the last step um and then we're basically just train it on a single layer and apparently that works really well so there's a lot of things that we can another thing we can do is we can do pruning so this is where you're removing parameters all right and people might want to
do this just to make the model smaller and more efficient because um maybe we can remove parameters and it will uh use less compute or be faster um for some trade-offs and there's two ways we can do this time train time pruning so somehow we are making the model to encourage to drop or remove connections or neurons during training or post training pruning which is basically you mangling the the model weights file the the file that's outputed so yeah a lot of options here but uh there you [Music] go let's take a look at labeling
so data labeling is the process of identifying raw data images text files videos and adding one or me more meaningful and informative labels to provide context so machine learning model can learn from with supervised Lear uh machine learning Lael is a prerequisite to produce training data and piece of data will generally be labeled by human on left hand side that's an example of um Amazon recognition where it's trying to identify bounding boxes or classifying image under particular categories that's an example of supervised machine learning that requires label data with unsupervised machine learning labels will be
uh produced by the machine and may not be human readable then there's this concept of ground truth this is a uh a properly labeled data set that you use as an objective standard to train an assess a given model and is often called Ground truth the accuracy of train models will depend on the accuracy of your ground truth and so ground to data is very important uh for uh you know successful models [Music] okay let's take a look here at data mining this is the extraction of patterns and knowledge from large amounts of data not
the extraction of data itself and so the industry has this thing called Chris DM which defines it in six phases verus business understanding so what does the business need data understanding what do we have and what data do we have we have data preparation so how do we organize the data for modeling the modeling which is what modeling Tech techniques should we apply evaluation what data model best meets the business objectives deployment how do be people access the data so that gives you an idea about working with data mining okay [Music] let's take a look
here at data mining methods um these are ways that we find valid patterns and relationships in huge data sets and they're important when we're talking about machine learning because sometimes that is what the model is trying to do it's trying to find a pattern of relationship it's trying to predict that so I'm not going to read through all this because you can read through it if you want but these are terms that we've seen already like classification clustering regression sequential Association rules outer detection and prediction uh and notice down here when we have prediction it
says uh use a combination of other data mining techniques such as Trends clustering classification to predict future data which is fine but we have classification clustering regression and Association these four are going to show up again and again when we're looking at um classical models okay so machine learning models but anyway I just wanted to include that even though this is more of a data a data slide [Music] okay let's take a look here at knowledge mining this is a discipline in AI that uses combination of intelligent services to quickly learn from vast amounts of
information it allows organizations to deeply understand and easily explore information uncover hidden insights and find relationships and patterns at scale this is a term that was kind of coined over at Microsoft you don't hear about it over at Azure or gcp but it still is a good uh concept to know the other thing is that when we look at rag so that's retrieval augmented generation there is a lot of overlap with this or in many cases you can look at rag being knowledge mining um but let's talk about what we have here so the first
thing is inest then we have enrich and we have explore so inest is inest content from a range of sources using connectors to fir uh uh to first and third party data stores so we have structured data like databases csvs unstructured data like PDF video images and audio we have enrich so enrich the content with AI capabilities and let you extract information find patterns and deep deepening understanding so for managing I service we have Vision Services language Services speech services decision services and search Services now those literally ma to Azure uh AI managed services but
we're talking about AWS uh when we're talking about Vision we're talking about recognition we're talking about language um I guess that could be something like um I'm trying to remember the service that does NLP here uh I can't remember off the top of my head but for speech we have paully um for for search this could be not necessarily an AI well it could be Kendra right so there's a lot of managi services that can be utilized at that level then we have explor so the newly index data via search Bots or existing business applications
and data visualizations so here it could be used in a CRM it could be in a wrap system it could be powerbi and I didn't list it here but it could also be used to return back to an llm to interpret and then complete rag so there you [Music] go let's take a look here at data wrangling this is the process of transforming mapping data from one raw data form into another format with the intent of making it more appropriate and valuable uh for a variety of Downstream purposes such as analytics also known as data
Ming I don't know who comes up with all these terms they're crazy but there are six core steps behind data wrangling the first is Discovery so understand what your data is about and keep in mind domain specific details about your data As you move through other steps structuring you need to organize your content into a structure that will be easier to work for uh in your end results cleaning remove outliers change null values remove duplicates remove special character standardized formatting enriching so appending or enhancing collected data with relevant context obtained from additional sources validating so
authenticating the reliability quality sa uh safety of data publishing so place your data in a data store so you can use it Downstream when we're talking about adabs specifically when we're talking about data wrangling there is um Sage maker data Wrangler and then there's uh ads glue data Brew um and there's that that concept of knowledge mining that we we said that was more of an Azure concept but you can apply to ads where you could use uh managed AI uh iTab services to enrich your data but anyway this is the concept of data wrangling
it's basically the way I think of it is just pre-processing or cleaning your data for use for your ml models but there you [Music] go let's take a look here at what is data modeling and so let's answer what is a data model this is an abstract model that organizes elements of data and standardizes how they relate to one another and the properties of the real world entities a data model can be a relational database that contains many tables as the example here on the right hand side a data model could be conceptual so how
data is represented at the organization level abstractly without uh concretely defining how it works in the software so I think people orders projects relationships it could be logical so how data is presented in software so tables and columns object oriented classes physical so how data is physically stored such as partitions CPUs and table spaces we had another definition for what a model was when we were trying to describe what machine learning models are but I'm just giving you another perspective from a data perspective of uh what that could be so what is data modeling so
this is the process used to define and analyze data requirements needed to support the business processes within the scope of corresponding information systems and organizations and so there is this uh very uh complex uh uh diagram here that kind of shows you uh what data modeling could look like but uh yeah there you go [Music] okay let's take a look at data analytics data analytics is concerned with examining transforming and ranging data so you can extract and study useful information a data analyst commonly uses SQL bi tools and spreadsheets and so the workflow would look
something like this data ingestion data cleaning and transformation dimensionality reduction data analysis visualization it's really important to understand data stuff if you want to work with machine learning because machine learning is just algorithms complex algorithms that are uh predicting or forecasting things Based on data so again we're going to keep spending time learning about data stuff to help us with machine learning [Music] okay a data scientist is a person with multidisciplinary skills in math stats predictive modeling and machine learning and so you're basically bringing computer science math and statistics and domain knowledge into one thing
to be a data scientist but there's these other skills that you can see that almost kind of make you data scientist but if you're missing one one of these then you're not necessarily a data scientist but for the reason why computer science is important is that um a lot of machine learning models are based off of algorithms and so having that traditional compsite background is going to help you there then In classical machine learning this heavily relies on statistics so that's where that math and stats background is going to be very useful uh software development
skills are useful because you're going to be writing lots of python um so that is very important or if you want something to be very performant you're going to have to use lower languages um you know you need to have traditional research because you need to Source clean prepare analyze data um and make sure that it's valid uh so a lot of stuff there and to be able to build of anything of use you need to have deep domain knowledge on a specific industry or um knowledge so there you go the the definition of responsib
of data science can vary per company but will generally have strong skills specialization in one of these three so just understand that I'm defining it here and you're going to probably see some variance of definitions [Music] online all right let's do a data rule comparison just in case we are not sure what all these data rules are we have data mining so get knowledge about a particular data set and use this knowledge for learning or processing uh processing purpose data wrangling so converting and mapping data from its raw form to another format with the purpose
of making it more valuable and appropriate for advanced tests such as data analytics and Mach machine learning data analysis using existing information to uncover actional data answering questions generated for Better Business decision- making data scientists so multi-disciplinary skills and math stats predictive modeling machine learning to make future predictions data engineer focused on infrastructure and architecture of data generation and the movement of data deploying Machinery models at scale or in a distributed architecture so there you go [Music] all right so when we're talking about data sets there's the training data set the validation data set and
the test data set so the train data set is the actual data that the model is going to learn on the validation data set is just um used to validate whether the model is working correctly and specifically if we're looking to fine-tune our models hyperparameters and then you have a test data set which sounds very similar to the validation uh data set but it's here to provide an unbias Val valuation the final model after it's been retrained um so where is ground truth in here well all of these can have a bit of ground truth
um so just understand that ground truth data is data that has been labeled to being correct and your training data set all these data sets can have ground Truth uh data in them but the model doesn't know that right you know that as a means to uh test against your model [Music] okay so we're talking about Corpus and a corpus is a large collection of naturally curring texts in a structured way for analysis a corpus could be as little as 50,000 words or tens of millions of words um a corpora could be sourced from books
newspapers magazines transcripts and web pages examples of Corpus would be accompanied needs to create an English dictionary so they need a corpus of text of words to provide examples of the dictionary words or a company needs to create academic textt so they need a corpus of texts composed of transcripts from lectures and seminars Corpus are intended to be analyzed to see how a language is being used now let's talk about what Corpus Linguistics are so this is the study of languages or language and uses corpuses to perform the following statistical analysis hypothesis testing checking occurrences
validating linguistic rules all of these are within the specific language territory Corpus linguistics is the act of looking for patterns that can be Associated to lexical and grammatical features these identify patterns that are used to answer questions like what is the most frequently used word how do people use certain words how often is tic Expressions used how many words does a person use to carry conversation which words are used in a formal situation so yeah the word Corpus and Corpus Linguistics is something you'll come across when learning about machine learning so I just wanted to
make sure you knew what it was [Music] was let's talk about what is a data set so a data set is a particular kind of data item that serves a specific purpose so operations that can be performed using data in machine learning and the following data types are important to know so what do we need to know well we need to know what qualitive is so this is measured by the quality of something rather than its quantity then you have quantitative so this is measured by the quantity of something rather than its quality let's go
down um the tree on the qualitive side so here we have categorical so these are values that are labels you have discrete so this is something that countable and uh finite and only values are possible under discret we have binary so data types that only have two possible options say zero or one or true or false nominal so labels where order does not matter and ordinal so labels where order does matter on the quantitative side we have numerical values so these are just numbers you have continuous so not countable and infinite many possible values and
can be measured and underneath we have interval so a continuous value that has uh has no zero and then a continuous value that includes zero so it's not ho zero it's just no Zer uh but yeah it's just these terms are going to come up when you are reading about machine learning things like that and so just having a general idea of this makes it a lot easier because these are less of uh programmer terms and more like mathematical terms [Music] okay hey everyone it's Andrew Brown and I'm back with Rola again as we're taking
a look at leaderboards um so you know I think that something that's really important is being able to understand uh how your model compares to other models so that we can make the uh best choice when choosing our models um but I'm going to be honest with you I don't know the best way to evaluate these and so I'm hoping that Rola can help us dig a little bit deeper than what I was looking at in Prior videos um so I'm going to go ahead and share my screen and I already have a few links
pulled up here so I'm going to go ahead and share my screen making sure that it is the correct screen um and so I think we all we all can see my screen now I'm going to go yeah go side view for right now and right now I have pulled up artificial analysis. we also have uh anthropic CLA I believe this is um which model is this this is oh it's just the the family model so opon and Haiku and we also have the hugging face leaderboard and we have live bench um and so what
I want to do is first go over to a model card and talk a little bit what a model card is and then uh Rolo will promtly correct my uh my description because I I everything to me is just like like a model card basically describes how the model Works how it performs um and I don't know uh other additional information that you might want to know about the model what would your description be of a model card roller like what's the purpose of a model card yeah so the model card is a summarized uh
kind of like a resume if you will for a model so it'll tell you uh the basic architecture of the model what data was it's supposed to tell you what data is trained on um sometimes how to use it and and everything you should need for um using that model so think of it as a resume and uh now resumes can be standardized are is there some level of standardization with model cards like or or an Unwritten Unwritten uh agreement that we should always have these these benchmarks or basically people just see what other people
do and they go I like that throw mind in my model card yeah I think there's a big problem in standardization in general in the field um I think there's a common understanding that model cards are important that they should exist uh in terms of what is on those models I don't think there is General agreement there's of course best practices and and different people doing different things but uh yeah no we every I think every company is putting different things in them h one one thing that I see in model cards a lot um
and again maybe it's not standardized across it but I'll see uh these uh metrics or rubrics where they're showing different things that they've uh put their model against so I'm assuming the things on the left hand side like MML math human eval are is a data set or of uh it's like things some benchmarking that they want to perform so uh and this was the result that it had uh uh you know here and so I always see this but one thing that uh a lot of us um um I hear a lot is that
be cautious of of benchmarks that come from the provider itself because they could be biased intentionally or unintentionally and so even if we have these uh you know can we trust them or where's the better place to look and even if we have these benchmarks what can we even do with them like how does it even make us inform like okay it's good at math what does that mean for my Japanese uh game that I'm trying to build that's a visual novel right um so there is another leaderboard and you're the one that you're the
one that shared me all these leaderboards to be honest um but one that you showed me was artificial analysis and so it looks like this basically has a collection of uh a lot of models and if we were to go in and take a look at like let's say anthropic uh anthropic CLA 3.5 son if we can find it um I know thropic is very popular these days but here we have uh June and October October is sooner so we'll go ahead and choose this one and what I'm looking for down here is that we
will see mlu right and then we go over to here and we see mlu and I'm going oh that's cool I still don't know what mlu is though okay uh so let's let's maybe step one back and then we'll we'll talk about mlu specifically so there are the idea of benchmarks right um let's start actually one step backwards so there's a traditionally um predictive machine learning not the Gen uh was more test specific so you would come in you would take a model you would uh teach it a very specific task and uh you would
evaluate it on that task right um generative AI uh these very big models they have uh more General capabilities they can do a lot of things and there's what we call emerging capabilities as well right things that we didn't think they could necessarily do or they they they weren't trained to do it turns out that because they're big models because they see a lot of data they can do a lot of things and um given that these are General things they're not very task specific how do you know what they can do and how well
they can do it and so this idea of benchmarking them um comes in right to to have these standard tests kind of like we take the SATs or we take uh different tests to qualify for University uh there's these different tests for these models today and there's a bunch of them as you can see um and they're all made to test a very specific thing so the mlu is the massive multitask language understanding it's um and what it does is it it mainly checks um accuracy and knowledge acquired so it's looking for knowledge there's the
mlu pro which checks for reasoning as well so what you're going to see is there's a lot of different benchmarks um these are what they technically are a set of tasks and they are um geared to test something specific and if you look at some of the leaderboards they're going to tell you um yeah so and they'll tell you when a if you go back to the um artificial analysis leaderboard I think it was there um and sorry just as a side sidebar because I keep seeing this word chinchilla and I don't know anything about
this model what is with this chinchilla model I know this is totally uh off off the side here but uh what is it so this is a model uh that uh this actually I'm more interested in the chinchilla paper as opposed to the chinchilla model this is a really cool uh paper that came up with this model and what it looked at was um it evaluated the model size versus the data sets versus the compute resources so it really looked at all of the resources around model training it's a really cool paper I think everybody
should should reel this paper um and it came out we we mentioned this in in a previous video but it it came out to talk about um uh the requirements uh the resource requirements the training uh the data set required so they think that uh a lot of models are overparameterized and under trained and that you need about 20 times uh the tokens data set tokens um compared to to model size so uh that's what that is oh okay so basically this is this is a research model the idea was let's let's make a really
big mod with a lot of parameters and then try to figure out how much how much data we have to pass PDF like view the PDF it'll this is a paper that looks at um if you scroll through it uh it'll go through and talk about uh the model differ size models and how much they need in terms of uh the flops the the the compute demands the work the the data set demand so it's really um a paper that looked at the the landscape of what people are doing versus what you should be doing
in terms of training these models oh okay Prett a really cool paper I feel like that would demystify a lot about like how much data you need to train right yes yes exactly and I think they came up with the with this 20 times uh the 20 times the parameter size in terms of token sizes a token number you need that's as a as a proxy for data size okay um so let's go back over to here sorry I just I just kept seeing it I'm just like I need to know what it is and
so I'm not gonna ask what gopher is I imagine it's gonna probably be like another research model um but uh anyway over here we're on the measuring massive multitask language understanding yeah so if you go back to the analysis um the the leaderboard what you can see is CD mlu that's a reasoning and Knowledge Test that's what that Benchmark is is used for uh if you see for example the human eval under it that measures coding capability so each of these benchmarks is looking for a very particular uh capability and so if you interested depending
on what you want to use the model for you would come and look at its performance on that very particular uh data set right so if you want to use something for coding you're not going to look at the mlu because that doesn't help you you're going to look at the human eval um performance I see so let's say I'm trying to build a visual novel generator because I want to immerse myself in Japanese language learning and so I would want to look for a specific benchmark that would probably have to do with the creativity
of natural language and so I'm not sure which one that would be but like how would you know like where would you find a list of possible benchmarks um like is there a standard standardized list somewhere or like even on here it's like where do I find there's usually uh I don't know that it exists in any particular place I could look one for you but what usually happens in in academic literature you have to remember that although these things are very very um popular in Industry they they they have their roots in Academia right
all of these are papers published peer-reviewed papers that are published and so what ends up happening is uh there's usually a a summary review and I'm sure we can find if we go to PubMed or archives we would be able to find a a paper let me actually look through archives really fast and um uh bench mark I think I've seen you're looking that up serious oh sorry yes no I think I've seen one in in the literature uh and so you you should be able to find a an academic lit you can probably find
any sort of um reference I I usually go for academic summary papers that would go in and kind of um Assemble everything that was published and put it in a single place and if you want after this video I could look one up that's reasonable and we can attach it to this video so so I mean it looks like a lot of these like here at least um uh and we haven't talked about hugging face but hugging face has their own uh uh uh leaderboard looks like they have multiple leaderboards so I guess it depends
on what you're looking at but we're just in the open llm so that's obviously going to just be open language models and so they have some very specific benchmarks that are here um and I mean we only see this 1 two 3 four five what is it 1 two 3 four five six seven so we have those seven there how many do we have over here we have basically six I wonder if they're settling on the same ones each time or maybe it's Tas specific if we go over to here um like this one has
human reasoning change a little bit um but I guess I look at this I'm like I was like okay math is great and all but like you know does everyone care about math and it's just like I wonder why I'm just trying to like try to make sense like I'm coming to hugging face and I want to choose a model is this enough information for me to start figuring out which model to choose or it's not well so if you want to choose a model then you first have to decide what do I need that
model for right what task is it doing is it coding is it creating language is it doing reasoning problems that that's number one let's oh sorry so let's say creating language right so we go here and then this is all we have right and they seem very math focused at least I mean this one's General reasoning right I think we said MML General Reas re math is important is because uh it's it's the Achilles heel of these things they don't do math very well and so there's a general perception that these things don't have have
a good understanding of the real world they don't understand physics they don't understand math they they're putting language one word after the other and a lot of times it doesn't understand the the the reasoning physics and math are still uh hit and miss and that's why they're important and that there there's we need to understand um if of course it's relevant to the task we're doing if if these models are good at that if they're starting to pick it up or so so I guess my question is like if I'm doing Creative World building uh
for for my Japanese visual novel does it need to know math because that's part of the World building no no it doesn't but that's why there's so many different benchmarks right if if we're all looking for a single thing then there would be only one metric to look at but because these things can do a variety of tasks and because we use them for a variety of different things that's why there's a variety of different benchmarks and what you need to do is you need to understand which benchmark Mark specifically um reflects the metric that
you care about and then look at that now I we talked a little bit about you mentioned before that when uh that the actual provider has their own benchmarks and I should mention that when a lot of these are peer-reviewed so again they come a lot of this is academic it is published in peer-reviewed academic papers and when you come in to publish a new model um the journal itself the it what what your reviewed means is you submit this uh work to a prestigious journal and then what happens is um at least two or
three people have to uh evaluate it and one of the evaluation is well how does it compare to what already exists and so as part of the publication process there is some sort of benchmarking or or um metric assessment uh that comes in of course that is done by the provider themselves there's potentially some bias like you said it could be it could be totally unintentional but it is subjective um and so these third party leadership uh systems exist um and and of course the other thing that's important is because this when it was published
uh it's a single point in time right it's a single point in time uh but these things are coming out all of the time and so these leadership boards are cool because they keep up to dat so they are running things uh at a at an at a regular interval uh to see how things compare with time so that's why that that's interesting it's always really cool to read the to scroll through the initial paper that came out and look how it it did at the time uh but for for but it's moving really really
fast and so these leadership uh leaderboards that are handed by third party companies um are interesting to see how things still compete how things come uh with with various uh as things come out yeah so what you need you so what you want to do is you you want to understand what you want to do this for um and so you would look at the Benchmark that does that specifically and a lot of these leader boards are telling you um but you can also do some work uh to understand what the Benchmark is uh then
you want to look at performance and quality and all of that which I think the leader boards have um this one is really cool the live bench and the reason I sent you that one is because there is a worry that there might be some contamination in the benchmarking in that these Benchmark um these Benchmark data sets are online and a lot of what these llms are being trained on is data online so it could be a a very important part of evaluation is that you don't cheat that you don't know the answers right that
if if you have a test and you've you have you've already have the answers and you studied them because your friend gave them to you then we have a problem the test is inaccurate right and so the fact that all of these leaderboard that these are these benchmarks the data is online and a lot of companies scrape data online to as part of the training data set it could be that the answers could be leaked so the the the so they don't they don't they don't know so because so just a a to try to
rephrase is that when you go and you are evaluating a model there is data that is supposed to be withheld that nobody knows the answers to so that you can fairly evaluate it but if they scrape the data they might not even know that it's in their model and so potentially or yeah so there's there's some worry now there's con there is concerns that these models are seeing the answers to these these uh benchmarks as part of their training and so the that makes the benchmarks themselves um pretty um inaccurate and so what this live
bench does is it renews the questions almost every month so there's some it's not it's not a snapshot that is online it's that there are questions that are constantly coming in and that kind of helps with the idea of the the models having seen the data as part of their training data and so this would be a little bit more accurate um in that these uh these questions are refreshed well this sounds like real school then you have to think of it as yeah they leak it and you're trying to prevent from doing it or
exactly they leaked and now you you make questions every every other month or something to I I recently had my uh Japanese uh Japanese lessons two weeks ago and my teacher thought I did a really good job because she had used a piece of material from shun's podcast but what she doesn't know is that I had watched it early in the day because it came up in my feed but I didn't want to tell her she like wow you're doing really good I was like oh yeah but it wasn't intentional it's just like I I
was exposed to that information and so it's kind of like this where it's like you don't realize that um the information that's out there could be accessible to your model as well and it already has that knowledge and so you're going wow you're doing a great job and the M's going yeah I guess I must be yeah really smart right so but yeah maybe I've seen that material before in its exact form but um yeah so this is really cool the live mention that there's some uh there's some refresh ratio that comes in one nice
thing that this one has that and this is just me talking superficially looking at these things but this one's telling me like reasoning average where it's like you go over to here this is reasoning and knowledge I guess knowled and mlu yeah exactly so they tell you what the Benchmark is and then what it's supposed to be testing for specifically um and and there's a whole bunch we can see that most leaderboards uh are are choosing about six of them and it looks like they're fairly the the same uh six but I think based on
what we know they're being used we are using them for coding so coding is there we are using them for um language and accuracy so that affects hallucinations a reasoning questions um uh again so the common things that we use these things for of course you would have metrics to give you an idea about their performance okay and so then then here comes my hard question which and I already know the answer to it which is what is your favorite model Rola and you're going to tell me it depends depends it really depends honestly um
I surprisingly enough I I don't know if I should say that I don't use gen as much as I uh should but uh I could tell you this I could tell you this my team at where we work we we do love anthropic we use a lot of it and it's probably biased because uh apparently anthropic and Claude are really really good at code generation and we are coders so we are uh and so we use a lot of uh anthropic and and Claud so the same thing it's like at the same time it's like
because you have deeper knowledge of machine learning I think that maybe you don't need to use you know when not to use gen it's like like I'm really good at at web application development everyone uses Tailwind I'm like I can use Tailwind but I would rather just write it from hand because I can get the exact results in the smallest amount of footprint and like why wouldn't I do it if I knew how to do it so do you feel that a lot of times you're just using the most cost- effective precise tool because you
have the domain knowledge to do so and it's not that gen is bad it's just that you know better it's not that gen is bad no it's it's very powerful uh the way to use it though uh it does have um it I mean listen I I'm I'm an expert I'm I'm a professional and like you said you you gain your your experience and so I know what I know and how to do things I think what J is really really good at is I always um have a hard time starting like if you're starting
a report or you're starting a paper or you're starting something it's really really hard to put those first few paragraphs and decide how to to direct and that's where J comes in and that if it creates a draft for you then it becomes really easy I find I can jump in a lot faster if with a draft than if I'm creating from scratch and so I use a lot of gen in in draft creation um and in just um kind of polishing things a little bit right um uh and uh yeah I mean I'm just
using uh large language models to learn Japanese and that's that's about it I don't really have any other use cases but um I'm sure other people do I hear I hear in health there uh there's a lot of interest there there's uh again I'm not exactly sure what but large language models are um uh there's a lot of interest to evaluate them in in some means to do something with something with uh uh DNA or something I imagine but that's beyond me uh but yeah again I appreciate your time uh helping us look at leaderboards
um it's a lot more clear to me how I can go about and try to uh choose a better model I hope that it makes a lot more sense for folks that are watching here um I'm sure you are going to pop up again and again uh to help us out so again appreciate your time again and uh we'll get back to uh the course [Music] here hey this is Angie Brown in this video I want to take a look at uh the available AI power assistants that are out there um we're just going to
pull them all up but I just want you to know where they generally are um and get you exposure to ones that you may have never used before I think a lot of people know chat PT so that might be the first one we'll we'll take a look at so that will be in its own tab here so we'll open this one up here and I actually have a paid version of chat gbt I think the frustrating part is that every single thing that uh every one of these AI powered assistants they all cost 20
bucks right so you can't use them all um but uh you know you'll have to pick your favorite I like tachu BT because it's not the best but it just generally works really well and just keeps going right um but you know it depends on your use case let's go open up another one so another one might be Gemini uh by Google so this is another AI powered assistant okay I don't do a whole lot in here but you can see I do a bit U meta has one So Meta AI which is another one
okay um another one is minstral mistal uh AI whoops I flipped my mouse over mistal AI so here's another one that you might not be familiar with okay I'm trying to think what other ones there are um just a second you know what I only missed one and it was uh anthropic Claud which by the way is like one of the most capable capable models at least right now it is so we'll just say Claude and we'll have Claude Ai and I used to play for Claude and it was it's really really good I'm going
to see if I can go ahead and log in here to to get to it here today um but obviously like if you have paid ones versus free ones you're going to get different levels of performance okay but I just want to pull them up here and so you can see the interfaces are uh somewhat similar we'll go ahead and try the chat here okay so we'll open up mistol mistol AI we'll sign in with Google and there's probably other AI powered assistant out there definitely in China they have their own uh China AI powered
assistant not Hospital AI powered Hospital that'd be really interesting but this is kind of where it's hard to um find because if you're not in China then you wouldn't necessarily know and I'm not even sure how they're accessible over here but I know that there is an AI power assistant it might be integrated into WeChat or Alibaba um but I simply don't have access to it right so I I'd love to show it and and take a look at it but I can't one thing I want to point about with meta Ai and Mr AI
is that these are powered by open source models um so with with meta it's probably being powered by llama okay so if we go over to here this is the open source model for U meta Ai and then mistol is powered by mistol it say the same name um and so you know I don't know what the goal of mistol is for their AI powered assistant because I don't feel like this is something they're trying to offer as um as a consumer product it's more like to show the capabilities of what mistl can do um
whereas somebody like Chachi BT they want you to pay for it they have right now a $20 tier in a $200 tier okay and Gemini I think they want you to pay for this as well yeah they have like an upgrade button here right and so we can go over to here and just take a look at the cost um and it's $21 it's always about $20 okay but you go to meta Ai and I don't think there's anywhere to pay or upgrade like here you can store your conversation history generate images I'm going to
go ahead and log into here uh we'll log in with Facebook just a moment there we go but you know again I don't think there's any way to upgrade per se right because that's not their intention with this thing this thing is to demonstrate off um their models llama and um I don't think they intend to make it a paid thing this is also I think integrated into Facebook so um you know you're probably using it there uh if you ever use it in uh Facebook there um but you know the capabilities are just different
like I find Chachi PT is really good at going forever so whatever I want to do you know it's like um uh you know whatever ever it is it just goes and goes and goes and goes um and I think what it's doing underneath is that constantly summarizing information um so I've never I've never experienced a point where it said hey you've used up your limit except for chat GPT 40 so there's different models here so like um uh 40 is the latest one and oh or maybe it's 01 mini sorry it's 01 so 01
these will have a limit uh whereas uh 40 it just goes and goes and goes there's some other mod like GPT 4 um and so you know I just find that this again if you're doing day-to-day stuff it generally works pretty well Gemini it's known for being able to parse really good documents so maybe we can find like a PDF to parse I'm going to say uh PDF um white paper um Transformers so maybe that's something that we could grab so it's called the paper all attention is all you need right and so maybe what
we could do is down download this and extract out the first page so if I go to print here I'm on my Windows machine here but I can see if I can just save the first page so just give me a moment okay and so I'm going to drop this down say Adobe P PDF and I just want um pages one I just want the first page and it says it's going to print but it's actually going to save it okay so I'm saving this to my desktop um and I'm just I'm renaming it to
just attention is all you need or just page one I'm going to call it page one on my desktop here and so now what we can do is we can go ahead and try to drop this into um our AI powered assistant okay so I'm looking for page one all right and so here I'm going to try to drop it in it says the page is unsupported which to me is unusual I thought that Gemini supported PDFs but if it dropped down here we have Flash and advance so maybe this is something that only Advanced
can do let's go back over to here but I'm told that it's really really really good at at processing PDFs or docs so we'll just take a look here so Gemini Advanced PDFs does support PDS if files if you put them in a Google Drive and refer them by name okay but is that in the free tier I don't know so that's something that I I had saw before that was really great um but you know you know maybe I can't show that here today so that's a little bit of a shame if I can't
show that um but we can upload images so maybe what we could do well first of all let's just see if if chat GPT can handle our PDF and I have the paid version so it probably can which is probably unfair okay so tell me about the PDF it probably can handle it no problem so it says here however the extracted text includes many encoding issues which is difficult to read if you if you like an attempt to extrap and make more clear go for it and so this is something that um is a lot
of work to set up on your own but like if these things can do it it makes your life a lot easier but I found that I get variable results when uh bringing in PDFs now again I'm not sure why Gemini is not doing here but what we could do as a cheat because it can take images is I can try to take a screenshot of this and drop it in that way so let's see if I if I can do that so I'm going to go ahead and um I going make the text a
little bit large here so I can see this is really small so I don't expect that to work but I'm going to go ahead and see if I can paste it in here and so can you OCR this for me let's see if it's capable of doing that and actually it's doing a very very good job oh look at that it just showed us that it was doing it doing it and then it decided it couldn't do it that's really interesting okay I'll say but I just saw you OCR and then you stopped uh okay
how about this we'll go back and try that again can you extract the text out of this image yeah so there it's doing it right what that's crazy it you it totally did it again so that's kind of uh kind of odd but whatever let's go over to here and um I don't know if we can upload it said over here we should be able to upload images right um maybe can imagine an image right so you know can you tell me about uh attention is all you need all right and so it's outputting some
results here we'll make our way over to mistol it can search the web oh that's cool what's canvas what is that I don't even know what that is Mr AI canvas let's take a look here a new interface that pops up in the chat window when you need to go beyond confirmations into idealization oh okay um so let's see those capabilities here maybe that's just like side windows so that's the web search this is what we want to see here um okay yeah so canvas is the same thing as what CLA has where they have
like pop out windows um but here we can upload a PDF I believe let's go give that a go can you get the text from this PDF see if we can handle that it's really interesting it's saying it's garbled right um would you like me able provide cleaner P yep do it okay we'll go ahead and do that again I'm not paying for Claud here but uh I like that this was in here before so that might be nice to try that out later on technical passage that explains the thing unfortunately much of the text
is corrupted so it's really interesting that it's not able to handle it but Gemini I definitely think if we had the advanced version it would probably let us do it I think they might be gaing that feature uh over there um and then here we can attach a PDF as as well so let's go ahead here and say like can you can you get the text from this PDF okay and we'll see what it does and so this one's reading it it's not complaining right it's just doing it so that's really good okay so you
know this isn't a great example of of comparisons but I just wanted you to get an idea that there are these ones out here that we have Chachi BT we have Gemini we have meta we have mistol we have Claude um and we're going to just keep revisiting these things I might actually have a better example that we'll walk through later but this is just an introduction to these things some are paid some have paid tiar some will not have paid tiers and their purposes are different okay so I'll see you in the next one
ciao [Music] hey this is Andrew Brown and I want to show you a bunch of places where you can run Jupiter notebooks in the cloud uh maybe in a separate video we might show you how to do it locally but I just want to focus on cloud Solutions so that you have many many options where you can run this um so you know one that we can absolutely run is Google code lab which you we will see time and time and again so um Google cab is just free CPUs and gpus so I made my
way over to Google cab I logged in the Google account I can create a new notebook here and we're not going to do anything fancy I'm just going to just show you something very simple so like let's say we want to check what version of python we're using we could type in was it python hyen hyen version here and run it okay and we'll give it a moment to execute not sure why it's super slow here but it will work in just a moment and I need double percentage sorry double percentage give it a moment
there okay and so that is an option we have now could we upgrade this and get more computes there is a collap pro so you know it looks like for $ 13.99 you can get 100 compute units per month collab Pro Plus you get 500 compute uh uh units per month I don't personally know what compute is being used underneath I simply could not find it but it seems to work fine it's some it's going to be something underneath I'm not sure but Google collab is something that you'll see time and time and again because
there will just be examples places like I'll give you an example if we go to Omni or maybe we'll just type in like Google collab examples notebooks um somewhere like hugging face because people will often have examples you'll see like a button that says this is hard to see but if we go a little bit closer here here you'll see we have the the button here which is collab this one looks like maybe sagemaker Studio which we might want to show here in just a moment but I'll open this up here and so this is
someone else's collab and so like I can run this one right I can go ahead and just start running it it says this is someone else's notebook and say run anyway and right now saying too many sessions because I have this one open here I don't know exactly how to manage sessions um normally when I'm running this I'm always running one at a time but uh there's be some way to stop it or oh here it is manage sessions we'll go here and so here you can see we have a few from before so I'll
go ahead and I'll just delete them okay yeah terminate terminate terminate so these are all being terminated so now if I go back over to this one now I can probably run it okay I'm not that interested in um uh actually utilizing this I just want to show you that I'm using an external notebook right so there it's running I'm sure it will work I'm going to go ahead and go to manage uh sessions I'm going to go ahead and terminate that okay this one's working totally fine so this is one uh place that we
can use it now it has its own little experience uh that that looks a lot different I personally don't like cab compared to other tools out there but this one is so popular um that you're going to run into it a lot so you want to get familiar with that the other one is stagemaker studio lab so I'm going to go ahead and type in stagemaker studio Labs now there's stagemaker Studio which is something else but Stage Studio Labs is um it's like it has free compute and gpus I'm not even sure if they have
a paid tier I think it's just free um and so this is something you have to sign up and request um so I have an account let me go ahead and grab it just a moment here as I log in okay so I'm just trying to log in but this is one thing they make you do is a lot of these tear things I think the reason why is that a lot of these um um choose all the chairs that was correct choose all the beds but a lot of these uh tools here they get
abused by uh crypto Miner so I think that they just put some extra excess here to get in here and I got my password wrong it's cuz I'm typing in my uh my password manager is not working right now so I have to manually type it just give me another second here to try try to type this in again there you go I had to reset my password but now I am in and so this experience is a little bit different you have two options CPUs and gpus and so depending on what you choose is
what you're going to have um and you can see here that if you choose CPUs you'll have it up to four uh four hours at a time with a limit of 8 hours for 20 periods or that time is half for four hours in a 24hour period if you're running gpus so this is another uh great thing that we can utilize and I do believe that they have some uh examples here so if we were to go over to uh this tab here I believe nope nope nope nope but I think that we first launched
it up there are some examples let's go ahead and just start it anyway I'm just going to use CPUs here today and so it's going to bring up the runtime uh it say something went wrong which is totally fine so I'm going to just choose some hats here here and it says uh please allow the script downloading from the domain refresh your page for more access and today it's just we're getting a lot of trouble and it logged me out wow okay so what a headache let try this again here and I think it just
doesn't like me here today please allow the script from downloading from this here today that's a new error [Music] there it contains a description if you can access your account verify that you're using the correct email and password uh yeah there's nothing that should be causing this I do have a username as well so I could try that and one second here nope so I can't use it here today but it's pretty straightforward you have CPUs or gpus and you launch it up um just give me another second let's see if I can fix this
so what it's saying here is that I need to accept davis. Waf as The Trusted domain I'm not even sure how to do that um to allow the JavaScript as a trusted domain for more information see W firewall manager but this makes no sense because I'm not logging into my AIS account right it's not it's not related so just give me a moment okay and here again it's saying uh if you're using a browse security plugin that prevents Javascript download you may need to see this error so maybe there's something that's preventing it let me
go figure that out all right so um I asked uh I gave um the information to Claude it's like go to here into your Chrome settings and add this here seems silly that I would even have to do this but um you know maybe this could be something that might fix it but also says in the padlock and site settings maybe there's JavaScript options there so maybe I'll check that first um site settings oh so we have it here as well okay so that is a a direct way to here um and it says allow
so it's set to allow there's no reason why it would not allow that um so I'll go back over to here it says that it's allowed here as well um but I'll go ahead and I'll just add this like this because now it should fully allow it no matter what let's see if I sign in here did that fix my problem I'm seen bags I hate this bag thing but I I get it they uh they don't want to deal with um these folks that are abusing it so now I'm in here we'll go ahead
and try this again there might be something funny with my browser like I'm even having trouble getting to dash lane these days so um uh boo my co-founder was here helping configure something maybe he's changed something on my computer um but anyway let's see if this go goes ahead and launches up it looks like it's working so we'll just wait a little bit and so now that it's running we'll go ahead and open the project but but I kind of remember there being existing notebooks in here so I think that we'll have something to play
around with here um if we don't even have anything yes we do and whoever wrote These they did a really good job because I really really like them um they should have ones in here for um maybe if we launched up the gpus I think the repo is different but if we launched up gpus I think we actually would have had some large language models we could run there this one's pretty simple so we'll go into gradio as an example we'll click into this one and we'll select the default kernel here looks like there's one
specifically for mistol which is cool and so this one is setting up an interface to greet um so I'm going to go ahead and just run all here yeah restart the kernel but this is your standard um notebook Jupiter notebook experience right nothing super fancy here so here it has one little problem says unknown module named gradio why would it be unknown you you import it right there and that's totally fine like I'm not sure how wellmaintained these are but I'm going to go ahead and drag this all the way to the top it could
also be just the the environment that I chose so I go down below so this one's probably better so we go ahead and do this one it does say that we had to use this one but it it might be the case and so we're restarting the kernel here notice you can see the dot here it'll tell you if it restarted up here it will give you information down below about restart um I think it's restarted I think it was really fast we'll go ahead and hit restart and run it and so now it might
work no problem and so here we might had to change something it's still saying no gradio you can access gradio with the following Link in your own uh domain [Music] M okay but it's still saying there's no gradio let's go drop this down and take a look again here oh gradio example so they actually have a specific kernel for gradio and that might be might have been our issue here I'll go ahead and just run this one that doesn't work I can go up the top here and say restart kernel sometimes that might help it
still says Sage maker distribution no no I want to be in this one select and we'll restart the kernel I don't think it's listening to me it's not but that's fine what I can do to get this working and we're going to be using gradio later on so this this is not not a waste of time I'm going to go p install gradio and see if I can just install gradio this way now I might want other things installed but um the kernel like that can have pre-installed stuff and so for whatever reason I'm having
a hard time switching it here today I just don't feel like um stressing out about it there's that environment. yaml file so maybe that would set up stuff I don't know but we'll go ahead and run this and see if that works okay this is what you have to do on a regular basis when working on AI projects fiddling with these kind of things so it's going to start the demo to create a public link sets shared to True um I just want to see this working somewhere let's go down below here so here it's
suggesting that I might be able to access this I didn't set it to True did I not but anyway it's just saying let's go grab this link up here at the top so I'm going to grab this here for just a second I'm going to go over to here and I'm just going to replace a bunch of the stuff by the way again this is just free so there's no there's no harm in you running this um I'm going go ahead and run this and see if that works and so now I have gradio so
I just say hello uh there we go hello submit and it outputs hello hello I think I supposed to say my name Andrew right and so that's a very simple um gradio example now we were seeing before on um hugging face there was a button to click I I've never seen a sagemaker Labs like that before so I wouldn't mind trying I'm going to go ahead and I don't want to log out I just want to stop this environment so um I'm going close this tab right we'll hit stop but you could see with uh
Google cab you want to stop the sessions with this one you want to go and you want to manually hit that stop button right you don't want to end up using your CPUs if you come back back later in the day you won't have any so make sure you stop them not going to cost you anything but there we go you can see I have a lot more time left over let's go over to uh here and I'm going to go ahead and just click one of these like a I don't want anything complicated let's
do this one here this sounds very simple and so this is now uh setting the options here down below it's showing us what it has okay does it need CPUs or GP I don't know um it's not saying what it is since it's doing like sequencing I think this is probably just require CPU so if I hit runtime I'm going to assume that it's going to launch this notebook again I've never done this this way before so curious to see what happens oh that's a chair there oh dear oh I missed a chair right there
there we go now we have to do the hats you can see collab is a lot less annoying did I is that that a hat doesn't even look like a hat all the bags it's almost not even worth it for the free compute here but you do get gpus which is nice so we'll just give that a moment to spin up okay here we go so it started so now we copy to project the button and then I assume it's just going to import the uh the get repo there so clone entire repo sure we
clone it and and um it's a little bit unusual where it's going I'm not sure where it's placing it but if we go to the top level maybe it's somewhere here yeah I'm not sure where it's going all it's doing is running uh get cloned in the repo okay great so now it's taking over um so yeah now the notebook is here I guess so maybe it was in here okay so it actually imported a bunch of stuff so I was hoping that it was bringing one specific one to here but it actually brought a
lot of stuff in here so I'm not really sure what library we were trying to run but here's an example of Sid kit um and that's doing something not necessarily what we want to do but uh lots and lots of examples yeah that is sagemaker studio lab so that is another option U that you have available to you okay so we'll take a look at another one [Music] so where else can we run uh notebooks well if we're in ads we can just log into that and I can show you how to run it directly
in your ads account um actually iTab us does come with a free tier uh for compute for ML M3 larges or whatever so I'm going to go ahead and just log in just give me one moment all right so now I'm signed into um AWS and so I'm going to make my way over to Sage maker just double checking my cost nothing nothing too crazy here on September 24th yeah that was a high day uh months away months ago but that's okay so I'm going to make my way over to Sage maker okay uh Amazon
sagemaker AI oh did they rename it now already it's been renamed oh well let's go over there that's fine whatever AWS you can put AI in front of it if you want I don't care and so the way this works is you would have to normally set up a domain like I already have one set up here so I guess I could delete it um but the idea is that you pretty much go over I'm just going to go to another region it's so hard to show this again and again and again but I'm going
to go to let's say Oregon okay and what you would do is you go to getting started and they have like set up stagemaker domain so this is the one that you would choose and that will set you up a bunch of stuff it takes a little bit of time but I already have one set up and I'm not going to set it up like 100 times over here um and so we'll go over to here back to the main Sage maker and oh do I not have one set up here oh I do and
so once it's set up if you go to Studio here you can open up the studio now there are notebooks here but I thought they're getting rid of these but the experience looks a little bit new so I'm not sure it says try the new Jupiter lab and sui maker studio so they're they're not suggesting you to use these anymore they don't really want you to use this they want you to open up sag Maker Now I want to point out that do not launch sagemaker canvas this one has a similar experience this one's expensive
do not use sagemaker canvas it is awful awful awful at least at this time uh it is in terms of showing you spend the service isn't bad it's just it's just a spend trap but now we're in stagemaker studio in the top left corner we're going to go to jupyter lab we can go and create a new jupyter lab uh space I'm going to call it example as as we can name it whatever we want we're going to create that space I'm going to get that pop out of the way and uh we have some
options here the ml uh T3 medium has a generous free tier um I would double check the pricing there because you could be outside of free tier if it's outside the year or something like that so just look at sag maker sag maker Studio pricing um uh Jupiter lab and so we go over here we see runp pods trying to buy ad space above them there we'll go and scroll on down and so it's the MLT three ml maybe Sage maker free tier Sage maker free tier AWS and so sagemaker notebook so 250 hours of
ml medium 3 it's free TI per month for the first two months for the first two months so if you want to utilize this compute make a new account you'll get that reset there um or you might be outside the free tier but I remember this not being very expensive like it was in the pennies but if you forgot about it it might cost you 60 bucks or 35 bucks for the month so just be careful there and you could look up the exact cost of that ml M3 medium pricing and so here they're suggesting
it's 067 uh per hour so almost like half a dollar um so maybe you might want to be really careful with that here it's saying a dollar so again it really depends maybe I was a bit off with my cost here um come on show me the instances here and Jupiter lab oh no 05 cents okay so yeah it was what I thought it's it's it's it's a nickel it's a nickel per hour right but that's going to add up over time so if you had that over let's say 730 hours if you ran it
all day every day all day every day sorry let's go back here we say 0.05 times this $36 see I was pretty close um when you set up Sage maker the domains you can choose to have it shut down uh like uh over time so you don't have to worry about coming back and and shutting it down that is in uh let's go back over here for a second databus amazon.com so I can show that but if we go here on the left hand side to uh sagemaker AI almost all the providers renamed their stuff
it's ridiculous um so we'll go over to Studio or maybe the domains the domains here because if we click into here cuz I didn't show you the setup here but there's some configuration app configuration and in here you can tell it uh to shut down so if I go here customize stagemaker UI uh it's not there go down here ah here it is so if we edit this we can just say like enable Idol shutdown so the idea is that if it's running you forget about it then it will shut down for you after a
period of time so this is something you absolutely want to have turned on so I don't even have it turned on so I'll just checkbox it on and it has an idle time of whatever that is so this is minutes uh default is so I would say after 60 Minutes I would tell it to shut down the lowest time yeah yeah so I would do that and that would be a smart to do okay because that way at least you'll be in better shape that way but anyway let's go back over to here and we're
ready to launch the space you can see we can choose the storage other things like that and this is where you might want to choose like if you're downloading a larger model You' want to have more more space but I'm just launching this up as an ml ml T3 medium I I didn't drop this down but we have other options we have a bunch of different compute either with uh CPUs or with gpus attached and so we're just going to wait for this space to start up this one's one of the fastest starting um instances
so we're just going to stick with this for now okay now you could also launch an ec2 instance with Compu and install jupyter notebooks and that might be something I might show you how to do just so you know how to install jupter not or Jupiter Labs servers from scratch which is something you absolutely should know how to do as it becomes super handy in many use cases but again just waiting for this to spin up so give it a moment there we go now I can open this up in Jupiter lab okay and so
this will open up a Jupiter lab environment and I'll just create a new notebook here and we'll do our standard this actually is probably pre-installed so the kernel for um for uh uh stage maker comes with a lot of stuff pre-installed and this is where you could run into issues because if let's say you need a very specific version of um or like very specific compile flags for something like um pie torch or tensor flow you could run it issues but it says it's already satisfied because it's already installed in this environment so we're not
going to do anything fancy I just wanted to show you how to get to that I'll go ahead and hit Lo log out here I don't think that stops stops everything I usually don't do that I usually go here and just hit the stop button okay so it brought us over here which is kind of funny I'm going to give this a hard refresh I don't think it stopped the space but let's take a look and see what happens right so if you want to stop it you got to stop it over here right and
you can have it hang around there might be storage cost longterm under 30 I think you're fine um because under like it's 30 and above you're out of free tier or above above 30 let I delete that and so that is how we would do that with sage maker Studio to have that and that's something you can absolutely utilize um uh [Music] yeah okay so we looked at stagemaker Studio but let's go take a look at Azure ml studio um so I'm going to make my way over to Azure I think it's just uh portal
do portal. azure.com and I'm not doing this every day I don't really build pipelines um uh workloads in here but I've obviously taught um a lot of this so I'm pretty comfortable with it um and there is a way to launch things in Azure AI Studio I think it's now called Foundry which is azure AI Foundry did they rename it it doesn't look renamed to me I thought they renamed it nope it's still called Azure AI Studio okay maybe that's an external service I'm kind of curious there Azure AI boundary where's the rename here we
go oh maybe it's said AI ai. azure.com if we go here is that where it is this might like uh Microsoft likes to have more entry points into their products I don't have a project called that we'll select another project here um but I think this is just another way to a access Azure AI studio uh I have a a Japanese example here from a while ago that's a hub this this is not the way that I would like to do it but if we click into here this is just an existing project I'm not
suggesting you to make a project in here I just want to show you um yeah we have to make a new project but there's compute here on the left hand side and so you could technically launch up a compute inance here I don't recommend doing it here because the experience here is kind of broken and so when you're using Azure I would recommend going into specifically Azure machine learning service and here you can see I have an example I'm going to go ahead and delete that I'm not using it right now now um so I
just want to go ahead and delete this for now um I'm looking for the resource Group to delete how do I keep up with all these clouds I don't know okay you're probably asking me that but I'm going to go ahead and just delete this now for stagemaker Studio I didn't show you how to set it up completely but for this I will it's a little bit um straightforward to to create and delete so I'm going to go ahead here and make a new workspace I don't know what this costs um to me it's like
pennies if you're worried about it don't do it if you know Azure pretty well and you're comfortable with it go ahead and do that um but we'll go ahead and just say a example and I'm going to just say a example the region does matter because different computers available in different areas I don't know what to choose so I'm going to use uh Central us I don't want anything Fanci here today it's creating a bunch of these other things here I don't think I need container registry so I'm going to ignore that we'll go next
um I'm going to leave it as public I'm going to go ahead and just go ahead and create this and so we'll have to wait a little bit for this to create shouldn't take too long okay I don't sure if there's one other step after here so I'm just going to wait a little bit see if I have another thing I have to confirm no there is not I'll be back here in just a moment okay all right so that's finished uh deploying I'm going to go to my resources here and I'm going to launch
studio and hopefully it looks the same from the last time I was here quite similar so on the left hand side what I'm looking for is environments and I'm just going to ignore ignore this um I could have swore it was from here if it's not here there's another place from in here to uh to launch it but I was pretty certain oh notebooks here on the left hand side um yeah create compute okay so I mean I was looking for compute oh it's down here computes down here okay so we can create our compute
over here or we can just go through notebooks and and create it here notice we have CPUs and gpus just like ads you can you have options that you can choose I like how the prices are here on the right hand side it's not 0.5 cents but um there are other options here that we can utilize so it really depends on what you wanted they're just recommending ones here I can just sort this by cost and not as cheap as AWS but still pretty good um and I'll go ahead to next and we have idle
shutdown so that's really good okay we have lots of good options here I'll go ahead and review and create so you can see the experience is a little bit better than ads not that anybody ever talks about it um and so now we were waiting for that compute we can go down here and probably see it spinning up and so now we're just waiting for that to create okay all right so our compute is ready let's go take a look and so we have a few options we have Jupiter lab Jupiter vs code um Jupiter
is probably the old Jupiter notebook so I don't sure why you'd ever want to use that but we can go ahead and open this one here and then the vs code web I think we could run these in parallel I'm not really sure um but it's just going to give us an environment that we can utilize so nothing crazy here and there's no surprise that um Azure can do this because get Azure or Microsoft owns GitHub so all this technology is probably shared between them you'll notice we have a few different options depending on what
you want to do we have Azure ml the SDK version 3 I'm not doing anything fancy so I'm just do something simple here for the notebook I'm going to do a pip install Transformers again and again and again till we get tired of it so we'll go ahead and run this okay so we're now installing Transformers which is great I'm going to go over to here and we I'm not sure if it mirrors the files I would expect that it would right I can save this file here for a second I'm just going to name
it as Untitled I don't care and this one is being a little bit unresponsive so is in preview so I'm not really surprised it's not working but I I don't really care okay we know that we have this as an option and so this is another way that you can run compute obviously the reason that you use sagemaker Azure ml is that they have kernels that are specific to their environments uh to make it easier to work with Azure ml Studio or uh when you're building pipelines or Sage maker studio um but you can you
generally use them but sometimes I've ran into issues using these things because the way they have their their set up the stuff that's pre-installed on them and that runs into issues for me in order to delete this I think we have to completely shut it down first it is shutting down so now it will delete it so now I don't have to worry about that so that is how you get a notebook running here on Azure ml studio and it's up to you if you want to tear down the uh the Azure Studio I showed
you that earlier in the video you just go into um I mean I suppose I can delete it now it's not that big of a deal but you find your resource Group and you delete your resource Group that is how you're going to get rid of this here okay [Music] ciao so let's go take a look at uh another one called lightning ey I really like lightning Ai and they do have a free tier um so lightning AI is uh basically a cloud developer environment specialized specifically for working with models okay and so I already
have an account here I don't feel like trying to figure out how to log in but I'm already logged in um and what's nice is that Co here actually has a bunch of different kinds of uh templates that you can utilize so they have I think Studio templates up here light I is totally uh totally free to sign up for um and they have paid tiers but here let's say we wanted to use something like llama 3.2 say llama 3.2 as an example here's one serving it so I'm going to click into this one I
like this one here and um they have like a video on how to do it but if I just want to start using it I just click open in studio right again very easy to make an account so this is going to open up here in um lightning AI I actually have a little stuffy of letting AI that's that's how cool they are and so what's interesting about this one is that it's not just notebooks like it has the vs code experience it has notebooks um uh you can just open up the terminal directly um
and the environment is controlled over here so if I click into here this is between CPUs and gpus so you can usually switch between CPUs and GPU so it will shut down but it'll bring the environment over um so it's a lot more flexible you can get a lot of information about your environment um it is it is super super powerful like I would generally just work in this because it's just so good but the startup time is a little bit slow but it's not that bad right um lightning AI uh they've created a bunch
of Open Source projects light lit GP GPT and things like that I'm sure we're going to come back and revisit them CU they're so awesome uh but right now it's spinning up the gpus and um I can't remember like how much you get for free but even within the gpus there are options that you can choose so so sign into a cloud account but before you used to say like which ones did you want to shoot so right now this is running an one Nvidia L4 tensor core graphics card that's not a that's not a
cheap one um but I I could have swore that we had few uh a few other options there if you hover over this one it will also tell us uh well we're not on it right now but I could I could have swore that we'd have that option but I think when we spin up one uh initially we actually get to choose what kind of compute we want but this GPU is ready to go I'm going to click into this example and this is the kind of experience you'd get in vs code so it's not
exactly jupyter notebooks but it has that jupyter note notebook x uh experience if we wanted Jupiter uh notebooks we just click Jupiter and we have it so whichever one you want the vs code experience with notebooks or the Jupiter lab experience you can just quickly switch between them well I say quickly but quick enough and so I can go here on the left hand side and and click into here and uh yeah we can just run through this so here it says start the inference server on the terminal by running the following command so here
I can just open up terminal if I can find it file new terminal and I'll just grab this here again I'm not a super expert of this stuff I'm just you know copying pasting and and working through it so here it's starting on on on binding on Port 0000 8000 so here it is um it's going to serve llama 3.21 billion so it's downloading that model I'm going to assume this is downloading from hugging phase and L jbt is going to serve that model in streaming mode with a Max token of 200 okay and so
that model is now being served it's downloaded and serve it's I think it was llama 1. 1B so that's very small 3.2 1B and so here we have our text we're go ahead and run that and then we'll go down below here and we'll run it like that okay and so it's doing our sum summarization now we can go over to vs code here right if you prefer this experience and I think that we don't have to run this twice because it might already be running so I'm going to go ahead and just give this
a go and see if it works it might not work we'll see what happens and so that terminal is still running I think if we go over to here we can see probably our open connection CS yeah so see how we have this this is it over here if we click here I think it's bringing us back to that terminal right that we were running earlier so it's kind of cool how you can go between them but this thing is super super powerful um I'm done with this environment so I'm going to try to figure
out how do I get to this I'm going to go back over to maybe Andrew Brown here and here we can see that it's serving I just want to stop it so I'm going to go ahead and delete this we'll say delete as I don't want to use up my gpus here I can only imagine at some point they might have some convoluted way uh in order for you to be able to access the gpus as crypto mining is a serious issue for these folks for all of them but if we go here let's say
I just want to do code I'm going to go ahead and do that we actually have some options of CPUs uh and we have the language model template I'm going to go ahead and click this because I just remember being able to choose between different gpus maybe that's only in the PID tier but yeah see now it looks a little bit different see it says 32 C CPUs let's see if we switch oh here we go so I choose gpus and now I have some options so we have a T4 an a1g an L4 these
are all Nvidia cards a100 h100 how many do you need so I really like this kind of experience because it it really makes you understand what kind of uh gpus you're using underneath right for CPUs I don't really know if there's that many options here we have some so we have 32 64 96 it's not specifying what CPUs it's using here I'm kind of curious what it would be using I would like to know maybe learn more but anyway here you can see it's spitting up that environment so definitely definitely we're going to spend more
time with lit GPT but we'll spend time in other ones as well um another one that can um that we can utilize is giod okay but we'll talk about that next so we'll just stop this here so we're not utilizing that I just delete it and uh we'll look at another one here [Music] okay let's go take a look at uh git pod now G pod has just recently changed their Model A little bit um and so what I would normally do is just launch a a workspace but I want to launch it with gpus
okay so down below here we have standard but there's a way to do with gpus which is new so I'm going to go ahead and say get get pod um gpus I was just at uh reinvent and uh Lou was showing this to me so I absolutely know they have GPU support um and it's configured in their Dev container maybe we go over to G pod here okay just the main site here it's going to keep logging me in here and somewhere in here there's a way to download the new environment so I want gpus
I know we can do gpus here we'll go over to maybe the the resources um view all and so now we're into our docs here and so somewhere here we can configure gpus I know we can do it I I've seen it in person um but like I didn't know they had gpus and then Lou showed me and I was like why aren't we using Gip pod but uh I think it's that it's like it maybe just not well known maybe it's cuz it's so new and so I saw him configured in a Dev container
Json file um so just give me a second I'm going to go find it okay I remember there's a thing called git pod desktop so maybe that's something that I might want to downlo git pod desktop no that would run it locally yeah that's not what I want yeah let me just look at the docs a bit more okay I just absolutely cannot find it there's no way I can find it but that doesn't mean that we still can't use G pod G pod um it has a notebook experience CU it has vs code so
maybe we'll just take a look at anything um anything that we want so I'm just trying to think of like I don't need anything in particular I just need like an example rep I'm just going to type an example it really doesn't matter what we do I even go to the ad examples one here and so I have git pod install with the git pod Chrome extension uh you could also just put in the front here um let me do like G pod. and it will still work so I'll do this uh just in case
you don't have the extension installed is it this can I do that um maybe I take this part out like that I forgot how to do it it's been a while um for slash like this github.com nope this is just to be a GitHub iio in front of here but if I go to this URL here well anyway I'll just click it it looks like the URL is a little bit different now um but anyway if you don't have that Chrome extension like Chrome git pod extension but even at the time of this video I
don't know if they're going away from this kind of system and they're letting you run it locally now but I anyway I just have this open and I just wanted to show you in giot it's super fast how fast it launches but I can go ahead and make a new file and I specifically want to make a um I'm not going to save this I'm just going to make a new file here allow yeah yeah allow and it's uh it's going to be like I'll say example ironpython uh nyb I think that's what it is
and we should be able to maybe I named the extension wrong it's I in Python ynb there we go Y and B and so now we can add stuff here so I can say what is our python version so python H hyen version and so this is going to be really good if let's just say you are I want some installations here maybe it doesn't have python extension the Jupiter extension so we're installing the jupyter extension so it can do jupyter like things and so we'll just choose our python environment here we'll just use the
default one let's all run it and it says it's a magic cell but the body cell is empty uh because I just want to run python what's uh what's wrong with that uh maybe I do pip install I'll just try to install some here pip install uh Transformers can I do that there we go so this is another environment that we can utilize and if you are not utilizing gpus like you're just using serverless um serverless uh apis then this is going to be fine for you obviously we could also use uh GitHub now again
there is a way to configure gpus we're just going to have to wait till we talk to Lou he's in the boot camp so he's going to definitely show us how to do it but what I've been told is that g pod now can run anywhere so like you could choose any type of compute that you want at ads or other platform forms um but it's probably in the pay tier and there's probably other caveat and I I don't have that information but just as a cloud developer environment experience I really like this when I
don't care about gpus um and if we do have gpus we'll have to learn a little bit more about that later on I'm going to go ahead and stop that workspace and so that is g pod as an example for somewhere where we can work with [Music] notebooks all right so if we took a look at G pod well we should probably take a look at GitHub code spaces so I'm back backing my ad examples repository again it doesn't matter where we are but there's a Code tab here and we go to code spaces and
I'm going to go ahead and launch one and we get free compute here um does codes spaces have gpus I don't know um I would imagine that that would be something on their road map code spaces [Music] gpus lightning AI comes up right away we might take a look at runp pod I've never used it it's been on my road map to take a look at um but do they have gpus I don't know and yeah I don't care about this uh and so this is launching up here I hate how it always defaults to
uh it never remembers my theme so I have to go here and change the theme and even that takes time to load up um but when we're here we actually can click the plus and we could have a little bit more options about the type of computer we can launch up and I didn't want to make two I wanted to literally have options there we go configure Dev container right and so if we go here I hope it's not just Jason it is just Jason okay we'll go back for a moment I want a UI
here uh notice I have two here by accident I'm going to go ahead and delete this one delete it I don't need two right but what I was looking for was options for me to change um uh the compute right so here we are and we go down here we have two core to 16 core um you can see we get quite a bit of memory so I didn't really see those options over at giod you just don't really think about your compute levels but there is now a way to change the compute underneath but
if you're working with LMS locally even if it's on CPUs and you they're optimized you're going to need like a lot of ram but these seem to have considerable Ram even the two core one has a lot of ram so that's totally fine but maybe 32 64 is where you want to float you want obviously more cor so that might be a little bit different again if you're working with service apis it doesn't matter what you use you can just use utilize this but one thing we want to see is that notebook experience um so
I'm going to go ahead down in the bottom left corner I'm going to go to our settings I'm just going to no that's not what I wanted I wanted to go to theme sorry and I just want to change this to a darker one it's a little bit hard in the eyes this one here light themes are fine but that light theme in particular was uh horrendous I'm going to go ahead and just make a new file here I we say I'll just allow it new file and it'll be example IP for iron python um
ynb I don't know why it's ynb U for Jupiter and so we have a similar experience here so I'm going to go ahead and just do uh you know pip install Transformers okay we'll hit enter and it will want install python uh and the Jupiter extension so that it can work with it so you get a Jupiter xes experience the only thing I don't like about uh juper no actually I don't mind uh not books but notebooks usually will tell you the time whereas duper notebooks won't and so you have to do an extra command
to do that we'll see that later on I'm choosing the default python environment the kernel that comes with it and so here we can do pip install Transformers and that should work fine here but yeah could you attach gpus with GitHub Cod spaces I don't know but G pod you can but I don't know how to do that okay so that's working fine so I'm going to go ahead and I'm done with this worksspace so I'm going to go ah and click that and I want to make sure that it's stopped I don't want it
to keep running um so we'll go back over to github.com and I'll go into my AIS examples and let's go ahead and just delete this okay actually you know what I noticed here is that we have some other options so we can open Jupiter lab this is definitely new I'll click that and see what that experience is like says it's preview so I'm not sure how long that's sticking around here but I can't imagine it's that much different um experience right so we'll click that see how that looks um unable to start code space maybe
because I deleted it we'll go ahead and try this one Jupiter notebook I didn't even know this was an option um Jupiter lab is the latest web based interface okay but that's not Jupiter lab is it running jupyter lab and then I connect to it okay so these are templates for um you you working with notebooks but this isn't actually The Notebook experience which I was hoping for I'm going to go back over to um the repo wherever that was it's not what I wanted and now I'm also running this template here can I go
here and let's click this open in Jupiter lab there we go and so maybe this what this will do this looks a lot like G pod right now that looks like a lot like G pod right now but that's not g pod um and so maybe this will open up in the Jupiter server experiences which which is what I'm looking for here we'll give it a moment here it is then we have all my stuff here we can go ahead and create a new notebook right so now we have our Jupiter notebook experience so I'll
do pip install Transformers okay okay we'll run that there we go so that's nice if we don't want to use vs we can use that there we'll go ahead and delete that so that is now in good shape um okay so that is GitHub code [Music] spaces all right let's take a look at deep note which is another place where you can have um a notebook so I've actually never used this service before so it should be very exciting to give this a go but uh the reason I'm looking at it is that it does
have a free tier um and so we'll take a look at what it can do apparently on the free tier we have CPUs I'm not sure if this Handles gpus in here but it clearly has a lot going on in here yeah there are gpus down below so I guess if you wanted to attach gpus you could absolutely do so let's take a look at deep note and see how this goes so I'm going to go ahead I'm just going to connect with my Google account though probably we'll have to connect GitHub if we want
to do anything productive uh productive but we'll see so go ahead and hit continue and learn or teach data science yeah I mean technically I'm educating so let's go ahead and do that oh I'm not that kind but we'll go ahead here which best describes you I'm a teacher uh but I think what they are asking me is like you know what I am in that sense we'll go ahead and hit continue and get through this here I don't need to invite a bunch of people and so now we have our area here and so
oh yeah cool we got data I mean data is a big thing when it comes to um working with uh geni models or or ml in general I'm just not very good at data despite making data courses um but yeah looks like we have some rich visualizations things like that but what I want to see is just our standard notebooks let's go ahead and create a new notebook and oh maybe these aren't the normal kind of notebooks I thought this is going to be like Jupiter notebooks oh okay this is really really different this is
really really really different okay let's go back over to our demo here so we have start with a query okay so it's still like notebooks but it seems like the experience is a little bit different okay so um and we have generate with deep notes sure let's go ahead and do that so uh you know write me code that will uh uh use Bert for um uh class or classification I guess let's try that so we're getting some code here I mean looks a little bit different but it just seems like they have more richer
components okay and so I suppose before we run this like the code's fine here we can add another code block okay so might put this here the blocks are a bit chunkier that's for sure how do I add blocks in between uh yeah I mean it's different maybe just keep it as a single block but definitely a different experience but if it works it works right okay let's just give it a run fair enough there are there are none I'm just not sure exactly how to add cells they don't make this very clear okay so
we'll go ahead and grab this we'll bring it down here I again I thought this was just going to be Jupiter notebooks we'll do pip install Transformers here they're suggesting that but that's fine I'm going to go ahead and do that like is this a good experience I'm not sure it's just different um never seen a never seen a platform like this we'll bring in torch as well probably need that that's probably all we need Y and so let's go ahead and run this one okay can I just run this one definitely a bit of
a learning curve here just let me run it let me run it let me run it I think right now it thinks it's something else maybe I go to go to code here there we go that's why okay I see so I down here I was here in this block okay it's like I can I'm not sure if can drag blocks around but anyway we'll go ahead and run this see if that works we're not trying to think too hard about what's going on here but it's bringing in the Bert uh Bert base uncased one
and it creates a custom data set um It prepares the data it's going to create a training Loop I'm not really interested in understanding this I I do understand it but um I just want to see it work right so it's going to just train the model so then once the model's trained it's not necessarily we're not necessarily using it here okay but it seems like it's working no problem there we go so that worked and uh the only thing is like how would I use the model now I'm not really worried about it but
I just wanted to see what that experience is like so you create an app from this okay so yeah I guess that's deep note really was expecting something more of like Jupiter Labs but um you know it's something so I guess we're just going to add it to our list of things we can utilize let just go back here and take a look at the main website here um I guess they have a lot of data data engineering integration so that's kind of interesting I was more interested in the machine learning parts of it so
they can apparently serve a model from a notebook so not exactly sure but yeah there we go yeah we have guess we have Integrations here on the side maybe we go back and take a look here so yeah it really is more data data Focus let's go here data brows aable Integrations okay yeah anyway I'm not sure if we'll come back to this or not but uh and then we have our compute over here so yeah we can switch over to gpus but you know maybe this is something we'll revisit if need be okay but
yeah there you go that's deep [Music] note hey this is Andrew Brown in this video we're going to take a look at setting up cond and Jupiter Labs locally I'm logged into remotely um a machine on my network which is uh the Intel developer kit um that has a lunar Lake chip so that I can do good things locally uh this is a very important video as I feel that if you have a decent uh GPU or CPU um that there's a lot that you can do um uh with this here if you don't plan
to use gpus or CPUs but do local development and just use serverless apis you still would want to do this okay so I've opened up vs code and I have a terminal here there's some existing project but we're going to ignore that for now because that doesn't really matter but what I want to do is I want to go ahead and install um uh Jupiter and so I actually already have um some instructions to do this and so I might just bring them on over uh into this project here so I'm going to go and
uh get that open here just give me a second I'm going to just manage two environments at the same time here but I'm going to go over to GitHub and in here I have a repo called exam proo [Music] ji Essentials is what it should be called there we go and so in here we can set up the instructions for installation so I'm going to just hit period on my keyboard and as that's going I'm going to go over to GitHub as I have a lot of projects I have the ji Training Day workshops and
in this one here we actually already have the code for us uh over here okay so I just want to bring it over and kind of rework it a little bit for our use case so I'm going to go and wait a moment here and I'm going to just make a new folder here oops not there we're going to make a new folder here I'm going to call this one um uh cond local like cond Jupiter or we'll say local development local Dev and I'll make a new folder in here this will be K and
Jupiter okay so we're going to make a new file here and I'm just going to call it uh a readme.md as this will be the instructions to get set up here and so the first thing we'll want to do is we'll want to install miniconda Okay so install miniconda if you've never heard of cond before or miniconda which is over here um so first let's talk about andaconda so andaconda um I I think of it as a package manager but basically it gives you um a package manager and trusted libraries that are commonly used uh
for people that are data scientists or in the data field so this is a very common way to start working with um libraries and to not deal with so many different kinds of conflicts but what we're going to want to do is install uh minondo this is specifically for ubun too here it's making a a folder in the home directory doesn't necessarily need to do that but it's not going to hurt anything um and so we're going to go ahead and run it here you can run this on a Mac even if you don't have
a home directory you can still do this but um actually I'm just trying to decide here should we change this at all I think what I'd rather do here I'm just going to change it slightly so I want to put this in the home directory and then here we will then do this okay and that way um it will go wherever your your one is because mine's called Andrew right and so I'm just copying each line over okay this is something that you'll have access to you can get it at jny Essentials right so I'm
going to bring in that first line which um you know what I don't really sorry that's not the right line there we go that's the line we want okay and then we want the next line which will actually download um the latest version of miniconda okay and then the last one here actually executes it so here it's downloading A bash file and renaming it to Temp miniconda and then we're executing it here which will install miniconda okay so it's installed now so that's step one installing miniconda the next step is to um get it set
up so these are the first two lines we want to run so we's just say setup miniconda I might just make this uh H2 not that it matters okay and I'm just going to slightly change this so it's actually this here okay and we'll go ahead and bring those in next have to be a little bit careful here as it likes to um when you copy paste on Windows it likes to not copy what you have and it's still it's still uh refusing to work there we go so I've done the activate and so now
notice that we have base so now miniconda is working this is really important so that um you know that it's there if you don't see this then uh cond is not there okay and the next line is the cond and it all so this is going to set up uh some defaults so it's modified some files and if we were to open up this file like our bash RC or Zs HRC I'm not sure which one we are using right here I'm just going to cat it out here for a second it's added stuff so
that next time we use terminal the condo will get loaded if you don't run that line then you'll always have to activate condo which is kind of annoying so anyway now we have uh cond installed and so the next step is to like get our environment ready and so often what we'll end up doing is we'll be creating a um uh like an environment and then from there doing something with it so let's let's actually go ahead and do that now we're not going to do open V here today uh but we're going to do
something else and I'm just going to make a very simple environment so we'll just say um create a new environment okay and so what that will do here just call um call it hello right this will create a new um python environment call hello with Python 3 10 often um I will use an older version of python because newer versions of python especially when working in data science projects you'll have serious conflicts with libraries and so you just kind of find which one's more reliable for you so latest is not always the best there's definitely
a newer version of python and so I'm making the deliberate choice of working with 3.10 if it's the future you just kind of figure out as you go um what versions work for you but right now at this time 3.10 is working great for me and I'm sticking with it so I'm going to go ahead here and we're going to go and uh create ourselves a new python environment now normally when you create a python environment it would just create it in place I'm not exactly sure where they place these but we'll take a look
here I'm just going to do an LS hyphen La so notice that like the python environment is not there so what we can do is type in cond EnV and it should tell us cond EnV it should tell us about the environment maybe it's not that maybe cond show so type in cond here oh it's cond info so we type in cond info I'm just seeing if it tells us where this environment currently is we haven't activated it yet so there is no um environment but this would be you know create create a new python
environment called hello with python uh version 3.10 okay and so if we want to use the environment like activate the environment we would write um cond activate hello okay and it's saying here could not find an environment called that I could have swore that we created it right did we not I mean this one's called open V so maybe when I copi paste it it created one called open voo which is not a big deal so I'm going to try this again and so now created it it's just that copy paste that messes things up
okay so I'm going to go here and say hello so now I'm using hello you can see this here it says hello and uh it's not easy to like to list out environments I've never figured that out but if we go conduit info now should tell us does it tell us what environment we're using so it's telling us we're using this environment and it's located here so I have a feeling if we were to copy this we could then um or write this out home Andrew minond environments we could find it so let's go ahead
and we'll type in miniconda envs and I'm going to just do lsy and LA to list them all so we have hello and open voo so I believe that if we want to delete it I mean we could manually delete it but the proper way would be to um remove It Whatever the command is so if we go here they call it remove okay so we'll go over to here and I'm just say get info about current environment and this is going to be cond uh info we can see things like where the Nar exists
or the environment uh python environment exists so let's say we're done with this we can remove it so remove an environment and this is something you actually do quite a bit because often you will install things you'll have you'll make a mess of things and you just basically have to uh delete and start over so I'm doing this all the time before you can do that you have to deactivate you can't just delete it if you have it activated so we say deactivate hello I think that's maybe what it is say cond deactivate hello um
it's not that is it not just deactivate oh it is deactivate I probably just spelled it wrong so try oh you know what it's just cond of deactivate because you can't um specify exact one I'm not sure why they made that as a choice it's just kind of a habit of me always typing it and so now we can say cond remove and this one's a little bit funny because you have to specify a flag for it and here it's cond to remove hyphen name my EnV and then hyphen all so that's what you're supposed
to do oh I mean we have to actually name it what it is so we have hello here and we'll say yes yes and I think open Vos here as well so go ahead and do that I'll just switch that out um I don't want have to put the the Y flag every time so just do that like that obviously you should read what you're deleting and so that's example there so I'm going to go back over to here so this will be cond remove every time time I say cond it sounds like in my
head what cond and I just think of that movie um but they say hello here and then it's hyphen hyen all and then it's hyphen yes and so that's how we're going to remove uh remove uh an environment so something you're going to be commonly doing now there's more to this um this is basically step one which will just be uh setup cond we'll just call this cond instead okay and the next part will actually be installing stuff and specifically what we want to install is um uh Jupiter lab but um yeah we'll cover libraries
installed in the next one here but uh one thing I wonder is like do we have Pip now so I didn't have this before which was kind of frustrating if we type in PIP yeah we have Pip okay great okay so what I'll do is I'm going to uh stop the video here and then we're going to continue on as a separate one but I'm going to save this here for now because this one's pretty good and we'll say update I I almost feel like I should have made this a separate one maybe I'll just
rename this a cond I'm so sorry for those that are watching and I'll rename this to read me but that way the next one can be Jupiter server okay okay and I will see you in the next one Chia [Music] Chow all right so my last video um we installed cond we have it here as the base um normally the next thing I want to show you is Jupiter lab but what I actually want to show you is how could you use cond specifically in vs code because that's where you're really going to want to
use it um also in jupyter lab of course um and so you know we just need some kind of random project to work on and so I'm just going to go here and make a new um uh file so I'm going to go back here for a second and it really doesn't matter what it is so I'm just going to say mkd hello and I'm going to CD into hello and I'm just going to open this up in vs code so I'm doing code period if you don't have that installed it will install it and
then it will let you open it up like this close out the old on so you just don't get mixed up that can get kind of confusing if you don't do that I'm going to go ahead here on the right hand side I'm going to make a new file called app.py okay and so if I open up terminal notice that cond is loaded as we see the word base if you don't see that then there's something wrong with your installation you have to fix it the other reason you might not see it is if you
install some kind of fancy thing that uh colors or or makes this more convenient it could overwrite that so you could still have it there but just consider those could be issues that you're having so you know every time you work on a project you should create a separate environment for each one absolutely you should do that um and don't work in bass I try not to install anything in bass if you do there is a way to clear out bass I don't do it very often so I'm I'm not sure in this video but
you can do that but try to keep bass as clean as you can but it's going to happen where you're going to install things in there by accident which you don't mean to do but the thing is that different environments can inherit from base and so that's why you don't want to muddy your base um but what I want to do is I want to create a new environment and I honestly do not have this memorized so I often have to look it up um so just a moment here as I'm looking for um the
GitHub repo we just had open here just a moment ago which is here I believe and so I'm going to have this off screen and and I'm going to look smart by pretending that remember this but it's cond to create hyphen hyen name I wonder if you can just do hyphen name like that that'd be nicer we're going to do hello and I'm going to specify python equals 3.10.1 and we'll hit enter and so it's going to go ahead and create ourselves that environment we need to activate it so we type cond to activate hello
again this something going to do again and again and again so just get used to it um so the idea is we want to run something here and we need some kind of Library installed and so we can install it a few ways one way is using cond Forge so cond Forge um we go here are basically trusted libraries um that are configured for use we go over here cond to forge and so we can explore it and no I want the packages here we go so these are all the possible packages that we can
install now we can install things outside of the cond of Forge but I'm not really seeing the ones I want there's too many packages right so if we wanted P torch we type in P torch here right whatever we wanted we wanted um uh SK kit s kit I'm not sure why I'm having to hard type typing it in today but the point is or like pandas if we want pandas it's there and so let's say we wanted to work with pandas right um we could we could do a pip install and sometimes you do
that but you really want to try to always do cond cond Forge install so I'm going to go over to here and maybe we'll continue on here um with this so we had cond I'll just call this one setup I'm so bad I just keep raming these on you folks uh we'll say hello example and so what did we do so far um whoops I want a hello. MD here what did we do we we said cond um oronda create hyphen and hello python 3.10 zero um hyphen y that was our first command and then
our next one was cond activate so the next thing we want to do is we want to install so I think it's cond install C cond Forge so what we're saying is like we want to install something but specifically use cond Forge and you could specify other things here I just don't know anything besides cond of Forge so I often just type that in and so we would do this we'll go ahead and give this a try right and that is not what I wanted so I'll just type it again cuz copying between um remote
like remote desktop doesn't always work properly so I'm just going to type it fully in here and we'll type in pandz right so this is now going to install it here now we could also create a requirements.txt install and that's totally fine as well but sometimes you want to install libraries this way so you absolutely know they're coming from cond Forge okay so we give it a hypen y here and so that's installing so that's going good and so now we have cond Forge installed or sorry pandas installed um I'm not sure if it would
show up under cond info we can take a look here so I don't think it would show those libraries but if we wanted to see what packages we had right we do um was it PIP show I believe PIP show or sorry pip list pip list and so here we can see we have cond or sorry pandis is installed here so what I want to do is I want to do cond deactivate let's take a look here and we'll do pip list and notice this comes with a lot of stuff installed but does it have
pandas installed it does not right because this is the base one that's installed here I'm going to go back over to hello and we'll do cond or sorry pip list I think I said earlier that I I believe that um things inherit from base but it doesn't appear that way so I guess that's just a false statement each environment is completely isolated and that's actually a lot better so here we have pandas and so if we want to use it uh let's make some pandas code here here um I don't want to figure it out
I'm not great at Panda so I'm going to over to chat GPT or whatever you want to go to whatever free uh model you want want to use I'm just over here I'm just give give me like give me a basic uh example of pandis okay we'll go ahead and do that should be using mini because I get limited uses of gpt1 and we'll give it a moment here we've already installed it so I'm going to grab this is a example you do whatever you want as a basic example okay we have a basic data
frame we just want to make sure that it works that's all that matters to us here okay so we have this here I'm going to go ahead and I'm going to um run this so if I do LS we have the app.py so do Python app.py and notice that we can use the word python as opposed to Python 3 because it should be routed to that location one thing you might want to check when you're using python is where it is so say where is Python and so it's showing that it's loading for miniconda there
have been cases where I've had to use Python 3 which is kind of annoying but we can type that in as well say where is Python 3 sometimes both are installed sometimes only one's installed but you can see that um it looks like Python 3 there's probably a local one here so the local installed version outside of cond is that one but clearly want to use Python uh I know it's confusing but it's just how it is but anyway I think that worked right did our file work it did okay so if you want to
run this as well I I'll go copy it over here to our uh repo over here app.py and so we'll go back over to this and we'll just do a bit of documentation so we had cond install we had cond uh or so we had PIP show or sorry pip list then we had cond deactivate and then we we had pip list we have uh cond activate hello and then we do python app.py okay so this is um hello example and so here we are creating our uh hello environment we uh install pandas in our
uh hello en environment we observe which packages are installed we observe which packages are installed in base it's good to make notes like even if like this is really simple just going through the habit of doing this really helps commit it to long-term memory for yourself uh and so I just that's why I always do this is it's kind of like your opportunity to do it like two three times over uh we can use Python uh python uh binary to execute U the python in the context of cond and we can type in where is
python check the check which uh uh what binary is being loaded okay we do where is Python 3 one thing we might also want to check out is like what if we did a uh requirements.txt here and we're not using the for it should still work okay and so I need some kind of Library here we'll just say torch and so I'm going to go here and do pip install hyphen R requirements.txt okay so it's installing no problem here I maybe I shouldn't have installed torch because that takes a bit of time yeah it's a
little bit large so I'm just going to stop that and you'll notice like torch is almost like um a gigabyte so you might want to clean up your environments from time to time maybe I should have chose something smaller uh you know like python. EMV we'll end up using that quite a bit so python Dov so do um pip install hyphen R requirements.txt so it'll install from that file here okay so that's really good um I'll go back over to here and I'll just make a new requirements here uh this will be just python just
again writing it again python. EnV that's so we can load environment variables and actually we'll use that quite a bit so that one's a good example to have here we'll go back over to our read what else did we do we did um uh python or sorry pip install requirements.txt has have a hyphen R in front of it if you forget it don't worry like after a week of me not working with python I always forget that command we'll do pip pip list again here because I want to see if it installed it so we
have python python. EMV here another thing that I should show you is um working with not Jupiter Labs but it's the environment specifically for um vs code so we'll do that in the next video here and so I'm just going to stop it I'm going to keep this environment around because this one's useful for us I'm going to go ahead and just quickly Commit This code here so we have it for later not sure if that's really the right commit messages but no one judged me on my commit messages okay see you in the next
[Music] one all right so in this video what I want to show you is how to use the notebook like experience within um vs code with cond so we already figured out how to do that um obviously with um uh Cloud environments but if we want to get this set up locally we're going to need a couple of extensions or maybe well at least we're going to need Jupiter but what we can do is if we go ahead and make a new file here I'm going to call it app. ipynb I'm not sure how I've
memorized that at this point but that is stands for iron Python and this ynb I guess kind of for jupyter Notebook oh yeah I guess it is yeah so the J is represented as or sorry Jupiter is represented as a y so iron python notebook okay that's the way I'm going to remember it from now on I'm going to make a new code block here and I'm going to do I want to uh require something so we'll say python or what is it um import. EnV because that's how we're going to bring in that there
I'll hit run and notice right away it pops up because right now there's no kernel it doesn't know about cond it doesn't even know about WSL as if you've installed WSL and you don't have um particular plugin here it might not work but I'll click it here and here I can see python 312 .3 and I don't want to use that what I actually want to use is the minond environment this is the same environment down here so this one's that's installed I know it's recommending it and this is the base one but we really
want to go to Hello and this might just work surprisingly I'm surprised that just works so sometimes what you need to do is install a couple things and maybe they're already installed here so I already have WL installed okay another thing that you might need installed here if this is not working for you is remote desktop remote developer uh remote Explorer remote like I would I would install this one like I have WSL installed so it's working fine but this one is super super useful in fact um I'm right now remote I remoted into this
machine using remote desktop but I could uh connect to this machine through this plugin and then not have to work through this here but I don't have that set up here today I'm not I'm not planning to show show that here so um that's that but anyway there's that and then there's also Jupiter so I think here if I go here I want to see what extensions I have installed show running extensions and so there's one called Jupiter as well that you'll need but anyway when you click this it normally will ask you to install
them and you just say yes but if you don't you know we'll just type them in Jupiter here jiter you got to type it right and so you can see I already have this installed and it might have installed a few other ones as well so just let it install whatever it wants to install but the idea is over here is we have our kernel and it picked it out from here but normally you're supposed to create an IP kernel um within it which we'll show in the next video but all I want to show
you is that this is connected to cond and so we have EnV and so what I want to do is I want to make a new file called EnV and in here I'm just going to make uh something called it doesn't really matter what it is it's going to be like hello world this is a we basically environment variable here okay and we'll go back over to here and so I have EnV um and I think it's EnV load EnV and so what that will do is it will load it I usually always have my
Imports on a separate line here okay so we run this one running cells requires IP kernel package okay so that's something that we don't have installed and so I was thinking that yeah I was thinking that uh we wouldn't need this till we um we did the next one but IP kernel won't work unless you have this installed so it looks like we do need to have it it's doing like update dependencies Force reinstall I don't know if we have to do all of that and I would probably write this a little bit differently so
I'm going to go back over to here and so I thought this would be for the next video but it's for this one and so if we want to um go over to here as this is a separate video we'll go here and say new folder this is um vs code um notebook. MD so what we need to do oh I always make it a a folder I promise you I'm a really good coder if you saw me in uh um uh macvim you'd be like wow Andrew you're really good but I'm working in this
editor here so I would probably do cond um install hyphy cond Forge like the other one will will work but this is probably better IP kernel okay so this is the way that we can actually create our own kernels ipy kernel and so I'm going to go ahead and copy this here I'm probably G just have to write it again yep and so say cond make sure again you are in the correct environment cond install hyphen C cond Forge ipy kernel we'll hit enter and so that is now installing ipy kernel I kind of feel
like after you do that you usually have to do something else but maybe not so let's go ahead and give this a go let's try this again and it's complaining so I'm going to go ahead here and I'm going close this out here for a second don't save we'll go back into this let's run it again no it's still complaining so let's go take a look up at ipy kernel I always feel like there's some after it that I forget ipy kernel okay I'm just doing this on my other machine [Music] here so we have
Pip install python hyphen M ipy [Music] kernel conduit create new environment see probably would have been easier if we created this environment that way so what I'm going to do because probably we could have like restarted and did what it asked us to do but it might be just easier to just delete this environment start over so what we're going to do is we're going to say um cond deactivate then we're going to do cond um remove hyphen n hello uh hyphen hyphen all y then after that we're going to recreate the environment so conduit
create hyphen n hello python 310 zero but we want to bring in uh ipy kernel in that stage and that'll just make our lives a lot easier if we always remember to do that so let me just go find that folder here and so here it just says ipy kernel but then over here it's saying hyphen M install user so I wonder if we still have to do that okay so maybe we just had to run this line here and we don't have to actually uninstall we didn't uninstall yet have we no okay so let
me just try to run this single line here what it does I do not remember so say no kernel exists so okay that's fine so what we'll do is we'll take this line here we'll breing get into our instructions and the other thing I want here is sorry I got a little thing moving on my desk here um IP y kernel right so now we'll go through and uh run through all this so the first thing is I want to do cond deactivate cond to remove hyphen and hello hyph uh hyphen hyphen all hyphen y
so that's going to remove that environment we're going to recreate from scratch hyphen n hello we said python equals 3.10.1 kernel and I think like when you just put anything after that it's just going to install that so we could actually put more packages here if we wanted to like if I want a torch I could probably put it there but I'm just going to do ipy kernel here today and then I'm just going to do a pip list to see what's installed oh well sorry we got to activate so we could do cond activate
um cond activate uh hello we created that environment right okay so it's saying that package doesn't exist did we type it wrong maybe we typed it wrong iy kernel I mean it looks right to me let me double check I py kernel yeah so like this one here says ipy kernel here Source activate my environment and then do ipy kernel for example using con environment install my python kernel in a first environment so python hyphen M yeah so I mean like that's what we did more or less well sometimes I'm just making these things from
scratch we'll we have less of an issue so I'm just going to do three Z I could have swore we could have done it on a single line there but maybe we had to do like hyeny condo Forge I've definitely done that before on a single line but that's totally fine if that doesn't want to work here today and we'll do hyphen Y and I'm going to try this again so it's cond install uh ipy kernel I think that if you don't um specify cond cond of Forge it will still pull from there oh we
didn't activate the the the one that's that's something it's very easy to forget to do that to do um cond install ipy kernel okay so didn't find it see that's why I think you need cond Forge okay so it's still not finding it so maybe we never installed it this entire time ipy kernel am I crazy do I not know how to install that ipy kernel am I spelling it wrong am I putting an A in there yeah I am that's probably why it hasn't been working this entire time so maybe maybe that was my
big problem I just was typing it wrong and you're like Andrew you're crazy and I bet it would have worked as a single line as well so we could probably go back to this uh so we'd have um cond to activate so now it makes me think that this could have always been um ipy kernel like that uh so I'm going to type in PIP show or sorry pip list okay so I'm going to go and I know this is tedious but I'm going to go ahead I'm going to deactivate it because I want to
know um okay that's great defender thank you I want to know definitively that it can do it so goonda remove hyphen n hello hyphen hyphen all hyphen y it's going to remove everything we'll try this again cond create hyphen n hello python 310 python equals 310 Z ipy kernel hyphen Y and let's see if that works okay and so now what I want to do is do cona activate hello and we'll do cond or sorry pip list do we have ipy kernel we do okay so the question is do we have to run this weird
line which I don't know what it does I don't know what hyphen m is um it looks like it's installing it for the user but let's go ahead and see what happens so I'm going to go ahead and save this file up here and I'm try to bring this down here I want to select a kernel so now it should be kernel compatible like we have the kernel created so it doesn't even show it here I'm going to give this a refresh yeah so it's not showing it one second here this could be a little
bit finicky I don't trust it so I'm going to close it out I'm going to reopen it in vs code and I'm going to first switch over well we don't really need to switch over right now but we'll switch over we'll say um cond to activate hello and I'm going to go back up to here hit run it's just going to complain because we don't have an environment selected I want to select a different one python environment there we go so isn't it interesting that I just like restarted now it's okay I like how this
indicates that it's a cond environment this one's not but it is okay so now we're using 310 and I mean we don't have that module installed right so that kind of makes sense so it's not necessarily that it's not working it's just that we don't have it installed so you know again we can install this way using this one uh another way that we could install it is we could also just add it as a line of code so you could do the you could do the PIP requirements.txt if you want that there but I
wanted to show this anyway so we have Pip install umv or python. EnV okay so we can do that here so that will install it here sometimes I like to silence it so I'll do a hyphen Q on that and I'll make it a little bit cleaner you can also do like uh interjection but um I do percentage and maybe I do it wrong sometimes it tells you to restart the environment when you do that but it's probably fine in this case so now there's no problem then we run this one which is good and
then uh we'll need to import OS so I'll do it over here import OS and I'll save it here and so what I want to do is print out uh the environment so say OS Environ and which is for environment and this will be for hello and then we'll go ahead and print that and it says Environ object is not callable maybe it's Square braces I don't know thought it was that unless there's like a DOT after it Environ dot get maybe this is not my uh primary language so I'm always kind of guessing there
we go and so we have it so yeah it looks like we don't need this third line here but I have seen it before right and that makes it so that it shows up as an ipy kernel if you don't install this I thought we're going to do this in the next video but um sometimes it work sometimes it doesn't um but this is the proper way to do it now that I have it here this totally makes sense uh so hopefully that wasn't too confusing but often you are you are fiddling with this and
I could be on like 10 different machines and have 10 different problems even though we're following the same in instructions so don't get discouraged if this doesn't work perfectly for you um but if you're going to want to do local development is a little bit of a challenge okay so in the next video what I want to show you is installing Jupiter lab server as this is fine but you might also want a jupyter lab server as it's just a different experience okay so I'm going to stop the video here I'm leaving everything as it
is and I'll see you you in the next one okay [Music] ciao all right so we're going to pick up uh where we last left off um so we got this working here but another way we might want to work with this is with Jupiter Jupiter lab so what I want to do is I want to get Jupiter uh lab installed and so I'm going to go ahead and we're just going to make a new document over here so we can kind of follow along and so this one's going to be Jupiter uh Jupiter server
I mean I suppose this one could be separate uh we'll make a new folder here called Jupiter Jupiter server and I'll make a new read me here and I'm going to probably want to create a new environment so I'm going to go uh yeah it's always good practice for us to do this so I'm going to go ahead and just delete this one so I'm can say deactivate and this will be cond remove hyphen n hello hyphen hyphen all hyphen yes and so we'll want to make a new one here um and let's see if
I can do this for memory so what we're going to do we can still work in the hello folder it's totally fine um as this is a very small application but um what I'm going to want to do here is first create an environment so we'll say cond to create hyphen n and this one will be called uh ser for server and we're going to stick with python uh 3.10.1 ipy kernel so ipy kernel and I'm going to do hyphen y so that's going to give us a base install that's great I think that there's
a spelling mistake because it did complain I have this habit of making a nstead n l so that's my fault so that's going to install there now to install Jupiter server is really really really simple um all we need to do is we need to just say cond install hyphen C cond Forge cond Forge uh cond Forge Jupiter lab and again you probably don't have to put hyeny cond of Forge I just do that out of habit it's it's just telling where to explicitly pull it and so that's going to install it for us so
let's go over here and I'm just going to say yes so what were what what steps did we just do we did cond create hyphen n serve um Python 3 310 0 uh ipy kernel hyphen Y and then we're doing cond to create or cond to install hyphen C cond to forge uh Jupiter lab was that it Jupiter lab that's right okay so that's going to install Jupiter lab but now we need to start it so the way we started was just type in Jupiter lab hyphen hyphen no browser um I mean yeah we don't
have to you could you could say browser if you wanted to but this is just the way we have it allow root and we're doing we're binding on 0000000000 that way that we can access it from anywhere so 0000 we're hit enter and um I spelled allow root wrong if you spell things wrong it won't work so go ahead and hit enter here and so now it's starting on Port 888 I'm going to open the browser and so here is our initial server it wants us to um authenticate and so what we'll need to do
is set up a password or token um so it's up to you I mean if you're using this often I bring the token in here and set your own password so I'm going to go over to here and so we have here this token here okay so I'm copying that and I'm going to go back over to here and so I'm pasting the token in here a lot of folks had a lot of trouble pasting this token here so you could do one over here and never have to do it again or you can set
a password I don't know I'm going to put it up here it's up to you what you want to do I mean worst case you can tear it down and reset it up so now we have our python environment so we'll work in here and we have our existing notebook so if we go here you can see we can run it and so now we have it set up there are some things that we don't have here and so there are extensions that you might want to install for Jupiter lab um that we might want
to install so let's go ahead and just see if there's fun ones we can find so Jupiter lab Jupiter lab popular extensions install condo let's see what we have I know like one that I always want to install is um is the the GitHub one or the git Source One so that'd be interesting if we go here on the left hand side is excited to have a robust system we'll say yes and so these can be installed via cond but here we might want to have one like the git so I'm not sure what that
one's called git lab Jupiter lab gets so this is something we can do we can go back over to here I'm going to leave this running I don't want to kill it so I'm going to open up another terminal here and we are we're in base right now so I'm going to do cond activate hello did I install all this in base what am I running here is this base yeah I want to stop the server for a second oh we installed everything in the base remember I said you want to activate it it happens
sometimes so I guess I forgot to um activate it that's totally fine we'll take a look at how to set a reset base here in a moment but we are working in base and we really shouldn't be and so that's starting back up on Port 888 but let's say we want to install it and we'll just install in the wrong place so we can do um what is it J or cond hyphen C cond Forge install oh sorry it's installs on this side here install and this will be uh Jupiter lab git okay so I
think they're the same name as the extensions to install them hyphen y or just y there and if we go back over to here I'm not sure if we have to restart the server for this to take effect so I give this a hard refresh here so we still don't it cuz we don't have any way of it did pop up and complained there a second ago if go that refresh what what was wrong get extension unavailable please ensure you have installed the jupyter lab git server extension installing pip install upgrade jupyter lab git to
confirm the server extension is installed run Jupiter's server extension list so I'm not sure what it's complaining about there but what I'm going to do so I don't trust I don't trust its complaints I'm going to go back over to here whoops and I'm going to do the uh the old trick of stopping and starting it again so we're going to stop it and start it again because it didn't know about that extension prior to here and we'll give this a refresh so now it shows up okay so that's one way we can install extensions
I think that we can install them here on the left hand side but sometimes your environment does not allow you to do that so sometimes you have to do it through cond but maybe there's something else that we would like to install [Music] uh pastel colors that sounds like fun so if I want to install this I we click the little install button here and will it install for us so you need to refresh the page nice hard refresh here and I don't know I mean it's installed but it doesn't feel like it's here so
I don't trust it so I'm going to go ahead and close it out and I'm going to reopen it again so we'll say local host 8888 see what we get here I still don't see our pastel colors so what I'm going to do is I'm going to again do the the usual trick Stop the server restart it up go back over to here I mean it says it's installed why why don't I see it so go down here yeah like wow that looks so much nicer so it says install the juper lab extension pipond okay
so I mean it didn't work that way and so I again I don't really trust the extensions installed this way I feel like if you want something installed you have to really do it this way so do that say yes I'll restart the server here and we'll give this a nice refresh here okay I mean it's definitely installed okay so we say install the jupyter lab extension we did that it's definitely installed start and restart Jupiter if it's already open reset the application go to settings oh okay so it's not that it's not installed we
just had to go over here to do it oh cool we have all these themes well look at that okay I think these are all the ones we installed was the uh these ones here that one's pretty nice um so yeah maybe maybe the Z work here on left hand side and I just was kind of ignorant uh what other kind of extensions could we go here let's just see if there's another one there's another like cat one I'm not a huge fan of cats but uh sorry cat folks but maybe there's another one here
that was like cat [Music] related I'm trying to find one that like we'll be able to notice right away if we were to use it let's go up to here and just say like Ruby can we get ruby in here no just type in cat I guess that's only for ones that are oh yeah it is it is searching down below it's just updating the list so what I typed it here there we go now I'm starting to see one what's this has no page well I don't think I trust ones if I can't find
them can we type game in here like do we get a is there like a game nope can we get Ruby like another runtime nope but anyway um again I I feel more confident installing it via uh via the terminal here and just restarting and so um anyway that's stuper lab server if you want to do more with it there's obviously um more configuration that occurs but um this is how I have it set up on my machine in another user account here and so when I want to work um on the Intel AI developer
kit from my main computer I can just connect to the server and it runs on a particular port and so if I run if I start it up I just go to Local Host whatever or the or sorry not Local Host but the name of the of the actual server and that is good enough for me um so yeah we went over here we installed a couple things so maybe we'll just go ahead and uh show those instructions here just want to bring those back over so we have uh this one here which is how
to start it it's very very hard to copy and paste out of here it's very frustrating there we go so that's one and we installed oops the git one here there we go that was one that we installed and then there was another one that we installed um which was the themes nothing nothing crazy though there we go so we'll go ahead and Commit This all right and so you're pretty much set up for local development now so yeah um there you go I'll see you in the next one [Music] ciao let's take a look
here at zero shot prompting this is when the model can perform the expected task with no prior knowledge or examples provided during prompting uh so an example might be that you want to classify some text so the text here is Andrew is great at teaching gen do we classified as neutral negative or positive and so the output the next thing that the um uh large language model would write would be neutral here but the idea is that we didn't have to provideed examples of what neutral negative positive looked like it already had that built-in knowledge
uh from when it was trained to be able to perform that task and that's what we call zero [Music] shot so fuse shot prompting is when the model can perform the expected tasks with examples provided during prompting so imagine you have a series of tweets and your model is not capable of doing sentiment analysis so you're providing examples of what negative and positive uh tweets would look like and the idea is that it would predict the next one based on the prior examples uh we could also call this in context learning because you're providing in
context in the prompt message uh the way in context learning because you're providing at that point in time so there you go [Music] let's take a look at prompt chaining so prompt chaining is where the output of one LM prompt is used as the input for the next prompt in a sequence uh the reason is when the context window of the prompt is too small to complete a single large or complex Tex task so we break the task up into smaller tasks so an example would be imagine we have a movie that we want to
watch or sorry there's a movie that we want to watch to learn Japanese and so we want to take the subtitles file and convert it into English Engish uh and then also get grammar rules around it and then provide it in adjacent Li lines file this is actually something that we do in the boot camp and I didn't even realize I was doing prompt chaining when I did it but I tried to give it a very complex uh prompt and it just would not do it it would just make a lot of mistakes or give
me bad information and so I had to break it up into smaller tasks and so the thing I had to do is iterate each over each line in the actual um uh subtitles file and say Translate each line from Japanese English and then I would then um ask it to produce a list of grammar rules that it found in the sentence then I would then ask it to explain the grammar Rule and then provide five examples then I would format the original line uh translator line the grammar rule examples into Json uh file so again
this is just where you're breaking up a task because the the um LM cannot handle such a large task okay [Music] let's take a look at Chain of Thought So Chain of Thought prompting is when you tell the model to do step-by-step reasoning so it produces a more accurate result without chain of thought you'd say something like which two prime numbers sum to 10 with chain of thought you'd say please solve the problem step by step first outline your your reasoning is clear a logical Chain of Thought then provide the final concise answer this is
not a huge issue for a larger modern models but the smaller models um sometimes they just don't know things unless they talk out loud and work through it so uh sometimes you get extremely better results by just telling it to think out loud and think step by step um and so that's all that is really happening here [Music] okay let's take a look at tree of thought so tree of thought prompting is where the prompt instructs the model to explore possible States and transitions rather than trying to produce an answer immediately in a single Chain
of Thought basically creating a decision tree and we kind of do this in the boot camp with um uh another one where it's the Japanese language uh uh Japanese language construction helper where we actually Define States and tell it how to transition between uh one and the other but here is a diagram that you'll see a lot every time you look up a tree of thought everyone's using the same diagram so I pulled it as well and so here you see on a you have input output prompting so that's just where you give it an
input and you get an output like zero or F shot Chain of Thought prompting where we will tell it to um uh think about things through it then we have self-consistency with uh Chain of Thought and and uh and self-consistency so this is where it produces multiple things and uses voting to come to a final result I don't think we made a slide on this but it is another thing that we can do and then you have tree of thought where um it has multiple ways that it can go through uh uh logic it can
add branches remove branches the idea is that it's not taking a um a linear line it's going to evaluate multiple options to get to the end so this one's a lot more complex but I thought it was just really cool to uh demonstrate here [Music] okay hey this is Andrew Brown and I want to see if we can figure out how to do tree of thought prompting it's not uh super easy to do so I'm hoping that we can leverage something like prompt Hub where they say they already have an existing template that they that
we can work with I'm finding all the examples online are showing um uh one with the game of 24 and then the other one is creative writing so maybe if we can get one of these templates from wherever we can get started here but I do believe they have a button here tree of thought prompting and so now we can go take out take a look at promp tuub which is a place where you can manage prompts which is not a total bad idea I'm going to go ahead and sign up here and make a
new account and we'll take a look and see if we can see that prompt in here so I've just verified my email I'm going to continue on here and I'm just going to skip here set up your teams key so we could put an open AI I key in here and luckily I do have open AI so maybe that's not a bad idea to go give that a try um of course you'll need an open AI count to do this we do this many times throughout this course so if you're not familiar with this you'll
have to watch just watch for now and you can always come back to this video later on I already have a basic project set up here you create new projects I'm go to manage projects I'm going to make my way over to API keys I clearly already have an API key I'm going to go ahead and revoke this key I'm going to go ahead and make a new k key and it's going be called prompt Hub we're just trying to get into promp Hub so that we already have an existing template because they're promising that
there's some template here right um we'll go ahead and dump that key in there we'll hit save and continue uh I don't have anyone else to invite let's go and make our way into promp TB so we are now in promp tab I want to see some of these templates let's go over to here and see what else we have so what I'm looking for in here is chain of Destiny that's funny um I'm looking for uh tree of thought so that's what we we want to find here so I'm going to go ahead and
type in tree of thought maybe we can get it if we go back over to here and click this again okay and then it will bring us exactly to the template that we want and so here's an example of the tree of thoughts uh one we'll read through and try to figure out what we have here so a prompting method that instructs the elm to Traverse many different paths when completing a task movie recommender is used for example task update the variable and steps for your use case so you are a super intelligent movie recommender
AI here's your task your goal is to leverage this information and use tree of thoughts method to come uh to come to an optimal conclusion so understanding friends preferences review the hardcoded movie preferences provided for each friend oh okay um uh create a list of unique movie characteristics including favorite genres preferred actors and any specific movie elements uh they enjoy using the gathered preferences generate a list of potential movie options that will align um evaluate the potential movie options based on each friend's preference you can assign a value of 1 to 10 to each movie
representing how well it's it it matches its interest now that you have a list of movies with the corresponding scores review the evaluations and select the movies that receive the highest overall score consider uh considering the combined preferen of all all friends finally finally reveal the perfect movie Choice output the selected movie that's a lot of stuff going on there how can that do in all one go I cannot imagine it would do that so use the tree of thought method to navigate through steps and generate the best movie recommendation for friends so this sounds
cool um how can we go ahead and utilize so we can hit add to library so now we have it here and since this is hook up to gen I'm assuming that we can input the stuff here is the task Okay so I'm here we have a preset here this is on turbo I like how it shows the price here that's actually really smart but I'm utilizing um GPT 4 usually I think it's also more cost effective and by the way if you don't have paid you can just watch me do this and we'll find
out here but I think the the version went down here but I like how they have that um okay so I guess my question is how am I inserting this here is your task okay so I'm not 100% certain on how to utilize this let's go look at another one here article content code so let's go take a look here and say prompted Hub uh variables like they don't really explain prompt Hub they're not really explaining how to set those there just give me a second here let's find out all right so there're suggesting there's
a thing called variable let's this one so let's go take a look and see if that's true so we'll go over to here back to promp tuub um all right well what we can just hit run test and find out what happens run test I think it's doing it how many times I'm not sure uh oh how many times is it running once twice three times so I guess it's running great so now where's the Run can I see [Music] it well this is gpg 40 that look kind of looks like this is the output
of the last uh test run one second ago introduction prompt engineering Okay cool so right away it's already um giving us recommendations so I guess it actually did all this stuff so it basically did all of it oh variables here we go ah there we go generate 10 article ideas for blog posts about prompt engineering but this is not exactly what we want to happen because we need to read this prompt a little bit more carefully one more time goals to leverage information for tree tree of thoughts methods to uh come T an opal conclusion
review the hardcoded uh movie preferences provided for each friend create a list of unique movie characteristics including favorite genres review the hardcode and movie preference provided of each friend so what we need for this to work so we need uh three friends so I need a um I need three friends and I need to know five movies they like oh I spelled it wrong give me three friends friend names and five movies movie titles they like okay so the idea is I'm just doing this to feed it some kind of data okay and so we
can see we have these three folks here we'll go ahead and copy this text I'm going to make my way back over to here and we'll paste this in here so now we have that information it's not pasted the the nicest way but this should be sufficient for this to work I going click this again to make sure that is set that is set let's go ahead and run this again okay so now it should do what it's supposed to be doing here we're give it a moment here all right so we have a result
back here so step one understand friends preference I'm going to copy this text over here so we can go somewhere else and just have an easier time read it just give me one moment all right so I've just broaden this uh brought this into um vs code and I'm just uh previewing it because it did it did look like it produced markdown so step one understand friends preference Rachel seems to enjoy fantasy Adventure uh films with a a mix of animation and live action her movie list also includes a mystery film suggesting she enjoys a
good plot twist his movie preference is towards drama and thrillers and complex narratives his list includes films with strong performances and thought perform thought Pro provoking themes um Priya seems to appreciate a variety of genres including stuff so we go to step two consider the preferences of all three some potential movies that could suit their task include The Dark Knight Lord of the Rings the prege ETC EV valuing the movie options so here it looks like they are going through here so we are getting scored values for them select the best movies so Lord of
the Rings is the highest overall score closely fall with the Dark Knight Toy Story the perfect movie choice for uh movie night would be the Lord of the Rings Fellowship of the Rings this Epic Fantasy Adventure is renowned for ETC so that's really cool that it goes through that and what I was surprised in the prompt was the fact that they just said use tree of thought so clearly the model the model must have knowledge of how tree of thought works I figure like if you're using a dumber model and it doesn't know that you
might have to explain in more description of what tree of thought is probably what we could do it's like if we go over to here and something didn't know tree of thought so um I am I am using a smaller model that doesn't know what tree of thought is what can we do to prompt it so it understands how to perform tree of thought outline the steps of true of thought okay so explain what it is outline the steps combine explanations instructions so I guess the question is like is tree you know is tree of
thought one shot like one like one prompt because you know I was expecting for tree of thought to be multiple prompts like Chain of Thought or not Chain of Thought but uh uh prompt prompt chaining right is a reasoning framework with a single prompt okay there you go so I guess that kind of makes sense um about how that could be utilized there we go so let's go take a look here at a de demonstrative example let's and take a look here so we will use a step-by-step reasoning called tree of thought in this approach
we will break down the question or Problem by exploring multiple branches ideas so outline the steps of thoughts so restate the problem generaliz uh generate possible paths evaluate improving branches expand on best branch provide the final answer okay so example prompt you're an AI sance we're going to solve problems using a step-by-step reasoning process called train of thought restate the problem clearly restate the question your own words generate possible paths okay so it's just saying the same thing so we'll go down below here so example question what is 15 * 14 restate the problem I
want to multiply 15 * 14 possible paths Branch a branch B Branch C evaluate the branches quick approach if we recall it more tedious but guaranteed and then this is the best branch and expand on that Branch okay so that's really cool that's starting to make sense to me um so yeah I guess that's true of thought there are so many techniques to prompt engineering like I'm not sure if I showed this in a previous video but if we go over to um prompt engineering prompt engineering website is it this one there's so many prompt
engineering websites now um some are good some are bad uh prompt engineering guide here it is like this one I like quite a bit and so you know there's just so many techniques and ways of doing prompt engineering um a lot of times when you are using uh using LMS you end up inherently uh using a prompt technique without realizing that you're doing it so um I think I mentioned this before but like when we were doing the Japanese subtitle thing I was using some form of state and transition management which looks very similar to
tree of thought um and like prompt flowing uh I ended up using that intentionally as well so or prompt promp chaining so I just would say don't worry so much about memorizing all these prompt engineering techniques um but you'll just find that you will indirectly use them but it does help to generally know some of them and come back and review to see you go oh yeah okay I am using prompt engineering I just didn't realize I'm I'm doing that there's also a big argument about the difference between prompt engineering prompt designing so some folks
would call using this just called prompt designing and prompt engineering requires a lot more uh technical um coding behaviors uh to orchestrate something a lot larger than simply writing a prompt I'm not going to fuss about prompt design versus prompt engineering I'm just considering it all prompt engineering uh but yeah there you [Music] go let's take a look at co-star so co-star is a prompting framework created by Sheila uh to um and it basically follows um this I think it was what it called a pneumonic uh context objective style tone audience and response so this
is nothing super fancy it's just a prompt uh prompt template you can utilize I use something similar but the idea is that you provided a context so you say what are you doing so here it says a free online introductory Japanese language workshop for beginners the objective encourage people to register by highlighting the quick fun interactive nature of the session style engaging inviting uh clear educational tone uh tone persuasive yet friendly it's weird that the style says tone but then we have tone down below well that's okay we have audience busy uh busy adults curious
about language learning with no prior experience response to concise impact impact Facebook post so this is just one of many ways that you can format a document but because it's called co-star I thought maybe it might be easy for people to remember and adopt as a way of making prompt documents to get a result that you want uh but there you [Music] go hey this is angrew brown in this video I want to see if there's a way that we can Implement react prompting now there are um Solutions out there uh that are managed but
um I want to see if we can write this from scratch as I don't believe that it would be very very difficult to figure out uh but maybe what we can do looks like there's a video down here what we can do is see if we can use chbt to write it ourselves so I want to implement a um llm solution that uses react prompt prompting how can I implement this okay and so we'll we'll take a look here and see what we can do here we'll give it just a moment here I'll get my
head out of the way all right so I've asked chbt let's implement it but I don't want to implement it utilizing Lang chain or llama index those Frameworks would allow us to uh easily do that but I really don't want to do that as um I do feel that they add an extra layer that doesn't help us out that much I'm going to go over to GitHub and I'm going to go over to our exam Pro gen Essentials um just so you know that prompt engineering does appear a little bit earlier in this course and
so if this stuff is out of your realm right now you can always come back to it when you learn a bit more coding and things like that um I'm going to open this up in GitHub do GitHub codes spaces here today if that's okay Sometimes using git pod sometimes I'm using GitHub codes spaces sometimes I'm using local development we like to shake it up here um but we'll open that up and as that's opening up let's give this a read so we'll create a prompt that instructs the model on how to respond to the
react format typically you do something like this so we'll have a system prompt so you are an AI assistant that can perform the following steps reason through Problem by describing your thoughts in a thought section when you need to use a tool output an action section with the tool name and its inputs after the tool call you'll see an observation and uh section with the tool output continue the cycle of thought and action observation as needed and with a concise final answer that answers the user query The Chain of Thought in thought sections is only
visible to you and not part of your final answer the user should only see the final answer so that sounds interesting so far uh so the user prompt what is the capital France plus uh the capital Germany right so perform a single manual react step with an llm and so that kind of sounds like something that we would like to do is to do this manually so I'm going to go ahead and copy this and we have lots of code laying around this is actually old code I can tell right away because it has Capital
C Capital uh the two capitals on there so here it says we pass in a list of system Messages Plus user and ongoing conversation parsing the action observation so let's go back over to here and I'm going to switch this over to uh color themes we're going to turn this into dark and I'm going to make a oops I'm going to make a new folder here yes allow me to paste uh prompt engineering prompting and we're going to go ahead and see if we can implement this react ourselves and it's not react the JavaScript framework
it's react the prompting framework so I'm going to go here make a basic. pi so this is or so not Pi but iron Python ynb and we'll also create ourselves a newv file we'll also create ourselves a new env. example file we'll also create ourselves a new um get ignore file um I've been using open AI quite a bit and uh no we'll keep using it it's fine um I do have like a cla a Claud subscription with or anthropic subscription with credits on it but the only way you'd be able to do that is
if you'd have to pay for it if you don't have the credits or a comparable one that's totally okay just do your best to follow along with whatever solution you have we do have a whole section where we show you how to programmatically work with stuff in the workbenches playground section so if you have yet to do so go check out that stuff but I'm going to make my way over to the um AI open AI implementation here which is over here um and I'm going to grab some of this code to get started with
so here this one is actually a really good example I'm just actually just going to copy this this and save myself some trouble um and then I'm just going to paste it into here and let it replace it I'm going to delete this one and I'll rename this ipynb and I'm going to need a key from open AI so we'll go over to here we'll give it a moment here I'm going to go over to API uh API login and I'm going to make my way over to my basic project if you don't have a
project go ahead and create one I'm going to go over to API keys I have one here from earlier I'm going to delete this key we're going to create a new key this one's going to be called react and I'm going to place this into the basic project it will generate out a new key for me I'm going to go back over to here and go into the EnV um I'm going to go into the open AI one because this is what we need is these three things I'll paste this into the example here also
while I don't forget I'm going to Dov here and then we'll go back we'll copy this we'll go up to here and I'm going to bring this down and this is the key that goes into the first part so we'll cut that out we'll place it in here and so I need a project ID and an org ID so I'm going to go back over to here and we'll go into API login and on the left hand side we'll go over to basic into manage projects um here I can grab the actual uh thing I
want which is the basic so grab that and I believe that was the project ID and then we'll go back over to here put done and I'm going to make my way over to getting a little bit confused where I am uh yeah organization General and so here we can grab an organization ID we'll go back over to here we'll paste this in and so now we have those three set okay so those are all set let's go back over to what's going on here I think I named the file wrong it's not supposed to
be a folder oh darn I'm going to delete this one second and I'm going to go ahead and copy this again and then I'm going to paste it in here there we go and so this should now all be configured to work um this is not really the example that we want but let's go ahead just make sure that it works I'm going to go ahead and let it install python Jupiter whatever it needs install python down here as as well okay we'll give it a moment okay so we'll go to python environments here we'll
choose 3 12 so we'll run this code here give it a moment and then we'll import our environment variables and we'll just make sure that this still works okay so it does so now the next thing is to incorporate from chat TBT some of the outputs we have here so the first thing is that we need to have this um system prompt documents we'll go ahead and copy this and I'm going to make a new line here I'll put this above I call this one system prompt I think if uh was I think it's like
triples if we do triples we can do this right there we go and so now instead of having this we can be have this as our system prompt right I'll bring this on to a new line okay so we have that that's good and we'll go back over to J chat PT here and this is our current user prompt so we go down here this will be user prompt and so we have that what is the capital plus that okay so I'll run this I'll run this and I'm not sure what we expect to see
as a result here but I'm going to go and bring this on to a new line sometimes I'm iterating on this I don't want to call it like a 100 times we'll go ahead and run this no problem we'll go run this and I want the content in here so I'll just do this one second so here I have the thought thought the capital of France is Paris and the capital of Germany is Berlin the question seems to not ask for a mathematical sum but rather to combine the two capital names in some way action
no tools needed for this task I will simply formulate the answer observation the capitals of France and Germany are Paris and Berlin Final Answer Paris and Berlin so that's the first example of it but let's change it to say what is what is the weather in uh thunderbay Ontario that's my hometown thunderbay Ontario Canada today today okay so I'm going to go ahead and update that I'm going to run this again and let's see what we get this time so we run that so it says I need to find the current weather information for Thunder
Bay Ontario I will I will look for a reliable weather source to get the most recent data action get weather Thunder Bay whatever so here is saying the above action simulates calling a weather in the scenario I'm using it to gather current data on the weather observation assume I received the weather data for the Thunder Bay which shows the conditions such as temperature precipitation General description the WEA and Thunder Bay is ETC so the thing is is that when you're doing this it's producing a thought and then we have an action and so here it
needs to call this function it's telling us right here what the action it wants to have and so the idea is that we can parse that out and then call that function and so that is what is happening here so D belot says perform a single manual react task is fine so we have action observation and so the idea is that so check if the models produce an action it did if so parse out which tool it wants to use and the input run the tool in Python to get a result aend an observation message
to the conversation call the llm so append an observation message to the call the elm again with the new uh conversation context repeat until we get the final answer okay so here they have this big iterative process but it's a lot simpler than we think here so here we have that function but what we'll do here like we need some way of parsing out the action so I'm going to go here and just say um this is what looks like in the text how can I parse it out using python all right and you could
even tell the LM like how it should display actions we're not doing that right now but we're just going to try to figure out how to extract the text that we have so we'll give that a moment here to just give us some code back oh weird it didn't actually do the thing we asked it to do so I'm going to ask it to run it again it didn't actually do it so we we'll ask it again so hey you didn't give me any code so we'll just ask her to do that again so here
we have a regular expression where we're grabbing the text I'm going to see if this will work so we'll bring this over I'll make a new line here so we're importing regular Expressions we're doing a pattern matching here for our our our thing um and then here we are trying to do our match okay so we are bringing our text our text here is this so I'm going to go and change this a little bit I'm just going to cut this out I say text here we'll still print it out so now we have our
text and so then we'll go down here and print this next and we got tool get weather Thunder Bay whatever like Thunder Bay onor Canada so now what we're doing is we would Define a function um called get weather right and then we' have an input and we're hardcoding this so we're not doing for real but let's just say input equals this or you know what let's go out chat jbt um How can I I get the current weather using an API for python okay let's see if we can get an example here maybe we
can implement it for real okay create a free account on open weather map sure let's go ahead and try that then so go here and okay that's fine that's fine sign in well let's go first take a look at pricing pricing um yeah there's there's a free one 60 calls per minute so we'll go ahead and do that I'm going to just make a new account here and I'm going to generate a new password here I'm going to save that I am older than 16 I agree uh I don't want a bunch of alerts so
I'm just going to go ahead and do that and we will do that I am doing this for educational [Music] purposes exam Pro Training Inc save okay and so now I need an API key we'll go over to here I have a key right there so I'm going to go ahead and copy it I'm going to go back over here let's take a look and so does this key have a very particular name they're not specifying a particular name that's okay so I'm going to go back over to here into our code I'm going to
make a or sorry go to here I'm going to add a weather API key what's that thing called open Weather open Weather open weather map and I'll just go back to original example here open weather map like that okay so we have our key in here right so we'll go back over to here and let's go back to our code which is here there's a lot going on here so hopefully this just works so I'm going to copy that and we'll go back over to here and I'm just going to paste this in here um
and so we do have that API key somewhere in here here it is and so in here I'm going to go ahead and say os. Environ doget and this will be whatever the name of this is okay so we have this it's like it's not knowing what that is let's see import OS I I'm pretty sure we imported that already but I'll import it twice um we have the main part here I don't need a main part as this is a single [Music] function so what I'm G to do instead of having the API key
here like this I'm just going to take that out like that and then I'm just going to grab the API key this way right here okay and then we'll take this part out I mean this is pretty straightforward so I'm just going to go ahead and delete this part out so we have the city name the API key use Imperial for Fahrenheit um oh no we want magic we're Canadian or at least I'm Canadian I'm not sure what you are on the other end there so we have a big example here of data um that's
fine I'm going to go ahead and just take this out like this and so this is our simplified function and we can pass in our city name here so the idea is that if we got get weather right which we do then we want to call this I'm going to Define this function and we have these two things here so I'm going to go to our next line and we'll say if tool name equals get weather then we just call this function okay and then from here this will be the tool input all right and
so I'm going to go up here and just uh say weather info and then we'll go down here and just say print weather info we do have it I'll just Dent this that way we'll have less of an issue we need a CO in there okay so this should in theory work the only thing that we need to do I'm just going to fix indentation here it's acting a bit funny um is we need to reload ourv file here because this will have those credentials so I've reloaded it I'm going to go all the way
down here I'm going to run this again so it loads in that environment variable and let's see if our tool works so it is trying to call it 401 unauthorized for the URL okay so it's saying that um our URL is unauthorized is it though I mean it's active right there's the API key okay may I have to confirm my email just give me one moment all right so I confirmed my email so now I'm curious if this is going to work now so we'll go back and run this again and try this one more
time unauthorized for URL so maybe it doesn't like the URL that's here um so let's go back over here I mean this is generated by by chat gbt so it might not be utilizing the latest thing here so we'll go over to the API and so this says 3.0 so you can already see that there is a difference how to make an API call and so we see this right so I'm going to go back over to here and so this is data weather so I wonder if there's like a weather API end point like
this one to get the access to current weather minute forecast for one hour um okay okay how do I do it by town or city click on this again API docs type in City contains four end points and provides access to various data so we have current weather and forast I mean that sounds like what we want yes and so the API actually looks like this latitude longitude excluding part um so maybe you can't just say City anymore or maybe the thing was out of date so here then we'd have to get an additional step
to get the longitude and latitude which is kind of annoying um but you know if this works then we'll just use utilize this but you can see how it wasn't as straightforward as we thought it would be so we're going to go ahead and we'll paste this in here as such and I mean it has like the the query name this way um so really we want is lat lawn I don't know if we need part we probably don't need part okay we don't need the API key here it's already in here so what we
really need is the lat and the longitude um so this is not going to be like a perfect example but like let's just assume that we knew what the latitude longitude of Thunder Bay was so latitude longitude Thunder Bay which one's the latitude and longitude um I'm gonna go over here can I get can I all right we over to here uh decimal 90ga 90 can you represent as decimals uh it's probably fine so latitude is 48. 3809 that's the latitude and I know it's not really how we want to do it but I'm just
going to hard code it in and this is our longitude and so then we'll go back over to here so even though it's passing the city name we're not actually utilizing that anywhere um just because we found out the parameters do not match what we're doing here uh oh yeah we need a CO in here all right so we'll give this another try here we'll run this we'll try to get that tool again it says unauthorized for 401 so here it's attaching the API key is the API specified a little bit different here this is
app ID was that other one called API maybe that's why no that's right so I don't know we we generated a key so in theory it should should work but it's [Music] not anyway it's not working and so I'm not going to fiddle around with this forever and so I'm just going to go ahead here and just say I just give me mock data so go here uh back over here give me mock data for the following structure okay and we'll just mock it and then we'll continue on from here there we go so we'll
go ahead and grab it like this and I'll go back and we'll just go here like this I would love it to work but you know what we're not going to we're not going to lose a lot of sleep over it here we we know like what we expect to receive back here and uh so I'll just comment this out for now there we go and so I'll go down below here and so that will now return this and we go here and we'll get back our weather input and so now the idea is that
if we go back up to um our original one here so note the uh the above action simulates calling a weather uh in this scenario I'm using the thing so we'll go back over to chat GPT and let's read what we're supposed to do next yeah so I think we need to specify its observation let's go back up to here check if the model is produced an act if so pars out which tool did run the tool in Python to get the results y append an observation message to the conversation call the elm again with
the the new context okay great so we have that there right and so we have um that text prompt from earlier and so I think it's if we go here print say text okay so we have this right and so it's saying text uh equals plus observation plus equals I'll just do this text text plus new line observation colon and then here we have weather info we'll do this here and then I'm going to print out updated text uh it's kind of okay um I'm not sure why it's acting a little bit funny but we'll
just go ahead here make make a new line here like this it has a one little extra space there but I don't think it's going to hurt it so now what we'll do is we'll go ahead back up to this part here and then copy this here go all the way down and we'll go to a new line and so now we're going to go to the next line here so this is our text right that we got back uh from that so I think it's assistant we write back here right and then here we
have our updated text we hold on a second hold on a second this is still user prompt right and then we got back our assistant text assistant which is the text and then we have our updated text here so we're we have kind of a bit of a history here I assume that's what it wants us to do right and so we'll run this and then we'll say got to go back all the way up to that other one here I'll call this text two and we'll run this Final Answer uh in Thunder Bay Ontario
Canada Day is52 Celsius so one thing I'm thinking is like while we're doing this we could have um we could have uh probably instead of having like four four conversations here we could have probably just appended it but I believe it feeds it all in here so it doesn't really matter but I'm just saying like maybe we could have just taken that uh assistant prompt and I don't know but the point is is that hopefully you understand how react works the idea is that you're making a thought you get an action you have an observation
and then you have your final answer um and so there we were able to do that um there are Frameworks that will automate this stuff but it's better to know how the manual moving pieces work so it's a bit easier you can tell the system prompt exactly what you expect the tools to be you could even tell exactly what tools it has because there it was just making something up because it didn't know that it it it could just make up tools but that's what it was doing we'll call this one complete so we'll say
react example and there we go okay [Music] ciao hey this is Andrew Brown and in this video we're going to take a look at opening I API and so if we go over here um to product their UI on their website is awful I don't know why they make it so hard to find it and it's very slow loading but if you drop it down you can get to API login and we actually do use open a quite a bit just because I find this one one of the easiest ones to utilize it's not the
best it's just what I'm very comfortable with um but when you want to start working with it what you're going to end up doing is creating a bunch of projects um so here there's a project that I don't really want anymore I'm trying to manage my projects to delete this one if it takes a moment here but I'm going to go ahead and just delete it but you come with a a default project as you can see here um and you might have to fill in your company information you probably do um but the idea
is that when we want to work on something we first create a project and say you know basic okay and then from there we want to create an API key which we can find here on the leth hand side and you can see that I have a key from a different example before we'll open this up and I'll just say uh basic and I'm going to go and select that project and create it I'm going to make my way over to um here and I think I already have yep I do an environment open here
so going to open that up but I use open AI quite a bit in this course so you're going to see me again and again and again do it so I don't feel bad if I'm kind of glossing over it um but we'll go over here onto the left hand side and we'll just give it a moment and we will open up open AI uh or use open AI so we say open Ai and I'll make a new file here called basic do iy uh here ipynb we'll make another new file here called EnV we'll
make another file in here called EnV env. example and we'll make another file in here called Dog ignore okay and I believe the key is called open AI API key so that's what it's going to be called we'll go back over over to our API key we'll copy it we're going to paste it into here I'm going to go over to the left here and copy this part and we'll go back over to here and paste it in now we actually need a few other things we need a project ID and we also need a
or ID um maybe you don't but I'm pretty certain that I do and so I'll go back over here to up here to the manage projects and from here I can grab the project ID uh which is right here so I'll grab that here and we'll drop that in there and then the next thing I'll do is go to General and grab my organizational ID or organization ID and so we have that there and these are supposed to be in the EnV of course so I'll bring that over there uh and then just clear these
out but I'm not sure why I'm missing my uh open open AI uh API key because I thought I just pasted it in here maybe I did not let me go take a look at the files oh I pasted in the dog ignore well there we go that's where it went so this cut this out this whole thing I'll cut this out here and I'll bring this here and I'll drop it into here and then we'll go over to here we'll ignore the EMV as per usual um and so there's a few things when need
to install the first will be pip install hyphen Q python. EnV and then it'll be open Ai and so I'll go ahead and run that we'll go to the next line here I'm going to pull for maybe streamlet as we do have uh some code here I need this this stuff here to load in the um the file so that will load the EnV file and now we just need some code examples so if we go here uh there should be a way we can find it if we look up the API reference uh they
have cooked books down here I never go to that ever though oh that's cool like we want to do structured outputs they have an example here uh some these are pretty complex so I'm not really sure I'm just looking for something that's really basic um maybe we'll just look at the API API docs and uh here we'll go to our quick start and so yeah it's open AI API key so I got that right and here's a very basic example so we'll go ahead and copy that and we'll go back over to here we'll paste
it in um is that that doesn't look like python that looks like typescript or JavaScript so this example here is yeah we want python so we'll go over to this one here I knew right away that was not Python and so we'll paste it in here and I actually use GPT 40 mini quite a bit these days so we'll go ahead and run this we'll see if it works and it does now I do believe function calls themselves write a ha cou about recursive in programming okay so I was wondering like why it was being
kind of weird there but you know I do believe that open AI has a free tier I'm right now utilizing the paid tier I do have some credits loaded up but I found like even if you load like $5 on these things they last quite a long time though the credits do lapse over a year so you make sure that you do utilize them as that can be very unfortunate but that will be our open AI example and we'll call this on done and dusted okay so we have now utilized open AI but I do
want to point out that open AI does have a really really good playground I mean these videos are called um these are like playground videos but I really been focused on mostly uh working with them programmatically but if we go over here to the playground we have the chat we have a bunch of options there's not much to talk about but it is a way that we can uh work with the stuff and we might come back to these environments later on and play around with them okay [Music] hey this is Andrew Brown in this
video what I want to explore is utilizing the anthropic workbench as that is another way to work with anthropic um a lot of people might just go ahead and try Claud from here and I actually do have a paid account which I got just for this um uh this boot camp as usually I'm just sticking with chat TPT these days but the idea is that there is another portal somewhere here um if we go over to API okay and then we go to start building and so that brings us to console. anthropic docomo workbench sometimes
they call these workbenches sometimes they call these playgrounds I believe that they call theirs a workbench and so we are now in here yeah we see workbench up here um I guess the question is like can we utilize this in the free tier I think that we have to buy credits to utilize it but we can just find out uh by working with this directly um So the plan is the free plan is for revaluing capabilities before commercial use so sounds like we can utilize this uh for free initially and say my uh my uh
my key here and let's just see how we can programmatically work with anthropic CLA so here we have a key I'm going to go over to our GitHub repo I'm going to go down to gen Essentials okay I'm going to go open our uh code base in GitHub uh I'm going to delete this old space we'll use GitHub code spaces here today it doesn't really matter what you use as this is all serverless but I'm going to use GitHub um github's Cod spaces here today so we'll give a moment for that to launch up all
right so we have launched this up here we got color themes um and I want to just change this to a darker theme so I can see what I'm doing there we go and I want to um I mean we do have one for Amazon Bedrock over here which is working with a API but I'm going to make a new one uh here I can call this one whoops we'll call this one anthropic [Music] anthropic anthropic and in here I'm going to make a new file that I would like to call um I guess the
question is should it be a workbench or should it be the CLI I kind of like working in in notebooks so maybe we will make this a notebook here so I'm going to go ahead and just say basic. I Y andb which I believe is how we make the book and we'll go ahead here and we'll just start adding code I'm going to use the uh whatever the default kernel is here so we'll install and enable here so that we can get going we'll give it that just a moment all right so that's now installed
we'll go ahead here and just choose whatever the default environment is um and so we'll go back over to here and we have our API key I'm not ready to use it just yet but um I think what we can do is just load uh put it into ourv file here as I'm getting pretty good at loading uh external environment variables and I'm just going to save that temporarily I'm going to go ahead and make a newg ignore and in here we're just going to ignore the EnV as I don't want to commit that to
my repo and I know that we will want to install whoops sometimes I get those five numbers there by accident but we want to install python. EnV so that will be one thing we'll absolutely want to install I need to figure out what the anthropic library is called I do not know what it is um so if we go back over to workbench I know they had like that beginner stuff that they were showing us stuff but I want to get some code so we'll go over to here and so this looks like a good
start it's interesting we have other implementations we have for vertex AI which is cool and then we also have the Bedrock implementation we'll go over to python as that's the version that I'm going to mostly be utilizing is python everywhere all the time even though my favorite language is Ruby we'll go ahead and we'll paste in this code example and so here actually wants to load the environment variables which is perfect so I'm going to copy this over to here and we're going to set uh this value as such I'm going to go ahead and
just make aemv examples here and that way when I commit this other folks who are running it can follow along with their own key so we'll go back over to here and so this is the default one that will get loaded in um so I think we don't even have to set a key because it will get pulled in but we will have to uh load um python dodev so it's like EnV like import. dnv import. EnV and then env. load it's not autocom completing which is fine um but I'm just going to go to
another example where I know that I utilize this here we go and so I just want to make sure without a doubt this gets imported correctly there we go okay and so the library that we need to install here is um ropic I mean that kind of makes sense let's go ahead and see if we can get her installed to go here probably could have done a hyphen Q here to make our lives a lot cleaner there we go and so now we'll do our import and here it's coded for a very specific version anthropic
we'll go ahead and run that and it says it's a deprecated model so it looks like your credit balance is too low to access the anthropic API please go to building Etc so I mean we just want to evaluate it right and so maybe this one in particular is not something that we can utilize right away and so if we go here what if we went to like an older one like Haiku like this model in particular I mean I know I'm out of credits but like I thought I could evaluate it maybe I can't
so let's go take a look anthropic workbench does it have a free tier I mean if it doesn't that's totally fine however please note that the API is subject to commercial terms and services okay so which model can we use for free all right but it's not telling us how so yeah it's not exactly telling us which one it is and so I was hoping that we could go in here and maybe we could find a very specific model now there are some older models here so maybe if we choose this older one here there's
no easy way to copy it but I'm going go ahead and say try it no that's not really helping me so maybe I'll go right click and inspect to get the the content here PR they just don't make it easy to copy here so I'm just using inspector to try to try to grab that value so maybe if we try this older version of Hau then maybe we can use it for free let's find out I don't know for certain but we're just we're just trying some things out here so I'm going to go back
over to our example here I'm just going to say model ID and I'm just going to paste it in as such I want to keep both of them around so I can quickly switch between the two and so I'm going to go ahead and just grab this one here to me it'd be crazy if they even had a free tier just because they're always um exhausted in terms of their resources but we'll go here and just say model ID I'm going to comment out the top one here we're going to try this again and run
it and it still fail so clearly we need to have um we need to have uh a newer model so let's go over to here to maybe a newer model and I'm just going to go and try to grab this one next actually I can probably just grab it from my inspector window I'm not fully uh tapped out here yet I just want to see maybe it has to be the latest model and that's why it's still not working and so we'll go back over to here and I'm going to paste in this one instead
I don't think anything changed so yeah I think I'm just going to have to buy some credits and that's not a big deal but I was hoping that I could show you something with the free tier but at least you know that you're not really going to be able to easily get free tier let's go over to settings let's go to building I'm going to load myself up with some stuff here so I'm going to complete setup and get going here okay and I'm just showing you a little bit I just clicked on here so
I'm just showing you a bit of the step before I fill in some of my personal information um I guess we're small business uh we are based in Canada it's for both I suppose what task will claw be used for educational purposes will Claud be used to provide legal medical advice no will you be incorporating your API into any products or Services intended for users under the age of 18 no and so we'll hit continue and now this is where I will have to hide my screen so I'll be back in just a moment okay
all right so I set billing and uh I think for now I'm going to just start with $5 as that is a small amount Auto reload credits when the balance reaches a certain threshold I'm not going to do that here today but I'm going to go ahead and just complete the purchase and so in theory the purchase should be completed and I should have those $5 but I'll give it a moment here there it is so now I should be able to to use the API um so let's go over to here and take a
look at what we have um so we'll go back over to here I'm going to go ahead and um run this here and so it should pick up it says uh at least one message is required so now it's not an issue of whether it works it's just the fact that we don't have a message filled in here so I'll go up here I'm just going to put in messages also bump up the font here a little bit it is a little bit small okay and we'll bring that on down and I'm going to go
ahead and bring this here so now the next thing I need is an example of actual messages so we'll just say messages API anthropic now I honestly find the easiest way to use anthropic is via Amazon Bedrock but you know use whatever you need to use um we'll say python here we'll just type this again Python and so I'm just trying to find that example of it yeah so it's roll and content so that is very straightforward I'm going to go ahead and just copy that from wherever this is here we're going to go ahead
and give this a paste in I'll probably fix this up a little bit so it's less of a mess okay and uh yeah we'll just leave it as this and see what happens and we'll just zoom out a little bit here so we can see what we're doing so at least one message is required oh you know what we didn't actually replace this value here we'll go ahead and give that a run so there we go so we got a response back very very straightforward um but yeah nothing super exciting here um I know there's
other capabilities that anthropic uh API has but I really don't investigate their individual ones because I generally use anthropic through vertex AI or um Amazon Bedrock but sometimes they'll have functionality that is only accessible through their API let's go over to their API reference I'm just looking down here to see if there's anything interesting so we have messages which is pretty straightforward we can stream messages um we can list models we can do batch messaging which I would assume would be more cost effective just like open AI there's the Legacy one I assume we wouldn't
use that anymore there is an admin API but I can't imagine that that's something we'd ever really want to uh fiddle with then we have Amazon bedrock and vertex AI down below um yeah I guess what I was looking for was some of those beta featur because there's always really interesting things that anthropic can do so just say like anthropic well actually down below here it says ask AI so what are some of the beta features via the API I can use with anthropic so some always like hearing really cool things like being able to
control your screen and stuff like that but where they are they're not clearly here in the main API so maybe their uh AI uh thing can help us we'll just give it a moment here that's the what I'm thinking about computer computer use beta so allows you to interact with the computer desktop comments here's an example of the code and we don't actually see oh there's the example there we go and message batching here it's saying it's in beta I don't think it's in beta anymore um I think it's just part of the API now
I'm not sure how out of dat it is but yeah I mean that's pretty straightforward we can go back over to the workbench itself um I don't think there's anything exciting here to talk about in the workbench okay yeah yeah yeah so I mean something that the API supports I believe is tool use so I guess if you wanted to test tool use in here that's something that you could do which is kind of interesting so I guess we have an example there um do I want to show tool use right now probably not if
we do something with let's say rag then we could come back to that right now I'm not that interested in showing off uh tool use but yeah there's your your basic starting uh starting place with anthropic um I'm going to go ahead and commit uh this stuff here so just say anthropic example okay and so you can see it's not super complicated the only thing that I did not like and I mean I understand why they did that but they forc you to put in your um Canadian business number um for taxes and I suppose
you're supposed to do that but I like just to fill out forms then I had to go ask Boo for the U business number I'm going to go and make sure I get rid of the API key as I do not want to have issues here um we'll go ahead and we'll just delete this key and now we're in good shape okay [Music] ciao all right so we looked at sagemaker and also Azure ml Studio but let's still take take a look and see if we can run notebooks in Google Cloud obviously have Google Cod
lab um but let's see if there's something directly in here I honestly don't remember um but we'll go over to vertex AI because if there was one place it would be it would be there we could also just type in notebooks at the here and see what we get and so we have workbench um so that might be a way of yeah I think that's their their managed way of running notebooks it is okay great so why they call workbench I don't know um but it is one way of doing it I already have an
existing project here so you'd have to create a project I keep using my fre frees free BSC and I never seem to clean up my projects uh one day I will it's just managing projects in Google Cloud are kind of a pain compared to something like Azure resource groups so we've now en it that was really fast and we'll go ahead and create a new instance we probably should be paying attention to the cost so over here it's showing us it is um a dollar so that's kind of expensive but maybe we can choose what
we're running underneath yeah so here we can see we have um I mean these are environments and I guess they're just kind of pre-built environments for us we're not really choosing exactly what we want all all environments have the latest GPU libraries installed okay so we have Debian Ubuntu I mean I I like Debian but let's go over here does this one become cheaper we switch over to that no it does not um I'm G to go back over to just python with Intel ml mkl what is Intel mkl you think I know I know
usually mostly all the stuff oh math kernel Library okay yeah yeah yeah um but you know I was hoping that I could choose oh here it is okay so down below here we have our configuration um so C3 was what we were trying to launch earlier for something um but you see we have a few different options um we have AMD Intel Skylake I want something that's just going to be cost effective so and I only want a single VC I don't know let's choose N1 is that cheaper I just remember Sky being um cheaper
so I went here and now we're getting into something more reasonable this is even cheaper than um than uh uh ml M3 medium right so we have a lot of options here but again I'm not trying to run something for real here look like look how much memory I have so this isn't something that you run a real workload on it's just me trying to show you how to launch up the notebook okay so we have us Central 1 Iowa this is totally fine I'm going to go down below we're going to go ahead and
create this all the other settings are fine and we will take a look and see what we get really happy about that cost though that's really good cost um I don't know if it has a free tier to be honest here it says uh vertex AI workbench user manage notebooks is deprecated and and will support recommend you migrate to the AI workbench instances oh is that what we're okay is that what not we're using right now the instances user manage notebooks juper lab 4 is now available in this well how do do I know what
it launched I don't even know what it launched manage notebooks okay I'll give this a refresh here I don't see anything I mean I definitely am creating one up here see all activities I got to turn on that MFA nobody make fun of me for not having my I'll get my MFA turned on don't worry before the Boot Camp starts I'll make sure my MFA is turned on okay but I'm not sure is it launching or what's going on executions no that is uh I hate that because I can't tell if it's installing now or
not and I don't have any notification that suggests other otherwise so I'm this another hard refresh okay and so I'm on instances here right like I don't mind launching to I just have to pay attention to it so I'm going to go ahead and create new default default this is fine is this experience a little bit different oh did it take me to a much more no not really I'm not sure why this is so different create new I don't want to T T4 no thank you I wonder if that's what uh cab uses but
here we have an ec2 standard for 20 cents an hour why was the other one giving me so many options here we go instance what does user manage look like manage notebook well what did we have just a second ago why experience completely different whatever I'm going to go over here on the left hand side I'm going to go to instances as that's what it's recommending us to do I did create this new one I don't even know how to get back to that other interface we just had that's totally fine and so here it
says ec2 standard for I'm going to go Advance options I just want to see if I have a little bit more flexibility to change my um machine type and so I'm going to go over to N1 which is is it any cheaper oh it is okay but I don't know what happened to my uh my other one there and this will idle after shut down before minutes I'll say range 10 minutes I like that that's good I like that so we'll go ahead and do that that's like the lowest setting I've seen so far I
think anyway but uh yeah I'm not sure what happened to that other notebook and I thought like in my notifications it would show up here I guess not again just still clicking around here yeah nope I don't know what happened to the other one I'm sure in a month B's gonna be like hey Andrew you spun up something that has cost down here we also have collab Enterprise which I'm not obviously going to use I'm not an Enterprise but um yeah I guess they have that as well but we'll stick over stick stick with workbench
here and we'll just wait for that to become available uh yeah so we click into it yeah we're just waiting for it to start I guess okay so we'll hold on tight all right so um yeah it's up now I had to refresh um as sometimes you have to do let's go ahead and open up Jupiter lab here okay we'll just give it a moment nice little Google Cloud logo and here we are the experience is obviously a little bit different um I don't recognize some of these here notebook executor so it looks like they
have a little bit uh little bit differences here we have a little nice notebook template here um yeah some some things here I'm just stick with the regular python for now I'm going to go ahead back to the launcher we'll choose Python 3 IP kernel we'll do our usual pip install Transformers nothing exciting here okay all right I guess we modify the hardware here so yeah it kind of reminds me of like uh stage maker stage maker Studio Classic in that there's a few more little Integrations um I actually prefer stage maker Studio Classic as
opposed to the new stage merro Studio but any of us wanted to change it so yeah we installed it there and so now we have another environment that we can work in so if you did not know that you can do that I'm going to go ahead and delete that here today and so that is another place where we can use uh notebooks in the cloud [Music] okay all right in this video I want to take a look at coh here and so I'm going to go ahead and sign into coh here what I really
like about coh here is that they have a very generous free tier so you can get started with them uh very quickly and I've already created an account before so I'm just connecting here and so now I'm in my account obviously we could attach bilding and usage here I do not plan on doing that here today unless I need to do so we're going to make our way over to the API keys and you can see I already have a uh a key generated from prior I think I'm still using this key so I'm going
to actually leave it around as um I have somebody building something out for our boot camp with it but I'm going to go over to GitHub geni Essentials and I might already have an environment already running here um so I do so I'm going to go ahead and just open that in the browser of course if you do not then you will have to launch one and get Hub code spaces or use whatever you like here today and I'm going to create a new trial key um and so this will just be example key and
so we'll go ahead and generate out that key there and so now we have our example key um which I think we can look at at any time so that's totally fine um and so we're getting this environment spun up here but yeah coh here has its own playground so you go over to here and they have a playground um so we have chat classify in bed uh and generate which you do not use anymore but I like their playground because it really does give you a lot of quick examples very quickly on how you
can uh utilize it on the right hand side you can see they have tool use um they have connections to other or connectors to other um things like doing web search which is very nice you can provide it the response format um so that you can get back structured Json if you Pro provide a uh Json schema file um so yeah there's a lot of stuff that can be done here but I again just want to show you how to work with this programmatically very quickly um so if our environment is spun up which it
is I'm going to make a new folder over here and we're going to call this one CLA uh or not CLA C here I apologize and we'll go ahead and we'll make a new file here called basic. iynb and I'm going to make a newv file here called EnV and then I'll make aemv example and then we'll make another one here called Dog ignore in the doget ignore I want to ignore. EnV so we'll go ahead and do that if I have yet to do so did I commit the stuff from last time I did
but we'll go over to here we'll add some code I'm going to assume that pip install CO here so I'm going to go and say install uh I'm going to do hyphen Q Co here we'll also want python. EnV um I'm going to want to make sure that's a percentage I'm going to make sure that I want to uh bring in from this one um this here so I'm going copy this over as we want to import the EnV into our file here and so now I just need some coher code so if I go
to chat I wonder if they have yep they have code up here in the top right corner we're going to go ahead and grab this example and I'm going to go back over to here I'm going to paste it in I need to grab my API key from here so I'm going to go ahead and copy it I click the copy button and it looks like it's not saying what it's supposed to be so we could just hardcode a value here I'm going to go ahead and paste this in and it's going to be called
my coh here API key and that's what I'm going to call it I'm not sure if that's if that's what it's supposed to be implicit L called sometimes we can look that up I don't always care to look it up and so in here they're saying this is where you load your key so we going to do OS uh Environ so I'm going have to also include in here import OS um but we'll say OS Environ doget this will be coh here API key so that will bring that over here obviously it wants us to
have a message here this is set up for streaming which is fine there's nothing wrong with streaming we can absolutely do that it looks like we already have some chat history in here so I'm going to go ahead and bring this on down a little bit I'm not sure if this is their older um their older uh API because I'm pretty sure that they standardized on uh utilizing uh like there's a newer format so I'm going to go ahead and say coh here API because that kind of looks a little bit old to me right
and if I go to their docs just want some examples API reference see it says chat version two right so I go here and notice it says messages so this is really what we want I mean we can obviously do streaming but this is actually what we want because I think that um uh maybe in their in their portal it's a little bit out of date which is totally fine also looks like we have some variants here if we're trying to do different things like documents or tool use the documents documentation is really really good
uh for this platform and we're going to go over here I'm just going to replace this with this version this using command r+ which is totally fine it already has information for us in here um I don't know what it uses as a default to load the environment variable so I'm just kind of looking for that here just a moment sorry B was asking me for my uh um baser he's making a turkey but anyway I'm back um and so yeah somewhere in here there must be an indicator in terms of what the nvar is
for it so just look up uh coh here nvar because maybe there is one I mean there's is calling it that so maybe I have guessed correctly and it will actually automatically load that one in so I'm just going to assume that that is correct um and we'll go ahead we'll just run this and then we'll run the next one here and then we'll run the next one here and so we do get an output um yep and we get a response so there like a little bit more work that we'd have to do to
format this response but yeah that's as simple as it is working with the coh here API and again it has a generous free tier which is really nice I'm going to go ahead and delete our example key as I do not need it any longer I'm going to go ahead we're going to add this here just say coh here API and so yeah that would be a very good um one that to utilize if again you have no spend this one is very generous the other thing is that their models you can also download them
from hugging face so they have them in another format I'm not sure if they're I wouldn't call them open source but they're open openly available to utilize for locally so I don't think there are limits based on that you can use as much as you want but obviously for commercial use it' be different story so it's nice that you can have a model that that you could use serverly or download and and utilize it as much as you need to until you need to put into production but there you [Music] go all right let's take
a look at AI 21 uh so a21 is known for models that have much longer contexts uh than other other ones so if you need a lot of text like large documents that's where this one can come into play and it can be very very useful um but let's go ahead and see what we can figure out here I believe that they do have some kind of playground here in API um but we have chat with Jamba or build with Jamba actually I don't know we could chat with it let's go over to here and
I guess they also have a chat environment just like any other one so we'll go over here and take a look we'll say continue and so now we are in Jamba chat I can just say hello how are you okay it says hello I'm doing well how can assist you today so it would be really interesting to see how this one would perform against other AI powered assistants but I don't think this is intended as an AI powered assistant this is more just a demonstration of what this thing can do let's go over to start
building as this is really what I'm interested in and so now we have uh this is the playground so we have chat conversational rag rag engine and so I could see how this could be really good for um large large code bases let's go over to view code and so we have some code here and as per usual we're going to go over to GitHub and I'm going to make my way over to Jing Essentials and I'm going to go ahead and open this in the browser okay and we'll just give that just a moment
and so that is now open opening I've had this environment open from previous videos but of course there's nothing special that we're doing here that is out of the ordinary um yeah it looks like we have a JavaScript curl option but mostly it's python so we'll go over here and we will give AI 21 Labs a try here so say AI 21 uh AI 21 labs and we'll go ahead and make a new file here called basic. iynb and in here we'll add couple new code blocks here I'm going to grab these here and we'll
bring them on over and I'm just going to bring this one up into its own I like to do the import separately if I can and we're going to want our pip install I'm going to do hyphen q and we're going to bring in AI 21 and also python. EnV and from our previous examples I'm just going to split here and go to maybe our coh here example where we've loaded our Imports here I'm going to go ahead and bring this on down as such maybe take that one out might not even use the um
uh the import OS but that's totally fine let's go ahead and give that an install and it looks like we're going to need a key so I'm going to go ahead here and make a newv file and we'll make a newv example file and we'll make a newg ignore file and I think I spelled that right we're going to ignore the file I'm going to go over to here um sorry back to basic and I like how it shows us exactly what the key is supposed to be called not everybody does that but for this
one it does it makes my life a lot easier so clearly we're going to need a key so let's go back over to AI 21 and let's see if we can find a key let's go over to settings to API Keys it looks like we already have a key which is interesting because I don't remember generating one but we'll go ahead and regenerate it maybe they give you one by default okay but where where's the key how do I get the key do not share your key with others or expose it in the browser or
client in order to protect Securities uh view the usage of the key okay how do I use the key though where do I get the key okay so AI 21 how do I access the key oh you know what I bet it's copied it to my clipboard let me see nope I tell me like when I press this are you sure yes let's rotate it out okay oh it's over here copy that is not easy to see whoever did the UI on that needs a needs needs a yeah they need to they need to not
do that so we can load it this way that's totally fine um we probably could take it out and we probably still load it here it's using Jamba Jamba 1.5 large I'm not sure what their latest one is we'll go over and say uh AI 21 models because there is yeah there's like a isn't there like a newer one let's go over to pricing here no I guess we just have mini and 1.5 large which is fine let's go back over to here so we have this one here we have a bunch of stuff in
here like I don't plan on putting documents or tools here we're just testing the messaging formatting which is very boring also we we're not printing anything out here so I assume that we need to print back a response so I'm going to go ahead here and just say response and just say print response um and then we're going to need some messages in here so I'm going to go and say messages and I don't not know what format it wants but let's just assume it wants the usual roll content so we'll say roll user and
then we'll say content hello how are you okay and we'll go down here and then we'll go and paste it in as such right and so now let's go ahead and see if we can grun this stuff so we'll do this one and then this one and then we'll do this one we get an error back it says object has no attribute model dump and here it's talking about pantic version two I mean I don't know if I need to specify exactly what response format I oh well first of all did we even place this
in the messages we did do that but I don't plan on using documents or tools I don't plan on having to stop here um everything else is fine let's try this again object has no attribute model dump so it does not like our messages so I need to find some example code we'll go over to documentation here we'll say get started and let's see if we can quickly find an example with the API so yeah so far not seeing it but python s let's go down to here okay uh so it needs to be in
a very particular user message format I've seen this with a few apis not all of them but some of them do this where it's structured and I guess maybe that's what it was talking about when it was talking about pantic so let's go back over to here and and it looks like that's already being no it says chat message but we have user message so I almost kind of wonder if like this stuff is older so I'm going to go ahead here and just do this I'm not sure why uh there's inconsistencies with the doc
but sometimes these companies do not have time to go and update their docs which is not a good idea but it's just what happens and so here we have um messages I'll just go ahead and copy that okay and so yeah it's doing some of the formatting for us here let's go ahead and run this well first we'll run this then we'll run this and now we get a response okay so that is a way that we can start working with AI 21 I we obviously haven't covered the advantages of AI 21 but this is
just to show you how that you can start to get working with a21 but as far as I understand their big pitch was the fact that they're they have giant context windows so if you have very large documents um then this could be a model that might perform very well under that we'll go ahead and just commit our code so we'll just say ai2 and we'll consider this one done and I will see you in the next one okay [Music] ciao hey this is Andrew Brown in this video I want to give you an introduction
into Amazon Bedrock so Amazon Bedrock is a models of service or it's a collection of uh things that you can do uh with llms and things like that and we're going to go through the most basic things basically learning how to uh start using it deploying a model things like that so I'm going to go over to Amazon Bedrock um and then once this is loaded on the left hand side we have a lot of options now I've already um utilized this within North Virginia region so when I go over here to the model catalog
which is all of our possible um models I already have a bunch activated but if you are doing this for the first time you got to go all the way down the ground here go to model access and you want to modify your model access now you might see an option here that says uh grant me all model access and I'll show you what I mean if I switch over to a region that I know I have not deployed this in like uh Europe Ireland if I have access to this we'll see um and I
don't have yeah there we go so I say enable all model access so that's what you want to do and just go ahead and proceed to all some options might require you to fill in some information some models make may take time to get access to it so it's best for you to do this as soon as you can um but anyway you can see I have most model model access here and so if I wanted to modify my model access I could just checkbox on things here but I try to get access to as
much as I can just because it makes my life easy and basically everybody that uses uh Amazon medrock basically does this approach um looks like there's been some changes here I like how they said fine tuning distribution because custom models was not clear before which is fine but anyway so I have a lot of models here and if I want to start working with them I can work in the playground so the playground uh also known as a workbench depending on what provider you utilize is a place where you can start working with these models
and so here I can go ahead and select a model but before I do that notice I have two options here chat and single prompt if you're using something like Azure AI Studio they'll be called chat and chat completion on Amazon Bedrock it is called chat and single prompt and so um when we want to have a conversation we'll use chat and when we want to just have a single term response we'll go and use that one but in here on the leth hand side we have a bunch of options and so we have um
a121 Labs Amazon anthropic C here meta mistal there are some other models um that should be here I don't see them in the categories but maybe we'd have to deploy them and then infer them another way but uh you know here are all the new models I think that Amazon models are under the free tier um but honestly a lot of these things are in the penni so I I go and take a look here and say Amazon Bedrock pricing okay and we go down to our generative AI pricing you can see a lot these
models it's like 002 008 so you need to utilize quite a bit and this is priced per 1,000 tokens so quite a bit before you are seeing uh super spend so you do have to have some spend um but you know I think that again Amazon Bedrock might uh Amazon models might have some free tier um but where that is I don't know and it probably be on the titanex express okay so just understand it's very hard to get around those costs and even if you had credit from ads there's no way to get around
this stuff so um just understand that you are going to have some spend on ads and probably as you're in gcp when using model as a service but notice that the uh the building is inputs and outputs um which is a little bit different from other providers but I know that there are other models in here that I couldn't select from so I know that IBM has granite and so I believe that that would have to be deployed from the marketplace but for now we'll go with the things that we can utilize so I to
select a model and I'm going to go to Amazon and we're going to use the Nova light and so notice we have the option on demand so if we click around we might see more than one option okay so we have inference here and uh you know before what would happen is you'd have on demand and provisioned and I'm not sure why I can't find this today maybe it's under meta yeah I don't know it's uh there used to be two options here and I'm not sure why it's only showing on demand but on demand
means that you are uh you deploy it and then you consume like whatever credits you consume is what you spend there is another Mode called uh provisioned but I'm not sure why it's not showing up here today but that's totally fine so I'm going to go ahead and choose Nova light we'll hit apply sorry for being all over the place and so we have our Amazon Bedrock chat experience here so I can go ahead and we'll just say hello um how are you if we want to start working with this okay so hello I'm doing
well thank you for asking how can assist you today if you have any questions or need information particular topic uh feel free to let me know so pretty straightforward um this is not the same thing as an AI powered assistant AI powered assistant doesn't give you all these parameters and and dialogues you can work with and also an AI powered assistant is going to have a lot more um things going on in the background you could build your own AI power assistant within Amazon Bedrock where it it's under agents so you could create it there
you can see I already have one created from before um but yeah this one is pretty straightforward uh you should know how to programmatically work with Amazon badrock and so that's something that I think that we should spend a little bit of time doing so we're going to need a notebook so let's make our way over to Sage maker okay and so I showed you in another video how to use Sage maker um they renamed it to Sage maker AI obviously they want to put AI in there to uh let people know what it is
and so if I go over to getting started here or sorry Studio you should already have one um set up from the time we were going through notebooks if you don't know how to do that go take a look at how to do that so stagemaker studio is opening here I'm going to go over to Jupiter Labs we're going to create a new Jupiter space I'm going to say my uh space um and we're going to go ahead and create that space this is an MLT 3 medium I believe it costs a nickel an hour
um so if you can't afford a nickel don't run this you could also use Amazon Bedrock locally it's just an API so you could foro this cost if you want um but again I try to use always um Cloud uh Cloud IDs and I'm trying to use the one that is specific to the provider so if you want to go use a git pod uh or GitHub code spaces on your free credits you absolutely can but I'm running it here today so just choose whichever notebook you want to utilize so this is just starting to
get spun up here so I'm just waiting okay and I'm just going to pause here and wait till the space becomes ready all right so um our uh our our environment is ready I'm going to go ahead and Launch Open jupyter Lab because we are again using um bedrock and everything is being taken care for us we're not downloading model so it's very um easy for us to use the small size of storage and we do not need any kind of powerful compute whatsoever I'm going to go here on the left hand side create a
new notebook I'm going to rename this notebook okay I'm going to rename it to um what will I rename this to just basic and let's see if we can go ahead and make some code here now we do have Amazon developer Q here or Amazon Q as they call it and if I have access to this I might ask it to write me some code so give me an example of using Amazon Bedrock um via the python SDK and it just happens to be here so I'm going to give it a go you're not subscribed
to Amazon Q developer please request domain into that not going to do that so that's totally fine um so so I will just go ahead and go look up it manually so say Amazon Bedrock API and if we go over here to our API reference um honestly I want Bodo to be honest Amazon Bedrock uh Bodo Bodo 3 and if we go to the documentation we have a bunch of runtime so we type in Bedrock here we have Bedrock Bedrock runtime um and agents for Bedrock so for Bedrock is for Amazon Bedrock agent Bedrock is
for infrastructure on bedrock and Bedrock runtime is actually working with models so this is the one that we want and they have invoke model and the newer one called Converse I'm going to use Converse here today because that one's a lot better and so we're going to start working with uh this one here so I'm going to go over to here I'm going to try to use my memory uh to do this we're going to say import boto 3 um which be funny if I can just do this for memory and I need to uh
it's initialize a Bedrock runtime client so I'm going to try this client equals again just trying to do this for memory like that and uh yeah it doesn't like that so I again I again hoping I could just do that for memory but I guess not exactly so I'll go back a step yeah so I'm just missing this okay I try my best to try to do things for memory because it's just better uh long term for me but to we'll go back over to here and we'll go ahead and grab the response Converse and
so it has some options it needs the model ID okay see you don't need chat GPT when you got Andrew to help you here um and so we're going to go over to St maker not stage maker Studio but um back to bedrock and so we need a model ID right so if I want to use something here I'm going to go to the model catalog shows that there's like 46 of servess which is great um but we're going to stick with the Amazon models here for today I'm going with Nova light so I'll click
into that uh oh it's a multimodal that's that's pretty sweet um maybe all I want is actually text so I actually might change it to um I'm going to go back to just Amazon here I'm actually just going to micro I all I want here is text to be honest and so the model I'm looking for is this model ID so I'm going to go ahead and grab that it's actually up here as well which is a little bit easier and we're going to go ahead and paste that in um I'll bump up the font
here a little bit I know it's very very small and so we need our standard messages in here so that is uh pretty straightforward it's either user assistant which is pretty much standardized for um the Bedrock API which is one reason why I like the converse API okay and so here is one message and this is going to be the user you always have to start with the user first apparently um and I think it' be content I'm just guessing here okay I got to go back over to here well here uh yeah we have
content and then we have to specify what it is so within it um it looks like yeah we'll have to specify that it's um this okay so the content is really going to be this here and the reason why we have to do this is because because um certain some models take in text and other ones don't and so you see text image document tool use So based on what you're passing in is going to vary but we're just passing in text this is a Texton model and so this is what we have to do
there's probably like a shorthand for this so we don't have to write as much but I'm just going to go with the the full hand here it's totally fine I'm going to wrap that as well here um and we still need this to close the only thing I don't have here is like let's say system prompt which is down below here so that's going to be system text okay so go ahead and do this and this one is um similar but we have yeah just text string so this will be this okay uh you [Music]
are you're correcting my English grammar and spelling mistakes okay so so I've prompted it to tell it what to do uh there is a little highlighting issue here so it does not like something I'm just trying to oh you know what it's because I have a uh s uh single within the middle of there there we go and so this should be the most basic thing that we need to set up and so I'm going to go ahead here and just write something so let say um hello what's the weather like around here okay so
I'm not sure if it will be able to do this but we will give it a try I've never used micro I don't know how intelligent it is um it doesn't like the spaces here so I'll just take that out it's not really a problem it's just spaces and provisional argument follows keyword yeah I'm not sure what it's trying to say there but I'm just go go ahead and bring this onto its own line I'm just trying to get rid of all the um stuff here yeah this one has oh that's an equals okay let
let's do that instead there we go and so that looks good to me so I'm going to go ahead and run that and we have an error we'll go down invalid type for parameter system um expecting type class dictionary but class list tupal um so I'm gonna go back over to here oh and it wants an array of it okay that's fair I mean I don't know why it would only ever take one but we'll go ahead and do that and so now I'm going to go ahead and print the response let's see what we
got so we have a bit here a little bit too much going on here so I can't really see what I'm doing but we do have output here so I'm going to go ahead and try this I'm really just taking a guess here there we go and so getting closer we have the message okay and then I want the content and I know it's very verbose but once you get your your uh code wrapped around here it's not so bad I'm going to assume there was like a zero in there A lot of times these
are in Rays so sirch shers here is the corrected version of your sentence hello what's the weather like around here here are the changes I made okay so there you go that is the most basic example of using Amazon Bedrock um it there's a lot of features a lot of different API stuff but this will get you the most basic stuff you need to know so I'm going to call this done um I guess I could download this code let me just think about this for a second yeah I don't really have a repo yet
but I'll go ahead and make one just in case people want it so I'm going to go to exam Pro here obviously in other videos it'll look like it already existed but I'm going to go ahead and make the repo quickly so I just created a repo called gen Essentials and so I'm just going to start uh placing these somewhere so go ahead and download this download download download download download download download here it is okay great and I'm going to make a new folder here this one's going to be um I don't know uh
we're looking models the service but I'm just going to say bedrock and that's what it'll be called actually we it Amazon Bedrock I suppose Amazon bedrock and I'm going to bring in that file here so if you for whatever reason want to run the exact same one there you go okay I'll go ahead and commit that and before we get out of here I just want to make sure I stop that workspace I'm going to go ahead and stop it and once it stops you can go ahead and delete it there we go and I'll
see you in the next one okay [Music] ciao all right let's take a look at Google AI Studio which is another way that we can work with um uh AI models um obviously so you could use Google Cloud but through vertex AI Studio but this again is another way to do it let's go ahead and develop ourselves a new prompt um I'm going to go ahead and just agree to the terms and apparently I am logged into an account here now I wonder if this is tied to if I go to plan information is this
tied to my Google Cloud account let me see take a look here um so right now I'm just trying to see here you can upgrade to paid billing uh in your Google Cloud project associated with these API key so basically you would still have to have Google cloud and Associate it with it um but I already have a Google Cloud account so I should be fine I'm going to go ahead and get an API key here and create an API key um I don't have an Associated project so I'm assuming that I will just start
to be able to use this and if I want to go beond the trial then I will have more access to it so I'm going to copy that key for now we're going to make our way over to GitHub and we're going to go over to the Gen Essentials is that is what we keep on using here today and we're going to go over here and I actually still have a a workspace running so I'll go ahead and open that and while I'm waiting I'm just going to paste the key over here so that I
have it somewhere okay and we'll go over to the API quick start guide yeah you can see that they have you know they got a thing here that you can work with um but we want to get some example code working here oh did I just close out my key I totally did but I still probably have it on my clipboard so that's probably fine so I'll just be patient here and wait for this to load so that I don't lose my key there you go so I'm going to go over here and we will
just say in a new folder here uh Google [Music] AI uh what's it called Google AI studio and I'm going to go ahead here and make a new file this will be basic. iy ipynb okay I'm going to make a new file here it's going to be EnV I'm going to paste in that key and we'll go ahead and make another one here em. example I'm also going to go ahead and make a do get ignore as per usual we are going to ignore thein file I'm going to go over to another basic one like
open AI here I'm going to grab this part as I'm definitely going to want to load in our environment variables um let's go back over to here here and see what information we have so here we have the PIP install Google generative I we'll bring that in and I feel like this is very similar from something we did before but it's totally fine because we're just entering in from another interface so we have python. EnV uh we'll go ahead and give this an install oh you know what we need to put that in front of
it otherwise it's not going to work so we'll try this again I don't know if we need the hyphen U to be there uh to be honest but we are bringing it in we'll bring some code on down here and we'll take a look and see how we can get get started so here is an example of making our first request so I'm going go ahead and paste it on in here we have our API key I'm going to load that in from the environment variables so I'll also import um OS here and I'm just
going to call this os. Environ doget we'll say uh Google API key and then we'll go back over to here to our EnV this will be Google um API key so now I can go back to here and I actually have to run this twice I keep forgetting to uh do that in the correct order and so this should load it bring in gen uh G uh Gemini 1.5 Flash and yeah the API looks very straightforward if it works that's really awesome saying no progress is found that has nothing to do with this it's totally
fine and we are getting response back so you can see that uh the Google API is really really really straightforward um yeah we can go here and yeah I guess that's that's all there really is to it very straightforward but uh yeah that is that and at least you know where the Google AI studio uh playground is it would take a little bit more work to work with Google cloud and Link it over um but what's interesting is you can obviously bring in files from uh your Google Drive and and any of the functionality that
you want to work with you can obviously work with programmatically so that's very interesting there but I will see you in the next one [Music] ciao hey everyone it's Andrew Brown in this video I want to take a look at uh vertex AI uh model Garden offering so we can start working with models very easily I already have a project set up here I just throw everything in my FreeBSD and I promised that I would turn on MFA I think I'm just going to go ahead and do that very quickly so just give me a
moment okay all right so I've turned on MFA so no one can uh uh look poorly upon me anymore I'm working in existing project um maybe I should make a new project let me get rid of a few of these first just give me a moment okay I just want to manage uh resources if you've never uh had to delete a project it's pretty straightforward you just go ahead and delete it and you do that so I'm going to go ahead and just quickly delete as many projects as I can okay there we go that
was some good clean up I'm going to go ahead and create a new project this will be just gen Essentials gen Essentials I'll go ahead and create that project uh just because if we run into any specific issues with activating stuff I want to show you that um it's really easy Once you have a Google project just to work within it but uh always from scratch is kind of a pain I want to make sure that project is selected so I hit select project and so now I'm under Jenning Essentials let's make our way over
to vertex AI as this is where uh we're going to be able to do the stuff that we want to do okay I'm just going to be honest with you I'm not us using vertex AI that much as of as of late but I'm sure we'll figure it out very quickly get started with vertex AI empowers machine learning enable all recommended API let's give that a go be nice if it tells us what it's enabling but I'm going to go press that and we'll see uh what happens here it might tell us up here in
the top right corner so yeah compute data flow notebook so those are all three key ones that we want and so that's what I was wondering it's like we start trying to use a model it's like do we have compute um but this is vertex AI but then we have Vex AI Studio down below here um we obviously want to work with a notebook so we'll definitely do that um but this is kind of where the geni stuff lives is down here below um so yeah we have open free form so non- chat tasks like
classification extraction so remember over ads they called it single prompt or something like that and then over at Azure I think it's called Uh chat completion so they call it free form and then we have a chat why why everyone names it differently I don't know they' like to keep us on our toes um and we obviously have a bunch in the prompt Gallery but we go over to here um and yeah very similar interface to the other ones but the thing is is that we probably want to utilize some models but we haven't really
had to deploy anything I'm not sure if we even have to deploy anything in in here in the same way um but let's say we wanted to use something let's go take a look here I like how we just have hugging face models here and a bunch of of them but um I'm just looking for all the foundational models here we go this is what I want to see and so I'm just looking for something that I recognize like Hau because this is something that you normally would have to accept um permissions to gain access
to page not viewable for organizations do this page select a project okay and so you have to enable it okay so if we want to use it we'd have to enable it first that's what I was expecting right um so something similar for model access but we're doing this on individual level but let's go back over to our chat here and what I want to do is I want to see if I can select um anthropic here what will happen if I do this and so it says to use it you have to go here
and enable it okay so if we want to use it I'd have to go over to here and then we'll give it a moment and then we'd have to enable it right and it asks for some information so we have a lot of stuff here I'm going to fill it in just a moment all right so nothing super sensitive here do you also have use cases is uh requirements per do you have any do you have any of your use cases have additional requirements no um nope we're not doing anything uh crazy here let's go
ahead and hit next just to show you what this experience is like I like that it shows us the pricing right away so 1 million tokens is about a dollar something here so we'll go ahead and say agree so that we know that we're agreeing to those costs um and it says that we purchased it we're not really we didn't really purchase anything but we just enabled it um I'm not sure if it's enabled I'm going to give us this a hard refresh sometimes the UI is um uh not as clear as it could be
okay so we have that there and I like how we can open this right away in a notebook let's take a look at some view code here I mean that's fine I'm curious to click this but this is going to open up in Enterprise I don't want to open in Enterprise oh I did not want to open an Enterprise m I mean it's fine let's go take a look here coab Enterprise pricing what's the cost oh okay it's pretty reasonable so then what's the collab Enterprise versus workbench why why do we have two then give
me a second okay so this one's saying developer focus and this one is more collaborative this one's serverless user controlled uh limited extensive automatic configurable um okay well I mean I don't mind that I suppose that's fine I didn't I didn't show it later on but I may have to make another video of it um so anyway I guess we have an example here and I mean if this example works we can just go ahead and utilize it so Claude on vertex AI so we'll scroll on down um so we have htpx Google collap I'm
just reading it to see what it does um select the model that's a mess mess of a code [Music] mm what I'm looking for here is I want to see what the code looks like they must have a they must have an SDK using anthropics vertex xdk vertex SDK okay um huh that's interesting so I'm not I'm not sure about this so I go ahead and delete this and confirm so I'm going to go over here now I have no notebooks running which is totally fine so yeah I'm not 100% sure about that um that's
not exactly what I wanted but um I want to see how we can programmatically work with this stuff I mean like the chat's going to be very straightforward if we go here and we use Gemini flash which is a super super fast one here we go down below and just say hello how are you you know you get the idea of what it can do it's going to talk back to us um but I'm looking for some code and up here it says get code let's go click that okay so this looks good and this
says open a notebook so we have yeah yeah yeah so this is more what I want so let's go ahead and click that okay if this opens in collab Enterprise that's totally fine collab Enterprise I guess is totally okay to use okay so I'm going to go ahead and we say install the vertex AI SDK open a terminal window and enter the command below uh well why don't we just run it here okay we'll go ahead and give this access give that a moment to install or do whatever it wants to do there we go
and now it is uh starting I think it was just starting the um the compute underneath and that's why it was so slow it said this is going to run for 17 hours I I assume we can stop it somehow um if we go to run times I'm not sure oh no no no no I don't want to do that I was hoping we could just see the run time but I'm just going to let it do what it wants to do here so it seems like it's almost done so we'll just wait here a
little bit more there we go the following package were previously imported in this runtime you must restart the runtime in order to use the newly started packages so we'll say yes let's go ahead and do that um you know this isn't my favorite experience I think I I like workbench cuz we get Jupiter notebooks I think that's what we were utilizing there but um this is fine did this restart I cannot tell if it did I'm going to assume that it did let's go ahead and look at the next code so we have vertex AI
generative models so we're bringing that in this kind of looks like a hugging face to me like the way they're doing this and so we're initializing a new project for Gen Essentials that's what we're in right now in US Central one which that's fine we're specifying the model here we are going to start a chat we're going to send a message um it has generation configuration somewhere here we have safety settings here I don't feel like I'm setting them here but oh it's down below okay so I guess uh you can set them here and
it gets pulled up into here that's interesting so we have I thought that was only hoisting which is what we call in JavaScript where it's like you can Define it here but it'll work later on I didn't think python could do that we have Max output tokens temperature stuff like that and then the safety settings I'd rather have this above be less conf using for me but um and so for safety settings these are basically your guard rails so we say no hate speech no dangerous content explicit stuff like that and that's very typical of
Google uh trying to put that front and center I mean that's not a bad idea let's go ahead and run that and see what happens okay and we get an output so there you go pretty straightforward obviously there is a bunch of other services in here but these are just basically um uh these are uh these are like if you were using ads these would all be isolate services but uh Google collects them under here for image and and other kinds of generation let's go over to our collab Enterprise here okay I'm going to go
ahead and just delete this yes I want to delete it I didn't bother putting in our repo just because the code is so available now that's deleted I'm going to go over to run times so you can see this is the runtime that's running right now so I want to go ahead and delete it and confirm I'm not sure if you can just stop run time so I'm going to go ahead and do that and that way we're not going to have U continuous cost but yeah there you go that is using Google Cloud um
vertex AI uh um vertex AI [Music] studio all right let's take a look here at um using models of service within Azure AI studio now I know that there is um ai. azure.com which is the new a Azure AI found Foundry but it's weird because it's called Azure AI Studio but it renamed it Azure AI Foundry but it's still not called that in the main one and that's kind of annoying um and so you can go through here or you can go through here um and which it should be I I really don't know um
I suppose we could explore it through this way the other way is also easy as well but right away it's asking you to go create a project but what I'm going to do is I'm going to make my way over first to Azure AI Studio I just want to make sure I get rid of anything that I might have here so we can kind of experience what that other page might look like if we didn't have anything because I think that's the first thing so I'm going to go in here I'm just going to tear
down a few things um like this Resource Group like you obviously wouldn't have this I'm just trying to get it back reset to zero here um we'll delete that and same thing with this one I'm G to go into this one here give it a moment here and it's within the same Resource Group so just give me a moment for that to tear down okay all right so that should be gone now there we go so I want to go back over to Azure AI Foundry um which is ai. azure.com so I just want to
see if it makes us create anything no it doesn't okay great so I'm going to go ahead and create a project here and yeah I don't mind whatever the name is we stick with the standard but what I find is that the location greatly varies notice that it's going to create a hub anyway so so another way to create it would be through Azure Studio here and if you create a project we probably create the Hub and the project at the same time so here's creating the project I'm going to go back here and if
we create the Hub okay it's a little bit different but the way it used to work was you make a hub and then a project but uh I don't know they're they're making it really complicated here but this is fine I I'm going to go ahead and create this new one here um but where the project exist really affects what models are available to you so you might have to create multiple projects to get to other um uh to other models but so we're going to go ahead and hit create there which I did and
so now I'm just waiting for this to create so we'll just wait a moment okay there we go so we have a bit of UI here um I don't need to get information to explore but there's a couple resources that get created for us one is azure AI services and then Azure open AI so open AI is going to get us access to the open AI models Azure a Services is going to let us interact with um other things on the left hand side here um and I guess we have an inference endpoint which is
what we need to utilize to access things so it did spin up a lot of things here um so on the le- hand side we have um a bunch of interfaces here and it's a little bit different from the last time I saw it but where we are looking for stuff is within the model catalog I have activated models prior to this but when you are first doing this you will have to go open this up and a request model access so I'm opening this up here and I it's been a while so I'm going
to see um if it tries to ask for model access and it's I already have it so you will have to click something fill in a form wait a while could take a couple days and then you'll be able to deploy models so if we want to start working with things we're going to go over to code um oh this is different templates and tutorials no no no no that's not what I want I want um playground okay and they change the interface again here but I say try chat playground and so now we'll have
to do is create a deployment and so we need to deploy a model so we have a bunch of models here I'm going to go with um chat gp4 mini notice that some say chat completion and then other ones just say completion I think I said this before that um if you're on ads it's like it's called chat and single prompt if you're on um Azure it's chat completion I think I said chat completion and chat but it's actually chat completion and completion and then when you're on Google it is uh chat and uh some
weird name I can't remember but it's not it's not normal um but anyway or it's like free flow they call it free flow I don't know why the naming is so bizarre but anyway I'm going to go ahead and choose um 01 Mini really depends on what you choose here because some of them have restrictions on how they're deployed and so it's not showing me much here I'm not sure why this is slightly different this is for chat completion so maybe I'll try chat 40 okay I can deploy this no problem and we should choose
our deployment options which is now up here I guess um so manage compute and service API used to used to have to select between the two and so I'm looking here I'm saying okay where is my options they've changed this so I don't trust this I'm going to go back over to here in the top left corner corner I'm going to go back um out of 01 mini I'm just going to choose the regular GPT 40 okay if I click deploy what do I get now I get the normal UI I'm used to so here
I have more options and uh it has the most options out of all providers in terms of deployment options I don't know if that's a good or bad thing but they just do so if we open this here in a new tab it should tell us what these all are okay and so you'd think that um uh you know like uh Global would be more expensive than standard but not always the case okay so here these are pay per token so this one this one this one and so um I'm just trying to think here
Global standards probably pretty good uh batch is something that is offline so the idea is that you're not going to be to infer with it right away so global standard is what I want here today I'm not going to go into all these in great detail I have a course specifically for Azure AI that if people are interested they go take um so now what I can do is go over to the playground okay we can well I don't need to create a deployment hold on here let me just go ahead and refresh this page
because we've already deployed a model and by the way that model is servess so if we're not using it it's not costing us anything right now and it's saying no deployment exists are you sure I'm pretty sure I deployed something and there used to be a tab here for deployments interesting so where did they put it maybe it's model endpoints down here here it is okay so this model is deployed I'm going to go ahead and open in the playground here and so we have a similar interface as you would with the other ones so
let's just say hello how are you doing we'll hit run and run again um and it's not uh it's not submitting for some weird reason it's usually not a [Music] problem there we go Azure is uh Azure is known for being a bit um buggy to be honest like their UI um so that's not really working as I hoped so I'm going to try this again now we see our deployment I'm going to go ahead and try this again there we go and then it responds okay um can we get some code that'd be really
nice there's probably a code example somewhere here view code and yeah here we have an example um I don't like this particular example because this is using uh this isn't using an API like open API so that's what I would rather do here I'd like to use the open API so now um I'm not exactly sure how we would do this with other models but I definitely know um for or uh well I guess we maybe use promp flow for that but uh for open AI you literally use the open AI SDK to to talk
to azure's models U but for other ones I'm not exactly sure like for Claude or Haiku so just say um Azure AI studio uh uh Claude do they have Claude on here I believe that they do and so that would be one that we want to take a look at here so maybe we'd have to do that through Pro prom flow Azure asdk this let's take a look at this okay so see here they're talking about like okay use open Ai and it's as simple as that what if you're not using open AI right what
if you want to use a model other than that uh so so powerful access for leading providers okay but that's if you're doing Azure AI inference which means probably one that you're running constantly I would think I'm not sure but we can do gp24 here so I'm wondering if Azure AI inference is now the the standardized one for all of them so let's take a look here and see what we have okay so that's how you do it all right so maybe that's what we'll we'll go ahead and try to use here today um so
I need some kind of environment to work with um I'm going to go use Azure ml Studio here today if that's okay with everybody you do not need to use Azure machine Learning Studio you can use your local machine whatever you want I'm just trying to get as much practice with you here using things uh like this so I'm going to go ahead and do that here today I'm going to create myself a new workspace and this will just be I don't know what the other one was called um but I'm going go ahead and
just say AZ coding or a ml I go ahead and hit review create it doesn't like something here a ml there we go review create and we'll get our environment up here in just a moment I'm waiting for this create to finish it's initialized in deployment so be back here in just a moment okay we'll wait for this to finish deploying okay all right let's go to the resource here and I'm going to go ahead and launch Studio and from here on the left hand side I'm making my over to Compu I mean I just
want a notebook so we can do it here as well um it's interesting we also have model catalog here um but you know there's just multiple ways of doing the exact same thing so we'll go ahead and create ourselves compute here uh Android one is totally fine I just want CPUs I want the lowest cost possible um from all options as we're not doing anything crazy here so we use our standard DS DS1 version two idle shut down after 60 minutes is fine with me we'll go ahead review create I'll go ahead and create that
so now I'm just waiting for um this environment to provision I go compute down here and we'll just wait for that okay all right so that is now running we're going to go ahead and click on Jupiter lab and that's going to bring up a Jupiter lab environment so we'll just give that a moment uh to get going there there we go so um we're not doing anything crazy here so I'm going to go with um SD version 2 I'm not sure what the the two is for I'm going to just check uh click python
IP kernel so we're just using standard python here let's make our way over to that code over here and I'm going just grab this link as you folks might be wondering how do you get it later on cuz I will I'll take this one and I'll actually put it in the um I will place it in the um um you know Whatchamacallit um I actually want mark down there that's fine I'm just going to switch that out here to markdown and then I'll bring this one up this one is actually code and let's see if
we can get this to work apis are always changing folks you got to keep up with them okay I was just doing this three months ago and it's completely different experience so here it says you can use the project clients to configure stuff here so we'll go ahead we have chat completion embeddings client we want to do chat completion here today so I'm going to go ahead and grab this code okay and actually we just happen to be using gp40 so that's perfect for us um but you know again the reason we're not using openi
we're using this one is that this one uh can utilize multiple multiple ones here um here we need the project uh okay where does that come from so we need to bring in [Music] this okay so we'll try this one second here I'm going to go ahead and just bring this onto a single line like this here and we'll run okay so that's good and then I'll bring this down a line create a project in client code we already have a project right so we don't need another project but we need to load a project
so here it says project your connection string create a project in the client [Music] code copy the project connection uh string from the overview page so I'm I don't think it's creating one it's saying like you already have one and so you're bringing in here so I'm going to go ahead and I'm just going to paste this in here and so if we go back over to um yeah thei playground if we go to overview here we have a connection string right here okay so I'm going to bring that in here I'm going to dump
it in like that um that looks fine I'm going to go ahead and hit play here says oh um I didn't import those we'll go back over to this we'll bring this here and I'll bring this down apologize for the fonts I just realized it was too small we'll bring it down we'll hit play okay and then we'll just go ahead and hit run here now we didn't authenticate in any way so I'm not exactly sure how that's going to work but uh we'll find out here in a second assuming this is not going to
work but if it does that's great oh it worked okay great I just thought there'd be more like there was like a popup or something but maybe because we're in here it's totally fine um yeah it just worked that's great okay so here we are getting back output so that's perfect I'm going to go ahead and just clear this out here one second um and your string uh this will just be basic and I just want to go ahead and clear all cells I don't want to run everything I just want to clear the cells
I'll stop here clear all outputs there we go clear all outputs there we go and so I'll go ahead and download this one so this is another one that's useful for our repo here now I didn't do one for um uh what do you call it uh this is azure AI Foundry I didn't do one for uh Google just because the code was so straightforward and I mean they gave us the code so I wasn't going to just change it but this one's a little bit different so we'll have to do a bit more work
here oh sorry you know what I'm still recording this video oh no it's not the video I want it's what it's the file let's try to name the file that I I uh the video is recording right here and so we'll bring this into Azure Foundry I'm going to rename this here so it's just basic okay and so now we have our two so we'll go ahead and just Azure AI Foundry I don't want all the stuff in here though I told it to go away uh I mean the first one's here that's not a
big deal I just want want to make sure that connection string was in here I want to just clear this out C all cell outputs there we go and that's good so we'll go ahead I don't need all caps I think I just hit the Caps Key by ACC AI Foundry um that is good we'll go ahead and save that and so that is um the easiest way to get working with Azure AI Foundry uh that model is deployed I'm not going to undeploy it you can if you are uncomfortable with that um but if
you want to get rid of that you absolutely can um so you could go ahead and delete that I suppose I could delete it or un undeploy it where is it how do we undeploy this edit usually was like a delete button H that's interesting what if we go to the model catalog yeah that's not very clear I going to go up to uh maybe Azure founder I'm going to go up a level into a project I really don't like this interface I prefer the well I mean it's very similar to Azure Studio opening ey
Studio but I can't say I'm a huge fan of this go back over to models here what if we go to data I don't think there's any data that's models checkbox how do I undeploy a model OKAY Azure AI Studio I'm going to write Studio because it's uh undeploy a model select deploy model page then select deploy me one delete and delete it uhhuh deploy model page what to delete a deployment within the language Studio no that's language Studio um yeah Azure a portal Foundry deploying models how do you undeploy delete okay let me go
find it okay this is typical Azure for you I found it what you do is you click into the model and then you can delete it there so I'm just going to delete it just in case people are worried okay and so now that model is deleted I'm going to keep Azure a Foundry this project around we'll use it again for Azure ml um I'm going to get rid of that instance that I'm running I could just stop it if I wanted to but um uh yeah I'm under compute here so I could stop it
I just delete it here it's not that it's not that much work to spin one up when we need one and so there you go that is working with Azure AI Foundry at the most basic way programmatically and so I'll see you in the next one okay ciao [Music] hey this is Andrew Brown and we're taking a look at hugging face so you've never used hugging face before it is awesome because it contains models data sets and more but hugging face is not just a website a repository for things but they also have a series
of plugins that make it really easy to work with models and so you're absolutely going to learn uh definitely learn about a bunch of the different types of hugging face um things and so you can see we have Transformers diffusers data sets these are other libraries that you can utilize right so we have a bunch and bunch of libraries and then there's hugging face itself right um so what I'd like to do is maybe download a model and play with something here but anyway make make yourself an account once you have your account uh you're
good uh often what we'll have to do is set tokens so I'm not sure if I have any access tokens around here right now it wants me to log in so just give me a moment there we go I'm in my hugging face account I'm going to go ahead and delete my key for now and this is some something that you'll have to do a lot is create a key uh to work with models but we'll come back to this just in a moment as what I want to do is look at some possible models
that we could uh work with so if we click on models here you can see there is a lot right now we are specifically interested in large language models but there are other of course other ones that we might be interested in uh but you can see there's just tons and tons because there are base models foundational models and then people train these models to do other things so you can end up with a lot of stuff you wouldn't expect but what's interesting here we have like text to 3D I don't know much about that
um but that might be uh fun to uh give it a go at some point here um but yeah it's really hard to tell like what's a big deal and what's not but I guess you can go here like the most downloads it looks like open AI shape e is a is a very popular one here um and I think I already have access to this model as they've already accepted stuff but a lot of times when you work with models like let's say we go over to Intel here um not that one but some
of these will expect you to accept terms so I'm I'm just trying to find one where I haven't already done this before so if I go to here and I want to use this model it's not happening right now but anyway sometimes you have to accept the terms and so you have to look very carefully here and make sure that you accept the terms before you use them otherwise they won't let you use them and there can be a waiting period I'm just sorry that I can't show you that directly but with hugging face sometimes
they have options on how you can deploy models now this one is all we have is diffusers um but with other ones that are maybe um uh different if we go back over to here let's go to something like um natural language processing so let's say we went text classification here right um and we choose anything here we have more options right so here we have options on how we can deploy it how we can train it how we might use the models with specific libraries so if I click to this it'll give me exact
code on how we could work with this um and so this is something that we could utilize but I'm just going to go here and take a look here as deploy can sometimes be useful because if you're trying to figure out what amount of compute you need sometimes they'll tell you in here but as you can see there isn't much information so they don't have anything for this one but we go over to let's say sage maker and you carefully look here they might tell you that this is using an ml M5 x large and
then you could look that up and say okay what size is this this uh this thing here right so go here and I'm going to look for that model ml m5x large and uh you know as per usual it us say UI is terrible I don't want run pod get out of here runp pod um I just want to know how large it is come on you can find it for me that's an ml M5 it's going to be an M5 x large so that's going to have four vcpus 16 uh gigabits but then if
we want to look up the cost we'd have to look up pricing for that right so this again is exactly how we would do it on Sage maker if there was a jump start we could do it that way if we wanted to use inferia and Trum we do it that way so the code is available to us uh Google Cloud um apparently it just opens up another Tab and doesn't show us anything but we'll get that sidebar with all our information in it and we have a certain amount that we could deploy so this
is something that we could deploy this way so it's like single click deploy which is really really nice sometimes it works sometimes it doesn't um obviously training but if we look at Transformers that might be another way that we could utilize it but anyway let's go over and use something practical so llama 3.2 two assuming that is still available 1B is a very good starting model and the reason why it's a good starting model is that it's very small so you can run it on most machines um but uh or even on on computes so
this is something that I think I'd like to work with here today um I'm just trying to decide um where to run this uh because I want it to work for everybody so maybe what we should do is try to deploy it to something um and M maybe we can use lightning AI today that's what I'm going to do so let's go make our way over to lightning AI okay just because it's so easy to start working with it and they probably already have a llama 3.2 one here but I just want to get started
from scratch so I'm going to go up to here and I just want to launch myself the studio now the UI is a little bit all over the place but I'm going to go to my studio started a project from a template I don't want to start from a template I just want to start creating so the wording is a little bit funny let's go to home maybe here we go and so now I can go ahead and start a new studio and we have code and we have a model language um so let's go
ahead and hit start I don't have it defaults to gpus I wanted to start on CPUs but we'll see what happens here in just a moment so it's on CPUs which is fine um but with llama 3.2 we do need gpus actually if we want to work with this but we're going to start with CPUs first okay and we're going to let this environment spin up okay all right so our lightning environment is up and so uh you can work in a jupyter notebook or in here how do I want to work with this today
I think I will work in uh Jupiter I think I prefer Jupiter here today so it's up to you you can either or it's the same compute but I'm going to switch over to that Al have tensor board here on the on the left hand side so I'm going to go ahead and make myself a new notebook for Python 3 and I'm going to rename this [Music] to hugging face Basics so hugging hugging face HF basic just so we know what that is and so we'll start working with this so how can we work with
this model well first you have to accept the terms so see it says gated model you have been granted access so be very careful and and do this CU I've seen people where they think they've got it but they actually really haven't done it yet and so this is where we can start programmatically um working with this model so we go over here we have Transformers and VM so VM is um a way of serving a model but Transformers is another way so here's suggesting that we use the hunging face Li to log in um
so login with a HF token with a gated access permission I suppose that's one way I'm just going to provide the token directly as that is another way for us to do it but I'm going to go ahead and grab this code here and I go back over to here and so that'll be the first thing I'm going to have to go above here and we're going to have to do a pip uh pip install Transformers okay so that's going to could not find a version of the package that satisfies Transformers I probably spelled it
wrong there we go why they call it hugging face Transformers I don't know really is confusing they're they're naming here if you don't like all the stuff there you could do hyphen q and it will still install but it'll just be less noisy um so we have that we do need to set our hugging face API key so that is something that we'll need to set I'm going to make my way um over here in a new tab so I'm going go here and then go to access tokens I'm going to create a new token
set it to read and this is just going to be HF example and I'm going to create that token and so now I have a token so there's a hugging face uh endpoint it's like hugging face um I mean I've never tried the hugging the hugging facei it's not that I've never tried it but it's been a while so I'm actually just curious if we were to use that but hugging facei I found was kind of a pain to install so I'm not sure we could try it yeah so we have this so I'm going
to grabb it here I don't trust running it in um in this context I'm going to go ahead and make a new terminal here and I'm going to paste it here I'm going to type in clear does this have cond installed it does um but I'm not sure if we are doing that in that context because all these environments are in different context right but we only have the one here so I'm assuming it's already set to that one I'm not sure or what we could do before we do that was just do cond list
maybe will tell us no that's not it cond EnV cond I'm assuming it's using cond underneath I just want to see what um environments we have we have packages uh just show me what possible environments we have I'm not sure why it's so hard to find oh right maybe the environment is like a local thing okay well anyway I think it's using Python 3 here and we did install that there so let's go ahead and just see if it works if it doesn't we'll just circle back and do something else okay but this is not
usually the way the way I like to do it but I'm just trying something different for a change so it says enter your token here okay so we'll go ahead and grab it uh from here copy of course use your own token don't take mine I'm going to delete mine anyway so you're not going to get it hit enter add token as git credential uh sure I don't know so canot authenticate get credential as no helpers defining your machine uh sure okay there we go we'll try this again I'm going to go ahead and I
realize this is small so I'm just going to bump it up a little bit I'm going to go and grab this again we'll copy it we'll go back over to here we'll paste it in we'll hit enter we'll say yes and now it says it's active Okay so just because it works in here does not necessarily mean it will work in here and so this just really has to do again with this environment and whether I'm actually in the same context as this environment but I don't know where that environment is uh usually when you
have cond will do parentheses and tell you but this looks like it's a bit modified so I can't really tell what's going on I guess I could type in cond info and see what I get and so oh it's telling us where it is it's right here okay so this is the environment that is activated called Cloud space um not exactly the same name as here but it might be okay so I guess we'll see but anyway uh what we have here is we have the Transformers and then we are making a pipeline we have
text generation meta llama 3.21 billion so this should download the model right so if I hit play here this should download the model okay if we go down here we might get a problem about cannot Grant access so it doesn't think it's set right and so this is this is what I mean where I'm having a little bit of an issue so um oh I always forget how to do let's go ask chat BT how do I load an N an nvar [Music] into uh Jupiter Labs cuz I I I imagine we could make like
aemv file and load it that way yeah that's one way so that's what I was thinking that's another way and there's [Music] EnV yeah I feel like that's a little bit more safe so I'm going to go ahead with that and and we'll go back over to I suppose here and I'm just going to grab that and put that there as well we'll hit play and that that'll install both of them okay and then I'm going to go back over to here and we need our EMV so we'll go ahead and make that uh new
file dotv um I'll rename this here so dmv. txt and we'll look for hugging face um hugging face and VAR because it'll have a very specific name that it can load API key HF token that's what it is okay so I'm going to go back over to here again trying to do it in a way that's safe okay and then I'm going to go ahead and grab my hugging face API key here and if somebody smarter than me and thinks there's a better way happy to hear it but we'll save that there and so I
want to load that specific uh like a very specific file so go back over to chat GPT here um okay what if the file is called .txt can I load that cuz there's probably a way to do that there we go so this is what I need so we'll grab this [Music] and yeah so we'll grab this here I'll go back over to here and I'm just going to bring this down one and we'll run that and then we'll run this again and so now it's downloading the model okay and so understand that when we
are downloading models it's good to be in the CPU mode here um but then when we want to actually do inference we're going to probably want gpus um if you had a very special type of CPU like Intel lunar lake or uh a metal chip they have igpus built in um so I would just say that we'll probably switch over to gpus when we want to start using this but you might have a machine that's not necessarily using the gpus but using the igpus and we're not necessarily utilizing those here today um this actually could
probably work with like a Zeon processor so if we had like a xon 4 skillable processor with enough CPU we could technically uh utilize it that way um but there's no harm we can try to run it and see if it works if it works with four four uh CPUs with 16 gbits of ram that's fine as well but you know we're not going to get the best performance so again we're letting that model download it's going good device set to use CPU which is totally fine because we are utilizing CPUs right now so I'm
going to go back over to here and we have a bit of code here so load the model directly so actually this would probably download the model what did this do then safe tensor no no it's downloading the stuff right here here okay where it downloaded I don't know so it it definitely downloaded it here and so then we have the tokenizer and the model um so go ahead and grab that we'll paste this in it shouldn't download it twice Auto tokenizer is not defined because we didn't import it oh sorry so this we can
either do it this way or that way okay sorry so um some like I often do it this way like load the all directly but here we're using the pipeline um so let's read the highle pipeline documentation and see how this might work so we have the pipe and then we can provide something here so let's go ahead and grab that okay that's why I was getting a little bit mixed up so go ahead and run it like this it would say set padding ID and EOS token ID for open and generation it still worked
um so this I think has something to do with uh embeddings or tokenization and so uh there's some kind of underlying file we might have to tweak but but it clearly still generated out it says this restaurant is awesome the food is delicious and the service is great the only downside is in the location I wouldn't so we provided some text and then it went to a certain degree uh but let's go take a look and see if we can learn a little bit more about pipelines so pipelines are a great way to ensure models
for inference okay um the pipeline abstraction is a wrapper around all of their available pipelines which makes sense and so here we're doing text classification uh but what did we do we did um text generation so we have different types of pipelines there so what might be interesting is to find out exactly where that information is for the different types of pipelines uh the text is really hard to read here on the right hand side I'm going just blow it up I'm not sure why they make it like super gray but I'm going to zoom
out here I'm imagining that that pipeline is this one right here so if we click into this right this is the pipeline I think we're izing okay so we have pipeline yep and then we have generator they just named instead of name it pipe then named it generator based on what it is okay and so that's an example of it working there I was hoping that we could see a little bit more about the parameters which I don't see but that's totally fine another thing we could do is we could switch over to summarization pipeline
maybe so maybe we could try that for a second the summarization Pipeline and so I'll scroll on down and so here what we could do is switch this out let's just see if this works and I'm going to go back over to here and we'll scroll up and so I'm going to switch this out to summarization right I'm just guessing I'm not really following well I guess it has it right here so I'm not necessarily guessing and we have a few parameters here so grab that there and go ahead and run it shouldn't download the
model twice that's fine so it says the model llama for casual ELO is not supported for summarization so there are very specific models that can perform these things and so it's just saying you cannot use the LA metal llama llama 3.2 1 billion um so we'd have to use something else so here we can see we have Bart for conditional big Pegasus I'm not familiar with both of these models but let's go ahead and see if we can uh figure out what these are so I might go here just see if I can search for
this uh let's go over to here as I'm not again not very familiar with these hugging face Mario model let's go here Mario mty model and we found it over here so it says so we have a bunch of models here that are kind of directly supported these are text models a framer for translation models using the same models as Bart translation should be similar but not identical to it okay and so here we are using the model this way but I'm curious I wonder what would happen Okay because we went to the summarization pipeline
and I wonder if I can just well here they're actually using the Google T5 base which is fine but I'm almost wondering if I could just drop in the Martin one um so let's go here and I'm going to just try this it might not work whatsoever but I'm going to try anyway it's probably say I don't know what you're talking about is not a local folder or identified there yeah yeah I figured as much so maybe it's just ones that are compatible with that right so let's see if chat GPT can help us so
you know I want to in the summarization pipeline in hugging face okay let's see if it can figure it out for us it just says that we can do it okay so it says this for translation tasks mhm okay so here is the model and that's what I was looking for so I didn't I didn't really read the rest of that I didn't really care and again I'm just going to point out I'm not an expert in this stuff it's just like you kind of start figuring out what to copy and paste and what doesn't
work the to cannot be in uh in uh instantiated please make sure you have S sentence piece installed to use this tokenizer so I'm going to go up here and just add it this again often what happens is that I have to constantly do this stuff uh the current process just got forked okay whatever and I'm going to go ahead and try this again so run this again and hugging face tokenizers the current process just got forked after parallelism has been already used disable parallelism to avoid Deadlocks sounds like something I wouldn't want to happen
the tokenizer cannot be uh instantiated please make sure you have sentence piece installed in order to use the tokenizer so sometimes when you have issues you need to restart so I'm going to say restart kernel okay and I'm just going to run this again and then this one again and then this one again let's see if we can kind of get this working so here it says um recommend using Saros I don't know what that is let's go take a look it is what um it looks like it's a tokenizer and it looks like more
like scare Mo Moses looks like scare Moses now that saying SC Moses that is going to be something there so I'll do that we'll restart the kernel okay again don't know what I'm doing but I'm getting it to work and that's all that really matters I mean you should know a little bit more but now I'm not getting errors this is not really a good summarization uh task so I'm going to go ahead here can you can you give me a large amount of text for summarization okay and it goes and goes and [Music] goes
good so I'm going to grab this text right and we're going to bring it on over uh to here and so this is going to be my big big amount of text that's a lot of junk um I do not need all those lines um but I don't care I'm just going to let it be like that because I can't even see what I'm doing so run that see if that works un terminate string literal I knew it was going to do something weird like that so I'm going to go here to the end yeah
was having a little bit of problems here and so something it doesn't like something in here because you can see that it's not terminating correctly right so I just keep bringing it on to the next line here it's oh now it's highlighting properly excellent so probably just the way I did multi-line so your max length is 512 uh but your input is only 452 since it's a summarization where um where do it says with output shorter than input or typically wanted you might consider decreasing the the max length Okay but it did go ahead and
it attempted a summarization now it is doing something funny where it's showing French however if you look at the model it says empty English to French and so you know this mod model is intended for translation English to French now we were using it here right but it's kind of interesting that it did produce French in this case so that kind of makes sense we can go down here and maybe let's use the Google T5 T5 base I'm not sure how large that one here it's suggesting that it uses the tensorflow framework and it has
to use this tokenizer so I'm going to go ahead and grab it here and I'm going to paste it in okay let's go run this and pipel can on for suitable mod class class for T5 Bas so not sure why that's happening so I'm going to go back to this so this is not a obviously a very good example for translation but let's go over to here maybe they have one for translation they have a translation pipeline okay so let's just go this one's English to French this one's English to French this might be perfect
is this one using the one that we're using no not necessarily and so I'm going to switch this out to here okay so we're going to take all this text out of here now holy smokes that's long okay and I'm just going to go down like this and for those who know I'm in Canada and yes we do speak French here but not so much myself I'll say hello how is your day going today right and so I'm going to go ahead and hit run here and then we'll do this uhour a oh I don't
know how to say that but anyway so you know we have that there and so that is pipelines um do I like pipelines they're okay but they do standardize stuff to a certain degree there are all sorts of pipelines like natural language processing we got multimodal ones and things like that um but you know there are other ways that we might use it at a at a lower level uh for hugging face we're going to do more with hugging face but I'm just going going to stop the video here um you can stop this environment
whenever you want uh by going back over to here I'm not going to stop it because I want to continue on but I just want to give us a checkpoint here uh for this yeah so if you want to stop that you can just sleep it or delete it if you really need to get rid of you can just delete it um and I suppose I could export this code in its current state if anybody wants to play around with this I'm going to go ahead and download this and we're going to bring it into
our repo but we're just going to keep playing around hugging face because there's so much to do in here and I'm just not even sure what to show you there's so much um so I'm going to just bring this into here for now hugging face right and I'll see you here shortly here [Music] okay welcome back we're continuing to learn about hugging face uh we used the um uh we used a pipeline and so there are different ways that obviously we can work with models and so pipelines is one of them but what I want
to do is go back over to here to hugging face and I'm going to again continue to focus on llama 3.2 um and see what else we can do with it so we went here and went to Transformers and so we used a pipeline which is this one which way this way there's another way where we can directly load a model um and so this might be a way that we might want to work with it so I'm going to go ahead and copy this and I'm going to go back over to lightning AI which
I still have running with our CPUs I'm going to go ahead and make a new file here new notebook this will be HF I really should name this one pipelines but that's fine and we'll rename this one to HF uh direct model okay so I'm going to go ahead and paste this but I'm not exactly sure how to fill the rest in here as I I normally have a basis of code that I work with I tweak it from there but let's go ahead and see if chat GPT can help us out so we'll say
um you know what we should try to use other models we should just use chat GPT every day so I'm going to go ahead and go over to Gemini and I'll just say uh you know I want to run uh I want to uh have chat with uh with with uh llama 3.2 but I need to complete the code and if I see the code then I I'll be like okay yeah that's what it's supposed to look like wow that was fast that was really really fast that was really really fast um and I guess
we're on flash so that kind of makes sense and so here we have Transformers the same lines and so we have some python down here here's how you can complete the code okay so here we could set the setting to GPU or Aruda actually that looks pretty good um and it would default to CPU so actually I'm really happy with that um so we'll go back over to here okay but I I think it if you didn't specify anything it would just fall back to that sometimes I see numbers here instead so instead of Cuda
and GPU it might be zero and one uh and one might be CPUs for some weird reason then down below we have a function okay so let's go ahead and copy this and see what we have go down here and I'm going to break this up into three functions I think that's a little bit better okay and so we have our function where it's going to generate text and so we have the tokenizer okay yeah because when you have an llm or a model you need to uh tokenize it before you put it into the
class okay so that makes sense and then we have model generate so we're calling do generate on a model so if we looked up maybe like Auto auto auto model for casual llm if we looked at this probably yeah we have Auto classes so in many cases the architecture you want can be used and guess from the name or path of pre train model okay so we have that what I'm looking for here is it's just the functions like generate and things like that so do we have generate here um I mean suggesting that it's
in here somewhere generate generate generate generate generate uh or am I just in Auto classes hold on here I'm just in regular Auto classes I want this one in particular um this one here okay and so what I'm looking for here this is hugging face uh API somewhere else so we can configure it m okay but where does that generate thing come in play from pre-trained because maybe yeah you have the model here right and and then we can call functions on it like generate but where does that come from so let's go back over
to Gemini it's like where does where does the generate function come from the generary function is powerful okay great but I can't find it okay but where let's go here and see it might have a docs for us there it is text generation each framework has a generate method for text generation okay so basically whatever it's so there's different Frameworks like P torch tensorflow flax Jacks and so I guess the idea is that it's mapping to that one underneath with a generic one there I still don't understand how it's getting there but I at least
kind of have an idea of it so that's totally fine I I think this is going to work um and then we can set it have the max amount of tokens and things like that so I'm going to go ahead and give this a run so we'll do I have to install anything probably Transformers at least right so we'll do pip install Transformers it might already be in our environment but I'm going to go ahead and run this again anyway and by the way like when you're working on dramatically different things you probably would want
to spin up a separate Studio as when you're working locally with code You' often spin up a separate uh cond environment right I'm gonna go down here and again I'm just going to silence silence this as this is kind of annoying and then we'll run the next one so it's going to download llama 3.21 billion we actually already have it downloaded from the pipeline before so I'm assuming it will go the same place we can actually tell it where to download by the way um if we wanted to put in a very specific folder and
that's something you might want to do and we probably will do it some later point because when you're working with containers uh you need to place it somewhere outside the container and so and also just it's good to know exactly where you have it so I I prefer to explicitly say where it's going to go so I can see the files um but really what it's doing is it's literally downloading the contents of this so if we were to specify where to download it I think that we would see that maybe we could uh ask
it to do that we'll go back here uh to Gemini uh can we change the code so we can say exactly where the model should be downloaded cuz I don't want to feel like how how to find that here and yeah so here yeah are we just specifying cash is that what it is Cash directory and so here what we could do is grab this here I said we' do it later but let's just do it now and the path I want to put it here is just like um models okay and I'll just make
a new folder here called models this should relative I'm not sure if I could just do this and say like period slash like that um this says model path so I'm not sure if we should put llama within it or this going to be a single model um okay so is this a path to a generic models or should I have a folder with the model name I don't like how they didn't give me a concrete example if you don't specify folder okay so maybe we'll try this here right and so let's do this because
I wanted to see it download right um am I complain that the folder doesn't exist let's go down here oh right right right right right right we got to load it in so we did this before in this other one here and we're going to need to bring in Python dot EnV so we'll go up here and put this in here and then we'll grab this put that in there so we'll run this and then this and it says we need to restart our kernel let's just restart it and we'll run it again and I
don't know if this is going to work without that folder but oh it is okay so if we go into here in llama 3.2 see we didn't yeah see we could have just put in a generic folder that's silly yeah it didn't really understand what I was asking for so we could have just done this and I'm just going to stop this because I don't want that to uh to go there and so I just want to go ahead and clean this up a little bit we'll delete this delete delete I do not care just
delete it models delete this folder please delete okay that's fine um going open up terminal here we actually can see our terminals over here I believe that was like when I was running before I don't want to do it that way let's go back over to terminals here and maybe down here there we go so I'll just type in LS and so I'm just going to go into models here and I'm going to go remove RF llama 3.2 there we go and if I go back over to here okay now this there we go and
so now we just have an empty models folder that's a bit nicer we'll go ahead and download it again this is how I would probably work with it um it does store it somewhere I just don't know where and I can't be bothered to look for it but I I like to know where my models are and and uh it's just how I do it so we'll we'll give a moment for this to download and again if you're utilizing gpus if you need them you can always switch between CPUs and gpus here so when you're
downloading models use CPUs and then we not go to gpus all right so that model is done downloading but let's go take a look at its contents okay so we have blobs refs snapshots we go into here that's not really that useful we'll go into here okay and so this is the stuff that you that will look a bit familiar so we go back over to hugging phase I believe if we look at it here yeah it's these files we have this the the model safe tensor flow and some of the other ones yeah so
we have the ones that we need it's not all of them but it's most of them and so you know that's what we can see here I'm not sure if we can get info from here um we probably go to terminal to find out but uh you know what's really important to to note is like how large these files are so this is basically is the model weights if you're using a um uh a tensor flow one it's going to be different for for um but this one is uh no so this is a tensor
flow model but if it's p torch it the the format looks a bit different I usually see bin files there's another format for pytorch I don't know what it is but anyway the model is now downloaded and so we can tell it to use Cuda if it doesn't we'll do that saying name torch is not defined um so this is where we'll have to include torch okay that's P torch um I'm actually going to put this on a separate line just because is a bit of a pain in the butt and I I want to
see it do all the stuff here so I'm going to go ahead and do that it's going to be a big old mess um I'm going to go ahead and restart this kernel one thing that uh I don't wanted to do I don't wanted to constantly download the model that already exists but I think that if it does exist it won't do that so I think that's fine and so I'm going to go back to the top here and we'll run this one and then we'll run this one I don't think it will download twice
will it download twice if it does we can put code in there to say hey don't do that but I would expect that it would see that it's there yes it did and now we can run this now torch talking about it's right there oh torch right here okay so we have to import it right so I'll go here and we'll start importing our torch import torch and torch might have already been pre-installed I'm not sure uh m okay I mean I didn't get any errors so that's probably fine I don't know if that's from
the import torch I'm not sure why it's printing out all that but I'm just going to ignore that for now so we created our function and so here we have kind of a loop um I'm not sure if this is going to work because I'm not seeing it like expecting any input from us well does say user input here let's see what happens okay great so we can put it user input so here we'll just say hello we' enter it says the intention mask and the pad token ID were not set as a consequent you
may observe unexpected behaviors please pass your input attention mask to obtain reliable results so we do get a response back hello my name is Maddie I'm a 20-year-old and I'm living in whoa what the heck uh this this is screenshot worthy this is screenshot worthy it's like this is built for catfishing all right so um sorry that was just so unusual I had to take a screenshot and just share it with my socials because literally I did not expect it to make up such a weird scenario are you real let's see if it even knows
and it is working again we don't have a lot of CPUs here and you can see like the ram CPUs using up here but it works really well I'm a real estate broker in the state of Florida that is that is really great so we'll go ahead here I just want to ask it about this um sometimes you have to tweak like this an error get what do I do about this error this is something I've had to solve before and there could be like a configuration file that we might have to tweak or it
could be just input parameters here but this is where if you use the pipeline there's less fiddling around but if you use direct model control you have a lot more um control of what you're you are utilizing so anyway that's direct model control I'm not sure how much more I need to really show of this to you folks so we'll call this one dusted and done and then we will uh take a look at something else that we can do on hugging face as there is a lot that we can do on uh with hugging
face and we should learn as much as we can as I think it's super valuable let me just clear out um these here if I right click here it will say clear all uh clear all cell outputs and right now we're working with lightning AI but just understand that I like to do local development as well but just trying to give you a place where you have reliable compute assuming that this platform is free while you're utilizing it but I'm going to go ahead and download this one and we're going to bring that on in
to uh GitHub wherever that is here um into hugging phase and so that's our direct model there we go there we are thank you very much and I will see you again and we'll do more with hugging face [Music] okay all right let's keep doing something here so um we've worked with uh uh uh Transformers the library and so that's pretty good so far with that also we've been using with 1B so it's actually surprising that that worked really well for a conversation normally you use 3.2 1B instruct and so sometimes some apis are not
going to work with the jic 1B model and you have to use the instruct model so if we go over here to like instruct see llama uh 3.2 um 1B just understand that you know if we were to use something like VM or something like uh uh TGI it might not uh be as Cooperative as we would like it to be I'm not sure what options they have over here I was just taking a look here of what what they're showing here oh we' got a lot of options but anyway that's something that is kind
of interesting there um what else could we look at here I mean we could upload models things like that but you can also do things like use the model directly here on hugging face so here on the right hand side we have inference API um and I'm not sure what this costs but some models are free so if I go here say and we run this Compu here right I didn't have to use anything I'm just pressing the button it works and so it's completing it so I'm not sure how they have the compute to
do this but they are able to do it maybe because of the way that they are doing this but this is a cached example so I think if I was to change say uh Andrew Brown uh is the best cloud instructor because uh because and let's see if this works okay cool and so if we go down here there's some pre-built examples if we view the code um you can see we have an example here and this is actually directly using the AI inference hugging face so hugging face like literally you're hitting this and you're
using their compute to do it now do they have pricing for this probably inference API so look at this inference API hugging phase huging phas is just a mess of things so if we go over to here servess inference probably go to pricing here probably somewhere here let's see what we have uh unlimited Etc zero GPU and Dev mode for spaces INF for end points okay so I'm not exactly sure exactly how the pricing works but clearly they have inference on here with a few different examples so if you just wanted to use direct curl
so you know that's that's great um and then they're talking about about spaces I believe this is for gradio okay so we go over to spaces here maybe it's not let's go take a look hugging face spaces okay so this looks just looks like curation the reason I'm getting confused is there's also a thing called gradio spaces okay so if we type in this and also gradio is maintained by hugging face I believe and so that's where I get a little bit confused okay but let's go over here let's see if we can even figure
out anything about spaces discover amazing AI apps by the community mhm so these are the spaces we have this week let's go ahead say let's go click this create a new space Oh okay space SDK all right so this is starting to make sense okay so streamlit gradio Docker I'm not sure what static is I guess HTML 5 static but it's a way I think of serving things in a UI and so maybe spaces allows you to utilize those that is that is interesting okay let me go back here so let's say we clicked into
this one here I don't know you know this is but it's like 3D generation ah okay so spaces clearly is a way for you to have an interface a UI interface and start working with it something that's very similar to this is like called uh was it called replicate which we'll be looking at some other point so replicate is basically this where you could go around and you find something let's say explore here I'm just trying to do this as a comparison but let's say you wanted to use this one it has an interface here
and you can just start using it but I'm not sure if you can make your own maybe you can this says it's public um but yeah it looks like we're creating interface here so upload an image and click generate to create a 3D asset if the image has an alpha Channel it will be used as a mark and running on zero I'm not sure what zero is hugging face zero GPU there a new kind of hardware for spaces what is it a shared infrastructure that optimizes GPU usage for AI models and demos on hugging face
okay that's kind of cool um so here we want to generate something so I need something like an image so I'm going to say um trying to think of something that is a little bit interesting um that we could do like say like a cup that's not that interesting but we'll go ahead and do it anyway and so I mean this Lego Cup's pretty cool I'm not sure if it's going to work for this example okay it's also kind of copyrighted so maybe we shouldn't do that let's just use the standard party cup I feel
more confident with this there we go Uline and I probably get that because I use Uline all the time to buy things that's where my snowsuit came from by the way uh would be like freezer jacket maybe jumpsuit this is this is exactly what I wear okay I wear a cold storage cover walls that is my snowsuit I'm going to tell you best thing best thing ever anyway so I have that image downloaded let's go back over to still have this running here if I'm not using I probably should stop it I'm going stop this
for now sleep for now yeah say yes sleep and I'm going to go back over to wherever that spaces was maybe close out some of these tabs so things are a little bit less confusing here it is okay so I have the cup I'm just going to try and see what happens okay so we have our cup good now what upload an image and click generate generate did I click it okay I did I did click it okay great so it's spinning and it's processing it here it's suggesting it could take 22 seconds we don't
really know for certain whoa that is awesome that is really cool okay so now the next question is could I download this asset so if you find the generated 3D asset uh satisfactory extract the glb file so we'll go ahead and extract that out and I think I've been seeing these online some of these things here so that's pretty cool and so now it's running another thing to extract it out and so I'm just waiting for it I feel like I can make a video game so quickly these days because of this it's amazing not
a good video game but I could make a video game and so down below here it's just processing it so we're just waiting for that extraction so we'll give it a moment okay all right so let's go download the glb file and I actually don't know what a glb is I wonder if I could open that in blender uh let's see blender okay what if I was to open blender do I have blender install on this computer I do not um what could could render glbs for those who don't know I used to do 3D
modeling like long time ago but it's been a while so supported by various it says blender can do it um but I guess I just don't have blend here so like glb preview I mean it's previewing in the browser so I don't really feel like I need to do much with it but it would have been interesting to just like open it up in blender or something that I know and just see like how clean the lines are um for those who don't know like models are made up of ver ver vertices and edges and
based on the shapes of them some are better than others so uh you know that might be something to look into but I'm dragging the file in here to here I mean it loads and it's textured too which is nice I wonder if we could go and see it's skeleton or wireframe there we go oh no no no no no no no no no no I mean it's good but whoa that's a little bit too much um I can take anything off here none oh my goodness uh I'm just trying to see like really is
that it's really that that many uh that many vertices I mean it looks good but the question is like how many you know how many polygons glb polygon count and so that wouldn't be easy for me to check right now again I'd have to bring that into something like blender does it look good yeah it looks good but um would I use that in a game I don't know it depends on uh I mean it's rendering fine I guess in my browser so I guess it's fine but like notice here looks like the polygons are
lot smaller so I don't know the only way I would really know is that I have to bring that into um I'd have to bring that into a blender and I don't feel like installing that here today um let's go take a look at another one here I'm just curious let's go generate this one out this is why we need more expensive CPUs because people can be uh um using these kind of assets but there is a way that I could bring it into blender and then low poy it that's really good that's really really
good let's go generate that out that looks like fun oh I already generated it what I meant to do was I meant to click the extract glb but like it didn't take that long there we go so let's click this now it says 42 seconds so we'll just wait a moment till that's done all right so this is now generated out let's take a look here and uh I mean I can see the polygon so maybe the polygon count is not as high but yeah I just don't have confidence with that that preview app that
I was using but uh it looks good and I mean it's running my browser no problem so yeah that's that's great um I'm not sure how textures things it doesn't know about I don't know it seems to do a really good job so that's pretty cool let's go back here is there any other spaces we could check out uh flux Phil outp flux wow uh Sora 3D well that's another 3D one we don't need to do that twice but uh let's try Flex out painting and I guess I need an image so let's go ahead
and I've been watching Dandan I hope I don't get anything that's inappropriate here um so yeah maybe we take this one here D is pretty popular right now and I need to crop it somehow so I'm going to just bring it into paint really quickly if I just find paint here open with paint open with paint with paint with paint where are you paint paint there we go everyone's favorite app and we drag that into here and what I'm going to do is oops that's not what I meant to do um let's said out painting
right so let me just double check I think that means like extend the scene if I'm if I'm correct so we go back over to here you like to extend and pick your R expected ratio okay great so I like image ratio that I'm providing and resize input ratio does the input one have to be that okay well I'm going to just can I uh restrict on the Square size I'm not sure how I could trying to make like a perfect square here one second so this is n 190 by 1080 and so I'm going
to go here I'm just going to make a new one no that's not what I want I'm just going to drag this up let see if I can bring the smaller so what I'm trying to do can I can I just double click this no VI and so I want to get this to be exactly 860 there we go okay great and so now we have uh this image here I'm going to go save it okay so that's somewhere probably on my desktop and let's go ahead and drag it in here okay and I'm not
going to provide it any information the ratio is 1 to one and we will go ahead and say 75% and this one is what is this image size I forgot it's uh 860 by 860 so that's fine yeah we'll go ahead and just try that so they generate I haven't provided it any instructions it probably needs some information if it wants to do a good job but I just want to see if it extends and again we're just trying out spaces for fun here unbelievably awesome this uh zero uh zero GPT or zero um gpus
okay so I'm going to pause here there we go and I mean it did extend the image uh so just say scho uh anime school boy walking around uh outside their school maybe that would help because this is uh I mean it did generate stuff we cannot say that it didn't do that but um we did not provide it any information so maybe this one we might do a little bit better but again not spending any money just clicking on things here here we go and uh it's interesting we're getting Japanese text so it knows
that it's anime uh is it making any sense what it's saying down there I don't know but you know what it might be fun let's see what Tex had actually produced so grab this um I can go over to something like Chachi BT we might be able to do in Gemini here let's just try here Gemini uh can you uh translate the Japanese text to English I'm not sure if it will do it here but we'll see maybe flash won't do it so it says here I mean I'm not sure if that really is what
that image said but uh oh did it grab it from an ex existing source that is similar that's really interesting but we have here sometimes he sings in a high pitch voice this translate he can sing a high note yeah okay so I think what it did maybe what the model did is it had Source images in Japanese with subtitles and it literally pulled them in because I said anime right but uh it's really interesting that it did that but it did extend the scene so it kind of worked and then we have some uh
obviously unusual stuff but again totally free I can't expect to have amazing results here but yeah that that's spaces I guess and we can create our own space but I mean I don't really have any use cases for creating my own space so I guess I don't care but clearly we would just upload um upload either gradio or Streamlight code and I I just don't care about doing that right now so I think we'll just call this uh an exploration of spaces and we'll continue on if we can find anything else with hugging face okay
[Music] all right so we are still in hugging faces and um one thing we didn't do is work with any data sets I don't work with these very often but sometimes you need them when you are trying to do training I recognize SM talk uh as a data set that I've seen before um I don't know what the data set is let's go take a look here it says this is a synthetic data set designed for supervised fine-tuning of LMS it was used for small M2 instruct okay so yeah data sets are there something that
you can load in okay and then you would use for training um I don't know I mean there's probably like a really easy data set that we can use let's go over back to Gemini where are you Gemini do I still have you kicking around here uh we'll say [Music] Gemini it's crazy that um google'll tried charging for this initially and obviously they have the flash model now so they went back on that but um I want to uh use a very simple data set in some kind of uh example for hugging face and so
this is something that we should give a give a go okay and so here I mean this is a small data set okay but I I want to use an existing data set not upload one but at least it's telling us that we could do it which is [Music] fine okay but it doesn't seem to know of a basic one here we have you might need to preprocess the data set before a specific task so this could involve tokenization feature section stuff like that um you can use this fine tune Bert for questioning and answering
so this might be something that we could do I don't really like what it's giving us for examples I'm going to go over to Claud AI great so I'm going to say I need I want to train or I want to use a small data set to train uh pre-train Bert to do uh Q&A using hugging face data sets can you give me a small code example can you give me a code example that will run on a small amount of CPUs that's not the best worded but I have a feeling that Claude might perform
this a little bit better not to say that Google Gemini is bad it was fine but it's just like I need something more concrete and now we're seeing this squad thing again so maybe that is a data set and this one's giving us version two so go over to here I'm not familiar with Squad but so it says Stanford question answering data set okay well that makes sense and so here we have uh title Beyonce when did Beyonce become a star and then here's the output so we're getting example data so that's pretty clear okay
so let's go give that a go so what I'm going to do is I'm going to bring back our environment here um this was was the one I was running just a moment ago in lightning AI again we can use whatever we want I'm just using lightning AI here today but because we're going to do some super fine tuning in a very in the mo in a very basic model with Bert um I figured that uh you know this this will be fine just using CPUs I could probably do this on my my computer or
on my Mac but I'm going to go ahead and and give this a go here using claud's Code sorry Gemini I mean Gemini looks okay I just uh I'm not sure about it so here it doesn't say like how much we would need so I'm going to go here so like how much uh CPU and memory would we need I really don't know so this is where like if I ask my friend roll plus she'd know like right off the top of her head so it says minimum uh two CPUs 4 to 8 8 to
16 training Time 1 to two hours on four four four small I okay I need to I need to train something in 15 minutes okay what what do I need to do what what can I do for this okay for a 50-minute training window okay that sounds great now it's suggesting Squad um and that's just a d a different data set I suppose so I'm not sure if it really matters if it's squad or Squad uh Squad version 2 let's take a look at this one and so the data is a little bit different this
is 87 rows this one's 130,000 rows so this data set's a little bit smaller um and we'll go back over to here and so this one here is for 15 minute one use fast distilled uh BT so you're going to see this term a lot distill so what is distill because it just means like they've done something to the model to make it to smaller distill models the process of transferring Knowledge from a larger model to a smaller one okay so it's yeah distilling that makes sense and this one says fast so yeah using distill
is is 40% smaller 60% faster reduce data set to 100 training examples and and things like that so yeah this one's greatly reduced used okay so what I want to do is go back over to here and it's suggesting it can train within 10 to 15 minutes okay that sounds a lot more reasonable so I'm going to go ahead here and say new new say HF data sets and again not a super expert at this stuff but you can see we're getting by uh pretty well um and right now we're in VSCO we could go
to Jupiter lab if if that's what people prefer here [Music] um I think I'm going to do I'm going to switch over to juper I'm indecisive I switch between uh both of them all the time but I think I'd rather do this in Jupiter here today and so um you know again if my friend Rola was here uh when she talks about models she might say like hey you don't always need llm you can use a smaller model like Bert and that's what we're doing right now uh but we'll say hugging face um data set
uh hugging face data set here I'm going to go back over to here and I'm going to bring in some of our code uh we're going to have to do a pip install on this so we're going to have to bring in PIP install trans uh we'll do Hy and Q to silence it we'll bring in Transformers and data sets okay we'll go ahead and install those um we will need some of our other code from before so I'm going to also bring in um py uh python. EMV okay we'll run that and I also
need this code here so I'll bring that in as well which I'll bring above here sometimes you have to restart not always but here we are going to bring in um data sets and then a bunch of stuff for um the Transformers and so I mean the rest of the code looks okay not that I'm an expert but we will work through it here and see how far we get so I'm just going to bring this down I'm going to just make a few of these we'll bring that here and then we'll just start dragging
this up I know it looks really complex but sometimes by the way you know if you're like Andrew I'm not getting the same code again you could just grab this from the repo because I'm going to drop it in there after this um but I'm grabbing this stuff up here this function is really large okay and then we'll read through and we'll see if we can kind of make sense of it oops oops oops I'm not sure why I did that grab it again my keyboard was kind of messing up there and H I think
the last one yeah we'll bring it into a separate one who cares okay so now that it's broken up it's a little bit more controlled and we can kind of see what's going on here um so first thing is we have yeah we we need to load our liaries and we have load the data set so it's bringing the data set in and we're saying only provide the first 100 examples and then we need a evaluation data set to make sure that things have worked correctly so we'll do that load the smaller model and tokenizer
so we have the model name from pre-trained model so remember we loaded a direct model before so that's exactly what we're doing here could there be a pipeline for this I don't know maybe um so it say some weights were not initialized for the models you should probably train the model on a downstream task to be able to use it for prediction inference that's what we are doing that's our goal simplify pre-processing function so pre-processing means it's going to do something to our data and prepare it so here we're providing the examples okay we're getting
the questions we're getting the context we're tokenizing the inputs because the model requires the data to be tokenized um what it doing here it has the inputs and it's looking at the offset mapping it's finding the token indices and it's putting the start and end tokens I know that we keep getting errors about like start and end tokens so maybe has something to do with that like errors prior like that having start and end tokens are important in your tokenization process so this looks like it's just taking the data and getting it ready for input
that's what it's doing okay so now we have process of the data set so we have the data set we're going to map so iterate over it we're providing the pre-process function that we defined here and so it's going to format the data um and that's what it's going to do okay it's putting into batch size two here we'll go down below batch length 71 does not match text pair 697 so I'm not sure what to do about that we'll go ask um Claude because I don't know enough about the configuration to make make a
fix for that it might be its pre-processing pre-processing function isn't handling the batch inputs correctly okay I'd rather it just give us that but it didn't do that and so we'll go here and I think we just have to grab the prepr function here pre-processing function but this one looks a bit different here it says batch 32 okay so I'm going to go up here and I can't tell what's changed I would have rather just had the things that changed and then we'll go to here and I'm just going to change this to this like
this okay I'm going to hope that that's all that's changed again just kind of guessing based on what I'm reading this is our old no that's our evaluation data set I'm going to see if that one's changed at all this one has changed as well so I'm going to go ahead and change that I'm also going to just scroll up here see if any of this data has changed above uh nope these are still the same I believe so those are fine m I don't know the training arguments have changed I don't know if we've
gotten there yet since I don't know I'm just going to grab the latest one here let's see here yeah nothing's changed I'm going to leave the one with the comments in I prefer that and so it's prepared our data set we'll do this one and now we're ready to train so let's go ahead and train our arguments what's our problem using the trainer pie torch uh requires accelerator Etc please install Transformers torch okay and so it's asking for that very specific version I'm going to just bring this down I'll just put it here cancel we'll
just save this file and I'm going to go ahead and run it again that's fine um we might have to restart which is kind of annoying but I'm going to try this anyway and it says please run Transformers or pip install accelerate so maybe what I should do I just don't trust this I'm going to bring this down to line cuz we do have um well it's not doesn't say torch here but we'll go ahead and just do this on a new line here I'm going to run this again no matches found for Transformers torch
you may need to restart the kernel to update the packages I'm going to go restart this again sorry and actually um I think we're supposed to take this one out and have this one in here instead so I'm going to take this one out and yeah just saving this so we're going to run this one this one's fine I'm going to run that one no matches found for this that's kind of annoying I I have used torches pipe uh uh torches um Transformers like that before we going to go ahead and put that in there
I didn't want to get the code sample there but we'll see what it says so you need to install a torch okay so that's what it wants fair enough I didn't think I would have to do that here but that's totally fine so we'll go ahead and try this like it's probably already pre-installed nope we'll try this one here restart no matches found this one's annoying so I'll run this again and then this one again it's kind of annoying because I'll have to rep prepare my data here not that it took very long but they'll
do this one and then this one I'm going to go to the top and just run all this stuff is tricky right like um I like I know this this exists as a package but yeah now we're getting a little bit better results here so now we're here down below is it going to say the same thing using the trainer okay so Transformers P torch does not exist it's just saying it doesn't exist okay like do I have to install something else maybe okay let's just look up the library then Transformers torch okay um how
do I install Transformers porch when it says no matches found you're in your counting no match found eror when installing torch it often indicates compatibility between Transformers library and Pie torch yeah that's fine um because then we'd have to create our cond environment that's kind of a pain pip install Transformers torch um so this is where it's tricky because like I don't have I guess we could create a new cond environment we could do that this is kind of annoying so what I'll do I'm going to go over to terminal here I did this in
one of our other boot camps um or something we did it somewhere I'm just going to go get the code because I can never remember how to do it off the top of my head but it's in the um gen training full day training day not that one we'll go to exam Pro here I'm going to go over to repositories we'll just say geni that's our gen Training Day workshops and in here in our intermediate one uh we were setting up a Jupiter environment and so this is the way you do it manually right so
I'm going to go over here and cond is already AC AC I believe and so I need to just create myself a new environment so I'm going to bring in this one here it says open V but we're going to switch that out here just a moment so I'll paste that in here and I'm just going to switch this over to um HF data sets Okay condic create is not allowed a studio has a default con environment uh start a new studio to create a new environment so that's what's telling us to do so that
kind of makes sense so I'm going to go ahead and try cond uh C cond Forge and I'm going to say install Transformers torch and see if that works um it's not exactly what I want oh hyphen seeond to forge okay so hyphen C no I did that cond to forge okay so I'm going to try this ask it like why why does it think this doesn't exist because if that's what it's using for installation that could be your issue oh Force it's supposed to be kind of Forge sorry one second but that's still not
that's still not working we go back over to here we'll try this [Music] like cond Forge is where the file is coming from right so let's say cond Forge if we if we chose to install it that way because underneath it's using cond as a single environment and so here we should be able to explore the forge oh we can't okay like a lot of package managers will let you search maybe here like Transformers but I thought maybe there'd be like a list of packages like here yeah here we go so let say Transformers and
so we have have Transformers here but I don't have one Transformers torch though I'm not really sure what the square braces do I thought maybe like that would be a separate [Music] package cond install Transformers torch how do I do that all right I'm going to spend a little bit of time figuring out I'll be back in just a moment okay I mean it doesn't help very much but here like they're showing it with a python environment install from [Music] Source install wakonda here we go install the from the cond channel cond Forge okay but
all right uh that's fine I can I can give this a go no matter what environment we use we're going to have different challenges so if I did this locally or somewhere else it's always just kind of a pain but I've installed Transformers torch before with having no issues we'll run this line and we'll see what happens um and it's doing something so we'll give it a moment here okay all right so it finished executing that um I'm not sure if that's going to help us let's go take a look here and see what happens
um no matches found no matches found uh okay so we'll do pip install Transformers I don't know if we even have Pip installed here but we'll try this my goodness oh my goodness how am I supposed to get this to run okay um let me figure it out oh by the way pre- chain models download to here okay so I was saying earlier I didn't know where they are it's telling us right there that's where they go okay so one thing it's suggesting it says for CPU support only you can conveniently install Transformers in a
deep learning library online for example sell Transformers with P torch with this okay so are they suggesting that if I don't want to use CPUs well down below here it says Transformers with tensor Floor [Music] 2 okay but I can't specify any of these things it's really annoying uh let's go back over to here okay so let's read this carefully so future warning is deprecated and will remove inversion Etc no Cuda is deprecated here so four CPUs that's what we're doing here [Music] mhm where does it say no Cuda we don't have any no Cuda
here oh it's right here okay so no Cuda so it says um it will be removed in version five use use CPU instead okay so we'll do this use CPU the true let's try that instead that's not our problem per se the TR uh using the trainer with pi torch requires accelerate 0.602 okay so let's go back over here I'm going to do uh pip list or PIP show accelerate okay so pip install accelerate or maybe we'll just do cond install cond install uh accelerate can we do that pip install accelerate let's try that so
that's installing it's installing version 1.20 which is not exactly the version we want but we can give it a go and see what happens also I don't know if we have to import it anywhere here we didn't necessarily import it anywhere um I give this a run here and see what happens I'm going to restart the kernel just one more time and so we'll try this again and this this is what it's like you're just goofing around constantly trying to get it to work says module Pi Arrow lib has no tribute Pi 8 bu um
and all I can think of is that because we've installed the new Trans the Transformer stuff stuff so we're getting issues here I go back over to here let's see if it can answer it cuz like how would I know how would I ever know how to do this the eror occurs when an outdated version of pyro is being utilized okay we'll try this for whatever reason this environment doesn't work we can just kill it and then bring up another one so we'll restart the cel again restart kernel and we'll try this again it could
be that I uh I installed um a much newer version accelerate that might be causing the issue I don't know that one's having no problem now that one's having no problem now that one is not a problem because we're going to train it we have our pre-processed function right which is fine we're going to hit play on the next one play on the next one and now we're on to training and we get a error so that that's a bit better so it says future warning evaluation strategy is deer will be removed Transformers using Val
strategy instead um evaluation strategy which is where here it'll be removed Transformers use eval strategy okay so this is now eval strategy it made it look like that worked so we're going to run this again now we have no errors do you see how I got this to have no errors question is will it work that's a different question we'll go ahead and hit this this is going to actually do the training how the heck did it train that fast I have no idea we'll go ahead and do this here it's dumping the model right
here so you know if we wanted to put this somewhere else we could put it and say like in our models so sorry the training is really happening here we could have brought this one up here and put it with this line that probably would have made more sense but it's training it's going very quickly the question is how do we use the model now so like how do we how do we use our trained model because like I don't know how like can we use a pipeline so [Music] here and we're out of free
messages that's fine I can always wait till tomorrow or like we have other ones we can utilize here but this one seems okay like we could probably get by with this and so this is now completely trained so now what I want to do is I want to load this model okay so here this is where we load the model right so we'll go [Music] here and then returns those two things right so I'll go ahead and run that and then we actually have our Q&A so bring this here there's probably a way that we
wouldn't even have to use any functions we'll do this let's take a look at this code so it it takes in the tokenizer which we have over here it takes in the model right and we have our question or our context so in our tokenizer we're passing our question and context and we're tokenizing our inputs or like our information or question or context to get backer inputs uh I'm not sure what this does get model predictions okay that makes sense um find and start the end positions I don't know why we need to do that
but that's okay um convert token positions to string okay so I don't fully understand it but like I kind of do but uh we'll go to the next one here and so the part that we care about is this code so we'll grabb this here and since we're doing this all they don't know that we're doing this in a notebook that's why they're giving us functions and stuff okay so here we load the model we have our context who created Python and then we can see if it works so run it torch is not defined
um sure we will import torch way way up here let's go on a new line here import torch there we go go all the way back down the ground and we'll run that and look at that that so we have who created python 1991 so that's pretty cool so we are able to use ber and we trained it right with a data set yeah I think again if I show my friend Rola she'd be proud of me for using a smaller model um but yeah that's that's what we had to do there but you can
see there was a lot of little things that we had to do to get that to work I'm going to download this I'm going to bring this into our repo wherever this is and so if you need the code that it will work just understand that you're going to get different results on different things and in the future it just might not work because these things break really easily you have to have confidence to try to work through these things to get them to work so say data sets example hugging face okay um is there
anything else to show in hugging face probably there's so much in here um but what I'd probably do is go to docs here Transformers I I haven't really used diffusers well sorry I have used diffusers not just hugging face diffusers and that's image Generation Um so maybe we'll look at it there's also Transformers JS state of the-art machine learning for the web um I think that is just a wrapper I believe we have PFT which is for parameter efficient fine tuning which we might do later we have TGI which we might do later we have
Optimum which we will definitely do later um we have evaluate so there's a lot of these things I think we're just going to come back to these when we need them um because there's just so much right so I mean that it probably gives you kind of an idea of how to start working with hugging face so hopefully that is pretty clear we didn't show you how to upload or push your own models or stuff like that I'm not really worried about that right now um but yeah there you go that is hugging [Music] face
olama is a large language model manager that makes it easy to download install and run LMS on your laptop or desktop you can find the GitHub repo here um AMA can run models directly in its own runtime environment each model self-contained so no complex dependencies the downloader installer and you get going on Windows and Mac probably has Linux as well but I didn't check AMA can easily download updates for models uh what you'll do is you'll go to the website and you'll see exactly what command you need to do to run or uh serve the
model after your installation so you primarily use this via the command line there's no UI but there are other open source projects that you can pair with an interface so you can get like a full chat TPT experience but here you can see I want to run llama 3.2 1 billion parameter it's going to download the model and then serve it and then I can talk to it I'm doing this all on my Mac M1 uh which is not a a very recent um uh computer so you can see that you can uh do this
on last gen Hardware no problem [Music] okay hey this is Andrew Brown in this video I want to explain for olama so olama is a way of downloading and serving models and the idea is that it makes it super easy for local development probably not something you would use in production um I'm sure some people would try to do that uh but yeah it's just supposed to be if like you really aren't comfortable with using hugging phase um or you know downloading models that this will save you a lot of time and it works for
very specific models okay um but anyway my local computer is not the best and so I don't really want to install AMA here as my C U is from 2016 my my gpus are fine but um you know I still don't really want to run it on this computer uh however I do have a Intel lunar Lake developer kit and so I would like to utilize that because the uh processing power on that even though it's a mobile chip is amazing so I'm going to go ahead and open up RDP uh as I've connected that
to my local um Network so I strongly recommend that you know if you need to invest into uh more local hardware just buy the mini computers and attach them to your net Network because buying a full laptop might be $2,000 whereas purchasing a mini computer might be $500 and then you just utilize your existing computer but you'd have to learn a little bit about networking however I got a co-founder named B so if I got a problem with RDP he'll help me with it but I'm connecting here as a user um on that machine uh
my my computer's named lunar lake so that makes it easy for me I'm going to go ahead and remote into that machine we'll go ahead and say yes and so I'm connecting to it and here we are another thing is that like I can run a Jupiter lab here and so if I have a models I want to use compute here on this computer I can connect it uh through Jupiter lab I don't have to RDP but RDP is pretty good remote desktop um for for Windows to Windows here obviously Windows 11 here my old
one's window Windows 10 I'd love to have Windows 11 on my new on my other computer but it's just uh I don't have the the the TPM chip in my um it's the tamperproof it's the security module chip so I'm stuck on Windows 10 so I've actually already installed AMA here because I had a bad uh a bad take and I had to restart I had it paused and I missed a lot of footage but anyway I'm going to just show you uh the steps that I did here um it was very straightforward so I
went to olama you have this where you download it you have mac Linux or Windows you're going to download it that's going to give you the downloaded file here okay you're going to install it it's going to start up here okay and it doesn't have an interface it might I don't know if it ama has a like AMA desktop is that a thing is there no I think it's console only but what I'm going to do is open up terminal and so as long as this thing is running down below here we can type in
ama okay and if it it didn't we could type in ama serve to start AMA but it's already running because I can see it down below here and so if we want to download a model it's uh what we can do is go back over to here to oh llama and we go to models and we have llama 3.3 it's 70 billion parameters which is giant um but let's start with something small let's try with llama this is 3.1 but I'm going to go with llama 3.2 and I'm not married to llama 3.2 it's just
the fact that it's so small and easy to run that I I like to use it as as examples for the time being let's go ahead and run that so it's going to download the model and then it's going to serve the model now we can use hugging face we can use uh VMS and TGI and stuff like that but this just makes it really really easy you can also um programmatically interact with uh AMA and we don't see an example here but we'll see it later on so this is downloading the 1 billion parameter
model it's obviously not going to take very long to do we'll say hello how are you who are or we'll say who are you and we'll get get a reply back and so very good accurate thing here if we type in for slash we can see that we have uh some actions here so I'm going to just say buy okay you can load the session restart it over set some variables I'm going to type in ama and what I want to do here is just list out the ones I have I'm going to go ahead
and just delete this because these do take up space and you don't want to fill up your drive um this computer here again is a lunar Lake until developer kit so it's not the biggest thing uh but it is uh it is very very powerful it's more powerful than my my my last generation graphics card um and so you know what's interesting about this is just the fact that this could be in a phone like I just think of this machine as the capabilities of a phone and having that local compute just like blows your
mind you know I can run llama 3.21 billion I can run mol 7 uh it's unbelievable here but we just deleted this one here let's go back over to here and we'll go to models okay okay I want to try something else like mistol so mistol is a little bit larger it's chunkier this thing uh would hang and cause problems on my on my other machine my older machine um could I run this on a macm one probably not no but let's go ahead and run mistol so this is mistal 7 and it's about uh
4 gigabytes in size okay and so we'll give it a moment here to download and run so I'll be back in just a moment all right so um it's now uh downlo let's go ahead and talk to it so say hello who are you and that's really really good says hello I'm moo a friendly and helpful AI designed to assist you with various tasks I don't think it's called Mo I I thought I thought you were mistol okay oh m m Mr both ey models developed by Mr ey but they serve the same purpose Miko's
general purpose assistant like me designed to assist users I did not know that while mol is the large scale model focus on Jing text that can be used in applications okay that's pretty cool so yeah I'm I'm really impressed uh but it's just like it's crazy to me to think that this could be something that could be on your mobile phone locally right and so I think what's really important to think about this is like if you are uh uh building out apps um you know CU again like this stuff is really expensive run right
now using serverless models is inexpensive but these these companies are losing a lot of money so if we can shift or move this compute locally then you could be building apps and trying to tap into what you can plug in on your local phones or you know your smaller device so that's kind of really cool but I'm going to go ahead and I'm just going to say bye to this and I would love to try out quen so quen with questions that's the the latest hot model it's the Chinese model it's supposed to be really
good it's a 32 billion parameter model um I think they have a larger one as well and we saw llama 3.3 which is 70 billion parameters I don't know if I want to try that one but 32 billion I think that I'd like to see if it if if this machine grocs or or croaks on it it's not designed to to run super giant models but I want to just push it to its limits but I'm just again impressed that we could impressed that we can run mistal 7 or llama llama 3.21 billion it could
handle three billion and S billion no problem but it's just the fact that um you know like if I wanted to build a language learning app I don't have to I could just build it on my machine here and I could use this as opposed to uh worrying about whether I have internet or things like that anyway let's go ahead and delete this model because it's a little bit large it's not that big but it's this thing these things catch up to me here and this drive I don't have a lot of space on it
um so I'm going to go ahead and just say remove and we're going to get rid of that and I want to go ahead and try quen so it's crazy but we're going to do it so I'm going to go over to here and it's Q with questions quen with questions um and let's go ahead and letter rip there's only a 32 billion parameter ones available on here that doesn't mean that there aren't other ones it's just what AMA has available to it that has been optimized and whatever whatever whatever you can see actually I
was downloading it before because this is my second take and so I don't have to wait as long here but I'll wait here a little bit and then we'll give it a go okay all right so it's done downloading it's loaded I just feel like this should be not be able to run on this machine but let's find out so say hello how are you or let's say who are you and we'll give it a moment here I can hear the Intel developer kit spinning up and the fans are going yeah so this one this
one's a little a little bit hard for it okay it does run oh Alibaba Cloud made I didn't know that okay so it does work but it's a little bit too much for this machine okay so we can use it but uh this is where we say bye-bye okay so that one was fun I couldn't believe we did that um but the question would be like okay we're running this locally how can we programmatically work with this because the idea is that if you don't want to work with hugging face and figure how to serve
models which by the way is not that hard this is supposed to be an easy way to work with it so uh before we uh look at the programmatic way I just want to get rid of that model that we just downloaded because it's really big and so we'll go ahead and we'll say ol llama remove CL with questions okay so that one's deleted I'm just type in clear here and so there was one here that kind of stood out to me 3.2 Vision but so far we've been working with text how would we load
an image in here we wouldn't be able to right and so um what I notice here is that we can we uh we can use the ol llama python library to interact that way so I'm going to go ahead and and we'll pull this we'll say ol Lama pull the other one was run we're just pulling the image right because we're not going to uh use it the same way right and we'll download it that way um you know I think we' have to run though but maybe not so here they have a chat I'm
trying to decide how I'm going to do this on this machine because this uses WSL um locally but we'll just see we'll just try this out so I'm going to go ahead and go over to VSS code VSS code oh this is definitely installed is it not installed maybe I did it on another another user vs code well if it's not installed let's go ahead install right now okay we'll go ahead and say well it says it has to be as administrator so let me go find this again vs code I got multiple users so
I'm a bit confused what is what yeah okay I said that it would be that we'll go ahead and install that yeah I'll create a desktop icon I know I have I have a few different users so I get kind of mixed up what has what in them um so hopefully it's fine what I'm doing over here we'll go ahead and launch vs code and yeah this one's still downloading the background which is fine but what I need here is I need um wsl2 so I do have WSL already installed here and we will install
the extension so I'm not showing you how to install WSL that's kind of a pain to do on your Windows machine you'll have to look that up yourself but it allows you to have a Linux like environment and it's really what you're going to be using okay so that one's almost already done downloading and while that's going I'm going to make a new um a new Pro a new folder here I like having a sites directory so I'm going to go over to here into uh this PC how do I get to this user is
it this user here no that's Ubuntu um my home here okay I'll just create on my desktop usually there's like a user folder but I'm not sure where it is here I'm just going to say projects or sites I like calling it sit and um in here we're just going to have a simple example so this going to be another folder and this will be uh ol llama co uh code right and so what I want to do here is I want to go back over to vs code and I want to open this folder
so we go here open folder on our desktop in our sites in ALA code and we'll select the folder so now we're in that folder okay and I'm going to make a new file here and this is going to be uh ol Lama or basic basic. py okay so nothing fancy we're just going to use l the code that was provided to us I don't need to put this in a repo because there's not enough going on here to be exciting but I'm going to go ahead and grab this code here and I'm going to
paste it in here okay so it should know to connect to our model here and we need to provide it an image there's no image here so I need to go grab an image and so I'm going to go grab something what's something that's cool uh that's not copyrighted um um uh public domain image let's go here and here we have a horse that looks good so bring that over into sites and then we'll bring we'll just rename this to image okay and I'll bring that over here I like how I spelled that wrong but
I don't think that really matters we all know that I tried to spell AMA right and so we'll go back over here and so this is hooked up the model is downloaded it's not we didn't say to run it so I'm not sure how that's going to work but I'm going to go ahead and we will bring this over a little bit we'll say new terminal and we'll say Python basic. piy and we'll go ahead and try to run that so it says no module called that so we'll say pip install uh I'll make a
requirements.txt in here requirements.txt I can't believe how good I'm getting with um python I don't even like python there we go we'll go ahead and install that so that's the most basic thing that we need and I'm just hoping it's going to connect to that um there and start it up so now let's go ahead and run it and it looks like we're not even running this in WSL I mean if it works I guess I don't care but we should really run this in um in uh WSL so yeah I can hear the I
can hear the the other the Intel spinning up so it must be working right but nothing's happen so far so I'm not sure if this is like a hard thing for it to run I'm going go here and say hello just so I know that it's doing something and I don't hear again I don't see or hear anything um so I'm going to stop this can I even stop this I can't even stop it so what I'm going to do yeah I think that's hang so I'm going to kill it I'm going to reopen AMA
I think we could also start it from AMA serve but I'm going to go ahead and switch over to a bun of WSL because I don't I don't want to run in power shell and so I'm going to do Ama serve old llama serve AMA okay so it doesn't know about AMA here H because it's installed over here so how would it know uh cuz then I'd have to install it over here and then download it again I guess we could do that if we want to use it with within WSL I'm going to just
try to I'm going to try this again but what I'm going to do this time around is I'm actually going to also download that other model that we have which was the olama 3.1 okay I actually don't want to run it I just want to download it I'm just going to adjust this code a little bit so uh for this obviously this is for uh that one there so going to go make a new file here and this will be uh chat dopy and this one's going to be um uh image. py okay so this
will be copied over here I mean they're both a chat but one is more going to be just chat chat and so this one will be llama 3.21 uh 1B okay and just say hello who are you right and so I'm hoping that this will just work I'm going to just type in um back slby okay and so we're going to try this again we're going to say python uh chat dopy oh yeah yeah W cell doesn't have anything installed so we go over here python chat. py again I don't normally run it this way
but we did get output back and so here it's saying yeah so this one worked fine okay but it kind of messed up on this one and I might be my code right um the image is here it's local maybe just it's just too too hard for it to do let's try this again and we do get information back that was my stomach by the way if you heard it okay and let's just hang on here and see if it actually uh brings anything back we'll we'll give it some time yeah again I can hear
spinning up there's probably some way to monitor it so there are these tools over here I don't know how to use them but um is it like dis what is it called uh system uh monitor it performance monitor I can't remember what's called on a on a Windows machine oh uh task manager task manager there we go and so what I'm looking at here is our usage and so we have 82% 77% um usage here going on right now I'm not used to this interface it's a little bit different performance oh here we go that's
a bit more normal so his CPU is really high memory is really high um so I think it is it's doing something but I guess we'll have to wait a little bit like how large was that it wasn't I don't even know if it was that big um so so I'm a little bit confused let's go back over here this one is oh there's a few different sizes oh uhoh uh so we have 11 billion parameter one and then we have I guess it's not actually small it's pretty large oh okay this is larger than
mistol okay so that that makes sense that's why it's so slow so maybe we hit our limit here in terms of what this machine can do now it might produce something the memory is really high but the question is like is it going to be stuck here in Running In Contention cuz here look at the memory it says 31.6 GBS um it has 8.9 available so 22 8.9 so it's like really at its Max and I think the problem here is that it's the memory limit I don't think it's a CPU I think I just
need more memory um and so yeah I don't think this is ever going to finish so we're just going to have to let it go but at least we figured out how to programmatically work with this so I suppose I can copy this over here I mean there's no reason I I can't um continue to use this I just don't know how to get it off this computer um but anyway yeah that that isama now there are other ways that like obviously this is one way we can interact with it I know another way that
we can interact with it would be like Opa I'm not sure why anyone would want to do this but I guess if you are a company and you are um you have like a computer and you just don't know how to code and stuff like that but you know Opa I'm not sure how you know that but not how to do that but if you don't know Opa it is a um it is a framework for running gen workloads and containers and so if we were to go into I think comps and if I was
to go into comps in here I should have projects a project here for running a um llm so I go to llm here and if we do text generation and we go ol llama you could use oama with Lang chain right right so that might be a way that you use it and so you could connect it to like a container workload you can use Lang chain and old Lama together but honestly you should just spend a little bit of time learning how to pull models and serving them because it's so darn easy uh but
yeah I I don't really touch Lang chain that or sorry olama much um it's just it's nice it's just too basic for me so um anyway what we're going to do here is just type O llama and I'm going to go o llama list and I just want to get rid of that large model o Lama I got to spell it right for it to work o Lama list let say ol llama remove llama 3.2 1B I'll and I'll just remove this other one B llama remove this one and there you go that is olama
[Music] all right let's take a look here at llama file this is a single boning format that contains both the model's weights and a way to serve the model you can find it here at this GitHub repo and it's really as simple as that you download the model you call like a binary you pass it parameters and it works um it works by combining llama CPP and Cosmopolitan ellsy so llama CPP we will talk about this because it is a way to run uh models on on comput so it's not utilizing gpus but whatever your
modern CPU processor is um and it does that by uh uh reimplementing The Meta The Meta meta's llama architecture for models that are compatible with it uh but optimized for CPU uh CPUs ridden in C++ so it can run on multiple CPU architectures it does support GPU use it can run on multiple operating systems the weights are embedded as part of the file um so literally you just take the file and you drop it anywhere you want you can utilize it and it supports uh a bunch of different packages so it's going to be probably
whatever uh uh llama CPP can support but you can see we have a ton here that we can utilize so there you [Music] go all right let's take a look at llama file which is by Mozilla um I honestly haven't had many use cases for this but it might be fun to just try it out to see how it works but the idea is llama file allows you to have this single distributable file that we can utilize so let's see if we can make that work so here um there are very specific models that are
available so 3.21 billion is one that I like quite a bit um so it'd be very interesting to pull that one and run it um if I'm going to run this I probably should connect to my uh remote machine so I have uh an INT Intel lunar Lake developer kit that has more compute on it so I'm going to just quickly connect to that and uh let me just start up that machine okay all right so I just started my machine on my network I'm going to go ahead to um a remote desktop and connect
to it now could you do this on some kind of other compute probably but uh I'm just feeling to use this here today or choosing to use this here today so I'm going to go ahead and connect to this machine and it's just a Windows machine that has a lunar Lake um that has a lunar L uh processor in it so it's as good as probably like my RTX 3060 that's sitting beside me actually if not better uh here you can see that the keyboard or the mouse that I have installed is the razor one
but the idea here is that I want to go ahead and uh navigate to llama index so we'll go ahead here I'm going to type in llama index and we'll go over to here um and I just want to get started with it oh this is llama index I meant to say llama file okay and so apparently we can just download it this one in particular is set up for Mac it looks like is this only for window or for Mac let's see do we have Windows here if you're on Windows rename the file by
adding aexe okay so it should be as simple as that this one says this is llama 1.57 billion for parameter I want to download something that is um a little bit uh more modern so like llama 3.23 billion so I'm going to go ahead and see if I can download that and from here there should be some kind of download oh okay so we'll hit download here and so now we are downloading the Llama file and while that is downloading I'm going to go back and take a look at the instructions so if you're on
Windows rename the file by adding exe to the end and it's as simple as that if we're on a Mac then you do trod and technically we have uh Windows subsystem Linux install this machine so we could do it the Linux way but we'll do it the windows way as you know you should be comfortable either or with Linux or Windows um but we'll give that a moment to download okay and I mean we can try to execute it both on uh windows and uh Linux here but we'll just focus on Windows here in the
short term so now this file should be done should be done okay great and so I want to find where that file is I'm going to go to my downloads just give me a moment here and so I'm going to rename this as it suggested to have Exe on the end exe yes we'll change it so it has it and now it's suggesting us to run it that way so I'm going to open up terminal on Windows 11 and this is on in my downloads so we'll go into my downloads here not my documents my
downloads and we have the name here so it's usually this way when we're on uh backlash so for Slash and so we'll go ahead and hit enter and see if that executes so it's launching up that's pretty cool okay cool it's kind of like an Olam experience very similar to the Olam experience to be honest hello how are you okay that's pretty good we also have it uh being served on a very specific um port on Port 880 I like how it's telling us exactly um well I don't know this is AOL Lake this is
a lunar lake so it's interesting that it's calling AOL Lake s unless that's another name for lunar Lake another interesting thing is you can see the model it's a ggf we'll probably talk about ggf files in another video but but down below here I imagine we can interact with it programmatically so it is running here but if we open this up we can see yeah we could specify the URL endpoint uh for it so that's kind of cool or we could use Curl but now that we got it working in Windows I'd be interesting to
try in in uh Linux as that's probably a more common use case here I'm not sure how we exited I just did command or control C multiple times so I'm going to go over here I'm going to just take off that .exe on the end of it if I'm allowed to I'm not sure how I would do that um I'm not seeing extensions here so maybe that's something I can fix in my settings wherever that is so I want to see all extensions I don't want to hide any types there we go so now I
can go ahead and rename this and take off the dotexe yep we're going to just change it back into a binary I'm going to open up vs code which should have um WSL accessible here which we did in a separate video which video I cannot tell you but it's somewhere here we're going to drop this down we're going to make our way over to WSL we'll give it a moment here to open great so there should be a way that we can access it here because it looks like it is going based on local here
so we go back a couple steps here do we have downloads in here we do excellent and so I'm going to go ahead and um do lsph la so we're looking at to see if it's executable it looks like it is I'm not 100% certain but if it's not we're going to chamod it do U plus X that's the usual thing that we do for that and I mean it looks the same to me so I think that it's executable now I think we just do this very similarly like that okay and so again similar
experience we can talk to it there I don't want to talk to it there what I actually want to do is um write some code here I think I showed this old Lama Stu before I didn't figure out a way to uh get this accessible another way still I'm going to open up another uh Ubuntu WSL here I'm just going to CD back here I'm going to make a new directory this is going to be called I like how I spelled o Lama wrong previous as well this is going to be llama file Basics and
I'm just going to go here and I want to interact with that endpoint so uh we'll go back back over to this and we have a curl but I'm more interested in the python code so we'll go grab this one here we are using um uh llama so I'm not sure like well hold on this is llama CCP Port 80 if you're already uh software development opening I python package then you should be able to do it okay that's interesting so it works with the O open AI interface it's not an open AI model but
that's interesting that that you can utilize that I thought you'd have to use hugging face or something you probably can use hugging face as you probably just point it to an end point but let's go back over to here into vs code and I'm just type in code period to open this up I guess it's installing it I actually found this out today I was working with my friend Tim and I actually let this install the way and then it actually opened up a new one so that was really good there we go so now
we're in the context of this project which is a little bit better um and so I'm going to make a new file here this is going to be called app.py I'm just pasting the contents in here and for the most part I don't think there's really much we have to change I don't think we need an API key because there's nothing special here it is running on Port 880 I'm not sure if it's version one I think it's just port 8080 um it says llama CC CPP which I'm not exactly sure about but let's go
ahead and see if we can uh run this I'm going to go ahead and go to my terminal actually I kind of prefer if this was a notebook it'd be easier to work with so it's ipynb for now I have that memorized and I lost my code which is totally fine and I really mucked that up so I'm going to go ahead and delete that okay I'm going to try this again uh app. iynb and we'll add our code Block it's a little bit harder because I'm uh in a remote desktop so I have to
click everywhere everywhere I need to go here but we'll go ahead and paste that in and so here we have this we'll go ahead and run that it might not know what um open AI is is it running we save that it's acting kind of funny here let's open this again is this not code let's try this code yeah it's code maybe we'll select kernel here oh the extensions already even installed for this okay so it's going to install the Jupiter extensions here fair enough first time doing it probably would have kept that for another
video but that's totally okay so this is going to install the python extension and the jupyter extension and so this way it can actually interact with python and stuff like that so give it a moment to install there we are so those are both installed um again going to try to select a kernel and we'll go ahead and select three 312 I'm going to go ahead and hit play and says running cells requires IP kernel and pip yeah go ahead and install it there is no pip installer available for this environment okay so it is
attempting here I mean it's just like whatever the environment is so if I go over back to the environment it's just because it's such a fresh environment like you know we have Pip here is PIP does pip exist uh is PIP three here yeah it's not even aware of any environment so I'm just trying to decide here do I want to get this working now or do I want to show it another another video as we are supposedly going to set that up later um I'm going to leave this alone but I'm going to go
ahead and re just take this file we're going to go back the other way so I'm not going to set up the notebook right now as I want that for another video which you may have seen already or you may have not seen I'm not sure but we'll go ahead and save this and try this again and we'll just run it the old fashioned way which is totally fine so I'll just say Python app.py and it saying doesn't know what that is we'll try python python are we in WSL say bash python three Python 3
there we go we have Python 3 for sure sometimes this can be a little bit finicky there we go so I'm going to type this again so we'll try um Python 3 uh app.py and that's understandable it doesn't know what that is so can I do pip install uh open AI I'll just do it do this with a three so there's no pip so there's I have no environment set up and that's kind of a an issue so in a separate video what I need to do is I need to actually um uh show how
to set that up so you know what I'm not going to worry about this I'm confident that this will work let's just focus on the curl one I just don't want to end up installing cond and all that stuff here when I want to leave that for a separate video so uh what we'll do is again we just want to interact with that programmatically so I'm going to go here up to the curl command 8080 right and so here bear no key because I don't think it requires a key and so maybe we could try
to run this one here I don't know why this is being shown here I guess it's because it's uh yeah I don't even know why they're doing that there that doesn't make any sense but we'll go back over to here um and this is running right now in well we renamed it right so is Sten still running okay so what we'll do we'll go to we'll go back we'll go back a couple steps here because this is in our downloads right and the first thing I want to do is run llama 3.2x I thought we
were already running it maybe in another window we were yeah we were over here okay so I'm going to just stop this here I don't want it twice okay so I'll close this one out and in here yeah we're running it right now and this is in WSL so in another WSL window I I apologize for the mess do we have curl we do excellent so I'm going to go ahead and paste in the curl command and we'll hit enter it says cannot connect on Port 880 we'll go back over to here this is on
Port 880 so I just need a scratch Pad here and we're going to paste in uh this here I'm going to carefully look at this this says V1 chat completions I'm not sure if that's really the end point for it as we don't know that kind of looks like the open API uh standard way of doing it which wouldn't necessarily be wrong um so I mean nothing looks wrong here per se but I'm not sure what llama C CPP is I've honestly haven't seen that before so I'm going to go here and make a new
tab maybe that's a very specific model oh they're saying llama C CPP so they're literally talking about inferring from a CPP file okay but we are using something else which is GG up so let me go back over to here yeah there should be some way to work make this work give me a second I'll be back in just a moment okay all right so it's still not working I'm thinking maybe I'll just copy this verbatim and we'll try to give that a go as I did take out the authorization but maybe we need to
actually pass be or no key and that's maybe the reason why it's failing um as I was trying to do a plane curl on it just as Local Host and says couldn't connect to the server so go ahead and hit enter and now it works okay so um hello I mean like it didn't airor out per se but this one has the bearer token in it as well so why doesn't that one work we'll go ahead and paste this in again um that one doesn't have the the closing parentheses though here I wonder if that's
maybe the problem here let's go ahead and try this and hit enter um yeah and I don't know they just didn't give us good code I guess yeah I I don't I don't like this um it seems like there's a lot of work we might be able to get it programmatically to work but just because um I don't have the environment set up here and I don't want to do that video I guess we'll just call it quits here but yeah it's interesting to see that you can run this maybe when we get a little
bit more experience with llama C uh CPP this will be a lot easier for us to do or we have this environment it' be less of an issue but you know we got it running so we can clearly work with it but um programmatically if you really want to work with this way then you'll have to do more work so why would you want this um well let's say you want to bundle app and you just need the binary you want to drop it in there obviously that is a really easy way to do it
I personally think it's not hard to use um uh servers like that actually serve um files so I suppose you could do this if you don't want to include any additional code but I feel like you know often if you are serving a model you're working with python anyway as a python app but I guess if you had one binary that was your um your uh like your llama file here that runs your actual model and you had another binary and you just wanted to drop on a machine and you had the two then I
suppose that could be uh great for local local inference but uh yeah I mean that's it and I'll see you in the next one okay ciao [Music] let's take a look here at Lang chain this is an open source framework to rapidly prototype agents or LM workloads and here is an example of uh using langin here on the left hand side features include prompt templates messages output parsers document loaders text Splitters adapters of various Vector stores adapters of various embeddings uh various retrievers to implement rag indexing tool use support to work with multimodal models um
uh agent workflows and example selector so it has a lot of functionality built in so that you don't have to code these things yourself you can just use the SDK and start building things and connect to many different LMS and things like that uh this was very popular for quite a while um I don't personally recommend Lang chain because um I feel that the API breaks quite a bit and so a lot of the documentation does not match when you want to bring this into production um I would not recommend this for production use cases
because um you're basically limited to a single server um and you have all these moving parts and you really need to break them up into containers or things like that however um Lang chain um while I say it's not well suited for production use cases the company has built a suite of tools so that if you want to utilize Lang chain in production you can buy into their system the Lang the Lang graph platform they also have this other Library called Lang graph I don't know much about it but my friend George who knows a
lot about building out agents was saying that this is a more lowlevel one and so it might be a better version of Lang chain I don't know maybe he'll show us how to utilize it at some point um so you could ship an LM workload as a minimal viable product with Lang chain but you would probably have to rebuild it uh utilizing multiple containers something utilizing Opa Lang chain was one of the earliest geni Frameworks that made it easy to swap out LMS with unified API when anms had small context Windows Lang chain worked around
this limitation by providing multiple chaining strategies like summarized and this is what I remember Lang chain for and that's why it was called Lang chain um there is a competitor called llama index I do prefer llama index uh but to be honest I uh rather not use either and just write the code uh by hand and I think a lot of people are going that direction as they're finding these things easier to use but they are still around they are great for learning uh so if you feel comfortable utilizing them go ahead and use them
[Music] okay all right let's take a look at llama Index this is an open source framework to rapidly prototype agents or llm workloads so here's an example of uh some llama index code llama index has wider support for data connectors and advanced rag techniques and this is one reason why you might want to utilize it over Lang chain but obviously these two things are coming close to parity so it really doesn't matter which one you utilize I don't really recommend Lang chain or Lama index as I feel that it's an additional abstraction um that uh
does not not always pay off in the long term for it but I do want to mention llama index and L chain because you might have interest in utilizing this um because it seems like this is an opinion framework to start building out and working with multiple llms with a single API but it doesn't uh necessarily deliver on the promise at least not right now but it has Concepts like data connectors so you can easily connect to multiple data sources uh data indexes engines agents observability evaluations workflows LL index has its own platform called llama
Cloud for Enterprise production similar to Lang chain having their own platform offering uh consider just uh it just like Lang chain the L index isn't designed for production use case that's something I say about Lang chain is that I don't consider it that it's something you want to put on a server in scale but if there is an Enterprise offering from the actual creators then they might have some way to work with it but then you're buying into their mode so I'm not 100% certain if I recommend these tools but if you are learning um
to work with LMS and you find this easy this is something something that you can utilize sometimes they will come with predefined agents um and so you can see how to work with agentic workflow with u more advanced um uh workflow techniques and then figure out how to manually convert that over and build that yourself [Music] okay hey this is Andrew Brown we're taking a look at llama CPP so llama CPP is an inference server for llms llama CPP implements the underlying architecture of the The Meta llas architecture within CC and C++ intended to allow
improved performance running models on CPUs the main goal of llama CPP is to enable LM inference with minimal setup and save the-art performance on a wide range of Hardware locally and in the cloud so here's an example of us utilizing um llama CPP uh with Ruby as they do have basically bindings for almost every possible language which is really nice uh they have a C tool which is also really nice they have a lightweight server which is also really nice lots of Integrations with UI tools and llama cpb Works only with ggf format files for
model weights um so this thing is really again intended uh for um uh just working on a broad amount of stuff working with very specific open source models I would love to show you how this works um I've been having a lot of trouble with my local machine but I'm trying to make a uh a video for you because llama CPP comes up a time and time time and again I would really like to Showcase how to utilize that [Music] okay hey this is Andre Brown this video I want to see if we can Implement
llama CPP I'm going to just tell you I've tried this video like nine times and I keep running into issues just because I'm having issues with my local development lightning AI is giving me permission issues and so now I'm tempted to do something that seems very bizarre for this particular lab but it might work I'm going to use gpod so um I'm going to go over here and launch up git pod because I know it has Homebrew and that has G me gives me a high chance to get this installed and working um so I'm
launching up git pod here um and I'm actually going to specifically go to the hugging phas ggf uh G ggu llama CPP because here they have instructions that I feel that are most likely to work it says you can install the LA CP through Brew through Mac and Linux um and that's exactly what we're going to do so here I have G pod launched up I'm I'm going to go ahead and install it via brew and hopefully it just works so we go ahead and do this I should have been able to do this on
my uh Intel machine but I've been running it for a very long time and it's a development kit it's not a proper computer and so I possibly um have given it some trouble by not letting it be off for a bit and cool down but anyway so this is going to go ahead and install llama CPP apparently you can build it from source which we're not going to do here today this is once installed you should be able to use the the CLI llama CLI llama server so we give that a moment to install okay
all right so it should be installed now let's go type in llama CLI and see if we have it um here it says failed to open a file no such file directory that's because we don't have any files yet so it clearly is trying to run something but we're going to need to actually have something here and this one in particular is specifying one from hugging face so maybe we can grab this one using llama 3 8 billion parameter I don't really want an 8 billion parameter one but the fact that it just has it
right here might make it really easy you can move cnv to run the CI in chat completion mode uh you know I don't necessarily need that but let's go ahead and run this and see if it works so go ahead and hit enter and here we're getting uh command not found HP high repo so or HF repo so it's saying that flag doesn't exist which is fine I'm going to go and type in llama CI hyphen Hy help and see what we have as our flags but I need some way of getting the model down
here it looks like it's now hrf so let's go ahead and hit up here for a second and we'll go back over to this apparently we can hit up on it which is kind of annoying I'm going to just go make a new file here just a plain text file so we have a little scratch Pad I'm going to grab this again I'm going to paste it in here it looks like the syntax is a little bit different this is going to be hrf what about the file and hugging face files HR FF so that
is fine and so we will go down to the ground here I will type in clear and we'll go ahead and see if we can run this enter instead so now we have llama back in it load the model apply the low adapter no such file or directory unable to find the command well this has double flag so maybe it's just a single one that's probably what it would be let's go ahead and try this okay we'll go and check the permissions again or the um instructions here again this is kind of a headache yeah
it's a hyphen flag and the full Flags go there so maybe the problem is maybe these flags are fine the problem is is the fact that this is not like that okay so we'll give that a go maybe that was their problem this entire time enter there we go so yeah the code copying from here is a bit messed up they should have um had a a a backlash there so it's a problem with their code and so here it looks like it's downloading that model where it's downloading I'm not sure maybe in place I'm
I'm not certain but it is downloading the model we're just going to wait for that to happen okay all right so while it downloaded the model it now Ransom code it looks like so running interactive mode you are a helpful assistant um so I say hello now this model in particular is kind of large for um this computer at least I think it is here says oh there we go there we go and so it is working I would have just rather ran a smaller model but um I'm not exactly sure where we could have
found a smaller one here because this is this one in particular and so I wonder is there like a one billion one billion parameter one because that's what I would really really prefer yeah so this one seems a lot more reasonable right this one here um and it doesn't look like we have any special requirements to grab this one so maybe that's what I'll do here I'm going to tweak this because this one was kind of crazy I'm just going to stop it it did work but I want something that's a little bit performant you
know what I mean and these 1 billion parameter ones are really good at that so I'm going to go ahead and replace it as such and then in here we'll go over to files and versions and so we have a few of them and so there's different versions of this being quantize I'll just take the first one I really don't care how well it performs I just want to have it work and download very quickly and then know that we can just quickly swap these out and get something that works so I'm going to go
ahead and copy this and then we will paste the in we'll hit enter it's obviously going to have to download the model again but this one should be way less painful um as that last one took forever I'm not sure how long large that file is let's go back and take a look this one's 733 megabytes I have no idea how how large the other one is but this one's almost already done downloading here so we'll give it just a moment cool we'll say hello and here you can see it's now more what are you
up to and we're just being kind of silly here so we're not really having a meaningful conversation but that's one way we can work with um that I'm gonna make a new folder here we'll just say llama CPP I didn't even think that this was going to work but this actually worked so I'm really surprised I'm going to go here just make a readme.md and so if you want to do this you will have this line as well so you don't have to look for it and I'm just going to go ahead and grab this
link here so you can see where this is as well um so now the question is could we work with this programmatically right so I'm going to go ahead and try this again and this will be uh basic. Pi yeah that's fine we'll just do it this way and so what I want to do let's go back over to here and they have bindings we'll work with the python one I just want my life to be easy so we're going to do something real simple I'll grab this we'll go over to here we'll make make
a requirements.txt and we'll paste this in here and I'm going to go ahead and CD into llama CPP and we'll do pip install hyphen R requirements.txt and so we'll go ahead and install that and so that will be installing um we'll need a little bit of code so they should have an example so here's an example of one we'll grab it I'm not sure how this is going to work because I'm not sure where the model has downloaded to right so here we have model path but I wonder if we could specify hugging face that'd
be really cool if we could do that let's go look at the Llama class in here and what I'm wondering is where that could be are these all the parameters I'm not sure if I type in hugging phase will we get hugging phase in here we do get hugging phase okay well anyway I know that the model is somewhere say where does hugging Face store models when you download them because that'll help us out there's the do cach folder okay you got to help me a little bit more than that that okay so for slash
hugging face Hub let's see if that even exists LS there is a cache but there's oh there llama CPP here we go llama CPP and I want to do LS here see what we have and so it's actually storing them over here so that's nice I going to go over to here and we'll do this this assumes you've used llama CLI and downloaded prior because we that's how they got here right if we don't do that that's not going to work and so we do have some models downloaded here I want to use the smaller
model which is this one so I'm going go ahead and grab that there okay uh um and that looks fine I'm not sure if those parameters will work with it but I guess we'll find out here in just a moment so I'm going to go ahead and do I think we already did a pip install right we did the PIP install yes and I believe we saved the file right I'm going to do it again just in case it that works okay we'll do python basic. we'll see what happens here and it's saying model path
does not exist sure it does of course it does so maybe what we'll do gu it does we'll do PWD here and we'll give it the full path that might make our life a little bit easier here okay because maybe I can't handle the uh that that Tila there and so I'm going to go back into uh workspaces LS sorry workspace J Essentials llama CPP and we'll try this again Python basic. py and so now it's running the model looks like it's working and it's outputed back its result here I'm not sure if we get
all this other stuff maybe there's a way oh maybe we say um Echo The Prompt back into the output I mean that's what we wanted to do I'm not sure what's going on with all this other stuff but it does come back with this here I mean I guess that's all the output [Music] right but it is very verbose llama CPP verbose off um Ros Falls maybe and that might be a little bit easier for us to work with let's see if that works there we go we'll run it again so it's less for B
which is nice and so there's our implementation with llama CPP obviously there's a lot of Integrations and other stuff around that there the hardest part is just getting to those GT ggf files um I'm going again I told you this is like my ninth attempt or something trying to get this to work and this time it worked fine and we used git pod of all things like I love git pod but like I wanted to use use my Intel machine or lightning AI or something else but you know sometimes the easiest thing is to use
G pod so that's fine we'll go ahead here and just say llama ccpb and I'm least glad that I got to show it to you because this thing comes up again and again and again so I really wanted to show you how to use it but there you go [Music] ciao before we can talk about bitnet CPP we need to talk about what is one bit llm so one bit llms represents the most extreme for of quantization using a single bit of Z or one for each parameter uh we learn about quantization in another video
but the idea is that quantization is we're taking the model weights uh and its activator functions and you are taking it from something that is more High precision and moving it to a lower Precision value so if something was represented as an F FP 32 uh so uh a 32 floating Point U value then here we're literally boiling down to the smallest value which is uh possible which is z and one one obviously this would greatly reduce the model size and computational needs uh and so there is a uh a framework out there called bitnet
CPP which is made by Microsoft that's why I'm calling it Microsoft's bit net CPP um and the idea here is that if you compare this to something like llama uh llama CPP which is already optimized using C and C++ you can see you have huge huge gains of like 55 to 70% um uh there depending on the model that you are utilizing um however as great as this sounds there is a limitation of types of models that we can utilize with this so there is like bitnet B1 llama 3B uh uh one uh 1.5 uh
I don't know whatever the tokens there so the Llama one would be the most interesting to try out and then they there's the Falcon 3 family um I've not got to play with this uh quite uh much but I would love to and so hopefully uh this video will follow with a lab if I found time to figure it out [Music] so GG UF and gml are file formats used for storing models for inference often models like GPT uh ggf is the successor of the gml format so you'll often see ggf we're still just going
to mention gml just because they're basically the same thing but one is the newer version so GF is a binary format that is designed explicitly for fast loading and saving of models ggf is specifically designed to store inference models and perform well on consumer GR computer hardware ggf supports fine-tuning models initially developed in Frameworks like pytorch can be converted to ggf format for the use of those engines so here on the right hand side um we're looking inside of hugging face at Google gemma's uh llm and you can see that they have a ggf file
and ggf files can be executed by olama GPT for all and llama CPP um I don't know if it's CCP or CPP I sometimes get that wrong um but uh that's probably just a spelling mistake let me just take a look here llama it's llama CPP so yeah it's supposed to be llama CPP not CCP um I apologize for that but anyway uh that kind of makes sense because when we utilize old llama which is a um a way of serving models very easily on uh consumer grade machines like your laptops or desktop that having
that ggf file I could see that being a great benefit uh there but yeah there you [Music] go let's take a look at context window caching also known as prompt caching or contact caching is when the computed context is stored in memory to help improve response times for llms and where prom caching would be utilized would be chat Bots with extensive system instructions repetitive analysis of lengthy video files recre queries against large document sets frequent code repository analysis or bug fixing prom caching is offered offered by some providers for very specific model versions I'm not
listing the model versions here but I'm telling you of two providers that do context window caching the first is Google Gemini and the second is anthropic Cloud I've not seen many llms implement it it's not something that I I would even know how to begin to implement it um uh uh if we were to do it locally though maybe it wouldn't be that hard if I if we just talk about it for just a second here but here's an implementation of Google Gemini and so here if we'll get the pen tool out so we make
it a little bit more clear here but the idea is that um uh using the Google uh Google Gemini um API here we create a Content cache and we are providing um in here the contents of a video file because we said that this is really good if we have a very large video file or very large document sets and the idea is that it's going to um uh convert this data into um into the format the token format that the LM needs to utilize and then the idea is that every time we invoke the
model it doesn't have to take that data and convert it into its tokenized formats right so if we wanted to implement this ourselves um it would be really tricky because what we would have to do is we'd have to understand the architecture of the model and take off part of the model that does the tokenization um and and be able to take the data and tokenize it and then also inject the data after that tokenization step step so you can see you'd have to really mangle a model to do it um but anyway this thing
is really great but it's only for very specific models um Not only would you gain um uh improved response times for very large files or large documents but you could also save money so for example with Google Gemini the cash tokens might be build at a reduced rate and what I say might they are um but I don't know if that's the same with anthropic Cloud Sonet and I've only again seen this in a few models so it's not like it's ubiquitous with all models um but it is something that's worth checking out if you
are doing one of those two things working with very large files or um uh video files or large documents [Music] okay so structure Json is when you want to force an element to produce structure Json as its output and there are multiple techniques to force structure Json there's context free grammar uh finite State machines and reg xes that you can utilize um and structured output can be a thirdparty library or it's built into the API of an llm and so then they're Implement implementing another step um and so there's multiple strategies but from experience I
find that trying to get structured Json telling an llm to just give you Jason back is not easy and so you really do need a secondary step outside of the llm to force it to do exactly what you want and let's just talk about generally how this process works well the idea is that you have an input you have an llm and then uh what you're doing is that every time it produces a letter okay you have some kind of schema that is implemented either represented pantic or or Jason schema and the idea is that
it's going to force the next letter to be what would make sense so if you're producing Json and if it's the first thing the first thing should either be a square brace or curly so if the LM the first token comes out of it is not a square brace or curly which it does not match the regular expression that it will throw it away until it matches what it expects it to match right and so you're basically forc it token by token to produce exactly what you want um and so hopefully that makes sense uh
but yeah this stuff gets really complicated some LMS will tell you that the LM to generate uh that you have to tell the LM to generate Json so for example if we're using coh here coh here specifically says tell it to generate out uh Json but for the most part if you're using a thirdparty library that is not part of the um part of the uh API of the llm then you don't have to do that okay but let's take a look at um two ways that we can uh use structure Json specifically where it's
built into apis and a separate video we'll look at how we can use a third party to do it so with open AI they have an API for structured output that requires the use of pantic pantic is um this validation tool for validating structures of data in um Python and the thing is is that you can use this to represent your Json so it does get converted to Json at some point but this is something you'll commonly see with structured outputs is they'll want pantic and so here we're defining um I want like uh a
bunch of phrases and then those phrases are represented by action verbs and so then I would pass that in as the response format and that would produce back Json I have another example here with coh here and it uses the Json schema based on the Json schema.org website and this is the exact same thing that we're asking uh uh but now we have to use a Json scheme instead of pantic and then we're going to pass that along um as the response format um so yeah but I do want to point out that getting structure
Jason back is very challenging and sometimes you'll find that if you make it too complicated it gives really bad results sometimes the naming of the actual uh things within the uh the Json object really help it um and so yeah you really have to work hard to get um Json output back and sometimes it forces you to use specific models because sometimes the only time you get reliable results is like with something with open AI uh and the only thing that's kind of frustrating is that we don't know exactly how their um structure Json works
and a lot of times these are beta features and so I don't know if these features will vanish in the future uh in favor of third-party ones but we will look at thirdparty ones that we can Implement as well [Music] okay let's take a look at instructor which is a thirdparty library that can produce structure Json output I was super excited to find this as there was a bunch of other ones that didn't really work but this one worked perfectly at least for the code examples that I pulled up for it uh which we can
find at python. usinor docomo um the only ones that I kind of wish were there was something like Amazon Bedrock but I understand that it's not there at this time um but anyway uh here is a simple example of Gro and actually when we use gro we end up utilizing this implementation you can see that this one is utilizing python gantic um but the idea is that you can basically uh use um this outputed format with any of these things so you like it's not reliant on the built-in one it'll just work with all these
which I think is really cool uh but there you go [Music] okay hey this is angre brown and this video I want to take a look at web UI so web UI I believe it's called Web UI if it's not we'll find out what is maybe it's called open web UI and so I'm told that this thing allows us to have a friendly chat UI interface that we can use with olama or stuff like that and this is probably a great idea if we can get this working because if you have a decent CPU or
GPU and let's say Chach PT is down or Claud is down you can use open AI models or open AI models open source models to get something similar so this is something I would would like to try out um and if it works out of course I'll include it so if you're watching this video then that's the case but seems like it's very easy to install it says open webui so we have Pip install open webui and then we serve it if it's that simple that'd be great we could also run as a Docker container
I do like running things as Docker containers so if is on your computer it is if ol is on a different computer we can do that so question is what should we do should we run it this way or that way I kind of like the idea of running as a Docker container so I think I'm going to do that here today I do believe that Docker is installed on this computer if it's not we should install it um let me take a look here so I'm going to type in clear do we have Docker
on this we do not so I'm going to go ahead and type it uh install so pseudo snap install Docker and I'm not sure what the password is but maybe that's the password and so I'm going to install Docker and we're going to see if we can run this within WSL if we can that'd be great all right so supposedly that is now installed I'm going to go over to here and I'm going to grab um this one here and we're going to make our way back over to here I'm going to paste this in
here and so we have uh denied because we have Docker installed but I guess we need pseudo for it I actually don't like it when it's installed with pseudo um but since I use pseudo snap installed I'm not exactly sure how to fix that so I'm just going to type in Pudo and this will probably still work but it's not exactly the way I'd like to do it so now it's pulling the image at some point I will go figure out how to uh use Docker without pseudo if I if it was a manual install
I would absolutely know how to fix that but not with pseudo snap and so we will pull this Docker container and we'll just wait a little bit here all right so um that's now pulled now it's not running per se I thought maybe we would just run as we did it there um so we're going to grab this one more time we actually aren't running are we running AMA right now it is running okay so make sure that it's open and I'm going to go ahead and paste this in I'm not sure if it's going
to work from here but we have again it's the uh that issue there so I'll type in pseudo and it says already in use oh you know what maybe it's running in the background okay so I'm going to do Docker Docker PS pseudo Docker PS that's going to drive me crazy how can we use use Docker without pseudo because I'm not I'm not GNA stand for that and really depends on how we've installed it ster D it's something like this so like Docker User Group yeah I do have instructions yeah to run a Docker command
without pseudo you need to add your user who has root privileges to the docker group okay well let's just try that maybe that will fix our problem because that'd be a lot nicer to do that yeah because I use pseudo snap I kind of regret installing it that way I might end up uninstalling Docker just so that it's done correctly but for now it is running so we say pseudo dock pseudo Docker uh PS and it says this running on Port 880 or it's Port 3000 we'll find out here in just a moment so let's
try uh 8080 first if it's not then it's Port 3,000 I can never remember what direction it goes in it's never very clear yeah see I would have thought it would have been Port 80 8080 because it's like for 3000 to 880 but really it's 3,000 goes to Port 880 and so here we are in our environment it's very interesting get start with open open web UI open webui does not make any external Connections in your data securely stays here okay so I assume I'm creating one locally I'm not sure why I have to enter
a password in okay but I'm going to do a very simple password for this and yeah I'm not sure if it sends any data out but anyway we have it here and so now we have this and there are no models here so I think maybe if we say like hello it's probably not going to work let's try this because there's no model selected right so in order for this to work we're going to have to add a model but you can see we don't have any models and as we press plus it's getting confused
so let's open uh terminal here and let's install a model so say olama I kind of forget what it is go here AMA and maybe it's like AMA pole oama pole I think that's what it is pole uh uh llama 3.2 1B I'm really just guessing here no that's not it I mean gemma's supposed to be pretty good maybe we'll try to pull the two billion parameter one of Gemma I just want to pull it I don't want to run it so we go ahead and pull it like that and so we'll wait for that
to download be back in just a moment all right so we've downloaded a particular model I'm going to go back over to our web UI interface again don't have much experience with this but we're going to find out does it show up okay and so if I drop down it doesn't see the old Lama model now I'm not sure if it's because I'm in WSL in a Docker container and it's just not aware of it on that Network um so that's a good question yeah I'm not sure that seems like a very hard thing for
it to figure out [Music] so I think maybe what I would do I'm going sign out of here I'm going to try to run it the other way because this is running well yeah because this is running right now in WSL and if we type in O Lama like o Lama is not accessible here and so I have it on the exterior so it really depends like do I want it on the exterior or in the interior because otherwise it's not going to it is so what I'm going to do is I'm going to go
over to here I'm going to say um ol remove Gemma 22 billion ama ama let's see here AMA Gemma 2 2B okay and so now I have nothing inama there but I'm going to install AMA on this side say pseudo snap install because that's all I can really think of is the fact that it doesn't know of AMA on the window side in WSL this is where like networking becomes really important I was just working with Tim um and and in the boot camp you'll see Tim as we worked on a project but his background
is more networking my background is more development and I feel like for him this is like nothing to figure out but but for me it's it's very challenging also B would probably have an easy time with it but we'll again we'll pull down this in a moment okay there we go so now's installed so we'll say olama pole on this side uh this one here ol Lama pole maybe the other issue is that it's not running but I'm hoping that if it's aware of a Lama that it should be able to know models that are
there there not necessarily whether they're running and then it would start them up but this is pulling pretty quick so it's not that bad okay so it's almost done there all right so we pulled the model let's go back over uh to Local Host again and maybe maybe has to be running cuz you know we don't have those exact instructions uh so I'll go log in here yeah I think it's just like local server so maybe like if you had a bunch of people you know that that want to use it so again it's not
running here so I'll go ahead and we'll say um ol llama we'll sit up here and I'll just say run yeah maybe web UI is good if like you have a a network of um like you have local compute and you have um a local network that you want to serve and maybe that's something that could be there let's go give this a refresh here okay so I still don't I still don't see it okay let's give it a read so open web ey can be installed with pit python package man before proceeding ensure you're
using python 311 to avoid compatibility issues um if a llama is on your computer uses command installing open AI with bundled olama support with allowing for a streamline set up with a single command okay so here we ran this one and it's using the hyphen D flag that's how we know it's running still so what I'm going to do I'm going to leave this tab open but I'm going to go into another one here and we'll do Docker Docker PS pseudo Docker PS I really regret installing it via that now I have to do that
every single time and I want to stop this container so let's say Docker I already forgot what the Comm command is Docker kill um I forgot oh no it is maybe skill no no that's not it uh Docker PS Docker or sorry pseudo Docker it is kill okay it is kill okay so pseudo Docker kill and so now now that should stop stop be not running anymore so we'll do this and say Docker PS and it's not running anymore so that's perfect okay so now what I'll do is I'll go back over to here into
our instructions and I want similar command here I just don't want the uh The Hyphen D because I want to have control to stop and start this so my life's a little bit easier so I'm going to go and bring that back in over here and we'll go ahead and paste this in and we'll put pseudo in front of it as it always wants pseudo and it's complaining because there's already a container with that name I don't care I'll just call it two and so I'm going to go back to my browser open this up
and now it's complaining so I'm not really liking this I mean like there's probably more to it to to get it to run as a container like like to read about it but I just want this to work so I'm going to take the easier route here and we're going to just do pip install open web UI like this and it's now saying there's no matching distributions um you sure and probably because right now I'm not sure like what python version are we using python hyen hyen version this one's 312 so that should work in
open your terminal and do pip install open web UI [Music] okay so I'm just thinking about this for a second okay what we can do is we can go here and we'll just take a closer look at what's going on with this so we got this we have this we have the name which is fine we have restart which is fine so you know run in run in daming mode use host Gateway it's mounting the volume over here it's using the name there I'm not sure why it cares if the name is the same because
we do pseudo Docker images that should be fine pseudo Docker PS there's no container running so if I copy this again and I paste it in again okay we'll hit enter the container name is already in use by container you had to remove or rename the container to be able to use that container I just want to use the container based on its name okay we'll try this hit enter there we go okay we'll open it up and it doesn't tell me why we're getting external error now now oh no now it's working okay great
so try this again Andrew exam pro. and then the password I had I'm now logged in and it just doesn't see oh Lama let's go over to set settings let's see what we have is our options we have interface personalization audio account admid settings connections okay so here we have the olama API and this says manage API connections um so maybe the issue is like where it thinks it is so I'm going to uh go over to chat gbt in another window just going to miniz this for a second and I'm going to go over
to chat GPT and so I'm going to go here and just say like I'm trying to I'm trying to use uh web UI uh to load um models from olama in the admin section it shows this okay so uh I'm not sure how it would know where my my existing ol install is okay again terrible at networking but I'm not sure if that's the default Port like 11434 is where um AMA runs AMA default port cuz maybe that's what it is it is 11 434 by default Services running in Docker can't directly access local hosting
your machine okay so host do internal to the host's IP address so the container can rote traffic back to the host system okay that makes sense and then if it was running it' be on Port 11434 so that makes sense many webu are do configuration for lm's frontends include a default connection string verifying that uh is actually running on Port 11434 that's probably a good indicator of something that we'd want to check so let's go back over to our remote desktop here and I'm going to go back over to vs code and I'm going to
run this here and it says unknown Port so I type in oama here because it should be running right now unless we told it to stop right now uh yeah yeah I don't maybe it's not oh it is right here so I'm just going to hit or quit buy and so I'm going to type in ama serve okay so it says this port's already in used now I wonder if this is an issue with this one so I'm going to quit this one here let me try this again okay so it still says it's in
use I'm going to type in ama [Music] ol llama stop ol llama ol llama serve whole llama serve I mean if it's serve that's fine so it already says it's in in Port 11434 so clearly is already running o llama serve C cre a model run a model pull a model list a model okay so clearly AMA is currently running it's on it's already running on Port 11434 uh it's not running on this computer so there's no way that uh it's being taken up from [Music] there okay so if we go back over to our
browser connections I wish you could test the connection here that make it a lot easier trouble accessing here let's click this one so if you're experiencing connection issues it's often due to webui Doc container not being able to reach the old llama server uh use the network host flag in your Docker command to resolve this note that this that the port changes from Port 30 to 8080 resulting in the link okay so I mean like we told like we it told us what to do and we did exactly that but it's telling us now to
use this command instead which is fine so I'll do Docker PS Docker I really don't like the way I've installed Docker see if I can I can say pseudo pseudo snap uninstall Docker can I do that how to remove things installed with SNAP pseudo snap remove docker so first has to stop the service which is fine how to stop Docker I'm not even sure bu it's not usually the way that I use it e so usually just like maybe like pseudo system STL seems like that would be one way of stopping it let's go back
over here did it already stop it no let's try stopping it this way yeah see I don't know [Music] from maybe one reason it can't stop is because it's running something so let's say pseudo Docker PS pseudo Docker PS yeah okay so now it's saying it's not running anymore which is great so I'm going just wait a moment I'm going to try to uninstall this okay all right so I've removed Docker uh using pseudo snap because that I again greatly regret that I know I've installed this prior but I'm going to go ahead and like
I have installation instructions somewhere but I'm going to go just look up Docker Ubuntu because the instructions that are right on the docker website um work a lot better I find and so I just have this pulled off screen here but I'm just grabbing from here wherever it is uh the installation instructions m it looks like this before you install Docker you need to set up the docker app repository yeah so I'll run through all this I don't want to do a pseudo update right now um but I'll see if I can get away with
doing that without a pseudo update usually have to do it so that could take some time and does a pseudo app up get update on both ends of it so we're doing one either way I guess and so we'll go ahead and do that a little bit aggressive on both sides but that's fine and then once that's done yeah we'll grab all of this okay and I'll run that it'll be yes and then here notice they have like the pseudo Docker run hello hello world which I really don't like having to do the pseudo which
will'll fix in two seconds and so we'll paste this in here it's going to pull the image that's going to run I'm going to look for um uh pseudis Docker as there was that command that I know that we can use here oh I guess we could have added the group earlier maybe that was the problem but I think that it would have probably installed if we installed it this way so I'm going to go ahead and run this yeah so created the new group and added it so now I want to do Docker uh
run hello world and we'll just H up here for a second oh pseudo Docker oh Docker run hello world let's try that again um you sure about that Docker okay maybe just it's not aware of the new context of it so we'll go ahead and do that Docker there we go so Docker run hello world and so now oh it's not running um pseudo system CTL start Docker maybe Docker run hello world pseudo Docker run hello world oh okay um one second this is frustrating this is just annoying we'll fix this here uh let's go
ask chat BT works with pseudo how do I oh wait it's saying right here connection permission denied okay so we still have a permissions issue there might be something else I've solved this in um the Gen training days Workshop because this is something that was really annoying me having to do pseudo so maybe in here I have the code go over to here uh maybe not that one maybe this one so we have these three lines I'm going to just run all three of them I think the new group has to be like this last
one has to run or it's not going to take effect this one right here obviously the group already exists now let's look for Docker okay now we have a Docker run hello world and and now it ran the hello world so so now we are in a ssan world where we can work with this in a much easier way so I want to go back over to here as they said that if we're having issues we want to run this one in particular and before I run it I'm going to bring it up over into
here okay and we'll bring this one down we'll bring this one down because I want to see what's going on here um yeah and so I mean this kind of makes sense like now instead of doing that internal IP we're just doing this uh explicitly we're not mapping uh mapping it we're just saying host it's going to run uh 3,000 I kind of think this one looks a lot better to be honest and so it's going to pull that image which is fine I thought it already had that image maybe the path is slightly different
so it thinks that it's something else um yeah we'll pull that down and maybe it will be able to detect AMA this way all right so it's pulled this container I'm assuming that it's running and it says it should be running on Port uh 3000 I believe so let's take a look here and see if uh we have some better results here okay sometimes it throws that air even when there's not a problem so we'll try this one more time okay um let's do Docker PS here and let's see if we can get into this
um I don't remember the command off my top my head but if we had the docker extension this will make our lives a lot easier working with Docker containers so we go ahead and grab this all right so I have that Docker container installed and we're going to give this a refresh here permission Deni trying to connect to the docker container socket oh come on what a pain what an absolute pain um you know this this in theory could be really good if we can just get it to work um Docker failed to connect settings
I mean I I installed it the most manual way I could install it um okay well Docker permissions Docker extension permissions vs code maybe there's like maybe I can add the vs code thing to the thing or something uh pseudo group AD group AD Docker we did that and we did that okay so maybe what I'll do CU maybe it's not aware we'll close that completely out we'll open up vs code again sorry and uh we'll go back over to Docker once it loads maybe it's already loaded and now we can see it okay great
so all we had to do is restart that so that was not too bad and so what I'm going to do is right click this and we can um view the logs and so this isn't showing any errors it looks like it's running it says port 8080 so I wonder if maybe it's actually running on Port 880 but I'm pretty sure it said it was running on Port 3000 let's try Port 880 okay so it's actually on Port 880 I could have swore it said this changes the port from 388 okay that's totally fine I'm
fine with that and so I have to create an account for the first time because I guess it's storing these locally these accounts so just whatever you want them to be and I'm just going to put a simple password in here and we'll go okay let's go there we go nice and so I don't think I'm running it um it's just that it's pulled so we'll say uh can you produce me a simple example of pandas okay so we'll go ahead and do that there we go that's nice now of course this isn't super fast
um it's fast enough but it is a 7even billion uh 7 billion parameter model and it's quite impressive that I'm doing this locally and so I now have a a local UI which is really nice um so I could see like if you had a bit more gpus um under here I I could see it replacing like the smaller models like uh mini and things like that not that you ever seem to run out of uses for those but um I guess if you were doing a large B job uh you could do it locally
and that would be uh very cool um so yeah I mean that's all I really wanted to show but you saw there was a lot of struggle through Docker containers and things like that and so it just shows you that um it's not all about just working with geni you still to know all the tooling around it um but anyway that's done so I'm going to stop this here and um I guess over here we'll do Docker PS and I'm just going to go ahead and kill this so say Docker uh kill Etc the only
thing I wonder is like if if it's storing the state in here that I'm losing those accounts every time maybe there's a way to persist it and so there probably is a way to connect a a store but yeah I could see if you have decent compute um within your network then everybody could utilize it and um that I like that so anyway there you go that's [Music] webui hey this is Angie Brown and video I want to take a look at GitHub co-pilot um I actually heard it just got a free tier and actually
it's showing right here now available for free up to 50 chats in 2,000 codes of completion so let's go ahead and give that a go um hopefully it's as easy as going to here to find it and actually we have co- pilot open here but what I want to do is actually install into vs code apparently you can install into a bunch of stuff the fact that it could be worked in neovim sounds really cool um I'm very very tempted to import that into neovim um I do not know yeah I have neovim installed on
this computer but I probably should just keep it simple here as I don't think everyone could handle that so we'll go ahead and click on Visual Studio code obviously you should have a GitHub account before you do that and we have the install button and this is going to open up vs code so this is clearly going to be my local developer environment you can install this into GitHub codes spaces but you could also install this into um you can install into gitpod unfortunately because they use a thirdparty Marketplace um at least in the current
version of G pod classic maybe the any when you can so this is installed I already actually had it installed before um I believe it's done an update so sign up for GitHub co-pilot free um I mean let's click it I I figured I'm already signed up okay so I'm over here and so we can start our free trial so let's go ahead and do that oh that's how they get you no no no no hold on hold on hold on you're using copilot for free okay okay so we don't have to do this but
at least we know it's like $10 a month or $100 a year that's not bad for pricing uh in can Canadian dollar is not so great our dollar is a lot weaker right now so if I did 10 USD to CAD it's $14 so not the best price for Canadians okay but um yeah that is what it is now I've always found that the code produced by Code Pilot is not as good as chat GPT or Claude um but we'll see um I also heard that you now have the option to change uh your model
which is something that might be interesting to check out so I will give that a go here so let me open this up and we will take a look at co-pilot what I need is I need our our actual gen Essentials repo here because I think we we'll work in a new project here so I'm just going to open up terminal I'm not sure if I've already downloaded prior do I already have it what we called it it's called gen Essentials it is not downloaded so I'm going to clone it it's a public repo so
you can clone it as well okay but I'm going to go over to GitHub here J Essentials and we're going to go ahead and grab the SSH here you clone it however you have to clone it get clone I'm going to paste that in here and that's going to download gen Essentials again we could have done this in GitHub code spaces I just don't feel like launching up GitHub code spaces right now and I'm going to CD into our gen Essentials directory and I'm going to do code period to open that up so now we
are in uh Visual Studio code it is getting ready I'm going to make a new folder here called GitHub copilot so what I want to do here is to see if we can make our link tree clone as that would be a good thing to build um your GitHub co- pilot access has been disabled by the organization okay that's fine I just want to use the regular GitHub co-pilot and so that seems like that shouldn't be an issue because there's I have we you're paying for it at the organization level I'm not sure if we've
just unsubscribed to it temporarily so now I'm not sure if if uh GitHub co-pilot is is working here but let's find out so I'm going to type in co-pilot and see if we can get working with it [Music] again okay so I mean last time I used this maybe I just got to open up a file here it depends on what we want to write this in so I really like Ruby and I think that I would love to make a little fast HTML in r or maybe go I don't know go very well so
I think that that would be a really good challenge to write it in go so let's go ahead and see if we can do that so we're going to say main.go so we know there's a go file here yes install the go extension again not that I mean I know go but it's it's not something I use every day so I'm not going to remember how to write go code from scratch and so I want to bring up the GitHub co-pilot chat which is over here uh I don't like it on the right I'd rather
have it on the left um well I can't really figure how to drag it so this just going to be what it is so go pilot it copile is a powered by AI so mistakes are possible review the output carefully okay so um I want to write a API backend for a simple um web app that is a link tree clone I need to have um need to have okay let's just see what it can do and by the way look down here that we can change our model oh we're linking between the two so
I wonder if we were paying if we' get more you think i' know because I'm supposed to make a GitHub copilot course we're still working on it so here's a proposed directory structure for the go API uh I mean it's a lot of stuff but sure we'll create that workspace can I not create it in place of where I am okay can I can we just insert the code in place I don't really want to make a new workspace oh okay well that's not very helpful okay um so here I'll just start typing what's the
command for GitHub get GitHub co-pilot let's go find out what that is I want to go over here command pilot GitHub [Music] co-pilot so here there should be a hockey I'm not sure what the hockey is but I'm G to just get this out of the way for a second so GitHub co-pilot hotkey vs code seems like the first they should be telling you right and so we go here and [Music] so yeah this is not uh this is not helpful okay let's go back here I wonder if I could just like open the code
here we go so maybe I can just like open the code here and just replicate the structure doesn't seem like there's any other way to bring it uh bring it in so here we have a main application and then it says link tree clone I don't really like the way this is written I kind of feel like is there any way you can simplify the amount of files for the go app because I feel like it suggested stuff is way too much there we go that's a lot better okay so let's go back over to
here and so we have main.go go.mod go. suum and again I'm not expecting you to know any of these I feel like go. suum is what gets generated and not necessarily um we do here what we have this is insert in place insert after cursor so we'll go into go do uh go do main.go here and we'll insert that here we'll go over to the next one which is go.mod and we'll seert that into here we'll go into yeah that's what I thought like there's no you wouldn't be doing goome that's what you do when
you actually build the project okay so we have main.go okay so it's bringing in Jin Jin is a um a web framework for go we have go. EnV so load A.V file I like that idea we have log that is good we have net HTTP that's good um we have links I mean that's pretty much what we would expect it to have it's returning links as a string it's not that structured I feel like you might want to have a bit more like a name and a title on that that's fine I suppose and we'll
go over to go dood so that's the two that they're installing okay so now the next question is like okay now how do I start working with this okay so how do I build the code and run the server again it's been a while so and I probably don't have go installed on this computer but we'll see so navigate to the project directory of course so we'll CD into GitHub co-pilot I'm just going to make a subfolder here called uh link tree go and I'm going to grab these two it actually probably has to be
link tree clone because the name of the module is that and I have a feeling the name matters here that's just from from memory I just want to rename the folder can I there we go sorry I was getting mad so go link tree clone and uh we'll just CD into link tree clone here I'm going to bring this over a little bit so this wraps around so we can at least see it and so it's suggesting that we can do this navigate the project directory oh so we have to be up one uh do
we maybe not let's see what happens because that was the folder and we didn't tell that we changed the folder right so I'm going to go ahead and try this because Go's not installed so I'm going to make a new um new file in here read me I'm going to go here and say how to install go on Ubuntu WSL of course if yours is different you'll have to figure that out and so [Music] um which one do we want I don't know so I'll take goang I think it's just different compilers we'll try this
I suppose we probably should look up how to do it but I'm just going to go for it yes go for it I mean GCC sounded pretty good too but we'll wait for this to install I'll be back in just a moment okay all right so supposedly we have go installed the way we're going to know is if this actually works so we go ahead and say go run mang go um it says expected package found and Def [Music] file okay so we'll go back over to here this is the main package okay let's see
if it can figure it out from here is commented out yeah okay that's that's what I thought was a bit unusual okay I wasn't sure it's been a while but uh yeah that that makes sense so let's go ahead and run go main go and it is compiling so that should compile into an individual binary which should be super super nice so we'll give it a moment here and I'm GNA give this a refresh so now we have a go go. suum file now you can run the server with main.go Okay so what did we
do we did go run mango okay so oh no no no here they had a build okay alternative you can run the server directly without building it first oh right right right right right fair enough okay so I don't know we could build it and then run it I guess it depends on how we want to do it so then we'll say how to build okay I kind of like the idea of how to build it yeah the whole point of this is we shouldn't know go we should just rely on I mean we should
know go to some degree but uh run run without building if we can do it without having great go knowledge then that's going to be really good so clearly it's uh it's expecting the EMV sh so we'll go ahead and make a newv here I'm not sure what I would expect from this here so we go back over I'm looking for the EnV so we have go. EnV right now the only thing that's set is port and if it's not set it's going to default to Port 880 which is totally fine so let's go ahead
and hit enter and this should get the app running is now serving on Port 880 we'll open the browser um nothing exciting because uh this is not a website right this is an API this is an API that we're trying to design so I'm going to go back over to um our main go file okay and so all it has is for SL links and this should serve back Json as it say here so I'm going to go back over to here and we're going to type in for SL links and we get back some
link examples okay so we're starting to get somewhere uh let's go back over to here and it'd be nice if it starts suggesting code I'm not sure why it's not doing that just give me a second here okay so it's suggesting that if I don't see it then maybe there's a setting I can change so in file preferences settings not sure if that's true though but we'll take a look preferencing settings and maybe there's something for GitHub GitHub co-pilot specifically co-pilot is there okay it's [Music] enabled uh let's say on that's missing enable auto completions
it seems like it is enabled all right so that seems fine so I'm G to go back over to here but it's just like it's not oh now it's suggesting okay so maybe that option I changed did help a bit so here maybe we start documenting to say like I'll go here and just say um uh returns an array of objects with name and Link okay and so I'm hoping that it will change it co-pilot oh here we go um comment explain editor inline chat here we go okay great so now just say uh change
change to have an array of objects with name and Link because that's really what we want that's GPT 40 so that's better we'll go ahead and hit accept now we would have to um in order to see this we' have to rebuild it every time which is not a big deal so if I stop this we'll go ahead and run it again because it's not going to hot reload right at least I don't think it is and then we'll go back to our browser here give us a nice refresh and so now we're getting back
the structure that is more uh ideal of what we want one thing that's not returning to us is um I would say a profile so that's something we had before all right so um I'm just thinking what's next so we have our links we don't necessarily have our profile here which I think I would prefer to have um I'd have to think about what that is but we do have a front end from another app so maybe what we should do is bring that code into here um I'm just trying to think of where I
may have ridden this code before where it actually might be synced into this repo hm thinking I'm thinking because I don't see it in here most likely because I didn't copy it into [Music] here because we used the one that was from uh like we generated a a link tree using uh what's that platform called um lovable right and also we did make one with uh v0 but I guess the question is like should we build our front end with a different tool or try to code it all from scratch um it's a lot of
of work to make a front end from scratch so maybe we just focus on the API endpoint here so I don't know I guess we don't have to make a fully working app is I'm just being a little bit ridiculous here let's go ahead and we'll just try to change this a little bit more so we'll go ahead and say with this we'll say um uh I need to also return uh the profile information of the user so go ahead and do that okay and so now we have that new structure um I don't think
again this will appear unless we go and do a hard refresh of the rebuild right that makes sense because it's a statically generated app okay so I'm going to go ahead and just run it again okay so now the next thing I'd probably want is I'd want to change the code so that um it loads from a database so can we load the data from an SQL light database let's see if it can do that okay so we'll go back over to this one I'm not exactly sure what it's trying to say when we have
it like this apply an editor okay so it added it to the correct file that's good uh did it have the version before it did okay great so we'll accept those changes um I'm not sure but we'll just go ahead and say apply an editor it does jump to the file so that's good as well so we're waiting for it to make the changes there we go and so here we can see the clear difference so we're importing qite I don't know what oh G so go and we're creating structs to represent the structure of
that information um here we are loading the SQL file called links. DB which is totally fine we have DB migrate Auto migrate so I guess the idea is that we represent the database structure up here and then we're going to apply the migration that's cool I don't think you'd have it in your code like this every time but for this simple example it's fine instead of loading day this way we're going to bring the links in this way notice that this is only bringing in the links and it has ignored the profile um but I
will accept it okay and we can always just tweak from there so um I also need some test data uh can can you give me the instructions actually we'll go back over to here and I'll read me so create SQL table with some C data SQL SQL sqlite table and C data run the following commands to create and Seed an esite table database called link. DB so I'm going to go here and do this it's not giving me any suggestions which is fine but I'm going to go ahead and right click go go pilot um
editor in line so generate out the example code needed let's see if we can go ahead and do that based on that context okay so that looks good to me I'd imagine it would also create the SQL light and one go I don't know if I have SQL light installed on this computer well I guess we'll find out here in just a moment so I'm going to stop the app and I'll drag this over a little bit we'll type in esql light we'll say where is let be clear first where is SQL light 3 so
it's already installed I'd imagine if it wasn't installed use pseudo snap install SQL light 3 for your machine you'll just have to figure it out if if it's not installed I'm sure you can figure that out um but we'll go ahead and try to run this command so I have no idea if it's going to work we have to run this in the exact same repo otherwise it won't work we'll go ahead and enter did it create a links. DB let's give this a refresh it did so if we want to explore this file I'm
going to go over here and like look up for like esql light fall explorer there's probably one here [Music] somewhere going try this one here I remember the first one the most popular one did not work good for me and that's why I'm trying a different one I don't see it oh you know what maybe we just click on the file and it will work there we go yeah yeah yeah okay so it is inserted into the database that's perfect we're going to go back over to here I'm going to save this file I'm going
to save this file I'm going to go ahead and stop this we'll type in clear and we'll do run go go run main go so what I'm hoping is that it's going to load that data but it's kind of cool because as we're slowly building this we're learning how the code Works um this is a different approach from that app prototyping I feel like app prototyping if you know your front ends really well you you know you don't really want to review that code as much uh for backends I'd probably step through all the stuff
very slowly but now that we have this running on Port Lo Local Host um 8080 let's go ahead and give this a refresh and there should be no change here it should be the same thing um did this start oh I think it's still finding finding the file so we actually have to wait a little bit here it's not ready yet all right after a short little wait it seems like it is now serving let's go over to here give it a nice hard refresh um I think the data is the same it is yep
and we see that it's serving it so it's definitely serving that just in case uh we're not 100% certain I'm going to go over to the database and make a change as it should be serving uh the data live so we go here I'm just going to put an exclamation mark here if it lets me do that I'm not sure if I can edit it says read only um is there a way I can edit it I remember that was kind of the pain about this uh these editors is that they don't always let you
edit the text maybe I can just add another row can I add another row no so yeah I don't like that editor I guess because I can't um do anything with it I'll try this one instead okay and I think they all work the same way you click it and then you get the interface I mean this one looks the same making me think that it didn't actually it didn't actually oh can we change it over here no no no no just just terrible editors um so I'll go back over to here I just want
to make sure that it's working and so I want [Music] to just go down here and just make an additional line here this will be exam Pro Training Inc and I'm going to go ahead and copy this um I don't want to stop the server because I want to see that it's uh uh it work without us restarting it so we'll just CD into uh GitHub co-pilot link tree I'm going to go ahead and just paste this piece of code in here hit enter and so that should be inserted we'll give this a nice refresh
and so now it's serving it there so there you go we can write go code without knowing much about go um I'm going to go ahead and just add a new file here and call it dog ignore I want to ignore a couple things the EnV and also the links. DB as I do not feel like committing those here today and yep that should be good is there more you can do with co- pallet absolutely but for the most part it's pretty clear um how that should work right so we'll call that done and uh
there you go [Music] all right so in this video I want to take a look at Amazon Q developer we built already a a simple API using go um in copilot let's go take a look at what that experiences like an Amazon Q developer since we're building things that are very simple I expect these to perform pretty decently especially in these smaller code examples it'd probably be better to like test infrastructure generation things like that but you know it is what it is so I already have the repo clone from earlier so if you did
the GitHub co-pilot video then you're at the same place as I am as well um so I I I cloned the G gen Essentials repository and I'm going to make a new folder here uh and this is going to be called Amazon uh Q developer and we're going to make a new folder in here called link tree clone because that is what we're going to do we're going to do something very similar to this one now I wonder if this one's going to be much easier for it to do because if it's in the same
code base it might pick up the stuff right now I have GitHub co-pilot activated so I'm going to have to go turn that off okay we we can't have both in here at the same time and I'm just going to say disable for now and so there's more than one it disables also the chat as well I'm just going to say we'll give this a nice refresh here um I think it's disabled yeah we'll reload the window and we'll go ahead and we actually already have q installed but all you would do is search for
Amazon q and install it so I'll just uh reinstall it okay nothing super exciting not sure why I saying up Dev containers I don't think that's related no I don't think it is um and so now I have Amazon que installed notice down below we have the little red thing so normally what we need to do is go and log in so on the left hand side we have the icon it depends if you have a paid or free version if you're on the free version you're going to use your Builder ID and that's what
I'm using here today so I'm going to go ahead and hit continue it's going to authenticate me um it's going to try to open the browser so let's proceed to browser we're going to open and here it's saying confirm that this code matches the one given to you um okay so it didn't take me to the Builder ID if if you didn't have this you might have a little bit of a different process but databas Builder ID is something that you'll want to create first if I already have it's going to keep logging me in
here okay yeah so you go here you create a new Builder ID and that's how you'd be able to work with this but I want to just go confirm that this is the same number or letter uh Ms yeah looks like the same so we hit confirm it's interesting they don't make you copy paste it that'd be more secure but it looks like I do need to log into my Builder ID so I thought maybe I was already logged in uh but apparently not so it says create a new Builder ID I already have one
so I'm going to go down below I need to get into my dash line so just give me a moment here all right so I think I'm ready to sign in and it didn't work so just give me a second here to work through this uh Stupid interface there we go a second attempt with the Copa and it looks like I'm getting access to it um and you know we didn't hit any limits with the GitHub co-pilot one um I'm not I don't think we'll hit limits here either with uh Amazon Q developer but we'll
find out as we're working with it we're going to go ahead and hit sign in I think we are signed in yeah we are we'll just hit acknowledge here and so let's start working with this so I'm going to go over to here into this we'll say main.go okay and I remember like if I started typing I mean it should autoc complete I'm not sure why I'm not getting autocomplete here today work on a task using theed capabilities let's do a explore okay so Implement features or make changes across your workspace unit test generation documentation
generation code reviews oh they've really done a huge update to this thing upgrade library and language versions of your code base and these are all cool but I just really want to um uh you know start coding a goang app so I'm going to say I want to build a simple uh go web app that is API only that is a clone of Link tree okay so let's go ahead and enter that in see what we get definitely the experience is feeling a lot better from the last time I used it so we're getting some
code here um obviously the insert insertation is not the same kind of experience but that's totally fine so we have some stuff here um we're getting links and profile I like that it's bringing in the profile oh I like that it's breaking it down into functions it's all a single file so that actually looks pretty good to me codewise anyway and so we'll go here and I'm going to go ahead and insert at the cursor so we have encoding Json log link profile profile get profile update profile add link it's really going all the way
here we even got routes now it's not that I wouldn't have taken all this code from GitHub co-pilot but it just gave me so much that I didn't want to prompt it and try to figure it out here clearly we are initializing data that already exists then down below we have some of our instructions I'm going to go over to here and we'll say readme.md okay and I'm just going to go ahead and split this screen like this um I guess I didn't really need to split it no I I did actually and so I
want to take some of the instructions it has here so here it's suggesting that we do a couple things here um so go mod and nit link tree cone so this is to use this API need to install gorilla MX router okay but couldn't we specify that in our file I mean this is one way of doing which is totally fine I'm not sure why the dev containers is running over here I'm just going to ignore that um and so I'm just going to CD into the correct directory I'm not going to follow the instructions
exactly because I kind of feel like it might be leading me a little bit astray like we will follow it but it'll be a little bit different so I'm going to CD into the link tree directory and I'm not going to do this well I don't know because if we did that it might set us up with some good stuff I don't know um I mean we don't really have much code in here okay you know what I'm just going to follow its instructions I'm going to go here I'm going to delete this file here
because maybe it'll give us some default files that we actually do want and so I'll go back over to here I ended up deleting the readme by accident but we'll go ahead and we'll CD back a directory and I'm going to go run this okay and so let's see what it created because what it might have done okay so we are supposed to be in that folder okay but it did create the go mod file okay so I did not have to delete any of that so we'll go back here to link tree clone and
then we're going to go ahead and make our readme.md and I'm going to go back and make our main.go I think what we can do is also bring it over here if we just want to have an easier time working with it I've seen some people do that what happened to my chat history no no no no no no no no no no where did my chat history go are you serious I move it and I lose my chat history please tell me it's still here are you kidding me one second what where does do
chat history go Amazon Q developer this so stupid it would be great if Amazon Q retained history of chats so if I came back another time it yeah someone complained about this but it's like how about the fact that if I just move it from one side to the other it vanishes okay uh do you recall the code you just gave me for the go app that's frustrating let's see here the link tree clod great give me all the code CU I don't have it in the chat history that is so stupid I mean we
can get it back so it's not that bad but like just me moving from here to here that shouldn't happen at least the code looks good assuming it's not hallucinating okay did that insert it maybe because it's still generating it was having a bit of a hard time so I'm going to go here okay so that's inserted it's going to use gorilla mck which will give it more advanced routing now Jin is a lot smaller but that's totally fine I do like the structure of all this so if this is true that this looks really
good we do have profile end links um so we actually have create end right here so that's good as well we are going to have to CD into um that directory again so we'll go and just say link tree clone and we'll initialize the mod so we get that initial file and then that's just going to add it to that you to be honest it's probably a better way to add it now I'm myself recalling these go apps it's it's very finicky to insert them into the mod file here and so this is a lot
nicer you can test end points using curls oh cool it gave us curls so we'll go here and just say like curl uh examples of curling curls okay so we have this um I'll just fix the indentation here or the the not indentation the level of stuff here probably don't need to do all that again I'm not sure why I'm not getting autoc completion um normally it would like say things as I'm typing but that's totally fine to be honest I actually don't like it when it autocompletes so if it's turned off by default that
is an improvement to me so we have some examples here let's go ahead and try to run the mango app because we're now starting to get some experience with mango there's a a complaint about this at the top um we have package main so that is correct is this actually in the link Tre directory it is let's run that again what's the problem expected package found end of file Lane line one main one let's take out this line here I'm not sure why it has that maybe it's just a comment because the first line has
to be the package or will complain and so here we have uh Port 8,000 as instead of Port 80,000 880 which is totally fine it doesn't really matter and so I'll bring this on down here okay and we have 404 which is fine we're going to go ahead and look at links here here so we don't get anything there but honestly like this API should really be accessed via curl so if we can't access it directly to the browser that's totally fine let's go over to our read me example and take a look and if
we curl I mean that should be sufficient enough now to be fair we actually don't have any data so if nothing was returned that'd be fine here we are passing a we're adding a link and then here we're updating the profile we're getting the profile um I guess there's no listing links that actually kind of makes sense we'll go ahead and we'll just copy this curl here we'll need another tab here I'll stop this one this is from earlier so I'm just going to it doesn't really matter I just want to be in the same
directory but it doesn't matter if we're in the same directory right now so I'm going to hit this curl and we do get data back okay so this stuff is hardcoded right now if we go back over to our main Go app we can see that we've hardcoded it down here now I wonder if we can rightclick oh we do so we have Amazon Q just like GitHub copilot okay I like how we can generate tests that's something that's really important um but not necessarily something I'm going to do this moment so we have profile
I guess the nice thing about um go is like just the way it's so statically declared you don't have to write as much test code but anyway we have a profile and so I really want this profile to come from a database so I'm going to right click click here and I'm going to say um inline chat uh this code should pull from an sqlite uh SQ 3 database okay so let's see how it handles that okay it's called in profile. DB which is totally fine that is a lot of code that is a bolt
load of code I wonder if it really needs that much but it is doing quite a bit so we have the database okay um this is also in the main area I'm not sure if I like that because it really should be getting the profile over here right so here we have the profile which is being loaded over here and this code is just like out the loose in the main now that's not say the code is bad I'm just saying like this is what we got um I'm not sure I'm not sure uh if
this is accepted still green how do we accept it it doesn't say okay it's still high oh here it is accept enter sure we'll take it but this code here is not going to work here so I'm going to go ahead and I'm going to cut it out from here and I'm going to bring this into here because this is where we're actually looking to get our data right so we'll go ahead and paste it in here and we are really just copying pasting whatever right now like we're on two spaces but it's like forcing
to four because the linter is doing it um so now we're expecting there to be a profile. DB and things like that so now I would probably say like um for our profile. DB database can you write me um uh code to insert uh to create the sqlite three table or database and see data okay let's see if we can do that okay so you know I just want I don't want all this this is too much this is too much it's not what I want no no no no no no no no no no
no no no no no no no no no no no no no no no no no no no no no to to verify the data no this is no this is no just give me a simple command a simple uh command I can paste into the terminal that's way too much it's trying to write code and that makes sense but but I want command code there we go that's a lot better okay so I'm going to copy this copy that and we're going to go over to our read me file create and Seed esol light
database okay so I have something here um it's a bit of a mess can you format the uh the last line so it's multi-line because that's a bit of a mess thank you that's yeah that's a lot easier to read but so far it's it's doing a lot more comprehensive coding so that's good going to copy that paste that in um and so we'll go ahead here and just say sh sh so that's more of what I would like to have now the Links aren't that great but that's fine it's just generating out nonsense links
so I'm going to go ahead and paste that down below and hit enter and so now we should have in here if we refresh a profile DB if I click here it's not opening I'm surprised because we did have an SQL light tool installed earlier but I guess I'll go ahead and see if it's installed again it is installed so I'm not sure why it's not uh taking that there as we do have a viewer installed so I'm going to install that one I'm going to go back to this one and yeah now I can
see this one okay so we have links and then we have our profile so that's pretty good so far I'm going to go back over to here and yeah so we have our database stuff here we'll go back to our main.go and so that's going to give us some data and so this should be able to serve that up now the only thing that I I wondering is like we Dro this code in here is that going to work that way should we open it up every time we do request uh it probably won't matter
in our smaller scope of our project here but let's see if this code works it is a little bit more complex so we going to go down here to our other one where we're running the app I'm going to try to start it again and see what happens undefined SQL light um to be fair we did not update the app to take go in account so let's go up here and we have this here there was something called go Imports before that was trying to install let's go find that go Imports yeah see I don't
know what it was now go import what's this no I don't care about that I just wanted to find go Imports nope that's just sort I guess that's probably a big problem for uh folks there uh what's what's uh what commands what do I what command you to run to install the go libraries for esite three hopefully it understands in the context of what we're talking about so is that the same library that we're using I'm not sure let's go back and take a look here um actually we don't have anything anything imported for SQL
right now maybe I needed to take more of the code and I just didn't yeah so this one is using go go light and log we'll just well hold on here database SQL and then these two really interesting how they are listed well that's if we that's for that part I think it's because we generated it in line and that's why it's confusing anyway I'm going to go ahead and paste this in here as such so maybe SQL SQL is like the generic uh the generic thing and then uh we're using go SQL light 3
as our adapter remember the other one we used uh like um GM so maybe there could be some similarities there I don't like this whole incompatible thing but we will see what happens it's like plus incompatible but we'll see all right so supposedly it's installed and let's go ahead and run this and see if it works undefined SQL line 29 right here okay but we have SQL up here what am I missing there's a bunch of other stuff here like tidy and stuff like that I don't care about that um okay so I'm going to
go here and just write this see if it can correct it for us clearly there's something missing maybe there's like an initialization or something I'm not sure you need to import the SQL package let's go fix that uhhuh is it not right here it is yeah the key Imports you're missing are database SQL maybe I didn't save the file let's just go ahead there because it clearly is there right and this is the standard SQL interface says it's running I don't see a failure yet we'll give it a moment here I'm kind of expecting to
see the stuff that we saw in Jin but the thing is Jin might have built-in logging and so this one might not and so that experience might be a little bit different um I think what we'll do is we'll go run the bash command again so I'm going to go over to our read me here and I'm going to go ahead and copy this line here and we'll paste it in try this one more time okay I'm just gonna try this again it's giving me a little bit of trouble copy and pasting here today I'm
not sure why we'll hit enter and we are getting this data back and this is the data that comes from the profile so this is working as expected we could take this further and we could finish it all up by implementing all the other ones I'm not sure if I really want to do that is that could be timec consuming uh we'll give it a try here okay so like we'll go here oh edit with control I that's cool it popped up finally control I control I not working as expected that's fine it says contr
I here but whatever it's not really doing it um we'll say okay uh hook uh change this code to work with our sqlite database okay let's see if we can do that if that just works that'd be awesome looks fine to me okay okay I'm G to also just do this one might as well just go for it right change this code to work with our escolite database again looks okay to me at a glance um this is the only one we didn't do was update profile that one's a little bit more involved let's just
do it let's go for it change this code to work with our sqol light database okay oh if you're wondering if you heard my phone there for a second I had a video where I was watching about Christmas lights and they were saying there's a a person that's from Norway that goes to Japan and they're saying as a foreigner I'm so I'm so impressed at like how good their Christmas lights are if you're wondering what that audio was totally not important but anyway now that we've done that let's go ahead and stop it and start
it and if it just works that'd be amazing um but generally if we were working with this we would check each one as we go the fact that it can write test code is also sounds very impressive to me so I'd be curious to see that it's already started on Port 8,000 I'm just going to close out the tab here on the side I know it's going to make us lose our history but that's just what we're going to have um we're going to go over to here and so I would like to add a
new link I'm going to go add exam Pro here exam Pro Training Inc I'm not going to test them all I'm just going to add the single one here okay and we'll go ahead and copy this and I'll paste it down below here hit enter and now we'll go get our profile and I want to see if that appears in there now do we have exam Pro in there do we see it um I don't think we do okay so let's type in clear again here I don't see it let's go over to our um
database file here yeah I don't see it inserted here so clearly oh you know what I don't think the file saved let's just stop that I I don't I don't think that it's saved I keep think I don't know why but I'm failing to do the most basic thing which is to save the file so we do have a problem which is result on line 101 so we'll go to 101 here and it's saying result declared not used fair enough it doesn't appear to be used that that is correct um and so it'll go here
fix I like how I didn't get to tell what the problem was how does it know what it's fixing remove the profile links array since we're now using a database added a proper error handling instead of log fatal get autogenerate ID from the database return just the nearly created a link instead of all the links um added a proper HTML status code I mean that sounds all good so I don't know if it's dealing with results still if we see result here yeah it does use it now so I'd say I'm surprised the code was
not um good here but inser cursor so I'm hoping that it just replaced the existing one no it did not it Just inserted it below it okay so I think it's the top one's the old one that's hard to say now um no I don't think it is so let's go ahead here that looks like the old one so let's go ahead and insert en cursor oh maybe that okay maybe that was all right I thought I thought it was uh that let's go ahead and hit up we might have to fix our other ones
as well because they might also be prone to the same issue it almost seems like you'd have to run it and then tell it to fix each code it's running so clearly there's no problem so I'm going to go ahead here first I'm just going to run our curl with our profile and I'm going to hit up until I get back to this one here hit enter and now I'm just going to go manually check the database um I mean I don't see it inserted but this might not be refreshed there it is okay so
it is being inserted excellent so I'm going to go back up to here and hit enter and so now it's inserting now could we correct the other code probably do I care to correct it no but um is the experience better than GitHub co-pilot at least in this uh comparison the code produced by Amazon developer Q is much much better um the only thing is that the ux experience is the developer experience is not as nice obviously the fact that I dragged this over here and lost my chat history was annoying but we were able
to work around it um and so yeah I would say I'm surprised at that results here here but you know these things vary in terms of their quality so we can't just always expect things to work that great I'm going to go ahead here I'm just going to make a new file called Dog ignore and we're just going to add the profile. DB here okay and I'm going to go ahead and refresh this there we go we'll say Amazon Q developer and surprisingly Amazon Q developer was okay at coding okay but anyway I will see
you in the next one all right [Music] ciao all right so in this video I wanted to see if we can give uh Gemini Cod assist a go um so this is kind of the competitor to GitHub co-pilot Amazon developer Q or Q developer I would expect that this one here would work better uh in go because go came from Google I'm pretty certain so who invented go goang and I could have swore that Google has something to do with it maybe they don't maybe it just because it's called go I think that oh it
was designed at Google there you go okay so I'm I'm not wrong here let's go ahead and see if we can try it so so Gemini code assistant Enterprise is available for $90 a month per user for $12 commitment uh this discount describes how you can use Gemini code assistant in Google Cloud to help do the followings in vs code and things like that okay great so how can I start working with it um before testing code Gemini assist capabilities your code you must make sure etc etc I don't care about that let's make our
way over to vs code and see what we can do it'd be really annoying if I have to pay for it but I wouldn't be surprised with Google um to be honest this is the wrong repository we have been working locally this entire time and so we did GitHub co-pilot and um uh Amazon Q developer let's make our way over here to here we'll type in code assist is that what it's called Google Cloud code and then we have Gemini code assist so this is Gemini code assist Plus Google Cloud code for visual studio code
bring the power of Google cloud and Gemini to help build your apps faster now that sounds good I just want to write code right now I don't want to Loy anything so I really just want this so question is like will I have to pay for this or does it have a free tier and that's what I want to find out so we'll go ahead and we'll install that all right so um I have it installed now the question is how do I use it now over here on the left hand side I can see
Gemini code assist I probably should disable I um Amazon Q before I proceed here so we don't have conflicts so I'm going to go here and just search for Amazon Q okay and I'm just going to disable for now we'll reload the window so there's no confusion and the question will be will I be able to use Gemini for free or will I be hit with you got to pay for it so log into Google Cloud sure we'll go ahead and do that Google Cloud's normally really easy for us to get into so this should
be not too much of a problem um okay so now I'm authorized that sounds great I'm going to make my way back over to vs code here nope wrong repo this one here and now it's talking about making a project I don't want to connect it to a project I simply want to uh code on a project but that's totally fine you're agreed to allow Google to enable the API required yeah sure we so select a project and I actually have one called gen Essentials I can't remember what we did in there but we set
it up earlier and I'm just saying enable API of course if you don't have one you'd have to then go do something here on and after January 27th access Gemini Cod system will require an active subscription with an assigned license so I'm not sure if that's a trial based on the time period that I'm in right now or if that's just going to be when they start doing it so if this is the future and you can't do that well sorry just watch the video and get an idea of what this thing can do so
now we have it open let's see what we can uh start doing so I'm going to make a new folder here it's going to be uh Gemini CODIS Gemini code assist and I'm going to make a new folder here called link tree clone and I'm going to make a new file here called main.go and we're getting pretty good at here now already the experience is better right it's already showing me control I which I like so if I go ahead and hit control I that's not working if I try alt ey windows I none of
them work none of them work so it could be that there's something just wrong with um my hotkey here but for whatever reason I can't seem to do hocky that's totally fine because I'm going to drag not g pod we're going to drag in GitHub or Gemini uh Gemini code assist over here it says open a file get code suggestions as you type and hit tab um so I just want to start here so I'm like I am I want to build a simple go web app that is apid driven um that is a clone
of link tree let's see how it can do and so we'll give it a moment here to generate okay not the fastest but let's just see how good the code is we are also using the I guess trial version so maybe yeah it says yeah maybe it'll be different here so we have a bunch of output which is fine okay where's the code this is the go implementation that you'll use okay where's where's the code give me code please I definitely use the control I if the control I worked maybe it's control shift I nope
still doesn't work so we're not getting any code okay so so far that's not good um well okay I'm going to open up the command pallet we'll try uh assist here and this says control enter let's try control enter what we get there nope oh I was so excited for this to work well code assist generate code okay so I not sure why control I control I control I contr I it just doesn't work so let's go to our settings here I suppose maybe something's overriding it hot Keys okay why is my control I in
PS code not working because maybe something's like overriding it toggle keyboard shortcuts trouble shooting okay so I'm going to go over to here developer nope try this again developer toggle shortcuts troubleshooting okay what did that do nothing I'm not sure what that was supposed to do but that's what the internet told me to do so that wasn't very helpful um so it's kind of hard because like if I can't do the control I how am I going to do this all right so what I'm going to do I guess oh here it is ignoring single
modifier control due it to being pressed with other Keys oh it is actually outputting stuff below okay extension Vim control I okay so maybe it's Vim that's overriding it let me go ahead and just disable them and we'll go ahead here and just say disable and I'm going to refresh the or re reload the work window I mean that's unfortunate that if I can't use Vim with it um I don't want to do an update just now we're giving this a moment for Gemini sis to reappear here oh open a file get code suggestions as
you type okay so we'll do control I here now okay so now we got something so do I want to generate code absolutely I want to generate out uh code so um build a simple go Lang web app that is API driven which is a link tree alone okay let's see if we can do that the code was fast we have htttp so it's deciding to do everything PL Jane and I mean we did say simple so technically it's correct right it it did the most simplest uh thing it can do it didn't use Jin
it didn't use gorilla um I liked the gorilla code but it was doing exactly what we told it to do so I mean that's good okay so I'm going to go over here like how do I um I have my app code but how do I run it okay so that's what I'm going to ask it ask it to do like it literally it's using all built-in libraries which is really impressive as well I'm not sure why this is so slow the code here was instant but this is incredibly slow oh you can click that
to get that to that as well okay [Music] um okay so talking about running in in a Docker container I suppose that's a way that we could do it and I guess if this thing is focused with g-cloud then that kind of make sense so here it says for local development is go build and then running it that way so I would say the output's not good I like that it's bringing in Google Cloud stuff um but I didn't ask for that in particular so though I was pretty vague in terms of how I was
describing stuff um so I'm going to go over to here to terminal and we'll just CD into the Gemini one here I get CD back here and so this is going to be well we have main.go but we don't have our other files but we don't really need any other modules so I don't think that matters um what was the line that we have to run it's like go main go or something go run maybe it's like G run main.go well we have to CD into link tree first so we say G run main.go and
so that should start up the app and it did start it it was super fast because it's using all built-in libraries we have a 404 which is totally fine um it depends on where this is started so if we scroll on down here we have for SL links and for add so we go here was just say links and we get null back because there's no data right if we go up to here we don't even have um any kind of data being inserted which is fine I'm going to go here and just say generate
so we'll say generate and I'll I'll just say um the app should use SQL light 3 um okay so hopefully that's enough so now it's bring in the go SQL light and now the database SQL which is fine it has the initialization function to initialize it I don't really want it to initialize in here but I suppose that's fine add link Handler yeah I like I don't really want that code in here um so I don't think I take that code I'm can decline that I don't like that so I'm going to try this again
say I want to use SQL light 3 to read the um to read um data from the database for links and adding links I don't want to initialize the database in this file okay now the other thing I don't like is that this is very specific to um a single file and where the other ones they seem to be a bit more bit more intelligent we have that go Imports again I can install it this time because if it can bring in the Imports manually I would love that but it complained here so that's no
good um okay so I'll just accept this do I like this I don't know we have a double Imports here normally there just be one so that's not oh I don't have Vim I had to turn Vim off to use this thing now to be fair I would have had to turn off on the other ones I just didn't have a thing so here it says create table if it does not exist I told it not to do that I told it not to do that and it still did that anyway um but it does
have the select from URL links the code looks okay I'm not sure why we're getting problems over here okay um I guess it sourced it from here so it's saying I grab this code from here and just understand that this is where I grabbed it from okay that's fair um I've never seen one do that so that's interesting if you're concerned about where it's getting code from okay so that that looks fine the only thing is that we don't have a database so I'm going to go and ask like can I get uh a command
that will create and Seed the links DB data can it be uh any bash command okay let's see what we get here no that's not what I asked I need some seed data so the thing is like this thing might be more optimized for Google Cloud as opposed to writing out code uh which we're not really doing right now so it may not be fair of an example here we do get um a file here so I'm going to go ahead and just call this links. SQL and I'm just going to that was kind of
interesting okay insert that there um and then we will do this so I'll just bring terminal up here for a second I'll go to my other tab oh I'm running both this is link tree this is Amazon developer Q I'm just going to CD into the correct directory here Gemini code assist and we'll paste it in here um I didn't call it C data I hold it links so we'll just change that out like this enter no such file or directory are you sure uh oh you know we're not in the right directory that's why
no such file or table called links I was hoping that it would create the table as it would do that it didn't um it would create the table if we ran the app first so I'm going to go stop it and then run the app and that way it'll just create the the thing that we need it doesn't have the ghost light anything here so I'm going to go ahead and just paste this in here and what I wanted to do is to give me that line to install it as we have done that prior
in another video this is the line I wanted this is correct I like how it has terminal here and I hit run and it ran it in place or pasted it in place that's nice because sometimes you copy paste and it's annoying especially on windows so it should install it and set up the main doso or whatever it's called is that what it's called the other one main dot or um go dot mod is really hanging here I'm not sure why because before we initialized right so I think the reason it's hanging is that it
needs to initialize um here so I'm just wondering what that I don't have a go. file how do I initialize one okay so I'm G to stop this this isn't doing anything like this Line's fine but if you don't have the mod. go file it's not going to work here it is go mod initialize now hit up and notice it worked instantly right so it was definitely missing that let's go back and try to build now um is it running over here no so we'll go here and run this here unexpected type name line 69
we'll go over to line 69 we'll take a look and see what the problem is we can already see that there's a highlighting issue right here and it looks like some code is missing it looks like we have double code okay so I don't think I did that um do we have ad link yeah we have ad link Handler twice so this looks like the old code here I'm going to go down to here and just do this and that's kind of like my fixed links Link Link maybe I deleted out some of the code
that's necessary we have ADD get link handler do we need we need this right okay I just undid and I don't have that double code anymore okay I'll just hit up is redeclared in this block okay so this being declared twice add link Handler and then add link Handler um this is the one I don't think I wanted that's get links Handler and then there's another one right here so these ones look a lot smaller so I'm going to take those ones out what a mess we'll try this again port 8080 um sure we'll open
the browser we have over here let's try links the I'm doing this so that creates the table it never created the table so could I get this working absolutely do I want to get this working no this is kind of a frustrating experience at this time I would probably not recommend at least for coding wise um Google Google um Google's or Gemini's code assist it could be really good at generating out um Google Cloud code infrastructure that's not what we're testing it for here um so I might be misrepresenting it here because it's not necessarily
performant on coding task but this is a bit of a headache to get working and I don't even want to try it any further um so just say Gemini code assist and we'll call that good enough for now and if nobody likes what I did there well whatever I'm doing the best I can here I don't want to waste too much time on this if we can't get it working in a state that we want but there you [Music] go hey this is Angie Brown we are going to be taking a look at codium wind
surf editor now codium uh is a extension that works just like GitHub um co-pilot or Amazon developer q but apparently they now have a full-blown editor you can also do the codium extension but I'd rather go for wind surf here today um and let's go download and give it a go and see if we can build our little goang app in their interface and get an idea of how the experience is different I guess codium is now kind of moving to the space of what cursor does I really like the idea of how codium has
built their infrastructure they have a really good video online on how they implemented that but let's go ahead and get this installed and once that's installed um we'll launch it up here and see what the experience is like okay there's a lot of coding tools out here but uh you know we'll go through as many as we can here without overdoing it here uh op add add to this stuff now I'm going to leave those alone we'll go ahead and hit next be back here in just a moment as it's installing and yeah it's just
installing here so we'll wait a little bit there we go looks like wind surf is in install we'll go ahead and hit enter here I've never used the wind surf interface I have used codium in the past and I have liked codium uh codium code generation but it's been I don't know at least six seven months since I used it last um I'm just allowing access to uh to wind serve here on another screen I'm not sure why it popped on another screen as that's not my main screen and I can't even move it this
is so stupid I have to hit get started I can't what come on let me move it okay um how am I going to show this off if it's on the wrong screen one second here I'm just going to change my screen here a little bit so you can at least see what's going on here as this is really dumb um this is now over here you can see the spreadsheet of what I used to get stuff started here so I click getting started it says import from vs code import from precursor we'll start from
scratch um actually Vim is my default key binding so I would like that dark seems nice let's go for Tokyo night just because it has a bit of a Japanese tee we need to log in I mean I should have a codium account already this it outs redirecting to the other screen it looks like I might already be logged into codium from a long time ago says hi Andrew let's Surf and now I can move it I don't know why they did it that way I'm going to move back to our original screen which is
this one here there we go I'll move that back over to here and I mean it looks kind of like cursor that's for certain um but let's go ahead and open a folder so I want to open up um that sites directory so I'm just looking for it off screen here and I'm looking for that folder here what did we call that folder Jing Essentials there it is and so we'll select that folder that's going to open up here I'm going to trust I'm not a really big fan of these like allinone interfaces like cursor
um I mean wind surf I don't know we'll find out here but um I don't know it just feels like you're too much into this ecosystem but maybe the experience is really nice with Cascade right with Cascade kick off a new project or make changes across your entire code base oh that sounds fun I'm going to go ahead and make a new folder here we're going to call this uh wind surf we are using codium but we're going to call this wind surf and I'm going to make a new folder here and this is going
to be called um link tree clone okay is that a file or a folder I can't tell I'll make a new file here called main.go okay this is something we've been doing constantly here uh we'll install the go extension down below I do like the interface that it's very uh Compact and we have Cascade up here I'm not sure why it's called Cascade but that's totally fine I want to go over into here and I'm going to just say it must be the name of their chat I suppose and so I'm going to just say
I want to build a simple uh web app uh that is apid driven uh rid driven ridden and goang I wonder if we could bump up the font here this is a little bit small ridden and goang I know there's like a write and a chat mode when right mode is turned on casc will be able to make changes in your code but we imported an entire folder so I wonder if it's going to get confused about this the other thing I noticed is that we can change the models there is the Cascade base model
so Cascade base does not use premium user prompt or premium flow so I could have swore that this is what um I thought codium it was like engineered for that but anyway we'll see here I want to build a simple web app that is API driven written in goang uh to clone a link tree um app so I guess it's just using Cloud underneath but I really thought I really thought here that um codium so now I'm confused we have the AI That's in your ID of choice Forge in browser grounded chat okay okay then
what's Cascade so now I'm confused let's take a look at the pricing well it's free forever on this Cascade credits I don't know now I really don't know that's okay let's go back over to here and see what we have um it looks like it's working in this folder which is good I'll help you create a link tree clone using go API backend simple front end let's break down the code of these steps I see you have a main.go file a go backend with rest API end points a simple front end um MH I like
that it's going to produce a front end that's kind of cool this is so far pretty cool um suggested terminal command not necessarily because we want to use um uh I don't want to reject I don't want to run that command we just accept all but let's take a look at what we've generated out so we have Gorilla which we've learned this is obviously for Cores because they built the front end for us which I'm really surprised we have our structure which comes from uh I think it's part of gorilla whatever their om kind of
solution is here we have some uh filler data which is fine we have get profile update profile ad link we have some links in here and those look pretty good we have cores option which is really really good like I I really like this code this is really really good let's go look at the static file I'm curious what we have have here um very simple application we have plain JavaScript wow they're just like hidden home runs here now you might say well Andrew wouldn't you prefer reacting whatever whatever I guess so but the fact
is the fact is that I kind of like this because when I started a project I like to use the simplest type of um JavaScript and HTML keep everything minimal so I would say like this is a really good balance of like using a higher level abstraction but also keeping the code minimal but of course this is just Claude Sonet right so we're getting like really excited here for Claude Sonet um but I can't really determine where wind surf starts and where wind surf ends um because that's not that clear to me so maybe that's
something we should read a little bit about before we proceed here um because I just don't understand like what what is it that wind surf is like doing that's so special here compared to whatever so we have casc collaborative agentic chat experience oh so these are probably its features okay okay let's go ahead here and take a look so I guess Cascade just must mean its actual flow of how it actually works so only available in Wind surf editor uh agentic capabilities unlock a new level of collaboration between Ai and human what can it do
fold contextual awareness okay suggest and run commands pick up where you last left off multifile editing the experience is really nice so I guess it's just it's like an enhanced chat is what they're trying to say there on the right hand side um updates the schema type of Sanitation rename variables automatically bind handlers okay I mean pretty pretty standard stuff here um codium autocomplete I guess the only challenge here with like since I don't know goang that well like writing it day in and day out it's like I never know really what to write and
I'm so very much reliant on it the stuff here but let's go back over to let's say toggle Cascade with contr L does that work it does no conflicts with my Vim commands and we're still in the main go so I would just say the one thing here is that um I want to use SQL light SQL light 3 as the database and let's see if we can do that now can't remember what else it said that it had here besides Claude GPT 40 okay so that'd be cool if I mean these are the two
main ones I basically use more yeah yeah that's the B basically two on So the fact is like if you are instead of paying for GPT 40 and cloudon it you pay for this then it seems like you get the basis of everything and it's really good at putting things where they should go this is super super nice yeah so over here I'm in my plan and I can actually see my credit use okay okay so user prompt credits using a premium model Cascade costs one prompt credit per use so I have 50 more if
I go to pricing I'm just trying to understand then we have like 500 premium models uh user prompt credits 1,500 premium models um I mean that's fine but like the thing is is that if I was using um uh chat like if I was paying $20 a month for chaty BT I'll never run out of a GPT 40 I don't think I would maybe I would I don't know um but just the fact that it's putting everything in the correct place in such an easy way is really nice so taking a look here um nothing's
changed here so I'm not sure why I did that uh that's been replaced that's good we have uh that it's actually using link tree which is a much better name here it's creating the tables not necessarily what I'd wanted to do but that's fine is that definely way easier than working within CLA so I would say I don't want the go mod tidy because again I'm going to run this in um uh WSL right so I'm going to open this up maybe it doesn't know that I'm using WSL so I'm going to go over to
um ubu here okay does it not have that extension installed sure we'll install it even if I pretty sure it already has that that's fine um so now what I'd like to do is CD into wind serf link tree clone okay and so we have code here the only thing I would say that we don't have is well actually no it does seed it does set up the tables initially okay I'm just going to say uh can you in the readme file can you can you create a read uh can you create a read me
with a bash command to seed the database okay hopefully it creates n SE to the database I'm very curious what it' be like if I switch it over to chat g chat GPT like GPT 40 but um anthropic CLA is working pretty well I also kind of Wonder like this is obviously not the agent it's API end point so maybe they're doing some kind of other magic underneath uh so we'll go over to our our readme.md [Music] and it's okay but it keeps thinking that we're using Windows and we're not using Windows so I want
to go over to here so this is go sqlite 3 yeah that's fine um why does it keep suggesting tidy I think again tidy is specifically for um I think tidy is for Windows goang tidy and there's just a bit of confusion there no that's for tidying oh maybe that's what that actually is it's the Tidy mod attempt to proceed to uh so tidy matches the source code in the module adds missing uh Missing module requirements necessary to build the current's module package and spenden oh okay that actually doesn't seem like a bad idea let's
go ahead and run that it's saying uh mod not found cannot be installed um can be installed with this um okay maybe we would had to install that mod another way here now let's install the required go dependencies and run the server and so here it's saying to do go mod [Music] tidy okay um go mod tidy go mod tidy go mod tidy I don't know I for some reason thought it had to do something with SQL light 3 I must have been reading misreading something earlier but anyway I guess that got everything we wanted
installed the other thing that we need to run in our readme here is um uh this here I'll paste that in down below enter and so now we have our link our our link Tre DB here we can go get our extension which would be our go light Vier and we'll go to this one as this one kind of works good enough if you find the better one definitely tell me it's not my favorite but it works and so we'll go over to here to link tree I just want to make sure data is there
so we have links and a profile that's really good we're going to go back over to here um and if we want to start it just be go go main run go now that's going to start the back end but we also have a front end which I wasn't expecting to have um but we can test it here in just a moment I want to go ahead here and just open another Ubuntu window and I want to see if we can get the profile information I also kind of wonder if it's polling it's it's taking
any kind of context of our existing code I don't think it has but here it's saying on Port 880 this is running on oh it's still starting so we'll give it a moment here to get ready okay all right so that built let's go ahead and oh module requires go 1.16 oh well what version of go did it set it as this says 121 H 121 okay so what we'll do we'll just change the Go version let's see if we can do that can we just do that can we just change the Go version like
that go mod tidy go run mod. go or uh main.go main.go can we do that I mean it just says like hey I want this very specific version but um one thing I would like to check here is what other versions were we running before so if we go into any of them like Gemini not Gemini the Amazon sorry Gemini I was just not not good whatsoever this one's saying 113 I put 116 as the minimal here um I'm wondering also if we go to GitHub co-pilot where we had moderate success this is 118 so
there's some variations here but um it seems like if we pick one of these versions then we'll be in good shape so 116 was the minimum that it wanted to be um I'm assuming it's building unless I need to delete the goome file yeah like it's not doing anything so I'm what I'm going to do here is I'm going to delete out the goome file I don't even know how I know what I'm doing but I do and so we'll do go go tidy go um was it Go mod tidy there we go and so
now we have a new sum file and we'll go ahead and run go main.go and see if we have some success here one other thing I kind of Wonder is like can we see what version of go we're running Go version is that something we can do here probably not probably doesn't work the way I think it does but anyway we're just going to wait here a little bit and see what happens I still don't know why it's not working I've been waiting here so what I'm going to do this is a bit of a
cheat but I'm going to go to another one what we know that works like this one here um not that one but maybe the Amazon Q developer one as this one here oh no it actually worked never mind it's saying module requires 1.16 for SQL so the issue is specifically with SQL so I want to go back over to our uh go mod here and it doesn't like this version of SQL light I suppose we could just ask it to fix it um I just don't want to waste our completions but I guess we have
tons of them so I'll just go ahead and do that okay um let's give it a try this we also have to accept this first though where did it insert it uh we don't want to inserted there that's for sure so I'm just going to clear this out here for a second I think I didn't understand what I wanted when I said insert but that's fine go mod tidy go mod um G run main.go we'll give it a moment here I mean it seems like this one's actually working so we'll just wait a bit all
right so um oh it says there's already port in use so maybe I have one running from before that I need to stop let we go ahead here there we go and we'll go back here and we'll try to start this again there we go now it's serving on Port 880 uh that doesn't give me a little popup that's okay we can make our way over to our own browser over here we'll just say Local Host 880 oh yeah we have an interface what how do we have an interface oh you know we didn't really
review the code but I bet it's serving up the uh um the HTML let me go take a look here I would have thought because we said it to for it to be API only but let's go take a look here um maybe they have like a single end point for it so it's mostly API driven so have main API profile API profile links path prefix port 8080 add the link okay but what's serving it [Music] up maybe just nose to serve it because it's using Gorilla MX yeah because we didn't specify it must be
it just must know to do oh here it is so serve St files at this location okay so they added it here and that's how it knows and so that's how we have it that's really really cool now here's a question um could I actually update any of this stuff now we don't have that in our interface here um but if we were to go take a look here we have API links and we have ad link so let's go over to our read me and see we have any examples Okay add a link so
this would be for Twitter we don't have Twitter uh up above here so hopefully that just works so I'm going to go over to our other tab here we'll go down below I'll just paste it in hit enter and we'll go back over to here give us a refresh and we now have Twitter so that's pretty sweet it just works okay so that's awesome um yeah I guess that's basically um wind serve I just don't feel like we used it to its full potential um just wanted to take a look here just because did such
a good job working with Cascade in such few lines that I felt like uh we didn't really get to fully experience it but if we right click here I'm just curious like does it have any of the other commands we have chat with Cascade um we have go generate unit test for function but that's specifically with go this is not necessarily with that so I don't know I don't feel like I fully used wind serf but this experience was really really good yeah so maybe if we were coding something like Ruby or python as we
were writing the code and we having more direct control over it we'd have a better thing but yeah this experience at least in this limited limited experience was pretty good for me and so we'll go ahead and commit this um yeah I think that's my favorite so far has been wind surf but we'll see maybe cursor is going to change my mind here um it doesn't have access to uh that that's totally fine I'm just going to go cheat and go over to here because these are shared repos and I'll just add it over whoops
not here but I will go to this one here because it's the same source code right and it should show up here oh it's right there okay we'll just sync the changes from over here and I will see you in the next one okay [Music] ciao all right right so in this video I want to take a look at um cursor so cursor is very very very popular but in my experience I have not liked cursor as much as other folks have um but you know we'll give it another try and maybe it is better
from my last experience what I found was that when I was typing it would get in the way however since we're working with goang I don't feel like we're using the inline code feature as much so I might have made a mistake coding in golang as opposed to something like Ruby or python but I'm going to just keep sticking with golang here and we'll best evaluate this tool um and see what it can produce cursor plugs into other um other uh um llm so if we're using Sonet then I would expect that it would have
very similar results to to codium but I really did like the codium experience uh but we will see here with cursor in just a moment and I am just going to install the latest version I already have cursor installed but I'm not sure what version I have installed so let's make sure we have the latest uh set up up here and uh we'll get to it okay all right so we have cursor here we'll let it uh you have Windows system at WSL so it's getting the WSL plugin so it can work with it um
and so I mean this experience looks pretty much the same it's even opening up code that I had uh that I had ran here before but I'm going to have to open up uh that folder as we've been working on it so this has been called The geni Essentials if I can find it here here it is so we'll go ahead and select this folder okay and as per usual we're going to continue on doing what we've been doing so this is going to be um cursor cursor Ai and I'm going to go ahead and
make a new folder here we'll call this the link tree colog and then within that we'll have our main doc go now I remember when I first used this they took me through a whole tutorial to tell me all the commands I don't remember what the commands are um so I'm just going to stumble my way through here uh we'll ignore that that seemed like it happened to the other one as well we'll open up our terminal make sure we're over in WSL so we're not confused um so we close that out there and so
the question is how do we bring up our stuff here that's remote controller I'm trying to figure out how do we bring [Music] over the chat so so we'll go down to here in the bottom left corner chat cursor and they're right away not making it easy if you don't have the tutorial from the beginning then you wouldn't remember how to do it but somewhere in here you would think they would tell you the options so now we're going to have to look it up so I'm going to go here and just look up cursor
hwys or maybe just go to the docs really quickly General accounts usage okay what are the hocky command K is one of them I want chat though overview cursor chat lets you ask questions and solve problems contrl L okay so we have control K and contr L so we'll do contrl L I'm not sure why it's contr l but that's fine and notice that we can change to whatever we want so if we wanted to use claudon it we can right now we're set to a cursor small to be fair if we really wanted to
compare this we should have used this against um uh uh whatever the model that um uh was it uh that um codium use but if we wanted something similar I would imagine we just choose Cloud Sonet here so I'll just write here you know I want to build I want to build but you can already see like way more options right tons and tons of options I just don't like how the font is so darn small they just assume everyone's eyes are stupidly um like yes when I was 19 I wanted everything small like that
but folks have some de decent sizes by default okay so I want to build a simple uh goang web app that is apid driven which is a clone of Link uh clone of Link Tre okay so we'll go ahead and do that let's see it go off and do some cool things we'll zoom out a little bit here I'm trying to zoom out while this is going here I'm not sure what composer is what's the difference and while that's going let's go take a look at what composer could be does it say anywhere here composer
here it is composer ER is your AI coding system that lives in your editor it helps explore code while write new features and new comparisons so this might be something that is more comparable to um what wind surf what's wind surf's uh whatever they called their thing Cascade was so this is fine but I'm going to go and I'm going to take this and I'm going to actually try this again because I want to give cursor its best chance to impress me here we're going to go over to composer and we'll nope that's not what
I wanted we'll go back all the way up to here and come on copy copy copy copy copy copy copy copy there we go and we're going to stick with Claud Sonet here and let's go ahead and submit again there's also different kinds of Agents normal oh agentic okay so I'm not sure what the difference is we'll do agentic so now I'm hoping that this will do something similar to Cascade right oh oh okay so it's really going through the agentic process all right that makes sense kind of like having v0 in place okay so
we have one file here it's still going so let's let it keep going here okay so this is more close to Kin with what we're looking for I'll just accept that it's bringing gorilla MX as well um it has kind of similar is active link store don't like the name but that's fine responds with header it's okay I suppose is this actually um it looks more database driven that's for sure it's not like staging any pretend data we have more API endpoints so maybe some of this agentic stuff kind of helps we're going to go
over here and just say Okay I want to use um esql light 3 also can you um as the database also can you uh create a readme file which has um a bash command to seed the database so we'll go ahead and give it that what I think I kind of Wonder while we're doing this if I right click here no I was soing like I really like that cont that context menu when you right click and you can choose options that the other editors have um so it's it's like hard it's like there's things
I like an Amazon developer Q or Amazon Q developer but I also just like that this is happening and is working correctly last time I used cursor it didn't have this composer mode and it kind of dumped things in places that I didn't want it was very frustrating I think it was like the live coding that was really annoying for me so here we can go ahead and review it quickly that's cool um database SQL nitb looks fine um okay that looks fine sure we'll accept the changes and I guess I'll accept the read me
as well I'm not sure that that seems okay and so we do have some steps that we do need to perform because it didn't create our um that one file there that we should have it's the go and what is it it's go and nit I can't remember what it is it's how do I initialize my uh go dood file I can't remember what it is like go in It Go mod in it or something I it just created it for me so that that's good as well so I guess I I'll accept that it
also took took in some versions probably would have been better to run them because we'd get a a much newer version here but that looks fine let's go ahead and type in go oops go run main.go oh we're not in the correct directory so I'm going to curse I go into cursor in the link tree it we'll say gun main.go so we'll see if that starts up and we'll give it a second here be back in just a moment all right so it says module requires go 116 this is an issue we've seen before um
let's just have it corrected our uh have it corrected for us here we'll go ahead and do this and oh my goodness I hate when my keyboard does that my keys just stick and I'm not sure what it's doing now I just totally accept what it already had here okay okay just cancel this cancel this I don't want that um let's go ahead and I'll just tell this here I already accepted the file so I'm not sure this is where again I don't like cursor's interface like it's it's taking these commands here let me just
undo here undo all I wanted to do was copy this part and then paste it in just say reject I'm hoping that didn't uh undo those files okay problem with go okay so we'll go ahead and accept that it just changed the version I'm not sure if it changed the esite version as well but I guess we'll find out is it still generating okay accept and I guess it's like running stuff for me okay great can I now do the stuff that I want to do there we go and let's see if it builds this
time if that's the issue then it's going to be the go ESU light version that we'll have to tweak we will give this a moment to do whatever it needs to do all right so that didn't exactly fix our problem I'm going to I guess try again same problem okay now we're just wasting our credits here trying to figure this out but um now it's tried to change the version okay so I'm not sure if it's just gone down a version that's going to help but we'll see here I'm just waiting here to see if
there's anything else I have to do maybe go go mod tidy something we should do go mod tidy go mod main go go run mango sorry and so we are getting into better shape here however I'm running something on Port 880 it's probably from codium there we go so I'll just close out codium or wind surf we should say wind Surf and we'll go back and hit up and now we will attempt to run this again this is now 8080 um and I'm going to go over to here and we'll take a look here 8080
it'd be fun to have a little front content uh just because this is in cursor and I want to compare it even closer to wind surf cuz like can you create a uh front end for this app I think that' just be nice to have it okay so it looks like it's doing something similar I just don't know why cursor sorry codium was able to do in one go in this one I have to ask it but maybe if I I did these things a few more times we would find out for certain I'm going
to go accept that stuff assuming it was I guess it wasn't done it's interesting that I can accept it even though it's not done but I guess we'll wait a little bit here as you can probably tell I'm not a fan of cursor uh like it's fine it's like it's doing what it's supposed to be doing but uh oh my goodness this just goes and goes what how much stuff are you making here oh it's a making a mid panel okay I should not complain maybe maybe we'll get something cool here still going this one's
a hard worker I'm gonna pause until it's done all right so I think it's done we're going to just accept the changes and so now we have a front end I'm going to stop the app I'm going to start it again hopefully it still works um though we didn't really test it in its full capacity here um I'm also going to seat it with data assuming it gave us a seat example I think I asked it to do that for us here uh yep so we have an example here so I'm going to go ahead
and copy this and we'll paste that in here we'll hit enter no such table links because I was hoping you'd make the table maybe this thing uh maybe in the code it makes the table for us we actually do have a table it's right here you know what the problem is it's me I'm in the wrong place okay when we first started that app it probably created the table and seated it there so go ahead and hit enter so now that data should be in there I'm not going to check I'm going to assume that
that worked and I'm going to save all files it looks like some files no maybe they are saved I thought maybe there's some here that aren't saved because of the these circles but I just might be uh not sure what I'm looking at here okay so now let's go take a look at the app and see if it works that looks really good okay that's actually pretty good let's go to the mid panel see what we can do so we'll say exam Pro Training Inc htps www. exampro doco add the link I mean it added
the link so that was pretty good we go back over to here we have our clone so it works it's fine um I don't know I guess it did what it said it would do um I still think that probably I could get this to work with codium um but yeah I guess it's fine I just don't like the UI and the waiting around but the agenta coding was was pretty good um is this better than something than let's say um what do you call it uh lovable or let's say versel zero I don't know
like versel ver cels vers vzer seems like it's really focused on xjs so I feel like if that's your your what you'd like to code in then probably that's what you utilize lovable is more tuned for super base I haven't really tried generating a back ends with it but these tools like cursor and codium um definitely seem like they can handle back ends and front ends pretty well we didn't really give Amazon developer que a chance to make a front end so it could have done an okay job um but yeah you can see that
there's variance of here of all these things and so I don't know not sure what to say but if you really want you can install all of these things and then use all the free compute until you get something working which is interesting but we'll call this video done and I'll see you in the next one [Music] okay all right let's take a look at source graph Cody I didn't even know about this but uh my friend Kirk on Twitter was like hey did you not check this out and so I figured well let's give
it a go as I don't know if it's open source but it seems like it's just another one that we can utilize it's interesting there's a virt oh it's a virtual code AI Summit that's a bit silly let's let's go ahead and connect with um Google and let's see what this thing is all about um there's a different tool every other day here but interested to see how we have so here we can install it into vs code I mean that's how I want to use it um upgrade unlimited etc etc let's go ahead and
install that into vs code okay and so over here we have Cody we're going to go and install it and close this out here as we did have Gemini goodbye Gemini you are the worst of the worst Gemini I'm so sorry and so AI that uses your code Bas as context Cody is an AI assistant coding assistant that helps you understand rate code faster you can choose between gp24 and Claude Sonet sounds a lot like the other two oh look at that interface over there that looks really exciting so we're going to find out here
pretty close we're going to go ahead and sign in and see if we can get in here um I use uh Google so I'm going to make sure I connect with Google here so that my life is super easy we're going to go ahead and authorize the vs code extension we're going to go back over to vs code we'll hit open again and I believe now we should be enabled there we go and it's saying jingi Essentials now technically we really want to be specific to a specific repo but hey if it picks up on
all the other stuff that's great for it we have document code explain code find code smells generate unit test this is looking a lot more similar to Amazon developer Q in terms of the stuff um I really want to be specific in terms of what I add but I'm not sure how to change this so I guess I can't easily let's go ahead and just drag can I drag this can I move this can I move this window let's move it here to the right I'm just kind of getting used to things being on the
right here get out of here Gemini I thought we got rid of this let's go ahead and give this a Reload so we do not see Gemini anymore we'll close all of these out here I'm going to make a new folder on the left hand side I'm going to whoops I'm TR why this is just hanging out over here in the loose we delete that out of there that's okay that's okay we'll go ahead and make a new one this will be Cod and I'm going to make a new folder here called is that a
folder or file new folder link tree I mean if they're all using plots on it they're all going to perform adequately the same uh main.go uh Cody what's going on here okay great so now it's in the context of main.go which is great I want to start generating some code here so we got prompts PL Sonet oh we got a lot of options even mistol I I think this is the one with the most most options I've seen so far this is really nice obviously Claud Sonet is very popular so we'll go ahead here and
say I would like to um I want to build a a link tree clone uh using goang uh using goang and SQL light three I'm working a little bit different here so but let's Let it go ahead and do that okay and so we have chat here I thought I would like insert stuff or do something there um I guess it's not agentic at least the way I'm looking at it I don't see any agentic stuff here Cody commands new chat edit code document code explain code generate test very similar to Amazon Q developer interesting
um again I'd like it to just insert the code at this point just because all the other ones do well let's just hit apply okay we'll hit apply okay so it does do it and we accept it again and then we go to the next one we'll say execute hopefully it um it's running in the correct context here confirm and run yes now this is not in the right place so it didn't know to go to the right place so we have to help it out here and we'll try this again Cody link tree I'm
going to just switch them both over so we're not getting confused here all right and so we'll go ahead and try this again I would say I like I like the UI here it's pretty nice so now it should have that file in here does it let's give this a refresh there we go we got a go mod we will execute that it kind of feels like I'm doing the work too like it's not and I mean that in a good way like I know what's going on here um then we have run go I'm
really surprised that's all we need maybe that's all we need right this isn't using um gorilla now it is making the link tree. DB which is fine um it is creating the database not exactly how I'd like to do it that's totally fine um I don't really want to start it right away it's say okay can you also code uh a front end for me so I wonder if it will know how to do that yeah so here it says index HTML but where is it going to stick it is what I wonder also look
at the file over here it's in the wrong location I need this to be in this folder here so I'll move that here uh well hold on a second yeah yeah yeah so this is the code that it just generated I believe it is and so this is not in the right directory so we'll move that there say replace and I mean I do want this but where's it going to go okay so not putting it in where I want it to be there probably is a way to help it contextually know where to stick
things but because right now it's set up here and I'm again I'm not sure how to change this current Repository yeah so it's not like it's a per folder kind of thing it's very specific to that I mean as long as the code works that that's all that matters wouldn't it have to change some of the um the front like how's it going to serve it would it know to serve that okay like how is it going to serve the static page oh no right here it's already there okay so maybe it already updated it
okay so we have index.html um you know can I can I get a read me of a um bash command to seed the database okay and while we're getting that I'm going to go ahead and let's go try to start this up so we have go run run main.go oh there's um what do you mean expected package it's right there is it not we go here package main right try this again go run main go oh the same things are undefined oh interesting um okay so we'll make a new file here called readme.md and in
here we do have a bash command so we'll go ahead and paste it here like this I guess I did say like bash command but I didn't say a file I said can I get a read me like put it in the readme that's not what it did also this file is incomplete go back to this one so go tell this here like these things are missing it says there's no context fetch so because I moved the file now it's confused okay so now I just feel like I'm using claudon it which is fine Clon
is good but now I have to do the work to put it in the right place and so this is where I'm going to stop because I kind of feel like I mean we can bring them over here and just dump them in but I'm starting to feel like this is not uh doing what I want it to do and now I'm doing more of the work let's go ahead and run this you know what I mean so I got to go here and say okay now need time now I'm just I might as well
just be using CLA separate from here and copy and pasting in yeah yeah okay so we're going to call this done um I'm going to say Cody is okay but it's um not really doing what I wanted to do but the problem is that I'm working within a folder of a repo so maybe it would have an easier time if it was the top level repo but it doesn't seem to be as sophisticated as something like codium or cursor whatever they're doing under the h with if it's agentic workflows or things like that um but
anyway some of the experience was a little bit nice there but just not enough to uh be worth our time [Music] okay hey this is Andrew Brown this video we're going to take a look at stackpack which is a way to generate out things like infrastructures code and other things I I know this originally for it Genera out IAC specifically terraform but I do believe since I last looked at it that they're extending into to uh agentic models for building or to write out your devops code for a various amount of Frameworks I believe this
is a new feature so it's not 100% done so we're just going to focus on its core functionality let's go take a look here and see what we have so I'm just going to go over here to uh I've already made an account there's you can make a free account here but I guess we have to go ahead and create our first flow UI might be a little bit different based on what you're doing but I'm going to go ahead and just say my simple flow notice that these are public right now so um other
people might see what we're generating out here on the left hand side we can drop down and change what we want to generate out whether it's GitHub actions Docker file kubernetes or terraform um or we have these three options we have generate and modify and ask and so you know if we want to generate a ter form we can do that and so I might say um I want to deploy a rails app to AWS um using ec2 and we'll go ahead and enter and so that's going to generate out some code for us and
we'll wait here just a moment this project is really interesting because uh George actually will be end up talking to him in the ji boot camp and he's going to share with us his wealth of knowledge in terms of uh structured structured generation and uh agentic workflows um the approach that he takes to building this stuff is really really interesting that's one reason why I wanted to show this product off as we will actually be talking to the creator of it um but here we can see that it's ads ads region instance type key pair
Security Group IM roll things like that um we have a security group set up here we have an IM roll and so it's pretty straightforward in terms of what is going on here it's choosing Amazon Linux two which is fine uh you know I might want 2023 here but again it's based on the knowledge of what it can pull at that time but here we have some basic instructions for setting up a ruby app and that looks pretty much right to me U I'm not going to go through the whole shebang to deploy this but
if you are looking for an i tool specifically intended for I this one is really good um my difficulty has always been trying to use the generic ones to generate IC um one uh hard part has always been terraform to Google Cloud um or even OS using Amazon uh developer Q in the past has not or Amazon Q developer has not been consistent so this is a specialized tool for I um if you are looking for something there but just playing around with it get an idea of what that is um it might be useful
for you to look at [Music] okay hey this is angre brown and we are taking a look at v0 which is uh a product by versell to rapidly um well rapidly build out um applications uh I would say that I wouldn't call it an end to end solution but it's it's pretty darn good in terms of uh building things what what I want to do here is I'm just going to keep using the same example that I'm using the app prototyping it's like build me a clone of um build me a clone of Link tree
I should be able to set a tiny profile and a few links uh please code me a back end now I've never actually asked it to do a backend I always usually have it for frontend code but since this is a for sale product I'm hoping that it will produce whatever the framework is nextjs um but we'll see what it produces and hopefully I don't run out of my free tier while working through this um but we will wait and see what it produces okay so there it goes it's writing okay so far it's doing
placeholder content which is not exactly what I want I do like that I can see the structure of what I have here on the Le hand side and it is separating things out into Data so it it appears to be using I think nextjs again I'm not super uh familiar with nextjs but um liking it really so far the last time I used this it wasn't as detailed it just like generate individual components but right now it's doing a good job okay and we got another one here profile it looks like it's hitting API profile
that looks goodh is it still ring at the bottom there we go dation class creates a basic uh link tree clone with the following here display user info things like this add authentication to so only people can uh have that add more styling but the question is does this actually use I really like this toggle here but does this actually use um some kind of backend because this is Json here right that it's loading which is fine and if we go into our routes we have a get request yeah this is a next server so
this is a next server request data okay I would say it's not persisted per se because it is just referencing data in a dayon Json file here again I'm not an expert in next but I've seen a little bit of it so I'm I'm quite comfortable with it so it's loading the profile information here but if we were to post information it would stringify the information it actually would write to the file so actually this would persist well at least here it might I'm not sure because I'm not sure if you can write to files
here but let's go ahead and just say um exam htps www. exam. um and it is there so yeah I guess that technically does does work um now yeah this is good enough actually to be honest um so now that we have this I probably what I'd like to do is get this in a into a GitHub repo so let me see if I can sync this so we go up to here we can add it to a code base run this command to add this block to your code base uh that's not exactly what
I want to do but let's go over to here and take a look oh wow so maybe what it's saying is like I could literally use this one liner to add this entire page this entire block to my website that's actually kind of interesting but what I I wanted to know was like is there a repo I didn't mean to Fork it that's totally fine all right so what I'm going to do I mean I really do want to deploy so I already have a versell account if you don't just go create a versell account
they're free I'm on the free tier right now we're going to add this to versel we'll create a new project it'll just call this uh versel link tree okay we'll go ahead and create that you would have to make a versell account separately from this I'm going to go over to versel and we'll take a look and uh for those who don't know the Gen boot camp is actually running on versell I just found it really easy to do that um so right now this is not connected to repo so I suppose this would be
the way that you would do it but is this deploying let's go take a look I mean it's here but yeah I would not call that sync we also have a console which is kind of interesting but how do I how do I put this in my repo oh it's over here maybe no no no no okay so v0 push to GitHub can we do that install fresh copy in the new project okay so basically the way we'd have to add it is literally the way they're describing which is add that Shad CN thing so
I'm going to go try that really quickly I'm going to make a new repo here I can make it public it doesn't matter I'll put it under Omen King here today this will be uh versel or v0 link tree example and I'll leave this as public for now I'm going to go ahead and um create this repo and I'm going to open this up in GitHub I almost kind of regret not adding the dot uh the dot uh um whichever uh I was hoping it would show me like the buttons here but it's also over
here as well I do want to open up G pod so I'll I'll just click that anyway if you don't you don't have that button for git pod there's the git pod um Chrome extension you can get git pod has a free tier if you can't do that you'll have to use GI GitHub code spaces you have to figure it out um as we do use a bunch of different types of developer environments sometimes I do things locally I sometimes just get PL get go spaces I use a combination of things to keep you on
your toes um but you should be fluid you should be able to work in any kind of environment because you never know what you'll have to use tomorrow um but anyway so we have the repo here it's completely empty and so so it's saying if we want to install this we'll bring in this block so I'm going to go back over to here very interesting way to add it we'll go ahead and we'll run this single line I wonder that i' have to create the project first let's go back over to here it says install
a fresh copy of nextjs let me see here yeah I can see having to do this to your code base well let's just run and see what happens without adding uh nextjs first I like okay so install the following packages yes we'll do that I just wonder how it would work if there was nothing I just don't know so we'll just expand this a little bit had a little bit of a weight there but does not contain a package Json yes let's get us all set up if you can create all this stuff for me
I'd be very happy for that um so yeah we'll just let it configure a new nextjs project if you're not familiar with nextjs it is a backend framework written or created by versel obviously very easy to deploy versel uh it's Ruby or railes so um I feel like I can make my way around it I think it might use expressjs underneath I'm not 100% certain but we will let it create and it says it'll take a few minutes so I'll pause so we can speed through this okay all right so we'll choose a style we
maybe with New York neutral I don't really care uh would you like to use CS variables sure why not and it will'll just continue on here again I'll just pause if this takes a long time oh now it's registering now it's installing dependencies yeah this is going to take a little bit of time oh it just finished okay great uh page already exists would you like to overwrite it I don't know would I like to override it I'm trying to find where page is I mean I don't even see it in here um here read
our docs go to nextjs I mean this looks like a very generic page so I would probably say yes okay and so now we have it here um and it's saying there's a newer major version which is fine uh let's run this locally see if we can get it working normally you go to your package Json this would be this is using next so we do mpm run uh I guess in this case it's start and it's saying here canot find package.json are you sure it's uh it's right there I have it open right in
front of us but you know what it's we are in whoa what's going on here there is uh oh it made it inside of a folder it's a bit silly so we'll go here and I'm just going to say mpm install okay and I'll just do mpm Run start says could not find a production build in the next next directory try building your app with next build before doing that okay so mpm run build is the first thing we'll do I I spelled it wrong so if you spell it wrong it won't work so creating
an optimized production uh for the build I mean I just wanted to start using it oh you know what it's probably mpm runev there we go okay so now it's started on Local Host Port 3000 let's open it up um so we'll go to Ports here click here let's see if we have a working application we'll give it a moment to you think and it already has an error so that's going to error out on us it says import profile cannot resolve components profile okay fair enough we'll go into our directory here and we do
have a profile here so I'm not sure why I can't resolve it um it's probably loading in the page. txt um where is this page. txt yeah okay and this is in here so it's interesting the components are on the outside of the app I'm not sure if that's intentional or [Music] not h interesting but it's clearly two different levels uh normally like you might I wonder if we can do an ET sign here let's see if that fixes our problem sometimes uh at time will like if it's configured in typescript this would have to
be configured somewhere for um like see here if this is configured then it would it would default to the root and then that would be less of an issue so maybe the code generated just uh wasn't contextually wear of where it was loading so that we might still have some other issues but for now that should be fine right those ones are using at signs we'll go back over to here and so now our app is here I'm going to say exam Pro exam Pro htps www. exam pro. and we'll hit add so now that's
been added right so that is there um so that's good but the question is can we get this deployed right so we do have it working now here in this state and we have a repo uh we don't have many files that's good so the node modules must be being ignored I'll just but the only thing is that this is not top level right so I'm kind of concerned that if I don't move this into the top level it's not going to work so I'm going to have to grab the contents of all this and
we need to move it up a level see if we can do that mhm that's not exactly working but uh we'll say move the contents of a folder up one level okay Linux and we'll go get a command for that yeah okay so that's pretty simple think I'd know that off the top of my head but nope so say move and it was like my full my app to this oh hold on let's go back here maybe because it has a hold on here let's try this again okay another thing we could do is we
go into here and then I'll do period and then I'll go dot dot for slash okay so maybe that would move the contents that is not working as I I hoped it would oh wild card that's what I'm missing here so let's go back up a level here and so here we would do um move my app Aster and then period okay there we go so now we should have nothing in here it didn't move the next directory though which is kind of frustrating but maybe I can drag that one up no it's not really
dragging but we'll just move that individual one there so we'll just say next because I want to move everything up I'm not sure what I need to move so I'll just move everything out of there we still have a few other files that are there I'll just drag them there it might be easy to do that yes I do want to move them and now I'll get rid of this directory okay it doesn't know about the G uh the dot g ignorer sometimes that happens so just I would make sure it's always the annoying the
annoying part here but I'll just do get status here because it could be a little bit hard to do it here in the left hand side I just want to unstage all of our [Music] changes you know what it's been so long that I don't even remember how to do that anymore un how to unstage changes get wow I actually can't believe I can't remember that that's crazy I think get reset actually I didn't find it but I'm just going to try get reset uh get status okay perfect excellent so now we have the files
here like this and if we add them all back here we'll have less of an issue okay so just say uh commit project and so I'll commit that so now that we have this project and we know that it's working because even if we deployed it and it didn't work we'd still have some trouble we'll go back over here to versel and it's created the project but we have to link the repo so go to GitHub here and you have to use a personal account absolutely have to use a personal account or won't let you
link it up so we have vzero link tree example go to connect that this again this is all free tier so this will not be much of an issue at least I think it's free tier um and so now it's connected let's go over to the projects and we don't have any deployments here um which is fine let's go to deployment ments I wonder if we can deploy it from the interface here maybe not um but we have a repo for it so we've connected the repo I'm kind of doing this in a kind of
a weird order but the idea is like we need to configure an existing repo for versel so let's say like I have a GI I have a project uh that I need to add to versell maybe we'll just install versell we just say mpm install versell hyphen G I'm just guessing here I've done it before it's been a while like worst case I'll just delete that other project and we'll deploy it again so maybe what we'll do I know this is a little bit messy but we'll go back over to versel wherever it is I'm
not sure why I can't find it here click back here we go and I'm just going to delete this um this project here this one here uh we'll go to settings here because it's probably easier just to push a new project from scratch then the way that we're doing it here okay and we'll hit continue and so we'll go back over to here and this is trying to install the global one I'm not sure why it's hanging but maybe it needs some time so I'm just going to wait a little bit here okay all right
that took some time to install but it is now installed so that that is now in good shape here um so I'm just going to I think it's just versel and that will set up a new project so continue with GitHub so log into versell yeah let's do that we'll go ahead and just say GitHub here um that might be hard because I'm in a cloud developer environment so I'm just trying to think of like how can I do this can I even do this in in in git [Music] pod versel login in in Cloud
developer environment how would I do that sometimes they'll have like a flag like like no browser or whatever so this might not be something that I can log into here so what I'm going to have to do is I'm actually going to have to use my local developer environment this is where I'm kind of running into an issue so I'm getting back to this repository here I'm going to shut down this git pod and I need to clone this repo now so I'll give this nice hard refresh oh did we not push our code no
no no no we pushed code for this didn't we not okay so I'm gonna go back over to giod IO thankfully the commits are still here so it's not that big of a deal but I guess I didn't push to the repo so I'll just re reopen that up again no biggie let's give that a moment to restart okay and so I mean I I could have swore we pushed this get push there we go so let's go back over to here give us a hard refresh and so now it is here so we are
in good shape I'm going to go ahead and stop this workspace oh you know what I could have done I could have just opened it locally like G pod has an option to like a local vs code I could have just done that but that's okay I'm just going to clone it anyway it's not a big deal so we get a code here and we'll copy this I'm going to make my way over to um anywhere here so I just need to clone this repo here so I probably have a tab uh we'll just CD
back in here I always like cloning things into a site directory that I have and then I'm going to CD into this directory and I'm just going to open this up in vs code okay and so make this a bit larger um okay so now that I have this open here what I want to do is I want to open up my terminal and I'm going to just get this installed so let say mpm install okay we'll give that a moment once we do mpm install then we'll do mpm install versus ly G but we'll
do this first okay all right so after a little wait there it was installed we'll go ahead and just make sure I install uh versell if it's not installed on this computer already this also did take a little bit of time so we will wait for that to install as well all right now that that's installed we'll go ahead and just type inv versel and it will want to authenticate this time this time will actually be able to open a link and we will authenticate with GitHub as I believe that's the way I use um
versell is through GitHub so it says that is authenticated we're going to make our way back over to um uh vs code wherever that is and we'll say yes go ahead and set that up we'll select those projects we will not link it to existing project we'll let it use that name it lives in this directory so I guess we could have kept in the my app directory if it was an issue but it's not um yes we will work with those settings I assume that it knows what it's doing so we'll let it do
that um which setting would you like to overwrite I don't know uh enter here okay and again because this is for sale and it's nextjs I just hope that it just easily works the first time but we did correct our issues uh when we were in G pod so it should be good now this will probably deploy this to I mean it says it gives us a production link but I'm not sure if it's actually going to deploy initially to production but I'm going to pause here and just wait for this very long building process
to finish okay all right so this is is finished um we do have a production link so let's go ahead and take take a look here so I'll grab this one here if it does Works that'd be great uh we have a 404 which is totally fine let's go over to versel we'll go back to uh the versel logo um I'm going to give this a nice hard refresh here and we have the link tree example here question is is it deployed in production I'm not sure I don't think so there is a commit issue
um the fail it failed to build errors defin but never used so this is the thing with typescript if you don't use something then it errors out when something's not being used which is kind of annoying um so we'll go back over to here again not sure why the build takes forever but we'll go and check out um whever that was where did it say the error was this was in routes. TS so go into to routes. TS and it was saying error was never used okay so I'm just going to console log out error
so that it doesn't complain as much and maybe this will now build properly so sometimes what you'll want to do is do mpm run build and just see if there's any errors here it's just this build takes freaking forever but we'll give it a moment we'll see what happens okay all right so the build seems like it found no problems otherwise it would it would say something here and I think we can go ahead and now just do our verell technically I'd like to deploy this in production but we'll just do the regular versell command
for now and so we did just recently build so maybe it we'll just take that build and push it and so we will wait I know it's going to build it again so we'll wait and see what happens here all right after a little wait we have no errors down here below so it makes me think that this is probably working correctly we'll go over to here and and we'll give this a nice hard refresh and we still get an error I'm not sure if that's the the old one or not so I'm going to
click back over to here and we'll go into link tree and we'll go to deployments and this one's ready so again I didn't say deploy it to production link uh usually you do a hi hien prod but we don't have a URL hooked up so maybe it doesn't make a difference oh no this is actually in preview because it was authenticating here so if we wanted to deploy this for productions for everybody to see we would just have to change one flag um but I'm going to go ahead and just see if I can add
this link and if it will persist okay oh I entered the URL uh maybe this is not a colon there we go and we'll refresh there we are and so I I built my own little link tree very quickly deployed to versel now if we wanted again to deploy this production all we'd have to do is go ahead and do H hen prod I'm not going to do that here today I'm not really worried about it but unless you do that it's going to go to this link and this is actually a preview link and
people can't access access that from anywhere you'd have to do prod let's go ahead and delete the verell project and now we're basically well that was the deployment that wasn't necessarily the project this is free tiar so this project is not going to hurt us if we keep it around but I'm going to just go clean up because I want to and we'll go ahead copy this we'll copy this we'll delete it there we go I will see you in the next one okay [Music] ciao hey everyone this is angrew brown and today we're going
to be taking a look at gradio so gradio is another thing that can quickly uh do app prototyping it's very popular among data scientists and uh people in the data space that know python really well but don't know how to build front-end applications because we can use minimal code uh to build out interfaces also think there's like ways of Hosting um gradio like there's something like called gradio spaces I believe I'm not 100% certain it's cool that they also have a playground of stuff that we can work with so this might be an example of
where we can actually start working with gradio to learn it so I'm just going to increase the size here and take a look yeah see over here we have spaces I think that's hugging space uh hugging space spaces I'm not sure let me just click this here yeah it's hugging space so um that's somewhere where we can deploy it I'm also just uh wondering if there's like a spaces CLI h uh we'll say hugging face because I've seen this um CI before for spaces but yeah I'm not 100% certain where it is but here on
the right hand side it is loading I'm not sure why it's going so slow here but here we have an interface for it so this might be better to play around with this locally as we don't need any anything special to do that um so I'm going to use my local developer environment uh today so um I'm going to pull in I got a lot of projects open here so I'm just going to close out a few of these here so I have a little bit more focus and I'll just close this one out here
and so we have this here but I'm going to go down below I'm just going to make a new repo um actually this is something we could just probably do in git pod as well um I think I'll do that in git pod here today because we have the uh jny Essentials repo and I'd rather work in that one um I'm not always committing all the code into the repo but I would like to commit this one geni Essentials repo and so I'm going to go ahead and open this up in actually you know what
we'll do GitHub code spaces this time I feel like I should alternate so we'll go ahead and make a new GitHub code spaces I don't like GitHub codes spaces as much as I like G git pod but again variation is useful I'm going to go down below and change my settings and I'm just using the free tier here so this should not cost you anything and go back down here to theme and we'll turn this into a dark mode okay now gradio is really good uh to be generated out utilizing uh vs code um I'm
not going to do that right away here because I feel that um we should just learn the basics of gradio and um I'm just going to install one extension you don't have to install this I'm installing it for me if you don't know Vim do not install this but it makes it my life easier moving things around this will remap all your keys just don't install this is for me I'm not sure if there's a gradio plugin but there is it'd be interesting to see that we have gradio syntax highlighter um no that's not really
that useful to me but we'll go back over to here and let's start working with gradio so the first thing is we're going to need to install gradio so I'm going to make a new file here um yeah I'll make a new file we'll call it requirements.txt and I'm just going to drop in gradio I would expect that python would already be pre-installed on here so I'm just typ in Python hyen hyen version and we have a decent version here so I'm going to CD into that we're going to type in PIP install hyphen R
requirements.txt and so that's going to get us set up with gradio here and we should start working through these examples as they say it's really easy to use um so I'm going to pull this off screen and I'm just going to write it out and you write it with me as we work through this um but I think this might be a better way to to learn gradio as we go through it um again not a super expert but I'm very good at coding so this should be very easy so let's go ahead and start
with our hello world and so I'm going to say hello world. pi and so the first thing we're going to need to do and sure we'll install the python extension why not um we'll go back over to our hello.py and so we're going to need to import gradio that'll be the first thing that we do okay and so the second thing we'll want to do is we'll need to create an interface so let's go ahead and type in demo equals gr interface um and you know I mean we have these examples here but I would
be kind of interested in looking at the documentation so let me see if I can find the gradio documentation okay we'll pull this up here this might be a better way to learn so it says build and share Machine learning demos and web apps using the core gradio python Library so we have interface blocks chat chat interface text box image audio data frame so let's go take a first look at interface so inter is gradio main highle class and allows you to create web based goys demos around a machine learning model um the function is
to create a guey the desired input components desired output components so we did this and so here we have the interface so maybe we'll just grab a bit of this code okay which we were already kind of doing here and so we have here a function an input and an output I don't like like four spaces I like two spaces and even the Creator python regrets not making uh making enforcing that so we'll just switch over to two spaces here but let's try to understand what we have so here we have the gradio interface and
these are the options we have we have a function that can be called we have possible inputs we have possible outputs we have examples we have cache examples cach mode and a bunch of other flags I didn't realize how much was in here that's a lot okay so if we go up to here what do we have we have our first interface and it's taking a class called image classifier which shows two things here it's literally a function we have an inputs called image and and an outputs called label all right so let's go run
that and see what we get I feel like we wouldn't get a whole lot but we'll type in python app. or or hello world here we'll see if that runs it okay so that started up on Port 7860 we'll open this in the browser we'll give it a moment here to connect should not take too long and so we have an interface what's really interesting is we have the input is an image so I guess when we said inputs is an image then we get an image right that's really interesting and the outputs is a
label which is over here does this actually already work and do something that'd be really interesting if it does so I'm going to see if I can find an image to upload here really quickly so I have a like a green screen photo of me so I do this I say submit oh wow we actually got something and I'm a dog I'm not a cat but I'm a dog U that's really interesting that that actually does something I really didn't expect that but you know what it looks like this is a hard-coded value so it's
always going to be 0703 right okay so here I bet if I change this to like 1.7 it almost looked like that was a functional app but it's not it's it's all it's complete nonsense right but we go here and we feed it an image which does absolutely nothing and now notice the value is 7 % I thought I changed it right 1.7 maybe we have to stop it and restart it for this to work right okay because maybe that it didn't pick up that data again this is all just nonsense but we're going to
go ahead and submit it so now look it's 170% right so that's really interesting another thing that's interesting here is it says uh running your url so you could share true to launch it and then it'll be publicly available I'm not sure if this would actually work with this but we could try that and see what happens if we do that so we'll go ahead and stop it because this is running in a a a a a git pod and so it wouldn't necessarily have it but notice that it now has this URL so gradio
is basically setting up a tunnel so you can go over here and access it somewhere else oh wow it does work so there you go that is like a super easy way that you could share this with anybody and that's one of the powers of gradio but so far this doesn't make a whole lot of sense what we're doing so let's go back to the docs and see if we can learn a little bit more of what is going on here because we have clearly have different kinds of inputs and there are different kinds of
components so I mean we put a string in called image but how do we know what kind of inputs we have that's what I want to know here so I'm kind of looking at this and saying okay um that's cool and all but how did I know that that was my inputs so here we have components here on the left hand side so I wonder if there's like an image component there is okay and so that's so create an image component that can be used to upload an image and there's an example of it so
I wonder if we could have also put in a gradio image and if that would have done the same thing it looks like it takes parentheses but let's go back over to here for a second and instead of having this as our inputs I'm going to go up here and just Define a new variable called inputs let's do this instead and I think it's actually just gr for gradio as we are using the abbreviation and um obviously you can take an array of it so I'm just going to go ahead and change this to an
array of inputs and let's go switch this out and see if this still works right that's what I'm kind of expecting is this will still work inputs again we're not building anything useful right now we're just kind of learning the components of it to understand the level of configuration that we can do and so if we go back over to here it should still be the same thing right we should not get an error does it load it does excellent okay so now we kind of have an idea that we can load them in we
could obviously have done image what if we do two here what happens now if we do two so we're going to go hit down I don't need the the shared link here so I'm just going to say false because that's a bit taxing on it and so I'm going to go ahead and hit up okay and so now this is running we'll go back over to here we'll get a refresh and so I'm expecting maybe two images in here so now we have two images right um of course it doesn't understand its input I think
here uh the input's coming from whatever this image is so if there was some kind of information about the image that could be interesting so maybe we could do is just say print and give me information about the input that is being passed along I believe this is short for input so I'm going to stop this again as I don't believe this hot reloads there might be an option to set tell it to hot reload whenever we change the code I don't know we'll give this a hard refresh I'm going to go ahead and drag
this on over I'm going to hit submit oops we'll hit submit again uh and maybe it wants both the images I'm not sure let's try this again here submit and we have an error that's totally fine let's go back over here and take a look and it could be that we don't have the right amount of parameters okay so I'm guessing again I'm just guessing without really reading about this but I'm assuming that maybe we have input one and input two because we have two inputs so I would think this would take two right uh
before we do that we'll have to just do uh uh like that there and let's see if that actually works okay we'll give this a hard refresh again right now we're building nonsense but we will get to something here in just a moment okay so we'll drag in this image here and then and then uh the image down below here that looks good we'll go ahead and hit submit no problems now so my hunch was correct and we get back uh matrices of information so probably that there's a representation of what that data is we
would probably have to look at the components itself to understand it so if we go back over to the documentation we have the image here okay an input component pass the upload image as a numpy array that's actually what it looked like pl. image or string file path depending on your type okay so for the SVG type parameter is ignored so maybe there's like a way to say like I want this to be returned do we do have initializations here so here we have value and I think it's defaulting to an array but maybe we'd
rather have it as this right so I wonder if what I could do is go back over to here okay and I could say value it probably be equals and appeal image numy or path of your all the default oh that's for the default image that's not necessarily what gets outputed so I was hoping that we could control its outputs but maybe we don't have control over that yeah yeah yeah so here it represents it as a numpy aray so I think maybe we don't have control maybe it always outputs output as an uh as
an output so an output component okay but how do we know what it is your function should return one of these types mm okay here we go so no that's not exactly what I wanted but uh like this would take an input and then the output is an output right like if we wanted to change this to our output we went back over to here we' say like outputs here well that actually depends on what's coming back here because if we have this data this is get to pass into here so we'd have to pass
the image along for it to work so that's probably not a good idea so again I was just trying to hope that we could change it so that it's of a numpy array but it's totally fine if it's a numpy array I don't mind that and so that's a you know starting to understand how these components work right so that's pretty clear let's go take a look at blocks what's that so blocks is gradio lowlevel API that allows you to create more custom web application so I'm going to go back over to here and we'll
just name rename this to interface okay because that's really what we're doing here um I guess the other question is before we move on is there another type of interface interfaces is gradio Main High Level class yeah yeah yeah okay let's go take a look at blocks next so blocks is gradio gradio lowlevel API that allows you to create more cust web apps and demos than interfaces okay compared to interface class blocks as more flexibility control over the layout of components okay that makes sense so over here we have an example let's go ahead and
grab it okay we'll make a new folder here called blocks. piy and we will oops that's not what I wanted we'll make a new file called blocks. pi and we'll dump this in here and so I'm going to stop this app here and so now we're going to go ahead and do python uh block high and that's going to start up on whatever our default one is here we'll give this a hard refresh should be the same port and what do we get so now we have something here so let's take a look and see
what we have so we have a block called demo so we have this area where we have markdown that we're rendering then we're just telling you to make a row okay in this row we want a text field here okay we have a row of text box text box we have two text boxes input and output and then we have our button here and then we have a button click okay that kind of makes sense so basically it sounds like it just gives us a little bit more control of our layout [Music] MH oh seems
like we can change our theme okay I mean it already looks like the theme is soft but let's go ahead and try that so we'll just say theme equals soft there's an example so I'm going to stop this restart this let's see if this changes its look give it a moment here to connect okay it kind of looks different I'm actually kind of curious what other kind of themes there are let's go down here to themes maybe they have the options over here and obviously we can specify it this way as well is that what
our theme looks like there kind of I'm just again I'm trying to see if there's like a dark [Music] mode how do I get a dark mode theme we'll say gradio dark theme mode blocks why don't we just try dark and see what happens okay because maybe that already does exist and we'll just stop it again this is how you learn you just play around with things so you get results and we'll go back over to here and no it does not so I think that maybe for that one we might have to to override
some of the theming to get it but um that's fine but what I'm trying to wonder like what I'm wondering about here is that we have like gr row yeah okay so blocks have layouts and these are its components under or like these are these things underneath okay so that kind of makes sense so we have an accordion okay so we could try adding this in here and see what happens okay so that's one element that we could add um we have column let's go try adding that in here okay this is all just nonsense
so far hopefully this doesn't break her app uh what else do we have we have the group we could do that as well I'm not sure if we already have a group in there I don't necessarily see one but we'll go ahead and do that we oh we have tab that's kind of cool we'll go ahead and grab that I'm not sure how it's going to load those images because we don't actually have those images so that might error out because we don't have um those images okay so uh there's also render um render is
probably looks like it's a decorator so uh yeah it is a decorator TI Dynamic layout so the component that the vent list the app changes bit depending on the logic okay so that one's a little bit more complex I'm not sure if I would want to show that to you right now but we'll go ahead and stop this and see if this just works so it says with as demo uh we just have a duplicate line here and this is also duplicate line there so we'll take that out we'll try this again we'll see if
it takes it okay it does and let's see what uh mess we have here okay so we have something very complex as you can see again I really don't like this theme too much but uh you're kind kind to see like what these things do so that's kind of blocks for you let's go back to the main list here as they were telling us there were some other components that we might be interested in it was back over here so we have chat interface which is next let's take a look at this one I mean
text text box is not that interesting but I'm going to assume that chat interface gives us a chat interface so we'll go ahead and do that so see new folder or new file chat interface. py we'll paste that in here and then we'll just back out here so um we'll say python chat interface dop whoops okay we'll go back over to here we'll give this a hard refresh and so now we should have a chat interface again I really apologize for the the lack of contrast here but this is what I have so ola hello
okay so we click the button and then it put into the chat and can say hi but there's I'm not talking anybody here so this is not going to really do a whole lot but you can see we're having a back and forth conversation here um so that's pretty darn straightforward okay that's just slightly formatted differently so there's that and go back to the main docs here I kind of lost where we were go back to Doc chat interface image audio data Frame data frame might be interesting this component displays a table of value spreadsheets
I mean that's just a very specific type of component yeah you can see we have a lot of different components and so basically they there's inputs and outputs and we just have to be able to map things to it okay so maybe what we should do is try to build something really simple in this um so let's actually try to build something for now I said that you can use something like chat GPT so maybe that's something we will do here today I'm actually going to use Google let's use meta AI let's see if it
can do it and I'm just going to tell it to like uh you know uh write me a uh chat interface using gradio um and it should work with well I'm trying to decide here what this could work with um open [Music] AI um in like uh open [Music] AI with open AI okay we'll try that because now we have a general understanding of how this works so I'm going to go ahead and copy this here now if you don't have an API for open API this is not going to work for you but I'm
going to go ahead here and just say open AI or we'll just say uh real like real chat we say chat. here and it'll paste this code in here so we have here gradio open AI we have our API key that we can utilize here we have open AI chat completion so the idea is that we have the chat bot which is its inputs I think this is just inputs here like if we did this this is inputs maybe it's not actually um because that other one's slightly different we'd have to go check the documentation
here so we go to chat interface let's look at its initialization oh so it's function okay with python you don't have to explicitly name things but this this function here is function right but it's going to mess up because we'd have to change all of these like this so the first one is the function one um I mean it doesn't say it's multimod next I'm not sure what it's taking next it must be taking the text box here so we have our input or response or title or description and this here is trying to use
da Vinci okay so I mean Da Vinci is fine but that's kind of a very dumb model um but I'm going to make my way over to open aai and they might have a free tier I'm not sure if they do to get to log in here it's kind of hidden um think you click that logo no maybe it's under products it's almost like they don't want you to use open Ai and so I'm actually in the API log you'd have to create an account to have the API and they might have a free teer
does open AI API have a free tier let's find out uh they might but anyway I know that I have my own account here so I'm just going to go ahead and log in going log in with Google I believe as my account if I don't it's a new account that's totally fine and so I need to create myself a new project so I'm going to go to [Music] where is this I can create a new project here this just be um gradio and from here I need to generate out a new [Music] key it's
a little bit of a mess so it's a bit hard to see what I'm doing I going go to manage projects we have API Keys up here but if I click into this project okay this is fine but I need my API key so I click here and I go to API keys I already have a key that I generated out for another use I'm going to go create this one here today this is going to be called gradio example and I'm going to scope this based on this project here the gradio one I'm going
to give it uh full access here I'm going to get this secret I'm going to delete the secret later so it's not a big deal if you see it and I need to bring this back over to my code now we should load this via environment variable so I'm just going to stop this here and I'm going to export this variable into open API key um well you know what actually I'd rather load this in as an environment variable so I'm going to gov I'll make an example here and there's usually three things we have
to set we have the orc ID the project ID and then the open API key and um obviously these will be examples but I'm going to create a new one here just so that um I'm not uh causing myself some issues here okay and I'm just going to go ahead and make a new file here called Dog ignore get ignore get ignore and we are going to ignore the EnV file okay just so we don't commit that and then I need to paste in this so um there's probably very specific uh Nars for open API
uh here yeah like this one's called open API key worst case we can just set them manually um API normally a lot of libraries will have default ones and then you load them in here yeah this one says open API um so we will grab that one here and I'm going to go back over to uh GitHub code spaces which is what we're in here today and then there's obviously an org ID which we'll need or ID and then project ID before all you needed was a single key but now I believe that you need
all three and you should use all three if you can um so in here I need to go back to well we need this project here so I'll go back to manage projects and this is just loading at projects and so this is the ID of the project I'm going to grab this here I'm going to bring this on over to here nope not there this is the project ID and then I also need the or ID which is over here so tons and tons of keys so now that I have those three keys I
need to configure it so that it uses all those I'll go back over to here we can see that's using this I don't think it's a Capital C but to be fair meta might code might be a little bit out of date here but it did give us the basis of something so that's totally fine and we see open API being configured that way so I'm going to go over here and say like open a uh AI um API reference okay and in here we're it's going back oh it's right up here okay and so
I'm looking for how we configure this so this you can see this is a little bit different actually GPT 40 mini is probably what I want to use so I'm actually going to copy this here and um go back over to our code base which is over here and paste it in and so this one's called I'm just going to name it open a AI here uh prefer client to be honest I'll bring this on down here I'm just going to ignore this now I'd rather have this streaming to be honest so bring this over
here and I just know from experience that um it takes all three but we need to set those somewhere [Music] so let's go to authentication okay here we go so this is where we are setting our keys what about our other one our actual main one H okay that's fine so maybe we'll pick up the one like the environment variable one will pick up over here but we'll explicitly set up the orc here so in here this is like import I believe import OS and then this here is os. Enron Square braces and this would
be org ID as that's what we called it and then here this would be OS Environ and I believe it's Square braces it might be uh oh no you know what it's doget I don't know how I know that it's that but I know that it's that doget and we'll say project ID okay and so we're not explicitly setting the API key this way I'm going to assume that it's just going to get loaded into the Nars and it'll just implicitly know uh one thing I would like to know is how can we set up
this interface this chat interface to be streaming now this just says interface is that correct because this one's called chat interface so it makes me think that meta didn't produce the correct code and so it wasn't intelligent enough to do it sorry meta I guess you're the wrong one to do it let's go over to G uh uh Gemini Gemini and I'm G to give it its code okay so uh can you fix my code so that so that gradio is using the correct chat interface and it's hooked up for streaming cuz it clearly isn't
hooked up for streaming and I can only imagine that gradio can do streaming so here we have updated code um open a API uh response I'm not sure why we need that there this is still using interface okay that can't be right because this is interface uh you aren't using chat interface why not is there a reason why you're not going to use it okay so this looks a lot better but this is where like it helps actually know and explored the code first and if we hadn't done that we would probably thought interface was
totally fine so I'll paste this in here and the code's a little bit different I don't know if we really need this in fact I don't think it's even being used right away I knew that made no sense um also oh you know what it's saying it can't be resolved because we haven't installed it yet so we got to go back over here to this andad say open AI okay so then we'll go back over to our chat. and so it's using the the the usual structured interface so that kind of looks correct to me
it's passing the messages in so that looks good as well um how do how do we know that it accepts streaming I don't know so like I would think that you'd have to do something special to do streaming so I'm going to go back over to here did it just get rid of streaming no it still has streaming on here um and so I would go back over to gradio I'm going to type in stream so here's an example of chat streaming so there's no specific setting for streaming it just seems like maybe it just
knows how to stream okay so let's go back over to here and let's go ahead and start this and see if it works no module named open AI well oh did we do our pip first pip pip install hyphen R requirements.txt okay and then we'll try to start this again now if we use Claud son it 3.5 the latest version I bet it would get this the first try but again we're just trying to try different models and see what happens um as that's just a good way to test things in general okay so here
it says the API key option must be set either by passing the API value or the open AI API key environment variable oh you know why this didn't work is that we didn't add a way to load environment variable so we'll type in python. EnV why it's called python. EnV I don't know but when you actually import it it is actually called uh just EnV which is kind of annoying so we'll go over to here and we'll say import. EnV and we'll say Dov load Dov parenthesis and um that should work now good so we'll
go ahead and hit up and so that should load in the environment variables it looks like we got no not an error but we'll go back here and now run the app and so now it's started on this other Port we'll go back over to here we'll give this a hard refresh we'll see what we have great and so now this should be hooked up uh to this one thing I don't like is how hard it is to see stuff here so I want to get a better theme in here so we'll say uh I'll
go back over to Gemini can you create a high contrast um theme for gradio because it's hard to see the default one okay let's see if it can do that at least and so here here we have a much higher one and so I'm hoping what we can do okay I'm going to copy this here I'm not sure if it's going to work but I'm going to copy it here and maybe we can pass them in here let's just go check the API to find out if that's the case we can pass a theme there
okay good so we'll just say theme and then we'll put in our where's where custom theme create High concious theme I'll just say High theme here and I'm just going to go grab this function and we'll go down below just to see if we can make it a little bit more readable because it is a little bit hard to see High theme okay we'll start this up give us a nice hard refresh uh I would not call this a high contrast theme I would call this really hard to see but maybe what we could do
is adjust this ourselves because clearly it's not listening to us um yeah it it's using the soft theme okay what are the default themes Dr okay let's Let's uh see what it has here these are just terrible these themes I don't want soft is there anything beside oh we have base default origin Citrus monochrome soft glass ocean can we at least see the theme dark mode yeah I'd love that can I just have that like this this one's even better okay how do I get this one okay great can I have that theme left panel
is where you create the theme the different aspects of the theme okay can I have it please may I have it how do I get it the strange of the Dark theme is easier I I mean it looks like it's just base theme dark theme to be honest so maybe what we can do is just go back over to here how to set Dark theme on gradio I not sure why this is so darn hard maybe you can't force it you have to have a toggle or something oh this is so silly because I want
to be able to see what I'm doing and it's just like really hard to see anything all right well I'll just take out my custom theme because that clearly is is nonsense here I mean like we could get it to work it'll just take a lot more time but I do apologize I find again this very hard to see and I I just realized that that's unfortunate so we'll give this a moment here okay so let's see if this works hello uh I want to learn Japanese can you teach me some basic phrases let's go
ahead andit enter and we got an error yeah no surprise so it says here uh not values to unpack okay so something is clearly wrong with this assistant message yeah I don't think this did a good job so what I'm going to do is copy this here let's go over to something that actually can make good code we'll go over to Claude AI um this code is not working can you help me out and let's see if claw can solve it cla's really really good at coding I just want to see if it figures out
what's wrong uh change the organization project to using the API no that's not right fix the model name gp40 to gp4 there isn't a model called GPT 40 mini I mean there definitely is so that's wrong using Delta con instead of text to access the stream content add a check none um no I mean you're wrong on many parts I just want to solve this problem so we go back over to our example here we'll see what's going on here so it's like this and we'll see what it says so it's putting my code back
to normal which is good so it says here change the parameter order to match gradio expectation simplify the history handling okay so this looks a lot better um so maybe yeah again it just was not compatible with the way gradio handles the code so we go back over to here I think what it's changing is this component right here that part there so we'll go back and and paste it in uh yeah it looks a bit different for sure okay um I think I might have lost that code there we'll go ahead and paste this
in again and I'm going to stop this start this again if it doesn't work then we'll manually debug it because um you know again these things don't always work as expected can you teach me some basic Japanese phrases for uh restaurants let's see if that works is it going to stream and we got another error so it's interesting that it can't solve it here Choice object has no tribute text okay so let's take a look here and see what we have so we have this configured this is totally fine we have user assistant which might
be correct we have add the uh pending message that's fine we have client completion and we're passing back a stream in the Stream we are iterating through it and we're doing this Choice object has no attribute text I mean there's nothing in here where we're setting choice right so that's why it's kind of confusing and we're not even uh defining the the text like any text boxes so let's go back over to here let's go tell Claud that we have this error and see if it can figure out what the problem is now okay so
it's that delta. content issue that it's adjusting again which is over here so I imagine what it's changed is just this part so let's just grab the part that it's changed oh choices here okay that's what it was talking about choices. text all right it wasn't clicking there for a second ago okay so we'll still stop this we'll try this again and we'll go back here and I'm just going to copy this text I'm going to give this a hard refresh and we'll paste this in again we'll hit enter and there we go so now
we have our little chat um could we take this a bit further one thing that I would really like is to add on the left hand inside a um image generation yeah maybe we can take it a little bit further so we know that we can create like a leftand column like we can create uh we can use blocks to create a column so I think maybe what I might want to see is like an image and um let's see if we can ask cl to do this so can you add an image to the
left column can we have uh two columns one for image and one for chat um I want to um generate an image from the response of the of the last chat message returned by the assistant okay and so I just don't know how the outputs will work but let's see what it does and then read through it but obviously we would use that block thing as we saw before yeah so there's our blocks that's setting up a single row and then within it we have a column I don't know if we need to have a
row here that might be um kind of useless here but here we have our column which makes sense connect the chat box output to the image so that's what I didn't know what we could do here so when we hit the image then it's going to hook up this generate image from the last history here okay and then Ching Dolly 3 which is actually perfect because that's a that's an API that's um part of open API so let's go ahead and copy this one thing I don't think we need is that row column but we'll
go ahead and see what we can do here okay so I'm going to try and stop this here actually before I do that I'm just going to undo I'm going to say uh chat with image. py and then I'll just paste this in again here and we'll CD into chat with or not CD into but we'll just say python chat with image it already is erroring out because it doesn't know what chatbot is here also it doesn't know what user input is either so I'm not sure uh why it did that we have it over
here did you mean submit function uh possibly but I don't know what it's supposed to be okay so clearly it has some old code here but we'll go ahead and just ask Claude to fix it again one thing we might want to check here is like we're returning back the actual chat interface and so if we go back to the documentation this is something we can check I don't want you to be 100% reliant on the um on the code so that's why I just constantly keep checking the code here so that you or sorry
not the code but I don't want be 100% reliant on uh generative AI to code and so that's why we constantly keep checking here so this returns back an object so what I'm trying to figure out from the code is what does it return so it doesn't tell us okay can we get submit anywhere in here submit so there is clearly a submit button dot on this but that's if we initialized it so these are components does it tell us maybe what it returns it doesn't a chat bot with a custom chat bot that includes
the placeholder what does chat interface return gradio because it's not showing us that easy way to find it that's kind of frustrating like how would we know what we would return here here replace chat interface with custom chatbot implementation combine the chat interface with it yeah so I think it made it up because I'm not sure if you can just do chat bot and say chat chat interface on like that so here we have message well first let's just see if this works and then the second thing is we'll read the code and see if
we can make sense of it but normally what I would do like when I'm coding is I would go look up and say okay what does it return but the documentation doesn't tell us anything unless there's like um let's say gradio API reference maybe that might be better no I would describe this um this documentation is not very good okay yeah you have a chatbot so here's this chat bot. like like where does this like come from like how do we know what object is being returned it just doesn't tell us that's very frustrating um
so we'll go back over to here and I mean maybe if we see the component then we know that there's a button so if we go to submit here there's no submit button in uh in particular here and that's why it's confusing but we'll go back over to our interface and see if it actually worked um over here so it says it's running so maybe it did work so I'm going to go back over here because this a hard refresh okay so now we have our chat on the left hand side and this on the
right hand side but this is now a custom chat it's not that integrated one so our experience is not as nice could we have used the chat interface possibly I would probably want the image on the uh left hand side and then the the other one on the right so I'm just going to flip that around I would also like to try to see if I can get rid of this row as I don't think it's necessary as we already have a single one here so I'm going to go here and just flip these around
here okay and then I'm just going to go ahead and stop this so let's see if that row was useless give that a moment to start okay so without the row then they it doesn't work okay so that's what I I wanted to find out so we'll go back here and put the row back in we'll bring back our indentation I'll stop this refresh that we'll start this again again and we lost our column but I put it right back in there so why can't I have it back okay well I'll just undo it till
I go back to this one okay so I'll just undo this again and restart it back up oh come on it is not even trying to work here so we'll go here I'm just going to taste a two spaces tab here first of all I really do not like uh four spaces so I'm just going to quickly fix that as that's kind of bothering me but I'm trying to swap the location so one's on the left one's on the right it seems like it's kind of caching it maybe we'll just open up our developer environment
and they just say to not cash it anymore and that might resolve our issue here as I'm tediously changing the indentation for no particular reason uh but yeah I want the image to be on yeah I want the chat first and oh sorry the image first and the chat second so let's try to stop that again and I'm going to reopen it this time what I'm going to do is I'm going to open up inspector and I'm going to make sure that it's not cashed so we go to network here as long as this is
open and that's disabled cach is on there that might fix our issue here we'll give this a hard refresh and that might fix our problem I don't understand it worked before right actually the image is not showing up here at all oh you know what are we starting up the wrong app maybe that's our problem that's what it is okay so we'll start this up again I would like to know what gradio returns maybe gradio has a a GitHub repo obviously would but like how do we know what's being returned and that's what's kind of
frustrating here as I feel like the doc should tell us that let's go into gradio as that might be where it is and then in here we have our components and then in here we have the chat interface probably over here maybe it's not under components yeah it's just there and so what I want to know is like what does it return so we have chat interface okay what do you return you must return something the chat interface returns itself okay so is there like a submit button on here if I type in submit I
mean there's submit function in here right so we have submit function so maybe the other one could have been correct and maybe it could have gone off a submit but without knowing exactly what it returns and not being defined that kind of makes it a little bit hard otherwise you have to kind of dig through the code to figure it out so anyway we have our chat here and um I don't expect it to work with Dolly but if it does that would be really great so um can you tell me what the weather or
uh uh describe a nice scenic view in a single short sentence because what I want to see we'll hit enter here is I want to see if it uh calls the other one Dolly like we didn't write any that code but if it works that'd be really interesting now while that is going we can go check our back end let's see if we have any errors I don't see any errors here so I'm assuming that it might be working we have a golden sunsize Cascades over the gentle Hills painting the sky vibrant Hues of orange
and pink tranquil Lake mirrors with breath taking Beauty that's really awesome that worked perfectly um again I would prefer if this used the chat interface but probably um if we spend a bit more time we could figure it out let's go take a look at this code just make sure we understand it before we move on here but I would call gradio done at this point so here where we have open API or open AI we'll split the screen here so we can keep track of where we are then we have this generate image prompt
right so the idea is that for bringing in text and then we're calling client image generate we're using Dolly 3 we're telling it what size of image we want the quality of image we want and then we once it generates out it it returns that URL right then here we have another function for chat and generate so we have user input and history okay so the idea is that we have our messages we will iterate through our messages or not iterate through it yeah actually we are iterating through it but we're appending our messages because
they're stored in a in a a plain array and we're attaching as user assistant alternating right so we're going user message whatever and then we append this message here yeah so it seems like what it is is that there is like a series of messages and one's assistant and then one is user that's what it's saying so it's like a bunch of those okay do we have to destroy that structure uh I don't know I mean we do for chat GPT we have to do this but maybe messages this is the format that the actual
gradio takes okay then down below here we have the response and we have a stream and so we are sending the streams uh in part here okay and so we return to here so I would think it doesn't seem like it actually streams it e because it looks like it's taking the stream it's iterating through it and it's it's sending it over let me go back over here let's observe that again did it actually stream okay how about a uh uh a like a Sci-Fi a Sci-Fi setting okay so hit enter yeah it's not readying
out it's not streaming it okay so it's kind of useless the fact that we're using the streaming API here because it's not doing the typ of writing thing what it's doing is taking the response it's collecting and doing that so let's say gradio streaming API streaming outputs yeah well that yeah that's what I would like to do so here it says to yield okay so there is a way to to change this to be streaming and so we have this new one the only thing I would like to do is change this code so it's
streaming so that's fine so then we have our block we have our row we have our image output we have our chat bot from the chatbot component so as opposed to actually using the chat interface we are creating a chatbot component and then we have a text field so they're completely separate and then we have a clear button which we could get rid of I don't even think we really need a clear button I'm just going to get rid of that I don't really want it I'll leave it in just because I don't want to
change the code too much no I'm going to get rid of it I don't even want to clear button oh you know we can just comment it out there we go so we have that and so the message is when it's submitted it has respond and so it looks like it does chat image chat and generate image so Returns the imageurl appends to the history chat and generate yeah okay so generate the chat and then we get the image ra back where does this get called oh here okay so after the response it does it
there you know what I I don't think we could change this to streaming because we would have to wait till the final response came well actually you know what we could probably change it if it if it if it finished and then put it in here but I'm not confident that uh this code would be easy to do if it was non-streaming or sorry if it was streaming with an image so maybe we should leave it this way but you know it would be better if if at least this one was streaming so let's go
back over to this one I don't know I'm not sure how to do it for the chat interface oh sorry maybe chat here we go because this one said well this one does yield right and they were saying like if you did yield it would stream it so let's go back to the other one for a second maybe the other one was streaming and I just didn't notice let's go back to this one hello run let's see what happens okay all right so second example probably streaming might be too hard to do there might be
a way to do it first one does stream so that's okay we'll call this done um and yeah I'll just go ahead and Commit This code I just want to make sure that the EnV is not being included it is not so I'm going to go ahead and just add add uh gradio example okay and because I want to be responsible I'm going to go over to my open AI account wherever that is and I'm just going to go ahead and get rid of the gradio example revoke there we go and I'll see you in
the next one okay [Music] ciao hey this is angre brown in this video I want to take a look at Streamlight or streamlit it depends on how you pronounce it I guess it's streamlit that's the correct way of saying it very similar to gradio um in the sense that uh you have um Python and you're able to just start writing something and get something back um I like streamlet more than gradio but really just depends on uh your preference U but you need to just kind of explore it to find out they have a lot
of different examples and it's really good at generating out um code examples very easily with uh with any place so it's up to you apparently there's like a account that you can make I'm going to assume that that's so that you can deploy it to their Cloud um which might be something you might want to do apparently it's free so maybe I'm going to go ahead and sign up for this I didn't even know you could sign up for it and I'm going to go ahead and just make a new account now I'm just kind
of curious like could you sign up for gradio and I didn't do that earlier no so maybe gradio is made by hugging face that might that actually might make sense who makes gradio hugging face okay so that that's why there's the hugging face spaces okay but we never deployed to hugging space spaces when we were or or when we were doing gradio but I imagine it's a very similar setup so I'm just going to go ahead here and just enter in a couple things they don't look like they're all required so I'm going see if
I can just skip them here but I got at least enter my country okay so now we're in here and we can go ahead and create an app this way no apps to show in this workspace but there's obviously other ones here but this is not what I'm interested in what I'm well that actually that one looked really nice here like it's it's UI so I would just I would just say that streamlet can make much nicer looking uis I'm not sure what's going on here but um let's go ahead and and start using it
so I have um I actually don't have this open right now but in the Gen Essentials I'm going to open this up in code spaces again I think I already have an existing one so if I do I would rather just uh spin that one up here you'll see that I use all sorts of different developer environments I can just kind of switch between them so uh you know it's just how it is but if we go down here I do have one that already exists I'm just going to go ahead and delete it and
we're just going to make a new environment in GitHub code spaces sometimes use git pod sometimes use GitHub code spaces sometimes I just use whatever but we'll go ahead and open up uh GitHub code spaces here the first thing I want to do is just change the theme to a dark mode once it loads a little bit because it's having a a little bit of a hard time here whenever it decides to let me have a theme any day now any day now come on give me a theme please nope still none okay well anyway
while that's load let's go over to streamlit and see if we can make sense of its documentation so we'll go over to the docs here and installation should be pretty straightforward we're just going to do a pip install streamlet and then they have an example of I mean I would assume you'd have to make a Hello hello. apppp file and so if this is now loaded there we go and we'll just change this over to dark mode I'm going to install Vim don't install Vim unless you know Vim Keys which will remap all your keys
I need it because I'm just so used to using Vim now I can't do anything without it uh I'll just say allow here for the clipboard we're going to make a new folder here called streamlit okay and let's just see if we can get some uh basic stuff down so the first thing we're going to need to do is install streamlet now this thing here is kind of a um it's a bit different because it will actually install a binary and so when we want to launch our apps we won't type in Python but we'll
type in streamlet and then the name of the Python file but I'm going to go ahead and make a requirements.txt here and I'm going to go Post in or just write in streamlet here and then we'll CD into the streamlet directory down below I'll bump up the font here so that we can read it and we'll do a pip install hyphen R requirements.txt so that should go ahead and install it um so that stuff is pretty straightforward let's go look at some Concepts here I mean they're making it really complicated for no reason I feel
like it's really really easy to use mm yeah Stream Run apps I'm just kind of looking for its [Music] components okay that's great I think the thing is like most of the times when I use streamlet I just use something like chat GPT or um claw to generate out my interface what I'm trying to find is the actual components that we can utilize and so here we have maybe something that we can start with so let's try just working with elements randomly first okay so we have streamlet installed and we're going to make a new
one called hello hello. uh pi and the first thing we'll have to do is import streamlet I imagine so streamlet um just like gradio I bet you could probably rename it so we go over here maybe like as yeah there we go we'll do that we'll do as uh like was it SC what do they use as abbreviation or do we just import it and we can use it they have St so I'm going assume that this is what we're supposed to do at the top and we'll go here and we'll just try the first
thing which is St right and we'll just say hello world and we'll go down below here and we'll type in streamlet run uh hello.py okay so I think that will just work it's going to start on this port 8501 and so we'll open that up here we'll give it a moment okay and we have some text there we go so we can probably write more texts and Say Goodbye Moon okay I've I've done a a save I'm hoping that it just changes there so that was one nice thing that um streamlight's doing that we weren't
getting gradio maybe gradio had a flag for development but streamlet I just refreshed and I got it so we have two there let's go back over and see what else we can do text wise so WR uh write generators or streams the app in a text writer effect oh that looks kind of fun so let's try that out where are you Mars okay so we'll try this we a nice refresh uh expect a generator or stream like object as input okay so we can't just we can't just dump it in there like that I think
like yeah it's expecting a a very specific type of object so that's interesting anytime streamlet sees either a variable or literal value on its own line it automatically writes that into your app okay so um all right let's just give that a try then really quickly here so I'm going to go back so this is obviously not going to work uh we need a streaming object for this to work let's go here we'll try s St right hello and we'll do double name again I'm just guessing here hello Andrew and let's see if that works
uh yeah that's the old Air I'm just going to stop this because it aired up before so I just want to see if that works through that does that work oh cool no no that didn't work let's go back over to this anytime streen light either a variable or literal value uh on its own line it automatically writes that to your app okay uh well maybe maybe it's saying something else there and I don't fully understand it but that's fine um but we can just keep going through here and see what we have so that
was the first one we did magic so magic commands are a feature in streamlet that allows you to write almost anything in markdown without having to to have an explicit command oh okay it's markup I thought it was like some weird injection here okay so basically what we're say what we're seeing here is if we do this right if we if we did a multi-line I believe this how we do multi-line in in Python then we could say like um uh hello this is magic or matting okay so let's see if that works we'll go
back over to here give us a hard refresh okay all right so then it automatically knows to format it um okay so magic commands like that so here maybe it can take other formats because here it's suggesting something else so this is a document this is some markdown draw the string on x mark down data and charts without having to explicit understand the commands so we don't have pandas but we could go quickly install it as an example so we'll go over to here and we'll install pandas pip install hyphen R requirements.txt and required are
satisfied okay maybe we already have pandas so I'm just going to ignore that it seemed really upset when I've included pandas like that so maybe streamless brought it already in and so we'll go ahead and just copy this here uh draw this data frame well we got to write to it I think oh is this just going to work plainly in here no no that can't be right you'd have to do streamlit right yeah see I I didn't think it was just going to work like that so let's do this we'll do St right I'm
not sure why the examples are incomplete like that this is what I don't like about the python Community their documentation is always um incomplete but here we can it's rendering that out so clearly it can take a few different types of types here draw the string X on the value y okay I'm not exactly sure about that we have map plot lib as well so we'll grab this one as well and we'll try this St right okay give this a hard refresh I'm going to assume it's already included oh there it is okay great so
yeah we're starting to get some charting that's pretty cool let's go back over to um the docs here and continue on so we have right now we have text elements we saw a little bit of that but we have like headings and bodies and text so if we explicitly want to have markdown we do do markdown so that might make a little bit more sense for what we're doing so clearly that did that but we would probably do markdown here like this right that wouldn't change anything it just it would explicitly know that we're expecting
markdown and not something else um I'm going close some of these other tabs so I sto opening them up and we just kind of stay a bit focused here so that's interesting we have a title I think that would show up at the top here so we can go to the top and just say St title hello here we'll give this a refresh so now we have this at the top it seems like we could have different levels of titles maybe can we kind of I mean looks like we can also put little other other
nice little things in here as well looks like it even supports um uh these I might want to bring those in for fun that might be like their extension of markdown by including that that stuff there we go it didn't color it blue though I missed the colon that's why so go here give this a refresh okay cool that's cool I was thinking maybe like you could say I'm this level of title but I guess it's just a single type of title oh we have titles and then we have headers so I imagine headers probably
act as the different levels of headers um okay well let's just put one in here I'll go here and drop it in here instead refresh that let's see what that head looks like well that's nice I like that and so we want the divider we put that on there and then we have a subheader okay so it's not like we can do H1 H2 H3 it's just like we specify what it is that we want um I'm going to grab this one which has a gray one and I'm going to put this between this one
here and let's take a look at that subtitle and so it has nice A little Gray Line because that's what we set as that there divider Gray okay and we already did markdown so we saw that already we have a few other ones caption code divider Echo latex text I think that's pretty straightforward code's probably a useful one we want to show a code example I'll go grab this one really quick we'll drop that in there code and we specified the language I wonder if it supports my favorite language Ruby that'd be nice but we
have our code example um so that's fine we can also even do HTML which seems pretty cool so we see their text elements we have data elements so clearly it can render data frames now we saw WR implicitly do that but if we wanted to render a data frame in particular we could do that obviously we have tables um like there what's metrics we have an example here let's see it that's really cool I guess it's like a metric in the sense that it's like a like a dashboard right like they're showing temperature but I
guess you could like have a series of metrics so if you wanted to build I guess a custom dashboard you could do that okay Json I assume it just formats the Json okay that's nice we have chart elements input elements so everything's very very straightforward clearly clearly extremely robust system I could see myself like doing like a full course on this but right now we just want to start utilizing this in some practical way so I'm going to go ahead and we're going to make a new fun we're going to call chat. py and so
what I want to do is I'm going to use Claud because it'll probably do a better job of this and I'm going to tell it to use um I'm going to say I want to use streamlit um to make an app where one column has an image and the other column is a chat uh interface the uh we want to use open AI um such as chat GPT 40 mini 40 mini um and dolly3 and uh so the response from the assistant should generate an image in the image column okay so hopefully that wasn't the
worst ridden thing but let's see if we can get that we already have something similar for our gradio example so I found that streamlet generates a lot better than most others and so I have a strong feeling that this is going to work first time let's hope that it does I have maybe too too much faith in in streamlet but streamlet is really really well optimized uh for for these things and it also is just like more robust than radio radio is nice too it's just uh whatever you want to use I guess so we'll
go ahead and copy this bigger example here and we'll go ahead and paste it in but you could see like if you wanted to build some kind of very uh very nice system that you know you just have to spend more time learning uh the UI and probably manually configuring it I would probably manually configure it but right now we're just trying to get this in as quick quick and easy as we can I already have an env. examples over here so I'm going to go ahead and bring that over here we're going to Dov
EnV examples andv um yeah I decided to use open AI we could have used anything but it's just really easy to use open a open API so I'm just going to do that today open AI um and I do need python dodev so I'll bring that over here and we'll bring that in and just stop this I'm going to do a pyth or pip install pyr requirements.txt we might need more requirements depending on what we need to do here but that should be good enough for now I do have the open API over or open
AI here so you have to figure way there if you don't know how to get there I show you how to do that in the gradio lab so maybe do that one first if these are out of order this is going to be streamlit as our example here um in this we have to set more than just um that I created a new project for radio so I'm going to go ahead here and just make a new project called streamlet I'm not sure how many projects I can have but I'll just make another one and
I'll just create the key now we'll just call this stream and this will be set to the streamlet project we'll create that I'm going to copy that key we're going to go back over Tov I'm going to paste it in here for now um I need this as the key to get loaded we'll go over here I'll wrap that in parenthesis I want to make sure I bring in my dog ignore file I don't want to commit the EnV here so just Dov so that's not a problem um I do need to set the other
two the project ID and the or ID so let me just grab these two here and I'm going to go back over to the EnV we'll go down I'm going to go back over to the API over here I need to go to the actual projects we'll manage the projects and in here we should have this loads here for a moment the reason I'm using o Open AI is because they do have a free tier at least right now I'm not sure about like um uh Claude they might not have a free tier so that's
why I'm not using it right now even though CLA is really really good but uh I'm going to go over here to organizations going to grab this organization ID I'm going to go back over to here we'll drop in the orc ID and now we are fully configured for this that doesn't mean this is going to work but let's go take a look because there are other other things here I'm not sure if it will require it but I don't think so the only thing we'd have to add into our requirements.txt here is open AI
so I'm going to go do that right now and we'll go ahead and we'll just say python app or I'm sorry pip install iyr requirements.txt and we're going to go ahead and install that okay so that's now installed um let's just run it let's see what it works and then we'll go look at the code so we'll say this is streamlit streamlit Run chat. py Okay started up no problem we'll go back over here because this a hard refresh it should be the same URL I would think and we'll give it a moment and we
have an error which is fine the code is not exactly right so we'll go back over to here here um yeah this isn't right so we'll go over to to our other gradio example where I know that we have this configured correctly the client at least they called it client that's like the way I like to call it but this is uh this is wrong okay so that's one thing that's [Music] wrong I'm not sure about the rest Dolly looks fine let's go back and take a look here um oh we didn't import org or
sorry we have to actually load our EMV file so we'll grab these three lines here I'll paste that in there we'll give this a nice refresh here okay we still have a bit of a problem the open API key is being set but it didn't load the EnV so we have to uh stop it and start it because that only gets loaded once it's not going to dynamically load that no I mean it is set so should stop complaining about that let's make sure that's in the correct directory we are we are in streamlet here
I'm going to just stop this for a moment let me double check this open AI key open AI key it definitely is set here oh the name is wrong okay so that's my my mistake here this one's spelled wrong maybe the other one's spelled wrong too in the old one here uh this one here there we go and so we'll try try this again here we'll give this a refresh do we get an interface now there we go um so just say like can you uh can you describe a uh scenic view of Japan let's
see if that works if it works first time that'd be really great okay it didn't stream it which is totally fine I don't see any IM generated but that doesn't necessarily mean it doesn't [Music] work yeah we didn't see it stream okay so that makes me think it didn't work let's go back over here for a moment because it could take longer for it to work and we don't really really have like a responsive feedback there it would take more time for us to investigate streamlet to really figure out how to properly configure it for
better feedback well let's just see I'm going to zoom out here so I can read the code a little bit easier I'm going to close this tab out we don't need to keep this uh tab open here close all because we do have a lot of code here so we have a button clear it clears all the stuff which is fine column one we have a heading here which is fine I don't even care about these interfaces let's take all this stuff out let's just pair this down so this is a lot easier for us
to look at I'll take out this header as well yeah so we have our message add a CH a message to the chat generate response with gp4 okay so here it should generate it out so it should go over to Dolly and then the image response should go over here and set the image and it should rerun okay but it never did that the image never generated okay so here's going to update our code with some debugging also it's using gp4 I told to use GPT 40 mini which is kind of annoying that it's not
listening to me it's I think I think that that model doesn't exist maybe so I might just go ahead and and copy this one I know it brought back in the things I just deleted out which is fine but I wanted to use gp24 mini that's really annoying that it's not using the one that I asked it to use so we'll go over to here and this one uses GPT 40 mini GPT 40 mini here we go we'll swap that out like that there we go um also I mean this one doesn't use message. contents
it uses Delta choices. Delta I [Music] think hold on let's go to my other one here for gradio that we already have working this is Dolly 3 I think Dolly 3 is fine GPT for mini I think it's going to break it because I've changed the code but we try this over time oh it already has an error secrets. toml what is it talking about secrets I'm not sure why it's asking about Secrets let's stop this again let's try this again give this a nice hard refresh valid Paths of secrets. or Secrets directory um no
secrets found okay so it's kind of expecting secrets to be in here I'm not sure why again I'm not that familiar with all the settings for this but we'll say secrets toml for streamlit so an optional file you can use when your working directory when secrets of T is defined both globally in your directory combined secrets and gives presidence oh that's kind of nice oh I see so you can set the open API key other values that way that's really nice and then you can reference them that way I mean that's great but we already
referenced it in another way so I guess I'm not really worried about that the only thing is like I'm not sure where this is the secrets here oh it's yeah yeah yeah yeah it's being stupid so we'll go ahead here we'll copy this this we'll fix it this one line here but this is still going to break because um the way we have it configured probably the better way we use the secrets but I'm not going to do that oh my goodness did we not get rid of the stupid Secrets thing Secrets no get out
of here secrets we don't want to use that we used a more generic way that's better okay okay uh yeah yeah this stuff's gone okay that's the problem here and we got to stop and restart this now we could just keep asking um what this is more complicated than before I don't want all this all right that's fine we'll just say uh uh describe a tropical drink all right that's fine but we don't have any image generated yet let's go over to here we don't see any errors per se but it's not generating it out
okay [Music] so now here's the question if I use this print hello I want to see if like this will print anything if we do that so stop this start this again here care what I put in here okay so hello does appear in here and so maybe this is a way that we can debug it okay but the chat worked even with our changes here I think it was stream if it's stream it uses Delta du content and that's why we had issues when we were doing gradio which might be in a different video
here um what I'm wondering is like does it generate an image here so here it be like print generating image oh look it's actually tying it so you have to say these words otherwise it's not going to generate it out no no no no no no no we said that I I want it to happen every single time so that's why it's not working maybe okay so let's go ahead here and take this set like this okay and so now we'll go back over to here describe to me a tropical drink just a single single
sentence description and now let's see what we get here well it's generating an image but it's doing it in line uh which might be part of that chat chat interface because this chat interface probably can do that okay but it did send it over to the other spot so okay that's perfect then that's fine that was a very different kind of experience I wasn't expecting to see that kind of interface over there but we'd have to read more about that component to understand um but yeah like where is the code yeah here's ch chat chat
message so the way it works here is like we have chat message user chat so yeah so we have the chat input here then we have chat message for the user chat message for the assistant it prints it out anyway I would just say that gradio is easier to understand its components but streamlit is a lot more uh robust and if you are you were to invest the time into it you can basically make a lot more complex apps than uh gradio but I don't want to go too de here as I feel that uh
it can get really confusing really quickly so we'll just wrap this up as we've produced something very similar to what we wanted in um that there we did uh sign up to uh streamlet I don't really care to push to it it doesn't really matter to me I guess we could push to it we we signed up for it so maybe we should do that um and I'll just sign in here I'm not exactly sure how we push create an app okay so it's a repo I'm not going to do that so you'd have a
GitHub repo you'd sync it and then it would go there okay that makes sense um so I'll just sync these changes here okay and I'm just going to do a bit of cleanup so I want to go back over to um here into uh this and I just want to go and delete that key okay I have a couple junk projects in here so I'm going to manage these ones here and I'm just going to delete these as well radio yep we'll archive that and we'll archive streamlit there we go and I will see you
in the next one okay [Music] ciao hey this is Andre Brown let's take a look at lovable so lovable is an a power assistant and what I like about it is that it's really good at building things end to end for you um so whereas something like let's say a cursor GitHub co-pilot is designed to help you as you code this thing is more intended to be like end to end build your entire application for you so we can go in here and build something that we want so maybe we might want something like uh
build me uh a clone of Link tree uh for developers and um we don't have to specify what it uses but generally it will use Tailwind C uh Tailwind CSS and um typescript and Shad Cen as most of these are being optimized here we could also attach an image if we had an example but it's as simple as writing uh uh this amount of information and clone app of this and we'll go ahead and enter and from here it's just going to go ahead and start building it for us and I what I really like
is that it will do its own Corrections now it can hook up to super base I actually haven't done much with super base uh in the last while here but maybe we could try to see if we could hook that up if we can get an interface here and right now I'm on the free tier there is a pay tier it's probably $20 a month but we're going to get away with um something very minimal here maybe we can try to hook it up to super base and do something with that as they say that
they can do something here but as you see it's it's dumping out code and what I really like is that it shows you uh what it's considering in terms of its uh version one and so you could I suppose tell it to version things and it should give us a live preview here I'll just pause here and wait till it's fully generated it shouldn't take too long actually literally waited 10 seconds and here we have we have our own link tree now so this is now implemented and something we might want to do is um
uh publish this or sorry like sync this to our GitHub so I'm going to go here and I'm going to sync it to my uh Omen King account and so make a new repo called Cod code connect tree and so now we have it here and I can open the code Bas and vs code if I want to explore the enre code base we can go over to GitHub I just hit period to open up github.io it's bringing every single Shad CN component in if you don't know what Shad CN is we'll just take a
quick look here but Shad CN are these solid components that you can just drop into your code um they're based off the thing I think called shadex um underneath we go to code here um yeah they are based off of shadex it's not saying that here but these are all the components it's using so the idea is that it will compose stuff utilizing that if we go back here it's dropping all of them in here regardless if we're using them all but anyway we come back over to here and so we have our interface the
next question is like okay how would I hook this up to super base which I've actually never done before so we do have a button here and so it says connect to add authentication store data and call third party apis um so I'm kind of interested in seeing what we could do with that let's go ahead and connect superbase and I don't even think I have a superbase account as of recent but I'll go ahead and click on continue with GitHub I'm going to authorize this here if you don't know subase it is a um
uh opinionated serverless back end so just give this a moment here we'll just accept whatever it's uh saying here all right so we'll go ahead and scroll down we'll create our organization and we'll just say exam proo and educational we could say I'll stick with the free tier for now we'll create the organization so I know that we can hook this up the question is can it also write the code for that as I believe that it can so I'm going to go ahead and authorize lovable here so that means lovable will actually be able
to set up the back end here here okay so now sub base is connected and so the next question is like can we hook up the back end so uh you know can you can you hook up this um can you code the back end for this uh app to superbase please you probably don't have to say please but does does hurt so I'm just curious if it can do it because again this is a beta feature but I'm just curious as to how far it can go along here click on the super base menu
in the top right corner connect your project to super base once connected I can help you um okay so yeah it is connected and I'm not sure if we'll run on the free tier here uh but we'll see the best that we can do here I keeping the the scope of the project really small to see how far we can get and so it looks like what it's doing and we probably click here and see what it's doing but I believe it's updating the existing files to integrate superbase okay so I've implemented the superbase back
in integration with GitHub authentication profile management Link Storage you'll need to create the database tables in superbase dashboard and configure the GitHub ooth for authentication to work okay so I guess the question is how do I do that um so let's make our way over to superbase super base here and I'm going to go ahead to the dashboard and so we probably have a well I thought we would have had a new project here um so didn't create the project let's go over and sync this code back over [Music] to GitHub it might have actually
already done that so we'll go back over to here I'm going to go over here and give this a refresh um so I'm not sure if it pushed the new changes let's go take a look here we do have three commits here connect component projects and features that actually just did happen right now so we'll click into this and so we have super base added it added a super base. TS file so we would have to update the project URL uh we would have I suppose a key here we have UI button uh the toast
looks like it ripped a bunch of stuff here then we [Music] have more more stuff here okay so this this is what's this file here this is actually Korean super base let's take a look here it depends on what the database is backed as await superbase from profile select EQ single so we have this kind of um query Builder here okay so that's really clear and then we have sign in with oo that's fine so nothing super complicated what's the name of the table that it's expecting it's expecting the table to be called what profiles
okay so let's go ahead and see if we can make that ourselves again I don't have much experience here but we'll create a new project under exam Pro this project will be called um I'm going to call it code connectory by the way uh everything you create here is public initially so just understand if you're creating something here that it is public facing initially we have code connect tree that's the name of this here we have a database password so for now I'm just going to make the password capital T testing 1 2 3 4
5 6 exclamation mark okay so nothing complex I'm going to tearing this down before I publish this I'm going to go to uh Canada Central that's just where I am let's look at other Security Options we have data API connection string uh I don't know use connection string only use post without authentication connect to VI uh autogenerated HTTP API for post grass it really depends on what the code wants so we go back over to here and we'll take a look so it wants the Anon key I don't know what that is let's go back
over here and just make sure we know what it is before we do that I'm just going to assume it's the anous key so base provides two default Keys when you create a project the nominus key and the service Ro so you can find Anon and service Ro under roles okay allow Public Access okay fair enough that makes sense um so I'm not sure what we want data API Poli connection string use public schema for data API use dedicated so we'll stick with the public schema quer all tables in the public schema yeah that's fine
and we'll go ahead and create that it's postgress so I know postest pretty well so I'm just going to to go ahead and hit create project and so now we have our API keys this is a little bit hard on my eyes I wonder if we can switch over to dark mode here is there an option for that I have an assistant in here that's kind of cool everyone has an assistant these days um but what I'm looking for here that's a little bit better their fonts a little bit small they must be young folk
because uh the font's so so small here so we have the uh anous public key this is a key safe to use in a browser if you have enabled Road level security for your tables configuration policy okay now we obviously can't commit this to our code because that would be public so this is where we would probably download I wasn't ready uh to grab the keys here yet um but we want to um download our code base and start using it because I'm not going to be able to plug that stuff in here I'd have
to um download this code base so I'm going to go over to here and I'm can assume this all uses just GitHub oops I opened up GitHub GitHub Cod Spaces by accident but I'm going to go ahead and grab this here I'm going to go to my local developer environment um I mean I suppose we could use get pod um yeah let's use get pod here today that's totally fine again you can use whatever you want this just an easier way for me to do it I'm not sure if g g pod uh this version
classic is going to go away at some point but uh if if it is we will have additional videos on how to utilize uh G pod locally but anyway I'm just going to open this up here okay there's also some additional code that lovable throws in here which I don't personally like but they need it to in order to generate stuff so if I go into like the index HTML they'll have like GPT engineer which is their their module in here so like in production you'd probably want to rip that out after you're done using
um lovable but this is starting up um I suppose the server I didn't create a dogod yaml but it seems to already know here what to run if you're doing this locally then you'd probably have to write um mpm run uh Dev that's probably what it's doing right here because I didn't I didn't give it any commands right but I'll just go ahead and type npm runev we'll just bump it up a bit here so we can see what we're doing MPN run Dev before you even do that you'd have to do mpm install Okay
and then MPN runev okay so it started on Port Local Host 880 so I'm going to open up local Port 880 here if I go to this tab here and click the link then I can preview the website here and it shouldn't exactly work because we haven't configured um our uh our file there for uh super base so in here we need to configure um these two values and it would probably make sense if we imported these as environment variables so um go over here how would I import uh environment variables for these two okay
I don't really want it to update the code but it probably will actually that's fine if it does we can just pull that instead we're already connected to superbase the connection details automatically managed by lovable super base integration you can simply modify the super base client connection to use the built-in super based connection okay well I guess my thought is like are those things safe to use oh did it actually just pull in mine or did it just make up something now I'm really curious because we have v wfj no no so again I'm not
sure if it's safe to do that but um these are public facing like they didn't hide them so maybe it's okay let's go back up here I don't know maybe it's okay to utilize them but I I don't know I feel more comfortable passing them as environment variables but uh they don't seem to be doing that so I'm just going to go ahead here and I'm just going to ignore these for now one second discard changes I'm just going to pull because this file has changed oh maybe they haven't maybe they're just showing me the
change and they actually haven't changed it okay that's fine and um again I'm going to end tearing this down so it's not a big deal but you know for test purposes I suppose we can do this I just want to see if it works right like if their code was uh is good so I'll bring that in here if you're not comfortable doing this because uh you're building something for real then that's totally fine but uh I don't think we're where's an issue because we're both on two free tier stuff and I can't imagine much
damage being done here so now we have created uh or we've hooked up these two links here so I'm going to go back over to terminal here I'm going to stop we probably don't have to stop and start it we'll go back over to here and so I mean we have this configured but I don't know if we enabled authentication um and we did also didn't create a database um so how is it going to do this if we don't have authentication turned on and we don't have databases turned on well that's fine let's just
see what happens so I'm going to go back over to here and I have this running I'm just going to try to log in with GitHub supported provider provider is not enabled okay so that's what I was thinking I mean I don't necessarily want GitHub as the provider but uh this is connected I mean this is fine but how did we get that Olo maybe it's like hooks in here add a new hook maybe it's under Integrations because that one looks pretty plain and also I said it was for developers so I kind of it
kind of makes sense why they were trying to do that so we'll go ahead and say say superbase GitHub login to enable GitHub off for your project you need to set up GitHub o application add the application credentials to your superbas dashboard so create and configure GitHub ooth app in GitHub add GitHub ooth keys to your superbas project add the login code to the superbas JS Client app okay so I want to know where this gets at it add your GitHub to the super based project where so we go back over to here and I
would have thought that it would been under here but maybe it's not um maybe go to Project settings here and we have configurations maybe API like where would I where would I add that it's not very clear is it to your super based project mhm okay we'll go back here for a second wait hold on no I don't see anything there well here we have the full steps right so go to here go to authentication go to Providers okay so go back over to super base here authentication providers oh okay actually they have quite a
bit that's really good and so here somewhere there is GitHub and so we'd want to enable that by filling that information but first we have to go ahead and create a project find your car call back URL okay so this is the step that requires a call back URL which looks like this the only thing is that this project here well I guess it's this is Project Specific so that's not a much of a problem okay authentication providers find your call back value here let's go back over to here and does it say where okay
it's right here okay so we have that that's fine so it wants us to copy that then we need to go to the ooth page on GitHub so you would have to have a GitHub account you can have you have to see swimmer is a really old app I made a long time ago and this one's going to be called um code connect tree this doesn't have a proper URL because it's not a real app application yet but we could is this going to work if we don't actually have a proper one I'm not sure
how that would work so for now I'm just going to use because giop is it is public so I'm going to try to use G pod for now I'm not sure if that would work I'm going to just open that up and so going to go here uh simple L simple link Tree app and then we need that call back URL which I still have open here okay and then we go to here allow oo apps to authorize via device flow I don't know what device flow is but it does say oh keep it unchecked
okay so we'll go ahead and register the application and so now we have a client ID and client secret I'm going to copy that I mean clearly that must be what we need to copy and we'll go back over to here and so I'll put the client ID here I'll generate a new client Secret I will have to authenticate with my GitHub mobile it depends if you have two Factor on I do have two Factor on and Y just getting my phone to go here my phone's acting a little bit slow here today not sure
why come on phone you can do it just a moment sometimes when the request goes to my phone it doesn't show up so I'm going to hit retry again I just had to wait for it to time out and so now I'm going to go try this again okay and so I'm just entering in the two digits on my phone here and that will get us through here from GitHub mobile and so I've just generated a secret of course do not share these secrets with anybody else I'm going to get get rid of my secret
here eventually so it's not a big deal I'm going to go ahead and hit save um I really should be paying attention closely to the instructions here as there is more to fill in not really I mean we just don't have a Graphic so now copy that over enter your GitHub credentials into your super base so I believe that's what we just did yes click on the GitHub according listen expand GitHub on so once we actually have it entered in we will enable it and we'll hit save I mean if it's that easy that'd be
really awesome I'd be really pumped add the login code to your client app so make sure you're using the right super based client with the following code so this says sign in with GitHub GitHub I mean I would think that's what it did if we don't know for ER let's go over to our commit history and check again okay that was junk coded added which we've already uh replaced and so in here I just want to see if it added that code correctly based on their recommendation it looks like it is yeah it looks correct
variant destructive it might be slightly different this one has a variant destructive on it which is totally fine I don't really care um so that seems fine let's go back over to here uh sign out routes I don't know if we need all of this but you know let's see if it it just works so I'm going to go back over to our app here going give this a nice hard refresh and let's go ahead and click this I did not mean to do this but it actually did open to the correct location we're authorizing
that looks good so far um it's trying to redirect back to Local Host 3000 which is not not where this is running I'm not sure why it did that as that is not correct sometimes you have to specify where the redirect is here it's showing this right this is the redirect okay but for some reason it's sent to the Local Host 3000 I wonder if anywhere in the code it specified that so I'm going to go ahead and find in files and we're going to search for 3,000 yeah there's nothing that says that it would
redirect to Local Host 3000 so that's kind of annoying but maybe what we could do is cheat where did that URL go here and obviously that's not correct but what I'm going to do is just grab this link here because I I don't know why it did that honestly take out the double for slash here hit enter and so now we're signed in so this actually worked um but it says public links does not exist because there's no data in the database so and this app doesn't have a way to insert data now we could
tell it to like give us a way to add that stuff but I don't really care um I just want to again see how far we can get here just a little bit so we need to add some tables and I think one of the tables it was asking for was links but let's go check our code to be certain What It Wants okay and in here we have components um I think it's more like the top level if we go to appt CU like it goes index or Main and then app possibly so main
loads app right and then inside of app we have um our routes to this page which is to our index page so in our there's probably a Pages folder maybe here it is called index and so an index I would imagine that this would fetch the data so here we have the Au and it says fetch profile and fetch links so we clearly have two types of databases uh we have a profile in links and it says says get profiles ID with everything so there's clearly an ID for links we have a profile ID and
Stu uh created at it'd be really nice to know what data here um as it's not very clear so I'm going to go back here and just like what is the database what is the table structure for links and profile because we don't know it didn't tell us anywhere right so we'll tell it to do that for us and then it'll maybe it'll tell us what it is all right let's see what it uh suggested here so here we have our table structure um we'll go back over to super base and we'll try to match
it here so the first it says is profiles table I believe it's I I believe it's called profiles I really hate how uh this keeps popping up here uh yeah yeah yeah stop trying to fill it in okay profiles like always shows me my birthday if anybody ever wants to get me something for my birthday you know my birthday now uh so have profiles enable row level security oh that's cool policies are required to query data you need to access you need to create an access policy before you can query data from this table without
a policy querying return empty array weate policies after saving this table fair enough we'll go ahead and create a profiles table okay uh I didn't specify any columns oh they're down here okay okay so we need username this would be text um we'll need full name this will be text we'll do Avatar URL this will be text we will do uh bio this will be text we will do uh we have created it already so we'll go ahead and save that and we'll do links we'll scroll down now that we learned that it's down there
below I mean I'm kind of zoomed in so it kind of explains why we can't see it so here we have profile ID this one we want to link to the other ones I'm assuming that that's the way I didn't know I just assume that's how we link it so we'll link that one over to the oh that one wants to go to the uyu ID h I mean yeah I don't care I'm just going to it didn't make uid so I'm just going to stick with that I mean this one we might be able
to change a u ID but I don't I don't think it matters so in this in this case it's not going to hurt anything because I don't think we're quering based on U ID so text text just a second um description text tags is assessing an array uh text okay but how would I do an array if I don't see the option for array defined as array there we go pretty uh pretty easy so far all right let's go ahead and save that table and probably as the other one is going to suggest we need
policies in order to utilize these tables let's go over to here and uh we'll create a policy um full policy here we have links permit to select all from using andon so basically saying that they can please provide anql expression using the using statement oh come on I don't want have to do anything uh select read access for all users what I do that yeah isn't that it there we go we'll create one for profiles here oh maybe that's not just for Anan okay let's go back here for a second I mean it's not a
big deal again we're just testing things but right now this one I think was set to we set a na here then it would just be this if I click this it sets it to everybody I think let me go back to this can I save that there we go [Music] um and I mean right now it doesn't really matter with our policy because it is um like now it's set to non is that you have to log in to utilize the service anyway so it's not a really proper link tree but anyway now that
we have these tables let's go back over to here and see what we get input syntax error big int so it might be an issue with the um the ID as I mean I said it didn't matter but like maybe because it's saying bigant right I wonder if we could change over these tables let's see here uh we'll go back over to our tables it's not proper but maybe I can change this to Big end okay that's great I'm going to go to tables here let's edit our table it's crazy I can just like switch
things on the fly like this I mean like in a good way as long as you know what you're doing it's not that big of a deal and so in the original one I think they're using U ID but I don't think that's big because they say U ID here but then in the search it was doing that I mean we can just change it canot type big to uid okay well maybe it already is big in let's go back over here I I'm fine with regular IDs I just have to figure out what it
is inal syntax for big this value okay so that looks like a uid and so maybe it's hardcoded for a very specific value in our our code base here so if we go to our profile here that's that's not very useful we go to our index I'm trying to find this information um now the thing is is that our table like when we created our user in GitHub it probably returned back a user ID and The UU ID is that's what it is and so probably what we have to do is recreate these tables so
I know this is a bit of a mess but I think we're going to have to delete this policy and delete this and then just make those tables again um as that's what I'm thinking is happening there that's that's that relationship there because it's not going to let me recreate that table like that and we'll just keep dropping that it's not a big deal a little bit timec consuming but not too bad so we'll go ahead and now we'll create this table against so we'll have um uh profiles and this one definitely has to be
uh uu ID so we'll go back down here to U ID U ID there we go and then we'll add additional tables here one moment sorry I just disabled Dash Lan because it keeps popping up and that's really annoying for me but we have uh username full name Avatar URL bio okay so we have I'm just going to pull this off screen so I can uh quickly do this we have um username okay we have full name we have Avatar URL we have bio okay so this time we made sure that our primary key is
a uid so go ahead and Save that we will now go ahead and create the other table which is links um it's saying that it's also uid I mean if one's going to be uid we might as well do both as uids no no reason not to and so here we have um profile ID which is going to be linked to the other table here specifically to that ID here and then we will add the uh title text then we will also add the URL text we will also add the tags text and Define that
as an array okay so tags text URL title um URL tags title profile ID and we'll go ahead and save that okay so that is now saved table links have been updated but there were some errors I'm not sure if that's good or bad I'm noticing that this has fewer columns the last one had seven I'm not sure why this one only has six but uh we'll go back over to here I mean it looks like it created them so I'm not sure what it's talking about and by updated we just created it I mean
they're still there so I'm just going to ignore that error for now and I'll create a new policy and just say select all from all and then just only set it to Aon and we'll save that policy and because we we'll insert data but we'll probably do it through the interface here and so that way um uh it's not like the app has to do it this thing has to get out of the freaking way okay we'll save that policy here okay I'm again not sure what's wrong with the table it seems like it's fine
to me and we'll go back here give it a refresh so it says Json object requested multiple or no rows return and that's fair because there isn't uh any data there right now and so we need to create a profile but right now I don't know what the um the profile URL is because when it's set up it's not like it inserted a user on the start so somewhere in here when it logs in it's going to fetch that data and so there's a user ID in the session and so I need to get that
information so I'm going to go here and just console log out the session. user ID and that way we will be able to figure out what the um U ID is for this particular user to insert into the database okay so we'll go back here and we'll give this a nice uh refresh here so we go back over to here we should have logs somewhere here oh but this is client side so we would see it over here okay so I'm just opening this up this is kind of a bit funky and so in here
we should get um I'm going to refresh this we should get it here so this is the U ID we need to insert for our profile okay so I'm going to copy that I'm going to go back over to super base we're going to go over to our databases we want to insert data Maybe table editor no SQL editor this is probably what we want so I was just opening that we could insert data I'll just go back to table editor I don't want to create a new table oh let's just click into the tables
here here we go this thing's got to get out of here it's really annoying me we'll go ahead and we'll insert a new row um we're going to give this ID here in particular we'll leave this so defaults it we're going to say Andrew Brown say Andrew Brown and then Andrew Brown and we don't have an avatar right now we'll just say uh I teach Cloud we'll go ahead and save that and so now we have saved uh this record in here as a profile so that we might still have some issues we'll give this
a hard refresh and so maybe it say error fetching profile Jason return requested multiple multiple or no row return this could be an issue with our policies as well as we're not 100% certain but if we go down here we have fetch profile right and so if we we need to debug this and see what we're actually getting um so if there's an error then it's returning this it's basically saying there's no data so that makes me think that there is an issue with our policies so we'll go back over to here but just in
case because it says fetching error fetching profile right so we'll go back over to superbase and I want to go back to policies here under database because I didn't really learn how these worked I just kind of like SK through it enable read access for all users so we'll edit the policy and I'm just going to change it to everybody to make this a little bit easier and so now it shows up okay so obviously a non is not the one that's accessing it again I'm not an expert in super base but I'm just here
to try to get it to work here today so even if this is not fully proper um we're just getting it to work uh for this use case Okay this isn't a lesson in sub base it's just more like for us to complete this we should be able to do both the back and front end um and so we'll go ahead and just insert some links here so despite that error constantly appearing here we'll go to the table editor and we'll go to Links here I'm going to insert a new row we'll just say U
we'll let it default to whatever it wants we will select our profile here and we'll just say uh exam Pro and this will be https www. exampro doco and I mean this is a list of tags so this might be Cloud uh AWS gen as an example we'll save this not valid array well you didn't specify what the array is supposed to look like what if I do it like this does this better maybe this one's not super friendly for this we'll save that there we go so that worked we'll go back over to here
we'll give us a hard refresh and so now we have this and it links and so we've built basically link tree in the most basic form um again could could do a little bit better than this but for 30 40 minutes is a pretty darn good so let's just go ahead and tear everything down now as as I'm completely done um so I want to get rid of this project so I'm going to go over here to the left hand side again I'm on the free tier so I'm not really worried about any spend but
uh in case I have any keys that should not be available I want to get rid of those so we go ahead and type in the project name here and delete the project good and the app I don't need the app anymore in GitHub so I might go ahead and get rid of that so back to developer settings here I'm just going to delete this app uh can I delete it Advanced delete delete okay so now that app is deleted and I mean I can commit these changes it doesn't really matter but um example but
you know if you wanted to run this yourself you could um okay I don't care just commit it there we go sync and sync and deploy but this is actually a public repo I believe so if I go over to maybe it's not let's go see open in GitHub context here it's a private link so I don't know I could share this publicly I think that if you're doing this for real you should just try to generate it out and lovable and try to replicate it so in this case I don't think I'll provide you
the code as uh I think that's that's just better for your learning experience um but yeah there you go that's lovable Chia [Music] Chow hey this is Andrew Brown this video we're going to take a look at fast HTML which is used for building modern applications in pure python I believe that it handles both the front end and the back end whereas something like uh fast API is only back end I don't think the projects are related but this could be something considered similar to um you know something that is like gradio or stream but
more a bit more polished um but anyway we'll go take a look here and see what we can learn I'm going just open this up in a new tab I'm going to make my way over to our GitHub repository to the geni essentials I'm going to open up a new environment here in GitHub code spaces looks like I have one for before I'm just going to go ahead and delete that as we will start from scratch sometimes I use git pod sometimes I use codespaces sometimes I use local developer environments I'm all over the place
here because I want you to be comfortable anywhere you're working um so you should learn a little bit of everywhere okay so we're going to start this environment up as that's loading let's go take a look at fast HTML um so fast HTML is a new way to create modern interactive web apps it scales to six lines of python it scales up to complex product applications with off DB caching styling and more so this sounds more like a fullon framework whereas the other ones are a little bit more uh lighter um but it also sounds
like you can start with the minimal amount of code to get going so let's go over to our components and what I'm just trying to do is figure out the documentation maybe docs would be probably better okay and so it looks like we're going to do a python fast HTML and here is a minimal app that we can get started with so let's go ahead and give that a go I want to have a theme that is a bit easier on the eyes so I'm going to go with a dark theme Here I do not
think I chose a dark theme so I will try one more time here to change my theme um it's a little unresponsive but I think it's just thinking maybe there we go there we are so that's a little bit better we'll bump up the font here I'm going to make a new folder over here called we'll say allow here but we'll call it fast HTML and I know a lot of folks like Fast HTML um but uh you know again I just don't have tons of experience with it but we're going to get to it
here I mean for me all Frameworks are basically the same uh so you know I don't I don't find one better than the other at this point it's all the same to me but you know us exploring these is good because then you can choose what what best works for you anyway so uh here we have a minimal example and so I'm going to go ahead and copy this and we'll go back over to here and make a new file this one's going to call be called main.py I'm going to go ahead and paste in
main.py and let's take a look at what we have so far all right and yeah we'll install the python extension why not so we have a fast HTML common that we're importing here it's initializing the fast app we have app and RT I'm not sure what RT is for as of yet but here we have uh get and inside of that we have a hello world and you can see it has like this div so it's going to render a div called hello world we have X HX get for change so I'm assuming maybe that
is not actually not sure what that is but the point is is that will this work what I'm confused about is like where does the serve come from maybe when we do this Mass import that's where it's coming from that's probably what the case is I'm going to do a I'm going to CD into fast HTML and we'll do a pip install hyphen R requirements.txt okay and I think for that to work I have to probably type it correctly so let me just autocomplete that and so that's going to go ahead and install uh that
let's go and see if we can get this running so say python app or main. okay we'll give it a moment here to start it started on 5001 and we'll go ahead and open that in the browser here and we have ourselves a hello world so a very basic start to it um feels kind of like a ruby Sinatra to me to be honest but let's scroll on down here and see what else there is so we have running the app will'll print out the following adding interactive interactivity surprisingly easily thanks to HTML HTM X
okay so modify the file to add this function all right let's go ahead and give that a try and oh it's literally a single line so weird to have that as a single line I'm not sure if I feel comfortable doing that so I'm going to bring it down to a new line okay and uh we'll go back over to here give this a hard refresh okay and all I did was click it so I'm just going to refresh this here I click it and so it looks like if I click that creates an action
so that's kind of interesting now you have a clickable element and the page changes when the text is clicked okay getting help from an AI so this is where it might be interesting where we we'll want to use an AI to um augment it now I believe that I mean I haven't seen this much but um I think this is a cool idea where it's like there's a tool out there and let's say you have an LM that doesn't know anything about how to use it well they wrote a big uh knowledge based document so
you can use something to generate out I was trying this earlier and I found that I wasn't able to attach my my company credit card to Claude but because the context is so large I feel like you have to use something like Claude Sonet um with projects to do this it suggests that you could use other things but um like it says cursor chat gbt Claude copilot won't give useful answers about to fix that problem provid an LM friendly guide that teaches how to use fast HTML so I suppose like if we're using co-pilot um
then we could just drop in that code but like I don't know this is hard to say I think what I'll do is I'll resubscribe with my own credit card to uh Claude because again we can't use it I'll just show you what I mean if we go ahead and try to uh use this document I'm pretty sure I download it before so I downloaded I renamed it to prompt if I was to attempt to upload the file which I have let me see here it should be downloaded here somewhere yeah I have it called
prompt. txt I'm going to try to import that file it's going to say that conversation is 25% over the length limit so in this version I can't use it okay so this is with Claude Sonet 3.5 so I think the only way I could do this is by upgrading my plan so give me a mo I'm going to upgrade even though it's really expensive for me I'm going to go upgrade my plan okay all right so I resubscribed to CLA I hope it's worth it but what they're talking about is that you can create project
so if I go here to the left hand side I can create a new project and I'm going to call this one um because I want this to be specifically for writing uh fast HTML I'm going to say fast HTML I'm going to show a version of this with Claude and then we'll do chat PT but um if you don't have CLA just watch at this point and then we'll see what the experiences with like chat PT so fast HTML code writer is what I'm going to call it write uh fast HTML code okay so
that's what we want this to do we'll go ahead and create that project and I want to uh bring in that content here so so um what I'm going to do is I'm going to go and add that and upload it so let's just that prompt txt here and so now it should now have cont uh context of what's going on here it says 23% knowledge capacity is used so obviously the free tier could not handle that but now we'll just say read the project knowledge read The Prompt txt document and just confirm you understand
it's contents okay so hopefully it's not going to write tons back to me because then that will slow things down but the idea here is I want it first to know about the prompt document sometimes you have to do this for um for these things here okay so it's saying that it now understands about the document and so the next step is to tell it to generate some code so I want to use fast HTML to generate or to write a um chat interface for open AI well I am subscribed to Claude but I'd have
to put some credits on there if I wanted to use it but I'll just say I want to use fast to write a chat interface for open AI uh GPT 40 uh mini can you give me the code for that let's see how it does okay I was thinking like I could load up Claude but i' uh the my company credit card is not working here I can't imagine work in um a workbench so I might have to do some work there to get that stuff hooked up all right so that was very quick um
I didn't even need to pause for that so we have some code here is it any good I'm not sure we're going to go copy copy the contents of the file I'm going to make my way back over to here into GitHub codes spaces I'm going to make a new file here called chat dopy I'm going to paste in the contents and let's take a look at what we have so I'm going to go all the way to the top here and so we're importing fast HTML common we're importing open aai which is correct we
are importing OS we are building our app here it's saying to set the open API key and this is one way that you can do it it's not the way that I like to do it but it is a way that we can do it um I would rather create a EnV file okay and I'll just also make a env. example and so in here we will just make a new. GE ignore file get ignore and I'm going to ignore the umv file we've done this a couple times so like with streamlet and other ones
so I'm going to go over to here and grab my example information I'm going to paste it into here and then we're going to get our real key and place it into here in just a moment after we've audited the code so uh one thing we did in stream litting gradio is that we imported um um it as a EnV so we're going to go ahead and do that we'll need to update the requirements.txt we're kind of sharing it with uh two other ones here but it's called python. EnV I think it's silly that they
have to put python in front of there but that's just how it is um I think we'll also need open AI here so I'm going to go update that as well open AI okay um one thing that is a little bit different that that I did um differently here is how we import open AI so this is how we're going to initialize the client or this is how I prefer to initialize the client going to go ahead and do that what's interesting is I don't see the client being called anywhere here at least not yet
so I'm just carefully looking through here so I see open AI here and it's using the old the old information that doesn't look new at all so it looks like the open AI API you are using is old because that's not what it looks like now let's see if it can figure out what's wrong here definitely is incorrect okay and then we'll go down and let's go take a look here okay it's not using GPT 40 mini let's tell that maybe he doesn't know about it it's interesting that I'm paying for this and it's wrong
I don't know it's a bit frustrating but it's saying gp4 that's not that's still not right but that's okay so I'm going to go ahead and just copy this part of the content I don't feel like uh replacing the the top part here so I just don't want to have problems with that so we'll go back over to our file here I'm going to go and grab this and it's not producing the latest code it's a bit frustrating so we'll go ahead and paste it in here as such but that's okay because we have our
own over here on the left hand side so we'll go or right hand side so we'll copy this part and I know I want to bring this in here I'm not sure why it has chat histories now but it seemed to have made some revisions here um also the Styles have greatly reduced again not sure why that changed but whatever so we'll go down here and the API looks a little bit better but we want to use um 40 mini which is over here so I'm going to go ahead and copy that I'm not sure
if it's going to need a Max tokens the other one has Max tokens so I'm just going to save myself some trouble and bring that over okay again not sure if we really need it like if it's maybe there's like a default I'm not sure and I really really hate for space indentation but we're going to just live with it in this video here um here obviously they have like a try around it and this one doesn't so I mean it should be no surprised that open AI can write better or sorry chat GPT can
write better open AI code as that would be expected uh but the question is will this work right so here we have the div message user message do have assistant and then it has CLS I'm not sure what CLS stands for but obviously this does something we could just look these up here oh these are all HTM h X okay a turbo allows you to specify that some content in a response should be swapped into the Dom somewhere else out of band okay so that'd be really interesting to see if this works um but to
me like the code just doesn't look very good so I'm not confident about the code that's here but that's okay we're going to make our way over to uh open aai and we're going to go over here to the the um the login and nope that's not what I wanted I want to get over to the API login and so from here I need to generate out a new token I actually already have a project called Fast HTML I must have created it earlier if you don't have you just go create one there it's not
a big deal then once we have our project we're going to manage our projects I'm going to go over to oh show me projects here we go so I've already created this project before just go create it and name it if you don't have an open AI account go create one they have a free tier if you if the free tier is gone then you'll just have to watch here but we'll learn as we go here so I'm going to go ahead and paste this into the project here and then the next thing we'll do
is go over to General I'm going to grab my or ID and I'm going to paste that in here as such and then I need my open API key so here I have open API key I've already created one previously as you see I made one attempt here before I think what happened is I forgot to unpause the video going to call fast HTML fast HTML we'll give it all permissions here we have our key we'll bring this back over up to here we'll paste it in save it I'm going to scroll up here we're
going to go to over the terminal I'm going to stop the server we need to do another pip install our requirements and so this is all set up whether it works as a different story so do python main.py so I thought maybe like fast htl would have some kind of more Magic on the front end but it's just using um HTM X okay so here it has hello world I think the reason why it's doing that is because I did not specify chat so we're going to go here and change this over to chat let's
see if this loads nope we have an issue here did you mean open AI let's go take a look at our code here um I think our import's incorrect so I'm looking at our previous example from streamlet and we'll paste that in as such now we don't have any errors it's opening on Port 5001 so it's weird that like this is a text field but we'll go down here so hello um can you hello uh can you teach uh teach me top 10 Japanese verbs let's see if that works here we're going to go ahead
and and send and I see nothing working here I'm going to go down below it looks like I got it came back okay so I just was impatient there and it actually did work this is kind of I guess this is fine I thought for some for some reason I could like type in text in here but I guess not of course here here are the most used verbs along with their meanings and examples one thing I might want to do is like uh you know is there a way depends if that actually comes back
is markdown does this come back as markdown let me go back here and take a look uh is there a way we can format the messages as markdown can you just change uh show me the Cod changes let's see if we can do that that might be interesting to see yeah that price Point's hard okay so here we're importing markdown so that's the first thing that we will try to bring in here okay so we'll do that the next thing is [Music] okay that's fine so we'll scroll down here so we have format the message
I'm going to go ahead and copy this in we're going to take a look here carefully so what I'm looking for there any changes it didn't really highlight the changes as I was asking it to do I'm just carefully looking here like beforehand or any that stuff has changed there's clearly more styling so I'm going to go ahead and grab that the next thing is we have this post let me just carefully look through this so the session looks the same h this is obviously wrong but um the only thing I see changes the styles
to be honest message pre message code user message there must be something different here format message okay so this is a bit different so we'll go ahead and copy this we'll go back over to here I see okay I'm G to make our way back over to our user interface we'll give this a refresh I'm not sure if that would persist the previous messages but we'll give this a hard refresh and see what happens no module name markdown fair enough so we'll just stop this we'll go ahead and install markdown I'm assuming it's just called
markdown hopefully and we'll do pip install hyphen requirements.txt and then we'll go back and run this hopefully that's the right version of markdown it didn't tell us which one so can you teach me the top 10 Japanese verbs around cooking how about that we'll give that a go here and this time we just have to be a bit patient when I tried this the other day uh with Chachi PT it actually tried to do streaming which was very impressive um though I never got a chance to run the code so again just patiently waiting here
it seems like there's probably a problem I don't think it should take that long yeah we have an error beam returned back air for chat [Music] ID maybe we changed some of our code let's take a look here I mean I didn't change anything that should affect that line 52 here CU it does have a session going on here so I think this might be an issue of us running it a second time and there's not necessarily anything wrong with our code um so I'm trying to decide like how would I fix this well maybe
it can CLA just fix it for us like I was thinking like I got to clear up my cookies I really don't want to do that because the sessions probably stored cookie side let's see if it gives me an answer here before initializing it okay so it seem like it just changed a little bit up here okay and we'll go back over to here now is that the same let me go back and take a look I'm just going to copy this one more time I don't trust my copy job and we'll do that to
here okay edit saved I'm just going to refresh here let's give this a hard refresh can you teach me the top 10 verbs for uh top 10 Japanese verbs for gardening let's see what we get here oh there we go it came back and it's formatted so we have U Mizu so that kind of looks like to sit but I think it's saru yasuru yasu sentu okay Sentai that's a that's a fun one to say Sentai suru um Mizu o yaru that's easy as well okay so this is working um now the question is what
would this experience be with chat PT I think it's worth exploring just because they gave us that document so I'm going to go over to chat PT I'm going to go back to our uh file here I'm going to rename this um like Claude we're still going to use um chat gbt just to make our lives really easy here or like sorry open AI but this will be chat GPT example and so I have a paid version of chat GPT and there's o1 which has more advanced reasoning but I can't attach files so I don't
think I can so I have 01 search is unavailable okay so I'm going to see if I can attach that file but when I go to my uploads I can't upload it because you can only upload like images here if I go to 40 I wonder if it can take a document it can Okay so GPT 40 can take it I mean 01 would be a lot better because as reasoning but if I want to get that document in here this might be the best way to do it when I did this before I wasn't
actually able um uh read The Prompt txt so you understand how to work with fast HTML I almost kind of wonder if like the because 01 can go out to the internet then maybe it doesn't even need the um it could just fetch the dock itself that way way but I can go back over to here let's use basically the same prompt uh it's very similar so we'll go back over to chat GPT here I would prefer to use o1 but whatever we'll see what happens here we'll just give it a moment it's definitely different
he using websockets which is actually really cool this is what I said before where when I used it it was telling me websockets so I like that um where's the GPT 40 minion here let me take a look here okay but there's no GPT 40 mini code where's the gp4 where's the open AI code to actually work with the model and again I think o1 would do a lot better but because we can't upload a file we could dump the the contents of the file in01 I've done that before but I really don't like doing
that it looks like it's already starting to use the old one yep so I'm not really impressed by that let me see if I can can I switch to my other model let me just click here new I'm going to go to advance re reasoning because it's not really performing how I want it to perform now I can't attach the file but I can copy its contents into there so I'm going to go and open that file quickly just a moment just give me one second so now I've opened up the file I'm going to
copy it I'm going to provide you a body of text so you understand how to work with fast HTML when you have understood it just confirm that you understand it in a in a single sentence okay if we don't do that it's going to outut a lot so now I've pasted it and I've basically dumped the whole thing in there I don't know this is going to work but we'll see what happens I feel like it's going to dump a lot back oh no it didn't okay excellent okay so now I understand how to utilize
fast HTML decorator based routing okay great so now I'm going to go back over here and I want to use fast HTML chat interface it should render markdown can you give me the code for that let's see if we can do it because I I have more faith in 01 doing a better job but we'll see what happens here okay all right so it finished generating out some code um let's take a look here so open a open AI code looks fine it's not streaming per se it's using the old one what the heck okay
use GPT 40 mini and use the new API okay because it's not I should have said open AI open aipi I don't know why it does that but it keeps doing that I guess just information's a bit old but we'll give it a moment here to update what is it doing no no no no what what so yeah it's not uh it's not doing a good job here but we'll try this again it's interesting because like the first time I did with um the first time I did it the other day I actually got good
results but there you just get kind of random stuff here so again we'll see if we uh can get me a bit better here watching this as it codes [Music] no okay so I mean we could play with a more but clearly it's not doing what we ask it to do and we could tweak it to get it to work but um we obviously already saw how we could tweak it through Claud so both Claude and chat GPT 40 are failing on this task um and it's not on the fast HTML part it's on the
chat GPT part which is or the open AI part so I almost kind of feel like if I provided an example of um some code it would probably do better I don't want to do that here today well I guess we could so hold on a second you know what let's just try it because the whole point of it is to see if we can get this to work so let's go over to um the streamlet code which is small okay I'm just going to copy this okay still wrong here is an example of using
the correct model and uh open AI API with streamit can you learn from that and fix the code for the fast HTML okay we'll go ahead and paste that in there I almost think a121 might be good at this because they have very large contacts and they might be able to do a better job with this but I'm not sure or AI 21 I call it a121 it's AI 21 I wonder if it's supposed to be like area 21 not sure or like Forever 21 but we'll give this a moment here and see like this
is like if we have a comparable example maybe it can do it Eh this one actually has a two column chat image image which is what is actually kind of probably what I would actually want but something that we didn't try was like if you have this example here can you replicate from this other one okay not sure why it's doing it that way but that's fine I suppose it's substantiating here which is fine I don't know why it's mocking the class there's no reason for it to mock the class I don't know why it's
writing so much documentation that's really annoying but anyway it's interesting that it's coming closer to the solution with an existing example so I just didn't think about doing that before all right so let's go ahead and grab this here and there's a lot going on here but maybe we can make it work so we'll go back over to fast HTML here and I'm going to paste this into here the only thing that doesn't make sense here is the open AI definition up here if if you truly have like it doesn't believe that there is one
right because it's already made a mistake this is something what we learn about um these things is that they start to double down I'm going to just install install the uh viom extension don't install this um this is just for me because I use the Vim keyboard extension to move around really easily but it's wrong but it thinks it's not wrong and so it's kind of it says here's a mock example but it's kind of mocking us as if like oh are you sure this thing actually exists it turn it goes chat completions with a
Capital C and then create which is really annoying um we'll go down here and so we'll delete this part out this looks correct that's fine uh we'll bring this over a little bit here this is okay this is looking like what we want it to be all the apis look correct at a glance okay let's see if it works python app or sorry python chat um chat gbt it says no module found chat GPT chat did I name something wrong here hold on here chat gpg chat oh maybe maybe that's something to do with the
fact that I named it chat chat gbt but that's fine we're just going to ignore that maybe that doesn't matter I'm going to go to Port 5000 should still be over here um okay maybe not no ports are open error loading asgi I'm not sure what that is so maybe just to make this work I'm just going to rename this to chat chat maybe the naming is throwing it off I'm not sure why that would we'll just go here and type in Python chat. py okay so that clearly isn't the problem we'll go rename this
back to um back to chat chat GPT error loading asgi so there's clearly something it's added in here that it doesn't like maybe it's how it's calling it at the bottom let's compare it with our other one for Claude so this one's just serve at the bottom so I'm going to just change this over to serve is there actually an app called app in here there is but this one also has that but it just calls serve so it probably infers it automatically so I'm going to go ahead and try this again I'm just guessing
I don't know this framework but notice we don't have a problem so clearly my guess was good and we'll go back over to here and it's loading we have some style issues that's okay the question is does it work here so um can you describe in one sentence a landscape uh a landscape in in North Western Ontario and I'm not expecting this to work but if it does that'd be really awesome we'll be a little bit patient here as it's thinking we can go back over here and looks like they got a 100 okay and
we never got a response P oh no there it is 200 so we get 303 from a post what the heck's a 303 303 status code I've never seen a 303 before redirections in which the server wants the client to use a different URL than the requested resource okay um so it did not work okay so we'll call this done but uh yeah that was a little bit disappointing I thought I thought it would work better than that but I guess we get what we get okay so I'm going to go here and just add
all changes and I don't think we committed let's go back to this one here I just want to make sure I did not commit the EnV by accident I don't think so so we are fine and we'll go push that but yeah there you go that's what it's like trying to port code from one to the other it seems like you there probably be a way you can make a model like train a model to spe specifically do uh porting from one code base to another but I feel it have to be an expert in
both and have good examples we'll go over to here we'll get rid of our fast HTML API I'm going to go to manage projects I'm going to go ahead and um delete this project I'm of course going to keep um my Claud subscription around since I'm sure we'll be utilizing in other videos here okay I'm just going to Archive that and uh yeah that's fast HTML as best as we can do and I keep forgetting to do this but I'm going to do it in this video just so that you know how to do this
yourself but if we go back to um this repository here I'm just going to make sure I destroy the GitHub code spaces I keep forgetting to do that so go ahead and we'll hit delete and I'll see you in the next one okay [Music] ciao let's talk about the active sandboxing so sandboxing is when we put a piece of our workload within its own isolated environment such as a container and this is something that we'll probably want to do with llms quite a bit because llms have the ability to exhaust all your resources because they
use a lot of memory and as a conversation continues so does the memory uh foot footprint increase they can hang they can crash um and uh this is not just for LM models this is for AI models as well and so we find that containers whether that's on kubernetes or in Docker containers um or uh whatever kind of container format you utilize is very useful um to isolate these parts of your code from everything else um it maybe for smaller startups you can get away having everything on uh a monolith but uh containers are going
to come into play quite quickly if you are trying to build something that's not using man Services [Music] okay all right let's take a look here at OPA which stands for open platform for Enterprise and it's a collection of Open Source Linux Foundation projects that provide blueprints to deploy a workloads using containers um and so you can find these projects here on github.com at for Opa hyphen project and the projects we're talking about here is Gen examples gen comps J eval gen work Studios and there are more but the top two that I want to
focus on is Gen examples and gen comps because those are the most mature ones and the ones that I find are the most useful um and as the other ones mature I think they'll be worth looking at um but I want to just say with Opa the projects are unopinionated templates that allow you to deploy in various formats whether that is with Docker uh uh Docker compose uh kubernetes or onto various different Hardwares like Intel AMD and Nvidia um but let's just talk about gen examples and then gen comp so gen examples are a collection
of Mega services for specific Ai workloads and the idea is that they have a bunch here and So based on the name you kind of get an idea what these workloads could be so if you're trying to build something that does code generation you could use the code gen uh gen example or if you're doing translation you could do that or if you're doing video Q&A you kind of get an idea of what you could utilize these for and so the idea is that they already have uh um these complex workloads set up so if
you were trying to build out a chatbot with rag you could utilize this I'm going to get my pen tool out here uh if I can find uh the pen here it is okay and so the idea here is that you are deploying this right and these all contain containers so if you are serving a model you use T TGI service right but then you might have something that sits in front of the TGI service um before you send things to that service you might want to do reranking retrieval embedding these are all separate containers
some of them running um uh llm so if you're doing embeddings this would run an llm here and obviously the TGI service down here would R uh run one okay you might need a Vector store so that' be another thing with provision you need a front end um way to from the front end the back end to talk to each other which would be its Gateway so there's a lot of moving Parts here but the idea is that it's a vanilla template so you can go in there and you can tweak it to your needs
and it comes in multiple different configurations based on what your use case is um and all these components here are generally made up of gen comps which are the microservices which we'll talk about next here I do see the spelling mistake I do apologize for that um so Jen comps are microservices that you can use as the building blocks for your own AI workloads a microservice will be configured to work in various ways with various Technologies and so here is a bunch of um uh microservices they have so for example on the left hand side
of highlighted llms and so this is something that you put in front of the TGI Service uh or VM or whatever is serving the model um but here You' go and you'd say okay I'm doing text generation so I want to go into that folder and I want to serve it via TGI um and then when you go into there you just see a Docker container but the idea is that you can see that they have it for all these different variant and all of these um microservices are like this and for the examples it's
very similar where they have these steps but it's broken based down on where you want to deploy it and what Hardware you want to use so hopefully you see the value in that but we will go to a Hands-On lab so we know for certain [Music] okay hey everyone this is ang Brown in this video I want to take a look at utilizing Opa so um I'm logged to my ad account and I figured what we can do is we can go ahead and deploy something on an ec2 instance now we could do this locally
but um uh I don't know I just think that generally if you're going to be using Opa you're going to be putting into production and so I want to put it somewhere where we can uh go ahead and deploy it we probably also do this in git pod but again we'll just go ahead and do that here so um in the Gen Essentials course I don't have the code yet but I'm going to go ahead and just copy this link because I have built this for another workshop and we can bring it over and just
execute it over here um and so this one is called I think gen the Gen Training Day workshops and in here in my intermediate Workshop I have one specifically for Opa and what we're looking for in here is the OPA sh file as this will get us uh mostly set up I also have in here a cloud formation template which will launch it as well but um you know I think what we can do is we can just go ahead and uh set this up manually so I'm going to go ahead and just grab this
here and I'm going to go over to here and we'll make a new directory here called Opa and I'm going to make a new folder here called basic and I'm going to go ahead and just uh create this Opa Dosh file here okay and what we'll do is we'll walk through this step by step and get this all set up okay um so what I'm going to want for this is I think like a new buntu instance so I'm going to go over to well I'm just think about this for a second because in this
we can run what any kind of all we want and I want to run this with llama 3.21 bilon instruct and for that we need about four uh four or um I think four gigabytes of RAM and we know that because when we were talking to Rola early in the course she was talking about or later in the course Maybe she was saying that um you know 1 billion parameters come out to four uh four um gigabytes depending on if you're using 32 uh float Precision which is what this model could have and so four
four gigabytes of RAM is most likely what we'll uh need to utilize um I want to deploy this to a Zeon processor I think um so I'm going to go take a look here and say ec2 Zeon processors and the reason I say that is that if we go look up Opa right and we go into GitHub here and we make our way over to the repositories and we go into the Gen examples and from here we go into uh chat Q&A which is what we're going to deploy here today uh it's going to have
different configurations of how we can deploy this and so we're going to do Docker compos here today but notice it's saying like where do you want to deploy it you want to deploy it on gpus um onto AMD uh Rock M or Intel I'm going to go with Intel here and then we have CPUs and Gotti I'm going to go to CP pus we have aipc which by the way I have we could deploy it to that and we'll go to Zeon so I generally want to deploy it to a um Zeon processor now uh
Zeon processors are Enterprise level processors but if you ever wondered what would cost to buy one yourself if I was to look up aipc um Dell because I actually just bought a new computer or my company just bought me a new computer and I was trying to convince uh boo to let me get a a Dell a uh a dell aipc and he just said no you don't need one um but I just thought it'd be interesting to show that because it actually comes with a Zeon processor I think it's like Dell AI Precision see
if I can find it here and so if we go over to here let me see if I can find it um and we go to workloads let's go desktops Precision fixed workstations and Dells are really good computers to buy they are a bit expensive but if you buy their extended warranties they're really really awesome and so we have different ones here but I specifically want to go to the [Music] aips go to all workstations here for a second here we go and so if we scroll on down if we go to the workstations for
AI so you know if you ever wondered if like could you buy yourself a a Zeon processor you absolutely could if we go down over to here here and we have like these Tower work stations there's all sorts of kind but if we went to the tower like 80 uh 7875 you notice it says 4,784 it's probably mostly because of the GPU but if we go into one and we say customize and buy um this one's using AMD so it's not necessarily what I would like so I'm going to go back here as we want
one that specifically has um Intel and so maybe it is the it might be this Tower or this Tower this product is no longer available so then it's going to be the Dell SE uh 7960 and so if we scroll on down here now we can see we have an Intel Zeon processor how much this processor cost I don't know if we go to this one this is $471 so this one is how much is it it's a little bit hard to tell because it's saying plus and add but um uh if we go here
that's 6,000 and I I'm just clicking between them yeah so this processor is about I don't know $400 $500 but I'm just saying like you can get an Intel zon processor but the bulk of the cost is really in um uh the video card and so this is an AI powered PC and I believe that this one is set to oh is it even set yet uh video card oh boy maybe that's just the price without the video card okay but anyway the point is is that you can buy an Intel xon 4 processor uh
they also have a fifth fifth generation here but just gives you kind of an idea of like what it would cost if you wanted to have these workstations here but I did get a Intel machine so that's kind of exciting but anyway so and I'm just showing you that because you know let's say you are a company you could buy one of those machines and you could have a very proficient machine working in your office um uh for for local local inference but anyway we're in here as you'll notice we have different compos files so
we have POS pine cone uh quadrant VM so if you wanted to serve this with VM then you would use this one if you wanted to serve it with pine cone as your rag you can do that if you want it without rag you can do that so you can see there's a lot of variants I obviously Just Launch it with the most basic one um I I actually have a lot of trouble uh configuring VM and getting to work so if they actually have one here that works that is really uh really cool um
but yeah I've had a lot of trouble configuring VMS and so this is the whole point of this project to help you get a setup that works and then you configure uh for yourselves but we want to deploy this to a Zeon processor um and so we're going to have to find an instance that has that so I'm going to say ec2 instances uh Zeon I'm going to say fourth generation processors fourth generation and yeah the r7i is the models that I believe that we're looking for um if you do not have uh the money
to spend do not run this um but again we could run this on an aipc and maybe I will attempt to run this afterwards if we launch this here I haven't really decided but let's go ahead and launch an instance it doesn't really cost a lot like this should only cost you less than a dollar to run but again you know if you forget about turning your your Stu off you might run into some issues so I'm looking for the m7i uh generation here so we're going to go I'm dropping this down here typing m
s so we have the m7a I think stands for yeah for Intel and so notice here we have um two vcp two vcpus 8 gigabits we have four vcpus and 16 gbits we really should run this on an i large and notice the cost here like if you're running it per hour I mean it looks pretty low but I'm going to try an ey large I haven't done an ey large before I imagine that we can do it an i large but it is a little bit small I'm going to proceed without key pairs I'm
going to launch this in my default VPC we do need to allow traffic from uh the internet so I'm going to just checkbox both of these I probably only need HTTP but if we want to interact with it we're going to have to do that um I'm going to drop this down here and I have an IM roll that I've already created from prior so we're going to go here and I have one called SSS ec2 SSM Ro if you're wondering what's in that Ro I'm just going to quickly show it there's nothing super exciting
in here if we go over to our roles um all I've added to that rle if I was to make a new Ro and to do this from scratch I'd say ec2 I'd go next and then I would just type in SSM here and it's basically this R sometimes it's this one yeah so this policy will soon be deprecated so use this one instead and so this is the only policy I'm attaching to this Ro and that's what I'm launching the cc2 instance with this is so that we can use sessions manager so we can
easily access the um the ec2 instance but anyway so we have that uh we have that set I'm going to go all the way down to the ground and this all looks really good well the only thing I'm going to change here is I want to go to a bun two Amazon L is fine but I just want to make this super easy the buntu uh 24 LTS is going to make our lives so much easier we'll go ahead and launch this instance and again this is costing pennies or like a dollar or two but
if you can't afford the dollar don't don't run it here and just watch me and I might do a video doing this locally but the process is basically the same right the only the only difference is that you would um uh utilize um the docker compos file from whatever whatever folder you want right so but anyway we're going to go ahead and get this running here I'm going to wait for it to initialize and then we're going to jump into this machine all right our instance is running let's go ahead and connect so we'll go
here and click connect and we'll use sessions manager to connect to this instance so we'll just give that there a moment and what we're going to do is log into um the EC user account actually sorry this is an auntu machine so we're going to be doing pseudo Su H Ubuntu so now we can go ahead and install this I actually have a full automation script and cloud formation but I think it helps to uh run through this manually I don't know if we need the A C in here um uh I know that it's
set here but I don't think we're going to end up using it so I'm not sure why we install it here let me just do a quick search here oh right if we were to store the parameter over here but I don't plan on doing that here today so I'm just going to take this out okay and I know it says it's a bass script but we're not going to treat it as a bass script the first thing we're going to do is go ahead and clone the Gen examples repo so I'm going to go
ahead and paste that in here and sometimes you get some stuff in the front here which you don't want so I'm just going to go ahead and clear that out so now we are cloning down the Gen examples repository we're going to CD into the geni examples um uh chat Q&A and specifically for that Docker compos and here we can see we have some options below uh we're going to go ahead and install Docker now this is just using the instructions that are right off the website for Ubuntu so there's nothing uh fancy that's going
on in here you could literally go to the um Ubuntu uh we or not the Ubuntu website but the docker website and follow the exact same instructions but it is really easy just to run it here and so once that's done installing we're going to want to make Docker PUD list I like to make it pseud list because it just makes our lives so much easier um as I do run into issues when running Docker at pseudo um I don't run tons of stuff uh at scale in containers and production so someone might have a
different opinion than me about running things in pseudo but for my workloads I prefer them to be suit list for containers so we'll wait for that to install it shouldn't take too long there we go let's go over to here and the next thing we're going to do is run these three lines and that's going to make Docker sudal list um and so it already had a Docker group we'll just ignore that that's not a problem so here we have some environment variables we want to set we're going to want to set the hugging face
API token I'm going to go over uh over there I'm just going to adjust this because we can we could store this in um sessions manager but I'm just going to manually set it here okay so set your key at this point uh you probably know how to uh generate your hugging face API key but we'll go ahead and do that as I need to generate a new one as per usual we'll go down to access tokens I need to signing just give me one moment I'm now in here I'm going to need a new
key I'm going to go ahead and delete this one as I'm done with mongodb we're going to go ahead here and say opa and we're going to create ourselves a new token so we're going to copy that token I'm going to go back over over to here and I'm just going to temporarily paste it into here I'm going to go ahead and Export this okay and I'm going to go ahead and hit paste and enter now I don't need this key right now so I mean like I don't want to commit this to the repo
so set your set your key so we'll leave that alone uh here we can set the host IP this honestly just has to be something routable to the Internet so you could do 0000 z.0 if you wanted to or 1271 I believe um but uh we'll go ahead and just paste this here and so this is going to set the host IP the next thing I'm going to do is just say no proxy so you can set this up to work through a proxy um but we are directly accessing this machine so we do not
need to do this now we're going to CD into the Intel xon CPU uh directory here and there's a few other things we need to change so there's this do setv file where we need to swap it out now you could run this line but I think it' just be easier to open it up if and modify it so what I'm going to do is I'm going to just type in clear I'm going to type VI and it's called setenv.sh and so we have some environment variables you need to know how to use Vim to
do this if you don't know how to use Vim make sure you learn those keys they're not hard but we here we have an embedding model a reranked model llm model and uh rag and so for rag we're using reddis to store our data and so this is set to neural chat 7 billion parameter version 3 uh three and so this thing is kind of large it will take forever to download um and it's fine but we need something that's going to be a little bit smaller and so I want to bring in um uh
this llama one here so I'm just going to type in a manually uh well actually I'm not going to type in manually I'm going to go look it up and in particular we need to utilize um uh we need to utilize the instruct version of this so uh in order to do this that's why we got the hugging face one I'm going type in llama 3.2 one billion instruct down here and notice that I've already granted access to the model you need to get access to this model to utilize it so click through whatever you
have to do and wait to get access once you have access you can utilize this model if you do not have access to the model if it doesn't say you hav't granted access to the model you don't have access to it so stare at this page and try to get access as best you can here in another video we show you how to get model access with a hugging face but I found a lot of people stumble with this because the UI here is just not very clear but over here you can see we have
the model is about 2.47 GB and it's a good thing that we use that one because the Intel one is just giant but what we're looking for here is the name so I'm just grabbing this part here I'm going to go ahead and copy it okay and I'm just going to go ahead and try to paste this into place and that looks perfect okay and we can use this as our embedding model we don't have to change it because the embedding is really more important for the rag and um yeah it's not going to be
an issue so this is all I wanted to change I'm going to do Cole and w q and quit there and so that is now set so these lines um this basically does the same thing kind of except uh this one's a bit inaccurate because it's supposed to oh it does it swaps out both the lines there perfect so you can run this if you want um and so now we want to actually run this so we'll need to run these two lines I'm going to go ahead and copy that and we'll go ahead and
hit paste and hit enter I'm going to make my way back over to here and we're going to go ahead and copy this line and we'll put enter okay and so now what it's doing is it's pulling down all these um containers so these are all the ones that running retriever Tei reranking service redus Vector database TGI service all this stuff so if we were to go over to the OPA documentation okay we went over to here and we went over to here um and we went to go into chat Q&A actually you know what
I want the doc so we'll go to op. and if we go over to technical documentation on the left hand side we'll go into examples and I'm going to go to chat Q&A sample we'll go to overview okay and so down here we scroll on down this is basically what it's deploying it's deploying a bunch of these different kinds of containers all orchestrated together um and if we go back over to here it's now finished pulling it says no space left on the device oh no I forgot to launch this with enough space so one
thing that I forgot to do is when we launched the cc2 instance I had to make sure that had enough storage and it doesn't um so I'm going to have to do this whole thing over so I'm going to go ahead and terminate this but hey we got the first run down so that's pretty good it's not that big of a deal it's not like it takes that long so I'm going to terminate this instance I'm going to go back over to ec2 and we'll go ahead and launch a new one practice makes perfect right
I'm telling you it's easier to just launch a new instance than than it is to try to resize the the uh the disc because there's a little bit more work we'd have to do I'm going to name it this time just say opa chat chat q& name and so we're going to go here and choose a buntu and that one's fine I'm going to go here and we said m7i M7 I no I didn't say I was going to remove anything I'm not sure what they're talking about this thing is running a little bit slow
I'm not sure what's going on here um I don't know what's going on so I'm going to go back to instances here I do not TR it I'm going to go launch instances and we'll just say opa I'm choosing Ubuntu okay I'm not sure what it's trying to spin there we go and so now we'll open this up I'll type in m7i and we're going to do large I've never used it with um a large but I think that should be proficient normally you want to have at least four vcpus when you're running something like
a seven bilon parameter model I'm going to go with a key parameter we're not we're just running a one billion parameter model so that's pretty easy so we'll select these two here and we only have eight gigabytes now we saw it was only 4 gigabytes but there's not just that that's downloading there's a bunch of models and so I'm going to move this up to 30 gabt which I believe is the free tier actually says it right there so I still am in the free tier at least for storage not for compute um and I'm
going to go down here and I'm going to go back and choose that imal again um so this one's called whichever and I showed you how you can make that yourself we'll go ahead and launch this instance and this time I think we're going to have it so we'll wait for that to launch okay and we'll wait for it to get initialized but yeah it's not that bad uh having to redo this but we'll just wait a moment here okay all right so um the OPA is running we'll go ahead and connect and we'll go
ahead and do this and this time we're going to get it right I think I kept the yep the token around so my life is not super painful here sometimes it takes uh two runs to do something but that's okay we're learning by doing something more than once okay so we're back into here we're going to go back over to our instructions and we want to do all this I'm going to go ahead I'm going to be risk a and I'm going to try to take that entire block and paste it in here and see
what happens okay and hopefully it just works I have had this like where I've used it completely and just run it end to end okay so I need to go grab my keys I'm going to go ahead and do that I'm just going to paste it into here and we'll grab this blocks next okay and we'll give that a moment to run I'll just pause here when it's done there we go so I'm going to go ahead and paste this part in next we'll go ahead and hit enter that's just getting us into the Intel
Zeon processor area and setting that hugging face value uh we can run this we can run this I just want to make sure that it absolutely works properly so I'm going to go ahead and copy this I'm G to go ahead and paste this in here and what I'm going to do like we really really want to check this I'm going to go ahead and type in set cat EnV and it is correctly set so I have confidence that that is correct here we're going to go ahead and copy these two lines okay we'll hit
paste and so now our environment variables are set we don't really need to have them in the set. EMV we could have just had them out like this I'm not sure I mean that's how they do it um but I I might have coded that a little bit differently we'll go ahead and do Docker compose hyphen f compose yaml upd so that is now pulling the containers which is what it was doing last time and so the thing with this uh layout we'll go back over to the architectural diagram the only thing we care about
that absolutely has to be working is the TGI service because as we learned in other videos TGI is the thing that serves um that conserve llama right and so or like our models in general and so we want to um see to make sure that's working and so even when all these containers are done and it says hey we're ready it's not actually ready until it's act the the the TGI service downloads the container and it might even uh convert the model weights and then serve it and then when it's connected then we know absolutely
that it's working but you can see it's pulling very quickly because we chose a smaller model um and what's interesting is that I don't even think the documentation says that it can work with that one it absolutely can um but uh you know I don't think they're always updating this um or it's hard to keep up to date with the lest models but I was able to get it working no problem so yeah there's a lot of documentation here but um do not worry I have taken the time yeah there's the mega service um to
work with it okay so yeah it's still pulling we'll just wait a bit all right so they all just pulled and they're getting going here and so yeah it looks like they're all started but we're going to have to go and take a look here so I'm going to do Docker PS and that's going to list out um our containers I do believe that I have some nice little command uh that I have to make it a little bit easier that's in the other repository I'm going to go ahead and go grab that really quickly
for us here from the Gen training uh training days workshop and so if we go over to here into I think it was [Music] Opa I have a line here you see I have all the instructions here this line here and this line makes our lives a lot easier because if we go ahead and run this okay we can now see exactly what's running um yeah but it just formats it so things are a little bit more um efficient here and so as long as the TGI service is running it most likely is working but
the reason we did this is we're just trying to get to this here you can obviously get it here if you like carefully follow it but this is the process and or the container ID I should say and so we want the container ID so we can check its logs so I'm going to go ahead and copy this and I'm going to say Docker logs and then we're going to paste in that log and so now it's going to tell us where where things are at right now it's saying download file save tensor flow so
right now this is not ready it is not yet serve the model and so we're waiting for it to get into connected stage so if I keep hitting up here uh it looks like it's not doing anything but it clearly is it's download the safe uh safe tensors and so I'm just moving this until it works so I'm going to wait a few minutes a couple minutes um but the other thing is here is that the token file not found in uh hugging face model supports TGI but sets a default 4096 I'm just making sure
there's no issues here cannot determine GPU compatibility torch not compile with Cuda enabled that's totally fine cuz we just want to run this on CPUs um but yeah we'll just wait a little bit here because it's still working and there's no there's no failures as far as I can tell so we will just uh wait a few minutes okay all right so it's been a few minutes let's go ahead and take a look and see if our model is actually running and so notice there is a lot more stuff happening here and so it's still
not ready it's warming up the model and just just remember that we did choose a very um a very small uh instance it says it's large but usually I'd use a 2X or 4X so my expectation is that it is going to be a little bit slower but I think that we can pull it off I like I know if we chose like a a 2X large this would have already been uh running here but that's is what you find out as you work with different um uh compute here you figure out what size uh
you need to work with here but I'm going to go ahead and oh can I even hit up anymore it's not letting me hit up in fact it's becoming unresponsive so I almost kind of Wonder did I make a mistake and not choose a large enough instance size hard to say so yeah it's totally unresponsive now I'm going to go ahead and terminate this um not the instance I'm going to go ahead and connect to it again because it might be exhausting all the CPU I'm not sure if we can um see the CPU here
at all yeah so not not all of it but you can see that CPUs has got up to 52% so definitely not exhausting it for S uh uh for sure the other thing is that um maybe we didn't give enough uh memory available for other things yeah yeah it's completely unresponsive so I think this is where I made a mistake and we just need to go up one size so yeah you you uh you learn so this is definitely something that um is not going to work on on the m7i large so we'll go ahead
and terminate it and we're going to do it one more time I promise this time it will work but you know they say if you do things three times then you absolutely know how to do it and it's it's uh locked into your core memory so we'll go ahead and we'll do this one more time and I know this time it will work and we'll go ahead and launch an instance I'm going to say opa 3 as this is our third attempt we're going to go over to buntu and we'll say m7i and so yeah
we have 8 gigabytes and so this is going to be four vcpus I don't know why I tried to do it with less I just thought it might work so we'll checkbox both of these we'll go down below we'll go to Advanced details we'll choose our instance profile so it's SSM and we'll go over to here and then we'll go ahead and launch the instance and so we will do as we normally do and just wait a little bit for this instance to be ready okay and you know what I already made a mistake because
when we launched that instance I forgot about the storage size so I'm going to go do it again for the fourth time this time I'm going to get it for certain there's no way I'm not going to get it here but you know practice makes perfect we'll go here and we'll say launch instance Opa 4 whoops four we'll go over to a buntu we'll choose the size it'll be m7i x large I'm going to scroll on down we're going to proceed without a key pair we'll choose these two here we'll choose 30 30 gigabytes and
then we'll go down here so this one has 16 gigabits of ram the other thing that I was kind of wondering is that you know did we run out of storage space I'm going to just jump this up to 40 just in case just in case and then we're going to make our way over to ec2 SSM roll and so now we should have everything even the the firewalls are open we'll go ahead and launch it and this time this time it will work it has to work I don't want to re-shoot this video um
but these are the common problems I almost always run with them and that's why I'm not starting over because I know that these are the common issues that I run into and I think it's worth showing that stuff off but we do have to wait for um this to pass those initial one so we'll have to wait a a couple minutes here all right so let's go ahead and connect uh to this instance now it should be ready sometimes this happens with sessions manager I just click back and forth and now it will allow us
to connect to it but at this point we should be uh super experts in getting this configured so we'll go ahead and type in pseudo pseudo hyphen Su Ubuntu and I'm going to go back back over whoops pseudo suyen ubu and I'm going to go back over to here and I still have the key in there so I'm going to now I'm really going for it I'm going to go all the way down like this I'm just gonna wonder I just single shot this here let me go here and adjust this like this okay and
if I grab this here TR it in and we will just single shot that there I'll just undo this so I still have those groups okay we'll just wait for this to complete all right so um we ran all that yeah it looks like we are okay here so I'm going to go ahead and just make sure that it ran the last one uh it does not look like it's in the correct location as um I expected so we did run all this stuff but I have a feeling it hit this and then it didn't
work so I'm going to go ahead and copy and paste this you got to pay close attention of what's going on here um I'm going to go ahead and do this I'm just going to cat out the uh that file there that was replaced excellent and so now we're going to do our Docker compose up and this time this time for certain it's going to work we'll go ahead and hit enter and we are going to pull those images okay so we'll wait for them to pull and then we'll do the same thing we did
before which is looking at the actual um uh TGI service and make sure that it's launched all right and so you know it's starting to it pulled all the containers and it's just creating them now and we're just going to wait till they're in a started State and so there we go we can do our Docker PS uh but of course we want our nicer line here so grab that one here we'll paste that on in and so I'm looking for TGI service right and we're going to do Docker P or logs and whoops it
did not copy we'll try this again right click copy right click paste sometimes that's what you got to do but then I get all this weird output that's just fine we'll go ahead and do that so we're now downloading it we'll just have to wait a few minutes but I think this time uh we're not going to have an issue because you know we'll have enough memory enough CPU enough storage okay all right so um let's go take a look and see if the model is in a running State I've been waiting a little bit
and now it says it's connected so we are now in good shape it says invalid host name defaulting to 000000 so that host name we set didn't like it but that's that's okay because 0000 is going to work for us as well there's a few different ways that we can interact with this model but the way that we really want to interact with it is from uh the web interface so this thing is being served on a public IP address hopefully you have a public IP address if you don't you're going to have to run
it again and I'm going to go over to here and hit enter and so I should in theory get an interface and I do and so we have the chat Q&A now I'm told that this can handle rag with some particular documents but I'm going to go say hello let's see what it does let's say hello how can I assist you today and so it's running totally fine I would love to try out the rag um I always just kind of avoid it I don't know why but in here it should explain to me how
the rag works just give me a moment I'll read through it very quickly so I remember there was an example file I can't seem to find it it's probably somewhere else maybe we just look at the architecture we might be able to figure out so let's take a look here um so we have our Vector data data store that's fine um and it does information extraction so here it's saying OCR PDF data extraction web Crawlers um okay so the diagram illustrates the flow of information from the chatbot system starting from the user input going through
the retrieval anal analyze and generate components okay so what I'm trying to understand is that part where it is extracting out the information okay because we do have an upload interface here maybe what we can do is explore the code and see if we can figure it out uh so what I'm going to do is I'm going to go over to Opa GitHub and uh I'm going to go here into the repositories I'm going to go into gen examples I'm going to hit period on my keyboard so I can open this up in github.com we'll
go into uh Intel CPU Zeon and then into uh the standard composed file which is what we're utilizing so we have some services so we have the redus data datase um we also then we have here the data prep rdus service okay which is dependent on the redus vector database and the the te uh Tei embedding service we have the retriever okay and so what I'm trying to figure out from this is what is the thing that's doing the conversion right okay yeah so here we have chat Q&A and so that's coming from the registry
there that's totally fine yeah and so that's what we need to figure out we need to figure out what is it that it's actually run running under the hood here um I do believe that uh these are reliant on um if we were to go back over to here into the actual uh comps because these are composed of um all the examples are composed of the components here so if I hit Peri period here I don't they can make sense of it there because I just want to understand like who's doing the ocing is it
doing OCR um and if it works that's totally fine um without us 100% knowing we could try to figure that out later but we can clearly see that we have a lot of different microservices here it says build Enterprise uh applications Enterprise microservice or building blocks these are the supported ones so we're looking at here and there are these data Preps ones okay so I wonder if What's Happening Here is that there's some data preparation going on here and this is something that I think we saw that was running and specifically this one here so
the data prep microservice aims to pre-process the data from various sources either either structed or unstructured data to text data and convert the text data to embedding vectors okay so this is the thing that we actually need um so this thing was called Data prep which is over here right and and um in here we're using redis so let's go read the read me so we provide the data prep microservice for multimodal data input for data data prep microservice for text input we provide two Frameworks here Lang chain and L llama index we also provide
Lang chain Ray which uses Ray to paralyze the data prep for multi performance I imagine if you're using Ray then it's using VMS underneath um as that's normally what's being utilized we organize these two folders in the same way so you can either use either framework for data prep services with the following constructions okay so I'm not sure which one it's utilizing but the idea is that um we can look at one or other let's just assume it's using Lang chain here and we'll go into here and take a look so we have a Docker
file here and it's setting some stuff up it's entering with prepared Doc redus and so if we go into prepare doc redus now we're starting to see some code okay and so we can see Lang chain here and what I'm looking for is like what is happening with preparation so here we have comp's data prep Source utilities and so we have a document uploader and a bunch of other components [Music] M let me go up here so we have comps data prep Source utils comps data preps source utils so here's some of the utilities that
we're utilizing I'm just trying to get it like what can it process uh converting doc file to docx so at least it can work with docx files it can work with PowerPoints okay so it supports a a wide variety of documents but does it have PDF that's what I don't know low PF so it looks like that it supports a a wide range of a r a wide range of things so let's find out if we can actually get this to work um I have not played around with the rag enough to know if it
will work so I'm just trying to think of something I can download so I'm just say uh Japanese language learning PDF and we'll go to easy easy Japanese which I actually heard of this one before I downloaded before and I have yet to play around with it for my own language learning um but the question is like does it have text that we can extract out of here we absolutely do oh we certainly have a lot of interesting stuff in here okay great so what I'm going to do is go ahead and download this and
it is three uh megabytes that's not that big of a deal so I'm going to go over to my desktop we'll go ahead and download that really quickly there so now I have it downloaded I'm just going to open it up and just give it a quick renaming I'm just renaming it to easy Japanese so I can easily find it and so now I have it downloaded I'm going to go back over to here to here where is it here here here where are you there it is we're going to go ahead and upload this
file um so I'll go ahead and upload and choose the file and I'm just going here and I'm looking for easy Japanese PDF and so now I've uploaded that file it's uploading I'm not sure how long it takes but we'll wait a a little bit here to see what happens and now just remember the type of file we're using is not a uh simple file it has it's it's a obviously mixed language so I'm not sure if we should have only been using uh purely English here but we'd have to do a little bit more
research to figure this out please upload your file uh or paste a Remote Link and uh and chat will respond based on the contents here so it's still spinning and so it's making me think that it's still uploading it or processing it um I'm not 100% certain but we'll go over to our Network Tab and this might be one way that we can kind of keep track of what's going on here and I'm not sure so I'm gonna give this a nice hard refresh and now saying oh the site's not reachable give it a second
it might be my internet to be honest we'll go here and say testing half the time my internet kicks out here but um we'll go back over to our instance here and I'm just going to make sure our instances are still running Docker PS yeah they they all appear to still be running oh you know what I think it's because it put the HTTP htps in there sometimes that happens um I went to full screen by accident too and I can't get out of full screen said hold Escape there we go okay so I'll click
into this and we'll just take off the thats there we go and so is that document uploaded I don't know but I'm going to make sure I have this open this time and so if there is a problem with the upload then we'll find out with that here okay so I'm going to go ahead and upload that file again and we'll go to easy Japanese we'll say uh open and I'm going to go to my network tab hopefully I capture that information and it is sending the data over okay so when we get a response
back from that we'll know it's okay so I'm going to pause the video and wait a little bit but maybe the file's too large I'm not sure if there's any file limitations here but we'll we'll do our best here we'll just wait okay all right so I've waited a little bit here and if we take a look it's still uploading so clearly there's something wrong with the file that I'm uploading I don't think it's the um the code here but it might be a little bit too difficult to debug um I'm going to see if
I can find a simpler PDF so just say uh doc docx uh learning Japanese let's see if we can find a simple one here I just want a really really really simple one so maybe what I'll do is I'll make my own uh word do file we'll go here and I'm just going to go over to chat GPT I'm going say uh um generate out me a uh um generate out me a document for learning yeah well actually we'll go over to here we'll say gp24 I think it's 401 that has the advanced so gener
me generate me out a learning guide for Japanese language learning and I know I spelled it wrong I don't care I'm going to open up a new document here and it goes and goes and goes and goes and goes okay I'm going to go go ahead and copy this I have a word doc here I'm going to go ahead and paste it in here and I'm going to go ahead and we'll say save as I'm going to save this to my desktop as per usual and I'm just going to say uh learning guide Japanese so
that is now a docx file I'm going to go over to here back over to here and we're going to go ahead and now upload that file so just give me a moment to choose that file you can't see it but I'm just choosing it off screen here if I can find the file I just made um sometimes when you put things on your desktop they're not actually on your desktop it's very frustrating what did I call this file I know you can't see what I'm doing but I'm just talking myself while I'm working through
this here uh learning oh learning okay so L for learning I swear sometimes I just put things on my desktop and they're not there like I totally made the file it should exist and it refuses to show up no matter what so I have to make like a junk folder to put it in just so that I can find it it's so stupid we're going to go ahead and we'll save this file again save it as and I'm putting it [Music] into uh new folder too we'll save it there even though I shouldn't have to
do that I'm going to try to upload this again so it definitely exists in this folder all files oh maybe it was always there okay so now I have a word doc and I'm hoping that I can process that if it can it's not a big deal but it might require a little bit more work that's outside the scope of this for me to to debug this would be a lot easier for me to debug locally is um then I would have all the containers that I can I can view them easily we can do
them in here as well but uh I would very much prefer to do that somewhere else but we do have all these services and the one that's prepping is the data prep so we could go in here and take a look so we say Docker logs and I just go ahead and copy this one here copy paste enter and so here it says oh they did upload it says Easy japanese. PDF does not exist unsupported color space for CM YK for BGR for Red Data um so it definitely was attempting to uh Implement one file
easy Japanese does not exist so yeah I'm not exactly sure what's wrong with this but I think we would have to do a little bit more debugging to figure out um this is still really really good I just again I'd have to spend a little bit more time on it and to be honest I always have a bit of trouble with rag there are files that do work with this I just can't find them right now but we'll consider this done and uh that's the best we can do here today so we'll go ahead and
terminate it and so yeah that is uh an easy way for you to launch container services and you're kind of getting the idea of how you could work with that stuff I'm going to make sure I delete my key out of here um as something I forget to do quite often I'm going to go over to here and we are going to just R this part out okay so just say uh set key and I'm going to go ahead here and update this WIP okay and I will see you in the next one ciao [Music]
let's talk about kerve which is more of a generalized tool for deploying machine learning models on kubernetes Via K native uh K native is a serverless way of working with kubernetes uh we won't get too much into that but we will talk about kerve and that it can be utilized to deploy ml models and also large language models so here's an example of some configuration code that we would uh utilize to deploy uh Cav serve as a service within kubernetes and here we're specifying the runtime we're saying cerve Pi torch server but I know this
is actually using the hugging face Transformer uh to utilize it specifically using uh the P torch server and so K serve can serve the following kinds of models so can serve tensor RT tensorflow pytorch sklearn XG boost Onyx and there's a lot of models that are onyx compatible so you can see it can do a lot of stuff um if you can uh see there I'm just going to get my pen tool out if I can find it here but notice that we have the layer so here we have K native which is a a
serverless layer then kubernetes which will utilize the underlying resources um so this is not something that I would demo just because it's a lot of work to show but I want you to generally know that there is this generic model in kubernetes to serve models and it can serve llms as well [Music] okay all right let's take a look at VM which is an open source library to serve large language models I believe the V stand transfer virtual um and it's really straightforward the idea is that you can use the the VM uh CLI service
to say VM serve um so I believe you would install that using pip install VM you can also uh serve it via Docker container though I will say that because there are various ways to serve it probably the best way or the way that I like to serve it would be through containers as you'll find that a lot of these servers that's the way that they will recommend to do it or will be the only way to do it I could talk to all the functionalities of VM but to be honest most of these Services
out there basically do the same things as they're all trying to keep parity with each other the only reason you might choose one over the other if you were using a very specific llm model that only worked on VM but did not work on TG or vice versa so again it's not worth it getting to all the little bits of it because I couldn't find a a a a huge difference but even when we go to the actual uh you know information here they talk about all these things it can do and all these ways
you can deploy it um and so I would say that it does have a lot more deployment methods than other ones but to be honest most people are running these things in containers uh but yeah there you [Music] go all right let's talk about rayer so Ray is a collection of libraries for AI workloads and Ray serve can be used to serve AI models with VM such as llms I know at one point there was something called Ray llms but since BLM has done so well uh Ray uh retired that open source project and now
they recommend to use the llms and so you can find um Ray under Ray projects and so they'll also have Ray serve there as well and so you can see that Ray is actually collection of libraries and we're just focusing specifically on Ray serve right now if you ever heard of apachi spark Ray is often positioned against uh against as a replacement for apachi spark we're not talking about all the other parts that you can do of Ray just Ray serve here as we are only concerned about serving llms um the reason you would use
uh rayer with VMS is that if you think about VMS it is only designed to scale to a single server so if you wanted it to be distributed uh to take your LM and utilize multiple uh compute uh or servers then you would use Ray to help distribute the workload across multiple machines the way you would do that is you'd have to obviously write python code um it's not as simple as launching a single container you will have to write code in order to do this um but you know it does make things a lot
more powerful so there you [Music] go all right let's talk about TGI and Tei these are both open source uh libraries by hugging face and so TGI stands for text generation interface it's for serving llms and then we have uh Tei which is text embeddings interface uh which is this is for specifically serving llms that have output embeddings and you can see that the way you use them is very similar uh the only way that I really know how to utilize this is with Docker containers there probably is some way to use Python code or
do it but this is really the way you want to uh use TGI and Tei um why do they have two separate ones when they could have um uh like they're both llms so why isn't TGI serving both um embeddings and things like that I don't know maybe there's something to do with the code but a lot of times when you look at these um uh these servers they are implementing some kind of of level of the architecture underneath to adapt to a bunch of different llm models and so obviously the way text embedding LMS
is going to be a little bit different than uh ones that are outputting um generation and so uh that kind of makes sense to me as to why there would be a separation of these models uh but you can see it's very straightforward uh to uh serve a model here one thing you'll want to notice is look where it says the model flag so I'm just going to get my pen tool out here if I can find it but notice here this one's is going to run gpt2 this one's passing a variable but um the
idea is that in these do containers you're literally passing over exactly what you want to run [Music] okay let's talk about tensor rtlm before we talk about that let's talk about tensor RT which is an ecosystem of apis for high performance deep learning inference it's specifically for optimizing models for Target hardware and specifically Nvidia gpus I didn't specifically just say for NVIDIA gpus because I'm not sure if it could be used for something else but generally it's for uh Nvidia gpus um and to be honest envid uh learning how to code with tensor RT is
out of scope for this as it is very challenging however I did want to talk about tensor rtlm which is um more reasonable for something that we might be able to do um and so the idea with tensor RT LMS allows you to serve an L model using the tensor RT engine using python code so what you would do is you would download um a checkpoint and if you don't know what a checkpoint is it is a um a state in which a model weights exist and then you convert that format over to tensor RT
LM checkpoint you would then build a tensor RT engine uh from that converted engine uh that converted checkpoint format file and then you could run inference this is not something that I would ever do a lab on it's just too complicated but because you'll see tensor RT a lot mentioned I wanted to make sure that you got some uh exposure to tensor RT [Music] okay hey everyone in this video I want to show you Intel Tyber AI Cloud which is a platform for Intel that allows you to utilize uh their latest gpus and CPUs uh
but also they have a bunch of notebooks there so if you want to uh play around with a bunch of complex projects in AI you absolutely can do that um I'm already signed up so I'm just going to go ahead and sign in and so this is bringing me to cloud. console. cloud. Intel do. Cloud this project has been named a few times it used to be Intel Intel developer Cloud then it was renamed to Intel Tyber developer Cloud now it's Intel Tyber AI Cloud um but you know this is no different than any other
provider name renaming things 100 times let me get logged in here just give me a moment okay all right so I'm just getting signed in here I just have to do an SMS verification uh and we'll get in here just in a moment 703 5 73 and so we'll go ahead and uh oh I think I typed it wrong so I'll try that one more time sometimes I mess up the codes and now I'm in okay great and so this thing has been getting better and better every time I come in here so I've been
pretty excited about it uh you us region one out computer sold out right now so you can see it's very popular um you can run kubernetes clusters on here I have yet to do so you can launch Compu from here um and you know the only reason that you'd want to launch compute from here as opposed to something like AWS is that they will have um uh a compute that you just simply can't get any other other places so if you want to utilize um I'm not sure if it shows up here I might not
have access to it as you might have to request access but there could be things like GTI 3 I don't see it here but I think I'd have to actually request to get access to GTI so he says Gotti 3 AI processors are now available in the cloud but your count has to be approved for it so GTI 3 is their latest AI accelerator um but yeah you can see that they have a bunch of processes there whereas on 8s they only have godi one right so if you want godi 2 or goddi 3 godi
2 I think you get IBM but God 3 you can only get it here currently but what I really want to show you is the learning notebooks as this is a way to start working with a bunch of stuff uh right away so if you want to learn how to fine-tune llama with hugging face here uh and have a complete project set up you can launch it up here and it should launch up here uh why it's not working right now probably because I right clicked it I was trying to open it a separate Tab
and so that kind of messed it up um and so I just need to relog into my account so just give me a moment here I'll relog back in here uh just a second my autofill is messing it up here there we go okay so I'm just sending another code and verifying it okay so I have another number here 8999 51 and verify the code okay and we will just give this a moment to start up so it is starting up a Jupiter Hub environment but this is just another place that you can learn so
obviously there's sagemaker Studio Labs there's Google uh Google Co collab but then there's also these ones here so yeah we just have an environment it's pretty straightforward you run it you can learn stuff on it but this one in particular like here it's saying Gotti 2 a accelerator so um this one might be utilizing Gotti 2 I didn't realize I spun up God 2 did I uh oh yeah okay yeah so yeah we're utilizing God 2 and so this is a great way like if you did have did not have any other way of accessing
D accelerators because they get really expensive then this is a way that you could start uh learning with it okay so I'm just carefully looking here um I don't remember exactly what is the Gotti 2 part of the code here it's been a while but the point is is that if you want to utilize gotti2 processors this is a place that you can do it okay then obviously we have the AI with Mac series gpus um and then if you want to learn how to use sickle you could do that as well but again you
know if you are buying hardware and it happens to be Intel processors or Mac series gpus or got 2 uh god1 god3 accelerators then you can play in this playground to learn how to work with them before you make those Hardware purchases um but just in general for learning their stuff here and so I just wanted to show you that uh that's about it um and so I'll see you in the next one okay [Music] ciao all right I want to take a look at runp pod I've never used this before um and it's a
place where you can develop train and scale AI models I think it's unfair not to take a look at all possible options um and I've heard about this one but I've just never bothered signing up so I'm going to go ahead and sign up here today and we'll find out if there's a free tier I'm going to go ahead and use my Google account if it's paid do not worry about it just don't use it um or use it if you have the money and you like the experience I don't know we're just exploring and
making sure we have options here so I'm going to go ahead I don't want to subscribe but I'll go ahead and accept the terms and service again never used this service before interface looks okay what I'm looking for is notebooks as they said that they can run notebooks but clearly this can run other things so deploy a GPU uh pod um Auto scale your your workload traffic storage refer friends to earn credits I'm not sure if we have any usage it says z00 Z at the top here okay spend limit is $40 an hour a
day so I'm not sure uh if we have anything here but where are these notebooks that they promised us do we have to deploy a pod because I mean I guess it's just like you're running anything right so if you run something does does runpod have a free tier let's go take a look here um does runp POD have a free tier I'm not sure if they do [Music] more power less cost I mean they had free gpus a couple years ago I mean to be fair I understand if if they're not providing that stuff
for free but um you know maybe we can try again I'm not really sold on this thing but I still think that we can try it and use it for for fun to see if I can launch stuff here I'm seeing stuff for serving so that's VMS like that's a that's a server but I'm thinking what it is it's just Let's us run anything so say runp pod runp pod notebooks okay so how do we do that so select a pod choose your gpus choose an instance search a template so here they have templates that
already exist for it um and so I just want something that's very very inexpensive as an example to show and so I'm looking through here and these are gpus let's go over to CPUs as I really don't want to spend a whole lot I just want to see it working and I might have to attach a credit card for this so I just want general purpose that the compute optimize is um this one's more expensive than that one you think that general purpose would be less that's fine so we'll go down below here I'm going
to choose a template and I think they said there was one for notebooks Jupiter nope let's go back over to here um select a pod choose a GPU inance select search for a template that includes Jupiter notebook okay so we go here I don't see it I don't see a template for it edit template um yeah there's no there's no template so I'm not well over here there's templates new template again there's no templates to search so I'm not sure about this but clearly um we could like I could install one by default run the
notebook so here yeah this one this is one way of doing it I don't know if I like this like I'm not saying that run pod is bad but it's clearly not showing a a fast and straightforward way to set up a notebook um but we clearly could do it what if I was to go back over to gpus for a second disc card changes so I'm G to go back over to pods I'm going to choose this here um I don't care what it is I'm looking through here uh yeah how about an A40
I like how there's more information about it um like I I anyway we'll go ahead and change the template let's so now we have okay there we go so now if I go Jupiter Jupiter Jupiter that P torch is that what this one is this is a runp pod P torch um so again just looking for the lowest cost so this one's pretty low the RTX a5000 I can't remember runp pod is the one where they're borrowing from other people runp pod runp pod I think runp pod is like it might be uh oh I
clicked on Lightning AI they have an ad for it is it other people's globally distributed gpus for the cloud so I think that this is the one where you are you are utilizing other people's gpus okay that's what I think it is um but anyway I don't care I just want the lowest cost one so we'll go over to here I just want to see if the templates are still available yes it is and I haven't paid for anything yet so I'm not sure how this is going to work but we're going to choose on
demand and we'll say Deploy on demand and it's going to say insufficient fund so I have to load up a little bit of money here and so I'm going to go ahead and do that so just give me a moment to load up some some money here okay it'll just take me a moment um I guess I could do that I don't know if that's my card or I don't know if I want to do this because like you can't choose less than $25 right and I don't know if I'm going to come back and
use run pod so you know what I think this is where I'm going to call quits like it looks like it's it's fine like it has templates but I don't feel like dumping $25 in here if it was less like $10 or five maybe I can choose the amount maybe I can say $5 can I oh no the lowest I can choose is $2 well let's do $2 I can handle $2 if I just keep if I'm I'm going to spend on a lot of services so if I if I do that b might get
mad at me be like Andrew why do you keep throwing like $30 here $30 there for services you weren't using so going to go ahead and try two nope can I do five nope can I do 10 so I can do 10 that's the lowest I can do so I can do that that's totally fine so I'll go ahead and say purchase 9518 I don't know if that's my credit card or someone else's just give me a second and I tried to buy it wouldn't let me buy it uh we have a It's Tricky with
my company credit card card so I don't know I guess I can't really test well you know what just give me a second I'll just use my my other card and that doesn't work so that was my best attempt here with runp pod here today now it's not their fault they're just using link tree here but I imagine what would happen is we would launch up that jupyter notebook and we could have that experience and so that's just another place where you can buy uh gpus and now that I'm memory it runpod is just like
if you're looking for something that is most cost effective then you can use these gpus which are distributed um but you know again maybe at some point I will go and figure this out but uh you know this is not working here today so I'm going to have to move on from a runp pod and at least we took a look around and we know that we can at least go with a $10 one for me as someone exploring I wish I could even just load up $5 but runpod if you're listening um that's something
that I I think that you might want to consider uh when doing it but we did give it a look okay so I might visit it later but that was the experience we had okay all right I got an email that came through that it said my payment went through so despite it saying it failed um maybe it actually did work and I'm going to go ahead and log into runpod and notice we actually have $10 now so we clearly have some uh spend that we can utilize so let's go ahead and deploy a pod
and we noce with gpus that's the only place where we see that template right now for the jupyter notebook I'm sure I could get it working in in uh the CPU one but I don't want to do that here today uh it's interesting we have security cloud and Community Cloud I'm not sure what the difference is maybe there's a a class difference um I would would imagine that there would be because if it's Community you have more options and it's more um cost effective but I'm going to go to secure Cloud here today um as
that was the default here it looks like we can also choose a lot of locations for where our stuff is running uh again looking for the most uh inexpensive thing that we can run um and so it'd be nice if it's sorted based on cost maybe we can do that I'm not seeing that here today so yeah I would just say I don't like the fact that I have to kind of like figure what the cheapest cost is I kind of wish this was a table but you know we got what we got so I'm
not going to Super complain here um but you know we have the RTX 200000 Ada so I wouldn't mind trying that out that's pretty cost effective that's a uh that's something that goes right into the laptop um is is low in the bottom low availability um I don't need much here we have high availability for the RTX 8 4 500 let's go ahead and use that I just want this to work here today and we'll go ahead and change the template okay and so it said that there is one here for Jupiter um specifically so
I'm looking for Jupiter I don't see it so I'm just going to type it in here Jupiter so we have Jupiter Pi torch so this seems like this would be a template that would have uh Jupiter and that we could start utilizing it I only need one gpus here today uh one week savings plan that's cool but I don't need something for a week but I could see like if you needed something for a period of time that could be interesting um and let's go ahead and Deploy on demand and see what we get so
now we're just waiting for this deployment to occur so we'll just wait a little bit here not sure how long it should really take but I can't imagine that long um and yeah it's doing stuff so it's downloading stuff so we can clearly see what's going on so just going to wait till it's completely ready okay all right so uh yeah I'm just still waiting for this to spin up it's it's been uh quite a few minutes here I think um but you can just clearly see that it is downloading containers so that's basically what
I'm waiting for it to do um the other thing is like I'm not not sure where this actually is being spun up as I didn't specifically pick where it is what does Ro stand for Romania I suppose I don't think it would really matter if it was Romania I don't think it would really matter um as I'm not looking for uh super fast latency here but it is taking time to pull this stuff um and it could just be the the the image that is being utilized here is just really large so it's hard to
say but even if this does get working then what you know what I mean because I'm not sure exactly how I get into this um oh we have a connect button okay so I think what happens once this is completely done if it does finish we'll have a connect button we'll give that a go okay all right so after a very very long wait it didn't take minutes I now have it running and I've already had a bit of spend not a bit big deal um as it says 40 cents an hour uh GPU Cloud
span is about that but now let's go ahead and do connect and so we have a couple options here we have basic SSH terminal SSH Overexposed start web terminal what I want to do is I just want to connect to it so HTP Port 8888 not ready um so I'm not sure what port this runs on by default we do have public IP address here right say default port for um what's it called uh Jupiter Labs Jupiter notebook is it is it 888 it is and so it's suggesting here that this one's not ready we'll
click it anyway okay so we're on runp pod here I'm not sure what AI Doc is I've never seen that before uh um let's see runp pod I AI doc like what is it login like what what would it be just give me a second okay all right so here we have a username but it doesn't specify what the password is um and so I don't know what it is um because we know that this is the username name or parts of it is yeah it is so this is the username right but how would
I know what the password is right I I wouldn't um and there's no indicator as to how you get the password enter the username and password okay so I'm going to go make a note for them I'm trying to be helpful but uh they might not like my feedback but I'm going to go ahead and give them some feedback here all right so here I am and I'm requesting information so I'm going to go here to this page here and let's just tell them like and we'll go back over to here and I'll go here
so to say you know I have launched a pod with gpus uh with the py torch pytorch Jupiter and I wanted to show people how to connect to the web terminal to show how they can work in a notebook again on a OD um when I click web terminal I have to enter a username and password it's clear what the username is should be I have no idea what the password is supposed to be there is a a lack of documentation explaining what to use okay there we go so that's all I can do um
it's not again it's not very clear right um like maybe we could use the SSH uh part of the SSH key to do it but there's nothing here it just says password right so that's not helpful whatsoever um but anyway we gave it our we gave it our best try here um but so far you know I probably wouldn't use this I don't particularly like it um and considering how long it took for it to start up that's not uh that's not uh you know like if you have a workload you have to generally know
how long it takes to start up um it feels like there might be additional spend but if you're doing something over a week or two weeks yeah that's less of an issue but uh yeah definitely something that I wouldn't um try again and I think it stopped yep it is exited and we'll call this one done okay [Music] ciao all right let's take a look at grock so grock is a place where it serves up open source models so it's providing inference via API and also an interface so I believe if we go down below
here we can actually start uh trying out uh Gro what's really cool is also has a voice feature using whisper large uh three so we could talk to it here um I want to see if I can capture yeah we are capturing system sound so let's see what happens if we do this so I'm going to go ahead and just sign up with uh my my Google here and so I should be connected now and I'm going to say allow when using the site can you teach me how to speak Japanese and then I'll hit
pause here and so right away we're getting stuff back here and you can just see how fast it is so they're just trying to emphas how fast their inference is the amount of tokens that are being used and things like that so that is really cool but if we wanted to start building with it we can go over to the playground and I believe that we can get a free API so I literally just pressed Google or my Google account to get connected and so maybe I want to go ahead here and create a new
API key um so we'll give it a moment here I do need to verify with Cloud flare we'll just say my API key and we'll go ahead and create that and so now we have an API key that we can utilize I'm just going to store it over here for the moment as I'm not just ready to use it just yet um but let's say we start wanted to start working with it we can go over here and we do have a playground but I want to programmatically work and we can see that there are
a bunch of different models that we can utilize we have whisper large version three uh llama 3.1 8 billion Gemma uh 9 billion uh parameters let's say we want to use Gemma here we could view some code and get some code and maybe we can start working with this so we do have an API key over uh here that we're going to hold on to I'm going to copy this and I'm going to make my way over to um GitHub and I'm actually going to launch this into GitHub codes spaces here today um so we'll
go ahead and we will launch code spaces all right and so we'll just give it a moment here to connect always takes a little bit of time so we'll just wait a little bit okay all right so our environment is up I need to change this over to Dark theme otherwise it's going to be a little bit hard for me here today so we'll switch over to here I'm going to make a new folder we'll say allow a new folder called Gro and I just want to bring in that code example that we saw so
if we go back over to here here we have our code example I like how we have other implementations I'm just stick with python and this will just be Gemma dopy I'm going to go ahead and paste this on in here we're going to probably need um Gro as a requirement so I'm going to go ahead and just type in requirements.txt um and we're going to just type in grock and it looks like it is compatible with the open a a AI um API which we often see uh quite a few places so I'm going
to do a pip install hyphen R requirements.txt uh txt here I'll bump up the font a little bit I realize it's a bit small we might as well get in our python our python. EnV here as we should not externally load um variables and we'll just put in a git ignore and in our git ignore we will ignore a EnV file and now I will add a EnV file um a new file here and this will be EnV and this will have our API or we'll say grock API key there might be a very specific
um environment uh variable that would automatically get loaded I'm just going to manually load it if I have to I mean it's it's not showing it here so there must be a very specific one so we'll just say grock API key nbar it probably has a very very specific name oh it's grock API key oh did I get it right even without knowing I did excellent okay so great so that is there now I'm going to do another pip uh requirements.txt because we now have it I know there's other code where we have loaded in
our nbars so I'm going to go down into maybe this one here I'm going to grab these three lines as we'll need all three of them maybe um maybe not the OS one but we'll see and I'm going to go ahead and just paste that on in here so the idea is that now it should be contextually aware of that environment variable it's using Gemma 9 billion uh um it I'm not sure what the it stands for maybe instructions and so I'm hoping that this works again I haven't put any credit card in here so
this would be really interesting might be very limited but um if it works that's good enough for me so we'll go ahead and type in Python Gemma piy and we'll go ahead and hit run uh we get an error back minimum number of items one invalid request um well I did exactly what you told me to do so I'm not sure as to what's wrong here um but let's just carefully look at it and see what could be wrong okay so we scroll on down um error code 400 so that's a bad request minimum number
of items one invalid request so let's let's go take a look here oh there's no messages so we haven't passed any messages here so that totally makes sense um I'm not sure what it falls so I'm going to assume that it's using roll so we have roll and we'll just say um user and then I'll just say message um hello how are you today let's see if that works we'll go ahead and hit up again I'm just guessing the format and so it says here the following must be satisfied messages message is not supported did
you mean name okay so clearly there is a very specific format that it's looking for here um it doesn't show us the example but we will go into the gro API um and we'll find out what it is so it's Ro and content as they are showing here I might vary I'm not sure if it's going to vary based on um API but we'll go ahead and try this it says as a large langage model I don't have feelings so I've started using this and wow is this fast this is incredibly fast so I guess
that's the reason why um they say it's such a big deal but you can see that there is some open AI compatibility here it can do text speech Vision but it's utilizing open source models so if you're serious about using open open source models I guess that kind of makes sense it clearly has Integrations to a lot of different um uh tools and services over here on the left hand side um I wonder if it can do yeah I don't think so but I was wondering if it could do um structured structured Json but I
don't see that here maybe there's a way to do it we type in like grock um grock uh structure Json that could be really interesting structured output so I mean looks like there is some stuff out here but I'm not sure exactly how we do it well it looks like it's has uh response model so it seems like there is some way to do um do that I'd be really interested to see if that works as that's always been a really large pain point for us is doing structured output so I'm going to go here
and say new file and just say structured structured. piy and we will go ahead and grab this example U it's using pantic which is fine and I would just love to see it work I was talking to George and he knows a lot about structured output and he's had a lot of trouble with it so if he's having trouble with it then clearly it's it's not a me problem it's a uh you know you know what I'm saying it's a it's just a hard thing to do but we'll go over to here and we'll grab
this information I think we already have OS imported but I'll go ahead and just take this out I'm not sure what this instructor is maybe that's the thing that's doing it from grock instructor um I'm not sure where we even just got that last link so we'll say grock structured output oh it is a from instructor okay structure the most popular library for simple structured outputs oh did we just find a cool library for structured outputs okay if this works I'll be pumped okay we are totally taking a detour this could be its own video
but we are dropping in here because I'm so darn excited um so we have instructor here I'm just going to split the screen here and we're going to look at our structured output so we have Gro I don't I know what pantic is so we'll put that in there it's a way of representing structures of of python um typing I think is part of pantic so I don't need we we need to import those separately let's go ahead and see if we can do the requirements I don't know if the library is literally called instructor
but we're going to try anyway um and let's go ahead and see if we can run this we'll do a pip sorry python structure pi and we got structured output wow okay I don't know how how to tell you this but it's been so hard to find a library for structured output the question is what does this work with um I mean if it works with open AI That's great as well so it seems like it does do that um but it works with grock that really opens the door for a lot of opportunities for
us so I was really excited to find that just now um and yeah maybe I'll make it a separate video but there you go there's grock as you can see it's super easy to use uh and makes it so that you can use open source models super fast let's just say grock here I guess the question is how much does grock cost as we were just using the free tier um we go to products here pricing so I don't know um 25 million per tokens I mean it seems very cost effective it looks like it's
doing it's not like you're loading up credits but maybe you are um it's pay as you go I'm just trying to see here so we go to how do we pay let's go to settings billing pay per token so you scale up as you go okay so oh you can't can't do this so you have free use low rate limits which is fine pay per token so coming soon you can't do that yet um and then if you want to use you got to pay it as a company so that's really interesting I would have
thought that this would have been ready to go so yeah once you get paer token this could be a game changer at least for the open source area but I guess right now you just you get to use what you want to uh you can utilize here in the grock cloud so I guess that's what you get all right see you around [Music] cow all right let's take a look at replicate which by the way has the coolest name ever I imagine it must be inspired by Blade Runner but um replicate is a place where
if you want to uh run projects uh like serve them up similar to maybe say hugging face spaces you can absolutely do that so if we go here we can explore um a bunch of different models and just use them right so if you wanted to use flux then you could go here and you could pay to have uh to utilize it directly here uh via here it also looks like they have an API so a lot of these things are set up and so you can install replicate API and use these directly uh so
they both have an API playr like a playground and also an API and then obviously you can implement it in a various uh amount of formats and they do have some free tier usage I believe so let's go ahead and sign up and I will just sign in with GitHub um and we'll get going here with um here so obviously I have an account for before but let's go take a look at what we can utilize for free because there are some things that we can use for free and some things that are paid um
I really don't care about this right now and let's go over and explore and see what we can use for free um because there must be something we can use for free and I imagine it's probably like featured models because that way they are showing them off um so let's take a look here affordable and fast images let's go over and take a look at this so here you can see 0.022 so here I have to add payment to utilize this one so I'm trying to find one that I can use for free if I
can't I can add a credit card it's not that big of a deal um but let's go to playground are these the ones that are free a new way to generate images and replicate the playground encourages rapid fire experimentation no so I still would have to set up building I know I've used things for free here so I'm just trying to find where we can do stuff for free if we can it's not a big deal I can hook up the credit card I just would love to um uh show anything for free let's go
over to [Music] here yeah so maybe the free model is just gone okay replicate does it have a free tier you can try featured models out from replicate for free but after a bit you'll be asked to uh set up billing okay so maybe the reason that I can't use the feature models anymore is that I've had my account for too long let's go over to our dashboard here um create an account you can try featured models out for free but after a bit we'll ask you to set up billing okay so that's what's happened
here is that um they're basically asking me to set up billing so I guess I'm just going to have to do that um and I'm going to go do that I'll be back in just a moment okay all right so I have a card hooked up and um I mean it doesn't seem like there's a way for me to manage my spend per se let me see here yeah the way it looks like it's just like you use so it's like oh spend limit you can optionally specify a month of spending limit once you reach
that limit will prevent new stuff here so yeah I'll just say like $5 as my limit here and so now I have that I wish uh cloud service providers had that that'd be really cool so now we have a limit of $5 on here we'll go back and so I guess the idea is that we can use uh any of these things that we want to utilize here so let's go over to models um actually let's go to explore and maybe there's something fun that we can use so generate speech generate with images make 3D
stuff um upscale images there's just tons and tons of stuff in here that's really [Music] cool yeah good question I mean restore images sound really cool but maybe we'll just do some image generation here nothing fancy so we have visual instruction uh tuning towards large language models okay what's the cost to use this though so down below here I can't even tell it doesn't say before it was showing us the price right there and so that was really useful um but I guess we could just utilize it here so visual instructions tuning towards large models
Vision models with gp24 capabilities I don't know if I like this one this one's not as clear to utilizes other ones uh I don't know okay maybe we'll just be not so difficult about it so we have output yes you are allowed to swim in the lake to the image okay so this one's actually looking at the image and producing output I want to actually generate an image so go back to explore and here we'll just do this affordable and fast one here and so it says point0 2022 per image that's really small here have
a portrait photo 1024 by 1024 um we'll just say a portrait photo of a raccoon or a black bear okay and then we'll run that and we will wait a mo a moment for this to generate out and there we go so we have a portrait of a Blackberry and wow that is a really good uh output there I actually I like that quite a bit I'm going to drag that on over here um but yeah as you can see that was very very quick let's go over and check out our billing okay and so
far we can see the do 02 cents so that is uh really cool let's see if we can work with this programmatically and see how easy it is to do that I imagine that this will be as easy as working with um uh you know whatever so I'm going to go over to our geni Essentials and let's see if we can quickly add some code here I think I saved the code last time I was here right because I was working with Gro yes so I did save it and I might still have an environment
up so I'm going to go ahead and uh open this in the browser I'm sure you've seen me launch GitHub codes spaces multiple times if you do not know how I got a course on it so I got a course on everything um so we're going to just open this up here and on the left hand side I want to make a new folder here which will be called replicate okay and in here we will just make a new file and this will just be J.P and let's go over to here and let's go get
the API here and so I want the python example and I guess the first thing we'll have to do is do uh install replicate so I'll make a new uh requirements.txt and this will be um replicate uh we'll probably want python. EnV as we keep utilizing that and I'm going to CD into the replicate direct directory we'll do a pip install hyphen r requirements.txt I'm going to go ahead and make a EnV file here I'm also going to go ahead and make a new umg ignore get ignore as we want to ignore ourv file so
that's something that we should obviously ignore dog ignore okay let's make our way back over to here and so we have this key in particular I'm going to show it here on screen uh hopefully it lets me delete these Keys later on I'm not sure if it does do if it doesn't that'd be really annoying so yeah I wonder how it manages keys well we made that key earlier right if I if I go check my API tokens yes yes so we generate that one out so I will rotate that out after this video um
and so I'm going to go ahead and just clear this out like this and then we will wrap it as such okay not sure if we have to um and so we did our install over here I'm going to go back over to uh our code and so it's suggesting that this is all we need to work with it okay so it looks pretty straightforward the only thing that we're missing is loading in the environment variables so I'll go down to our streamlet example which has uh these two lines and I'll go grab those we'll
go back over to our gen. piy and I'll just paste those two lines in there and let's go ahead and see if this works so we'll do um uh python python uh gen. piy and so it works and now we have our output there we go now we didn't change any of the default parameters so it clearly went based on that so we say a portrait a photo of a uh doc hund so that's the kind dog that I have a short haired doxin and we're going to try this one more time I'm going to
delete the output here did a really good job though so I'll wait a moment for this to generate so we'll go over to here uh kind of looks like my dog my dog's a bit fatter so I'll just say a short-haired fat doxin and now let's see what if it gets even closer okay and so we'll go back over to here yeah so that's kind of looking like my dog actually so my dog again my dog's a bit fatter than that but you know that's a kind of an example of of that generation out there
uh so very very similar um and so yeah we have our example works great we'll go ahead and say replicate yeah you can just see how replicate could just make it really easy if you had very specific services that you wanted to utilize um it's just dead simple I'm going to go ahead and get rid of my API token key and we'll go ahead and just delete that or disable it I suppose and uh yeah there you go that's [Music] replicate hey everybody it's Andrew Brown we're going to be taking a look at Cloud flare
as they do have an AI offering um I believe they have a vector database and also means to deploy um uh open source AI models and so I would love to take a look at that and figure out if we can figure it out together as Cloud flare is super fast um and I hear that they have really good pricing we might go compare that here but I'm going to get my head out of the way I'm going to log in you would have to create a cloudflare account first and also hook up your credit
card um I do not believe uh that I'm in the free tier or I would not imagine that these are the free tier but we'll take a look here and see what we have so under their AI category on the left hand side we have worker AIS Vector eyes and AI Gateway you can see they have a bunch of other services as well so they're definitely growing as a company maybe one day they'll be like a cloud service provider um just like AWS but right now I would call them a cloud um a uh a
cloud platform with a lot of cool stuff here but let's take a look at the first here so it says welcome to worker AI workers AI so offers a catalog of AI inference models that you can access from a worker or via a rest API so work sounds like uh compute to me um let's go find out what that is just one second here we go so Cloud workers provides a serverless execution environment that allows you to create new applications so it literally sounds like kind of like Azure functions but it just sounds like uh
like compute it's just it's just serverless compute um so here we have uh some stuff below and it looks like we can uh build and deploy a llama 3 worker llama 3.21 billion would be really nice to deploy and then we can use the rest API that's very clear we also have some examples down below which is really cool speech to text would be really sweet I wonder if that would use open whisper as I really do like open whisper um so I'm opening this up in another tab here I'm not sure if we can
open it this way and this one is using it is using open AI whisper so that could be a very specific use case that I would like to use however I'm not sure how easy would be to provide the the the code samples actually there is one right here uh which is being served up from um oh using an example from Azure okay so we could do that maybe we'll just stick with text for now as it might be hard to build something with a microphone capture but let's go back up to here and I
think what I'm interested in is llama 3.2 so I'm just curious because we have models down here below let go expand it and let's go into text generation and I mean Gemma is good as well but just because I've been using llama 3.2 as a baseline I think we'll stick with it so let's go with the 1 billion parameter one here and we can see it here so launch the El a playground try out the model with the workers a on playground let's check out that first okay so it's just a playground that's cool so
hello how are you doing I didn't know they had a playground that's really cool um hold on here but we'll go here and just say hello I need help uh studying uh Japanese it's not going to be very good at this we'll give it a go there we go and the assistant replies and so it has a little bit of a different interface but we do have some options here we don't have tons but if we go over here we do have uh the code if we wanted to utilize this so you can deploy the
following code deploy a workers AI worker using the current playgrounds message settings okay so we do have code but the question is how could we go about deploying this H good question so let's go back here for a second because up here I bet if we press this it would automatically started deployment not necessarily oh we do have a deploy down here below but I'm just wondering like if I want to do a custom deploy how would I do it so obviously I could press this button let's go ahead and try that and that's going
to perform a deployment it says it's actually telling us where it's going to deploy which is up here okay and that was really fast I cannot imagine that it's ready for inference that quickly but it is really interesting how quick it was um I mean obviously I would like to deploy the project locally for sure um install Wrangler I don't even know what Wrangler is let's go take a look here Wrangler is what maybe it's just the name of their it's their command line tool for building Cloud for workers okay so maybe this is the
way we would actually build a worker so MPX Wrangler init my worker so we'll set up a new worker and we have some configuration and we have development deployment so it seems like this is how we would do it and probably we would specify that code block and that's maybe how it would know to do it we'll come back to that in a moment um as right now we just want to deploy this llama 3.2 template um I'll go back here as I've lost the page but that's okay and the back works we'll continue over
to the project um and so here it's under the workers and workers and pages and so if we can go back into here over here on the right hand side we have My Account Details again I'm not sure uh I think I'm unpage here so just you know be aware of that so down below here we can see our version track history so a minute ago and then last minute ago I'm not sure what the difference is but maybe they're just showing commits over time uh I can't click onto anything there as of yet so
these are not accessible which is totally fine maybe because I never really hooked this up to a real GitHub repo because this is the the build history and so I imagine it has to be linked maybe to a GitHub repo I'm not sure but we can deploy different versions if we need to then up here we have our deployment m and so I guess my question is like it's deployed I'm not sure what the code looks like that's totally fine let's go over here and edit the code okay and so I can imagine that this
is literally the code that's been deployed and if this is like workers then this is our single line file that we say that we have okay oh this is trying to make sense okay okay okay so basically what this feels like it feels like kind of like uh Azure functions or or uh itus lambdas and it probably already has these libraries pre-loaded we have EMV aai run and so we're literally just calling them and then we can just start working with it and so the idea is that it already has an endpoint that's accessible over
here that we can utilize um and here it looks like they already actually have some um output for it input so we say prompt tell me a joke response here's one etc etc and it's outputting that there I didn't tell it to open up um this here I didn't know you could embed it like that but that's really cool but I guess my question is like if I wanted to use this myself how would I go about doing that okay so I'm just again going through this and taking a look here so probably what I
could do is produce my own curl I mean this looks kind of like the workload similarly what would I want but how would we go about triggering this also looks like O Okay so that's the developer stuff that we had before so build preview deploy your workers from Wrangler command line interface okay so I think what this is is doing it's initializing a new project based on this already existing project so maybe that's something we might want to do to to bring it down to the ground and then we can push it to our own
repo so I'm just deciding I'm going to open up vs code let's give this a go and I have obviously a random project open here on the left hand side it's not random it's all of our Japanese language learning stuff but we'll go over to here we'll make a new terminal I'm going to drag this up okay and um I don't don't have that installed so I wonder if I have to install the the Wrangler CLI first let's drag this over here I'm not sure but I'm just going to take it anyway I'm going to
CD into my sites director wherever you want to place that because I think MPX will actually pick this up like it will know like whatever it wants to be will do so it says need to install the following package create Cloud flare sure what does create Cloud flare let's go back here create cloudflare uh CI okay so it says here CL provides a c command for creating workers and uh workers and Pages project mpm crate Cloud Powered By The Crate Cloud package so yeah I'm just not sure if that is the same package or if
it's a slightly different package this might be a specialized one Wrangler version 2 is only receiving critical security updates we recommend migrating to Wrangler 3 so there might be some overlap I mean this is not uncommon like eight of us will have um CIS that do the same thing but over here what do we have we have in which directory would you like to create the app um oh okay um I mean I didn't choose any of these things it looks like it is working through it I mean all that stuff is fine fine I
guess because it's an existing project probably it's it's just defaulting to what already is there and so it's just working through that right now so we'll give it a moment for this to complete I'll be back in just a moment okay all right so it's installed installed it now it's actually using MPX Wrangler so clearly it is this is a precursor to then set up maybe Wrangler so probably what it did is it installed Wrangler um and now it's doing quite a bit of work so we're going to let it want really love this uh
this experience it's really really nice and so it's opened up on Port 87 9 etc etc whatever it said so I guess we are just giving access to Wrangler to my account and that's absolutely what we're doing so we've granted access I'm going to make my way back over to vs code um over to here okay and so it should know that we have logged in now sometimes logging can mess up but it's okay okay we'll just close that I don't think it's required but the point is is that we have authenticated but we'll give
it a moment here it might be stuck okay so I'll just pause here and wait there we go so it did eventually work which is great um so we we did Wrangler log in it logged in as my account not sure why I capitalized the a but that's totally fine um it downloads existing worker files do you want to use G absolutely yes we'll hit enter it's really interesting that the interface is like so Advanced here in the terminal though I'm not sure how that would work if you were working on a remote server probably
would still work and so we have our application successfully created locally um it's telling us if we want to change directories of course we would love to do that so go ahead and do this we have mpm run start so start a developer server I'm not exactly sure how this would work because we are running AI but I guess if they're inferring their own serverless like the runtime itself is probably not containing the Llama runtime they probably have other servers that are connecting to it and that would explain why the deployment so fast also that
could also explain if it is only uh based on what you use as opposed to time running which we'll find out here in a moment we do have mpm Run start and mpm run deploy something that's really hard over at least in uh ad Us and other places aure and Google included is local development so this if this is like lambdas but they just do mpm run start it works that' be really really amazing um we'll go down here uh below and I actually just want to reopen this into vs code so we can take
a look at what our code sample looks like and so we've opened a new vs code project here I still want to keep this one around so we'll just move it out of the way okay and we'll bring it on down here and let's see what we have so we have a package.json file okay that's pretty straightforward I'm curious what it's installed uh um just Wrangler okay so just Wrangler that's really nice we have an index.js file so yeah looks like a function a really easy function request environment yeah absolutely like a a a servess
function and then it's just inferring from it so really that's all there is we have a Wrangler dotl maybe to provide some configuration here um I'm not sure if if or why we need the binding for AI but maybe if we don't have that then the environment won't let us use it looks like if we want to turn observability it's just a a flip there we're telling its entry point it would be great if it could take Ruby I love Ruby but if it's only JavaScript this is totally fine uh when we're dealing with like
um serverless functions and they're being deployed uh to remote edges JavaScript seems to be uh the choice I'm not sure why but there's probably a technical reason for it so I can understand why we might be limited to just plain old JavaScript um but yeah that's really really straightforward so I don't feel like I need to change anything here but I would like to figure out how to actually test this and so I'm going to open up a new terminal because I believe that this is running uh somewhere well it's on it runs online but
we're not running it locally and so maybe that'd be interesting to try out is to run this locally and so I'm going to go ahead and type in MPN run Dev and so that'll probably start on some kind of Port we'll give it a moment there while that is starting I'm just going to go ahead and open a new tab okay and so it says CL collect some information uh starting local server open Dev [Music] tools I don't know let's go and hit Dev tools I've never seen such Advanced terminal terminal stuff it's really impressive
I'm hitting D but it's not doing anything so I'm not sure why I'm not really worried about this part oh no it just was delayed I was just impatient it's working but the only thing is like right now like this is not going to do anything right oh it actually does okay great so they gave us back output interesting uh this is developer tools okay so I'm not exactly sure what to do with that that I'm not really interested in that what I want to know is how do we do inference right so we obviously
have to send it a payload what is it expecting let's see if we can figure it out by just looking at it so if we go here we can see that um we already have a chat built into here which looks like it's hardcoded because we have system and user and so it's already set up to give out a response so you are a helpful assistant who won the World Series uh in 20 in 2020 right so if we were to infer this even if we weren't to uh send it any payload because it doesn't
appear to be including any payload here it should um well hold on a second hold on a second because we have two things here we have llama 3.8 instruct simple so simple uh simple single conversational style input and then we have chat so there's actually two things that are going on in here um so we go back to that uh over here let's try D for developer tools I'm just going to be patient here and wait just a moment I think what's that what that will do is it will show us the output nothing's opening
so maybe it's maybe it's B open a browser so I'm G to hit B to open this in the browser yeah so it's b d is not doing anything for me that's totally fine so we'll give it a moment here and uh I think this is just built into the browser and so here we have inputs tell me a joke and here's the response input systems your helpful assistant you won the series okay so basically they just have two different examples is what we're seeing here right um and so we might just want to narrow
this down and say okay I just want to have one right so we'll do this and I don't need two so has tasks I don't want two tasks I just want the single input so I'll go here and we have the response we have input chat response um do we need the chat I mean we can output if we want but honestly I would just rather return the response so I wonder if I could just do this response like that and take that out I'm going to save that notice that the server is updating um
and we can still just go ahead and hit refresh here and it should give us uh an error or it should work we should just monitor it here I'm not sure what happens when you get errors with this it says response is not defined that's totally fine so it's possible because it's not a Jason object it's complaining cuz it is trying to get it from Jason right and that's a string so I'll go here I'll just say um response and we'll go ahead and wrap it like that and so that might fix our issue again
I don't know any of this this my first time using Cloud flare workers specifically with AI I mean I have used Cloud flare it's just been a while um and so at least it's telling us exactly where the are is so response is not defined it's still an issue here um I'm not sure why that's a problem like it's clearly defined right there right okay so let let me just undo for a moment okay so yeah I want to take this one out of here and we have inputs chat response let's just take out one
for now and see what happens cuz this one should work right I want to just pair this down to the minimal the minimal stuff that we need here and now it's saying response is not defined but it is so I don't really trust what it's saying here so I'm going to stop the server start it back again because clearly well you know what it doesn't have lead in front of it that's the problem I think it's because the previous one was defined and that and then we overrode it and that's why it's not working correctly
so I'm going to go back over here and I just have more confidence now that this might work now I don't know if it needs to be wrapped in adjacent objects so that might be another problem but that error is going to go away this time if we just get back at string that would be nice but we'll see what happens here and we do so it just wraps it in response and so that would be a very simple way to get a response back um but this right now is not super useful as we
don't really have any inputs um so what I might do here is I'm just going to say console log and I just want to console log out the request I'm curious if it'll actually show up down here I really don't know if it would and I also want to console log out uh the EMV and I'm going to give this a query string okay so we're going to go here hello equals world I'm not sure if it can handle query strings we'll give it a try and we do get output so that's good so if
I scroll up here what I'm looking for is where my hello world is so we have URL this is the request I would assume it' be in the request um so we go here and we carefully look here um we got host user agent I mean maybe it doesn't take gets maybe it only takes posts [Music] um yeah I don't see it but I'm sure that's not hard to figure out we just need to figure out how to pass something along there so let's go back over to workers so workers how to pass um parameters
Cloud flare it must be really easy to do we could probably even ask um Chach PT or something all right let's see if we can save ourselves some time we're going to go over to chat GPT here and I'm going to go ask here I'm going to just say um can you uh like I am using cloudfire workers how how do I pass um uh parameters in a workload or in a request to it because like one thing is like how do I know it's Post in get maybe it works with either poster get because
clearly it's using a get but maybe it will work with the post I'm not sure if you can use Query uh query string parameters honestly I'd probably want to do uh oh you can it says you can do it okay but how do we extract it out so here if you're passing parameters via query string you can parse them using the native URL class in the worker okay well we are using query string I mean it's probably not the most ideal way to do it um we probably should be utilizing post post input okay so
let's go take a look I'm and I'm just going to try to bring this code over I know you can't see it but I'm going to just carefully look at it here at for a moment so it looks like you know they have this like ad event weight whatever but it looks like they have some kind of more um uh concise way maybe this is older code and that's maybe the way you used to do it or if you're using older version of nodejs I'm not sure um but the first thing is that we should
make sure that it's a post that's not a bad idea so go here um and I'm just going to go here and say if post and actually going to say if it's not opposed um I'm going to give it this response now this probably not the best way to do it there might be a way like to handle errors or something but I'm just going to do this send a post with Json data um and so what we want here because in the request apparently we can just get the data this way supposedly and then
we could just say data okay um and so that's fine we're not using tasks anymore so that's gone and we have uh we could also like expand our data out this way so if there are specific parameters like this we could do this so all we would have here is uh user input I think and I would then put this here I'd probably give it a better uh system prompt here we'll just say uh give me [Music] another explain what this I'm trying to think of like something it can do in a conversation so um
because we don't know what the first I mean we do I'm saying we don't have an interface to know what it is like what we're saying to it and this is a chat conversation I almost should have done like chat completion here but uh we just say uh well I'll just say help me learn help me learn um you know Japanese uh language help me with Japanese language learning I'm not sure if I spelled language right I always spell that wrong language learning yep apparently I got a write for once so we're bringing that in
there um and so I'm hoping that this this works here the only thing is that we need a way of actually um doing inference against this so I'm going to make a new bin directory here and this is going to be a bash script called um in uh uh like infer or test or chat how about it's just chat okay um and so I need a curl command I really don't want to write one from scratch so just say uh you know can can you give me a curl command which will go to Local Host
uh what's going to 878 I think it is is that what it is what port do we need oh now I've lost I got too many windows open I'm just trying to find it here here we are um and we're trying to go to Port 878 8787 okay 8787 with a payload of post uh for user input okay let's see if it understands that yeah that's basically what I want um can you make it a bash script I just can't remember the um shebang there's a particular shebang that I like but we'll take either or
as long as it works oh that's way too much code but um we'll take some of it so I'll grab in the shebang that's not the shebang that I like but it probably will work and um it actually lets us provide user input which is actually kind of sweet I kind of like that um you know what I am going to take this whole script because it kind of does what I want it's a little bit verbose um maybe we can pair it down here in just a moment but like we have um it gives
an example of user input which is perfect and it has error handling we can swap this out very easily which is something we might want to do in just a moment it prints out the payload or it creates the Jason payload I suppose it's passing it here yeah I think it's fine okay so we'll go back over to terminal here and I'll go to this tab by the way if you want to utilize this I will place this in the repo so that uh you don't have to try to re replicate this stuff you'll just
go to the um exampro doco or sorry GitHub github.com exampro exampro um uh gen Essentials and it'll be in there okay so I've just made that executable so that we can run it and so this should try to do local inference um I ran it and nothing happened so did it work because it didn't Echo out anything H okay so let me just go ahead and print anything say Echo hello oh you know what I didn't give it input so we'll just say uh how do you say eat in Japanese okay let's try this missing
user input argument uhhuh are you sure about that maybe we need to wrap it parentheses let's try this again if it does Works that'd be great I'd be very happy with that it's running slowly so I I imagine that maybe it's trying to do some inference oh look it actually does have it it's reloading the local server so how do you say eat in Japanese which is good I'm waiting for a response from it but we're not getting any response it's just hanging so it makes me think that it's stuck right he trying to do
the curl and get that response oh no there we go so it was probably just generating it was just taking time that makes sense so in Japanese there are several ways to do it so we've figured out how to implement it this way let's say we wanted to deploy that change so I imagine we just do mpm run deploy because I saw that earlier and while that's deploying I'm just going to exit this out and we'll go back over to here and so we'll let that deploy occur I'm not sure how fast it deploys I'll
just pause here and we'll wait for this go a bit faster okay all right so after few minutes it's deployed and it's at this address so I think that if we wanted to um uh test this worker out I think we would just copy this address here and I would go back over to our chat and maybe what I could do where this URL is I just SWA it out here so do URL this okay and I'll just comment out the first one and I'm going to go ahead and try this again and so it
should be inferring against um that there we'll give it a moment so now remember that this is happening on the server so those other things we had before here we wouldn't see it here but here we got a response and it clearly worked this is super easy I can't believe how easy This was um but you know what I'm interested in in checking out is is do we have any visibility or monitoring of what we've done here so we'll go back over to here and we'll go to here okay and I want to look at
this workload in particular now this repo is not pushed anywhere so I did create a git directory but I'll probably end up deleting it because um uh it would have to stand alone and I don't want that this isn't beta I'm surprised that's beta but we are getting um some endpoint history here I'm not sure how useful this is right now I mean this is fine but like where's my logging is what I would actually be interested in so like obviously we've done some we have some console logs and where we can do that oh
cool there's direct Integrations here that's nice um and so you know maybe that that's somewhere I'm not really certain as to to where we'd find that direct logging but um yeah that's nice that it just pulls up here um but yeah I mean that that works so I think that we would call this done um there are a few other offerings here under the AI so we have vectorize which is probably a vector database um and then we also have uh AI Gateway but I'm going to just tear this down as I'm going to consider
this done I actually did create a dog directory but I'll probably end up deleting it because I want to drop that into um that project there we have daily usage Limit Oh so we were actually running on the free limit so far that's pretty sweet um so I did not know that I thought that maybe we were on the paid uh because my my credit card was hooked up but apparently we are on a free tier usage um from this page it's not very clear um what we're doing so we'll go over to here I
mean that's if you wanted to select models and deploy them very quickly so basically that's just templates to deploy so that's very straightforward what I'm going to do is I'm going to go here into this project and I want to tear it down since I'm done with it I don't believe it costs us anything but it is kind of open up to the Internet and um I'm not exactly sure on how to protect it maybe that's where you put it in front of um that other Gateway if we have to spin up another one we'll
do that here uh shortly we'll go ahead and delete it and so now this workers deleted we don't have to worry about spend um I want to get this project into our GitHub repo so we have GitHub um exam Pro Co exam proo geni ex Essentials okay and uh I'm just going to open this up and in vs code I'm going to go back over to the project that I had open which is over here and what I want to do is I just want to uh remove RF the.git directory as we are not um
uh we're not going to use it that way but I need to just open this up in a folder [Music] here I'll just have to open this find and file explorer here and I'm just going to bring that code over so we go back over to I really got to close some of these other tabs I have way too many tabs open it's really confusing me here today um and uh maybe this one here there we go and so yeah yeah this is the one I want and so I'm going to go here and make
a new folder we'll call it cloudflare and I want to bring in um all this code so this one is just AI workers I'll make a new folder here called AI workers I don't know if this contains any kind of sensitive information so if we go to Wrangler here as long as it's not my uh permissions or something like that I don't think it is like I think that that since they have a DOT get ignore that it's pretty safe because I would assume that they would not let anything bad in here we'll go ahead
and commit it there's obviously a lot of stuff there's one thing that we should ignore it's actually not ignoring node modules that's something it should actually ignore node modules we do not want that we go to refresh here no no no I don't want node modules refresh [Music] refresh I really don't want it to take all of that it really should uh get rid of all that did I spell node modules wrong I don't think I did it's kind of annoying uh maybe it's not committed the change for uh the good ignore so maybe if
I just add the get ignore like this it just it's refusing it's refusing to ignore the node modules directory node modules so I'm going to just commit that one just that one sometimes this stuff is finicky right well I mean I did I said uh I said what it was and now we have all these other changes I just wanted to ignore the node modules I don't know why it won't do that uh you know node modules like come on folks can do I have to go top level for this I'll make a um new
folder dot get ignore here got get ignore I shouldn't really have to do this but we'll go ahead and we'll copy the contents of this here oops I didn't make mean to make that a folder okay we'll try this one more time do get ignore okay you know what I'm going to do because this is being really stupid I don't like what this is doing so I'm going to go ahead and just discard all changes delete discard just get out of here just get out of here go go get out of here I'm so mad
let me go back I'm we going to reload this I hope all that stuff's gone I really don't want to have to deal with it please don't still be there oh I hate you I hate you I hate you I hate you go away I don't want any of this discard delete delete get out of here delete yes yes delete I don't want to do this 100 times get out of here okay so what I'm going to do even me I get frustrated all the time with this stuff and I know what I'm doing right
and still we still have these problems so I'm going to go ahead here and I'm just going to I did not want to open this get GitHub get GT here today I'm going to just clone this repo and then we will bring it over that way so I'm going to just make a new uh a new terminal here new terminal and we'll say get clone and we'll bring that down there I should have permission to do so oh my goodness I hate this I hate this so much okay um give me a second okay all
right I'm back in here and that might fix a lot of my problems okay and I'm just hoping that there's no sensitive information in here as the dock ignore is not acting properly um but I'll go into here and I'm going to just drag uh this stuff into here and we'll say replace and I want to expand this here and I'm just checking is there anything in here no not really no uh we have that I'm looking if there's any like specific configurations like there is the name of the project up here which I I
don't think is an issue but I'm just going to go here and just say replace me all right and now we can go ahead and Commit This so AI worker example for cloud flare there we go we'll call that done and I'll see you in the next one okay [Music] ciao let's take a look at tpus which stands for tensor processing unit this is an AI accelerator that is an application specific integrated circuit an Asic developed by Google for neural network machine learning using Google's own tensorflow software it's hard not to talk about tpus when
we're also talking about tensor flow which we cover somewhere else in this course uh tpos are designed for high volume of low prision computation and they've been making this uh since 2016 so at least from this time at least almost almost 10 years but we're getting close to there um and so here's an example of a TPU version 3 and a TPU uh uh v5p so if you really like the tensorflow framework uh Google cloud is going to be a great place for you to hang out because they have tpus you can spin up there
um but yeah there you go [Music] let's talk about igpus which stands for integrated Graphics processing units this is when a CPU contains capabilities of Performing the task similar to a dedicated GPU so an example might be something like the Intel lunar Lake chip which contains multiple systems including igpu so here is a picture of the actual chip itself sometimes I like to say CPU for a chip but a chip can contain many systems obviously it is a Computing processing unit with other processing units within it so um if we look a bit closer we
can see the Die Cast the actual Pathways that are on the chip um and all these things actually map to very specific things kind of like your brain where one section is specifically for GPU you'll also also see ipus and mpus which are all specific to AI stuff so you can see that more modern chips when we talk about aips these are the chips that we're talking about have all this stuff built in um to allow you uh to run ai ai models on your laptop Tops on your mobile devices there is another term called
dgpu uh which stands for dedicated graphics Processing Unit that's just your normal GPU your normal gra uh graphics card um but sometimes you see that when you need to distinguish it between igpus and d gpus uh how well do igpus perform they actually perform extremely well surprisingly um obviously a dedicated GPU is going to be better um but uh you can definitely get uh like a modern machine and it performs like just like the last or second second last generation of um of graphics cards and so it becomes extremely affordable and and very good so
yeah uh if you can get a machine that has an igpu in it you absolutely want to but anything you buy these days generally should have them um but this is not limited to Intel obviously AMD uh Apple's metal um chips have them as well so there you [Music] go we also have vpus so vpu stands for visual processing unit this is an a accelerator specialized in Machine Vision tasks so think if you need to do convolutional neural networks um Intel has something called the the midus um and I Intel I think bought this company
in 2016 so this was the known thing for having a vpu and so now Intel owns it but the idea is that it looks like a USB stick and you just plug it into your computer and now you have vpus that you can utilize anywhere so yeah it's pretty straightforward there you [Music] go all right I want to talk about two things um that Intel has with ads and the first is Intel Zeon scalable processor and the second is Intel Gotti um so itus of course does work with or purchases um Hardware from other um
uh other companies like they use AMD and Nidia but I think it's worth mentioning Intel in a little bit more detail here because every time I go to reinvent Intel has a big giant booth and you can go scour the ads website and it just looks like ADS works more closely with Intel as opposed to the other uh providers not to say that Intel is not being utilized on gcp and Azure and others but uh I just noticed something more going on there with ads but let's first talk about Intel xon scalable processors these are
high performance CPUs designed for Enterprise and server applications commonly used in a instances um that scalable part makes them very good for machine learning so you often are going to be be using Intel xon processors whether you know or not on ads the Intel is the Intel uh Habana Gotti uh processor so this is a processor specialized for AI training uh you could say that this is a direct competitor to Nvidia or a similar competitor because uh they uh they uh do something very similar um I believe that Intel Gotti has their own SDK called
synapse AI uh that you can use to interact with it so you launch up Sage maker and then use uh that uh that API or SDK in order to uh best utilize uh that Hardware there but both of these um pieces of Hardware are offered uh on ads and I think it's just good to know them at least to name uh what they are [Music] okay hey this is Andrew Brown and let's talk about GPU I'm sure most people know what gpus are here but I'm going to talk about it anyway because I want to
talk about cudas so a GPU stands for General processing unit and it's a processor that is specialized to quickly render high resolution images and videos concurrently if you've ever played video games you know you need a good GPU because it's all about those images however gpus can perform parallel operations on multiple sets of data so they can also be used for non-graphical tasks and this makes it really good for machine learning and scientific computation so if you're trying to uh convince your significant other that you need a better graphics card you can just tell them
it's for work I need it for machine learning and scientific comp computation it's not your fault that you can also play video games with it and so we have like a graphic there on the right hand side I think I got that from Nvidia and so they're kind of trying to demonstrate the difference between uh the paralyzation with GPU versus serial tasks with CPU but let's go and just read a little bit more so CPUs can have an average of to 16 processor cores gpus can have thousands of processor cores how that works I have
no idea but I just know that that's how it works uh so we have 48 gpus can provide as many as 40,000 cores so that is a lot gpus are best suited for repetitive and highly parallel Computing tasks such as rendering Graphics cryptocurrency mining if people are even still doing that and deep learning and machine learning so you know there you go that's gpus [Music] all right let's take a look here at Cuda but before we do let's talk about Nvidia so Nvidia is a company that manufactures graphical processing units for gaming and professional markets
if you have ever played video games and you build your own rig um a lot of people like to choose Nvidia but Nvidia can do things other than video games and this is due to their framework uh called cuda which stands for compute UniFi device architecture so it's a parallel Computing platform and API I said framework but I guess it's an API by inidia that allows developers to use Cuda enable gpus for general purpose Computing gpus and it says gpgpu because it's saying general purpose gpus I know that's a mouthful there um so over on
AWS they have a bunch of instances that um can utilize uh Nvidia GPU so I adus is always changing the instance so these could be old but you can see we have a P3 which has the Tesla Tesla V100 you have the G3 with a Tesla M M60 the G4 with a T4 uh the P4 with the Tesla a00 so there's probably these are probably old ones there's new instances with newer Nvidia cards but my point is is that aabus has uh gpus that you can utilize another thing I want to point out with Cuda
is that all major deep learning Frameworks are integrated with Nvidia learning sdks there's a big fight or War over um uh these companies that make uh gpus and CPUs because they really want the uh Theirs to be used for machine learning so you can definitely be sure that AMD probably has some kind of similar offering or something uh and definitely Intel as well um but Nvidia has done a very good job in uh making sure that theirs is the most popular um so Nvidia deep learning SDK is a collection of En uh Nvidia libraries for
for deep learning so this is something that this is the SDK you can use with Cuda to interact with their API uh so one of those libraries are called cuda deep neural network library so that's something you can use with it and it's uh tuned for a bunch of stuff if it looks like it's getting a little bit too um uh technical it's because this slide was was for my machine learning uh inabus specialty and I didn't do a whole lot to change it and BR it over uh so you don't don't really need to
know that last part there but just understand what Cuda is and that it's uh very important uh for working with machine learning and adus has uh good offerings for instances with it [Music] okay hey this is Andrew Brown from exam Pro and we are looking at tensor flow so tensorflow is a low-level deep learning machine learning framework created by the Google brain team and tensorflow is written in Python C++ Cuda and there are apis to allow you to use various other langu es and so tensor flow is all based around this idea of a tensor
so a tensor is a multi-dimensional array and so they call ts. tensor in their their stuff and it's similar to a numpy ND array of objects and so tf. tensors can reside in accelerator memory like a GPU uh so they're basically a new type of data structure um that's just very specialized for uh machine learning and Google X has created their own Hardware called a tensor Processing Unit specifically optimize for tensor flow and the tensor data structure uh the way you write tensor flow is in Python an example of an ml model in tensor flow
is here on the left- hand side technically this is Cirus Cirus is a highlevel abstraction of tensor flow and so it can get a bit confusing initially the difference between curus and tensor flow but they're essentially the same thing because Cirus is packaged with tensor flow uh uh for the Google Cloud platform they specifically offer tensorflow enterprise so they accelerate and scale ml work loads on the cloud with comp compatibility tested and optimized tensor flow along with Enterprise ready services and support [Music] okay here's an interesting technique that I saw that I thought was worth
mentioning which is Medusa where it adds heads to an llm to predict multiple future tokens simultaneously the idea is that it will increase um uh performance of prediction so you get information a lot faster um there are very specific models that work with Medusa one's called Zephyr and the other one's called uh vicuna and this technique can even work on something that is GPU poor but again this is for very specific models um but I just thought it was very interesting that by adding multiple heads to an llm uh the idea is that it can
then uh predict things a lot faster because you have multiple things trying to make predictions simultaneously at the same [Music] time all right let's take a look here at flash attention which is a memory efficient faster variant of the traditional attention mechanism optimized for gpus to handle longer sequences with reduce computation and memory overhead the only thing I want you to remember is that it is a better version of the traditional attention mechanism there's this really cool diagram if you want to read more about it you can go to the repo here it comes in
multiple variants so we have one two and three and uh three is not necessarily better than two it's just based on what you're doing so if you're utilizing Hopper gpus which is a type of GPU produced by Nvidia like the h100 then you'd want to take advantage of flash attention 3 flash attention achieves efficiency by Computing attention scores in small chunks fusing operations like softmax and Matrix multiplications to minimize memory use and speed up computation so they're doing a bunch of uh big brain stuff there to make things more efficient but just remember that flash
attention is more efficient and you will see that term come up again and again especially when you are fine-tuning [Music] let's take a look at lit GPT this is a CA tool to pre-train fine tune and deploy llms at scale and so you can find it here in its GitHub repo here are all the CLI commands that it can utilize um as you can see that it can finetune uh now that fine-tune command there if you do not specify Laura it that's what it's going to do by default as that was something that uh confused
me initially but I figured out what was going on there it can do scratch implementations it has no abstractions it uses Flash attention it utilizes lightning fabric it's fully sharded data parallel uh and it allows you to do per so we can do Laura Cura adapter and adapter version two it has 20 plus llm recipes uh lit GPT has in the works a python API so you can in the future easily train it within a jupyter notebook that was something that I wanted to know how to do but I had to uh use um the
CLI which is totally fine as well um here's an example of training so uh imagine that we want to do fine tune Laura now if we took off theore Laura it would still do Laura but I like to be explicit there but imagine we want to fine-tune llama 3.12 1 billion parameter we can provide it a a data set as Json um we can say how many EPO we want to run with it its output directory the Precision we want to utilize and then when the model's done training it will output into that specific directory
that we told it to and we can use lit GPT to serve the model um and do inference so um I'm not sure if we included lgpt in our serving section but obviously you can serve models as well uh there's a lot of servers out there uh but for this specific example um where we fine-tuned and deployed llama 3 billion uh we I used lightning Ai and actually I think we have a video on this where I show how to do this but we use the um one Nvidia L4 tensor core with 16 vcpus with
64 GB of RAM with 24 GB of virtual Ram with a data set of 6,471 examples and we were able to train this within 12 minutes CU I want you to have an idea how long it takes to actually find tuna model and you can see here that you can do it for free within 12 minutes okay so uh fine tuning at least with luras is totally within the realm of reality if you are full fine toing a model that gets a lot more expensive and it's going to be a lot harder to do um
if you were to deploy this into ec2 this would be equivalent of a G6 uh uh g64x large okay and this would cost us about $2 American um so that would be your closest equivalent there [Music] okay let's take a look here at quantization uh which is a compression technique that converts the weights and activations within an LM to a lower Precision data type so um an example might be converting from fp32 to int8 um here we have an example of quantization against a signal now this isn't what we would be doing right because we
would be changing the weights and activations with an LM but this is just a visualization that I've seen a lot and so uh this is how I kind of remember what it is and so imagine you have this waveline see the blue waveline and that is um has a lot of data because it's nice and smooth and as we reduce the Precision we get this blocky kind of thing um but the idea is that you still have the same shape so it should perform the same way now of course we're not working with signals here
but hopefully that visualization kind of helps you remember what quantization is why would we want to quantize our model smaller models uh mean smaller size footprint faster inference not always um as I you know I learned this this by actually doing quantization for real that when I quantize a model I noticed the files didn't get smaller and the inference wasn't faster however sometimes quantization will greatly reduce the used amount of resources like RAM and so you might cut Ram like totally in half and so quantization is worth it uh the disadv advantage is potential loss
in quality because you are compressing the quality of the uh the data U just think of a JPEG right you have a JPEG and if you lower the quality you get a smaller file the loads quicker but um you know it's not as accurate as the original one um examples of quantization would be Q Laura and also ggf files but often when I see quantization it looks like a big mathematical formula uh that is converted um that I I personally couldn't do only someone that is a data scientist can do it and quantization techniques greatly
vary okay so um you know seeing one does not look like the other it is not something for us mere mortals it's something for uh maybe that Rola can do um and I if I had time I'd have a followup video because she knows quantization very well well but um yeah that's [Music] quantization let's take a look at knowledge distillation and the reason I want to talk about this is because you'll come across models that are called distilled models and that implies that they've gone through this knowledge distillation process so this is when you transfer
Knowledge from a large model to a smaller model so that the smaller model performs the same task faster and a lower resource cost knowledge isolation goal is generally to produce a small language model because you're making a faster smaller efficient model um it is a complicated process and so the the greatest simplication that I can give to you for knowledge distillation is that you have predictions that are made from the larger model and you have ground truth data and so between these two things which we call soft targets and hard targets we use that information
as we um uh train and we uh uh train the smaller model to do something that looks similar to the teacher model um one example of performance wise is uh the neotron so the neotron is a model that has been turned into minitron which is the knowledge distilled version of it uh from whatever the 15 billion parameter one to an 8 billion parameter one to then a 4 billion parameter one greatly shrinking the size of that model but the idea is that it's supposed to perform as good as the neotron um and so minitron is
something you might want to take a look at but it uses knowledge distillation and pruning so pruning can also be part of the process uh along with knowledge distillation um to make those smaller models [Music] okay hey everyone it's Andrew Brown and R is back and we're talking about the most popular topic of all quantization uh so we are going to uh jump right into it and so Rola has prepared some slides as a talking point uh for here R would you like to kick us off yeah so we're talking about quantization which is the
process of reducing the Precision of the model weights um and that helps us reduce uh the storage needed for the REM and needed for training and uh storage and the reason that's important is because of these size of the model so this we're going to go through to give you some idea of um how big these things are and and why it matters to quantize and uh yeah it is a very hard word to say but uh so we have some uh or well R has some geni size comparisons or considerations that are going to
help us out here do I click on through am I ready to go and for those those who are watching Rola could not share her screen so she sent me the slides and I'm controlling the slides uh but it all works out because we're coordinated here so uh first one here ml models are often sized by numbers of parameters and that equals the model weights so what are we talking about here right so the machine learning models are um a mathematical model right um most llms are a a neural network and what that is if
you want to think about it abstractly is is a mathematical model it's a complex mathematical model um and part of that mathematical model are what we call parameters MH and that's how we size that's one of the ways that we size models so um when you talk about a 7 billion parameter that means it has s billion parameters and when you talk about a 70 billion parameter that's what we're talking about is the number of parameters um that the model can uh that we can change in a model okay and just to word it in
another way uh that if it says seven billion there's seven billion numbers that can be that can be changed okay yes yes the parameter is a is a number it's a tunable number it's that that can control the think of it as a knob that um changes that gets uh focused or tuned with the data scene so that's what really um en CA the learning and yeah those tunable parameters we we call them model weights because they're weighted they're tunable to a number yeah they are a number but they they they change so if two
models if the same exact mathematical model or model architecture sees two different sets of data it'll come up with two different sets of parameters uh so the next thing we have is size ranges from one uh one parameter to two trillion is that two trillion about two trillion yeah the gpts have about 1.8 trillion to be exact I I rounded it up to two but we're right there yeah how do you run those cuz I'm like thinking about running like there's local models like 7even billion parameters and you have quen which is like 32 billion
parameters and like how would you even run a two trillion that must that must that must be distributed right there's no way yes there's there's distributed compute you could Shard the model itself you can distribute the compute there's a there's a whole lot of ways to do this and this is why things like quantization came up is because these things are really really large okay uh gen models are in the billion and trillion parameters well clearly they go in the trillion parameters and so it sounds like we're never going to get our own local GPT
40 one reasoning but uh we'll see the more parameters the model has the more data it needs to see why is that well because like like like think of it as a system right it's a mathematical model and each of those numbers has to be fine-tuned to encode knowledge and the more of them you have then the more data you need for them to be accurately set if you if you want to think about it abstractly right um wasn't there some kind of white paper I remember you mentioning some some very important white paper I
just can't remember the name of it paper yeah sorry what what's it called it's called the chinella paper I was gonna call I was gonna call I was gonna call it the Cassandra paper I don't know no it's the chinella paper it's a really cool paper it I think it was a the work of Google a Google team and the paper looked exactly at the size at at optimizing llms efficiently and so it looked at um the models and how much data they need to uh see versus how much compute it takes and so on
it's a really cool paper to look at efficiencies of of training llms well if if I have a few seconds left to record one last video it's going to be the chinchilla paper and I'll give the worst explanation the best best way I can but good paper well yeah well I mean worst case people can read it read it full fully themselves but my summary probably would not be very good uh so on the right hand side here we have a graph so every year the model sizes increases by 10 uh and I assume More's
law means it's going to slow down at some point right um and so I would assume that it gets exhausted one point but just looking closely here uh we have Transformers at zero or 0.5 billion so that sounds like 500 million what's 0.5 billion one yeah 50 billion 50 million 50 million there we go that's what it is 50 million and so that's Transformer architecture or oh yeah okay so that's when they first introduced it which we talked about in the course yeah then we have open AI GPT so that's at 0.11 billion so 100
million 110 million parameters then we have Bert which is at oh I didn't know it was at 3404 billion yes 340 million but you notice that this stops at 2021 right so now that we're at 2024 2025 models are a lot bigger and the I know that the gpts um are are roughly about 1.8 so we we pass the 1.6 that's here um anthropic I don't think they did not release the number of parameters openly but somewhere around there we are we are somewhere around the two trillion Mark okay well I I just can't imagine
it will it will keep going at this rate we must we must exhaust it at some point um and I think probably one thing that uh determines whether they can well maybe maybe it's more maybe it's not I don't know but all I know is that when you have Hardware they have nanometers and the smaller they make it the more circuits they can fit into it and they're getting down to like one one nanometer and I'm like what do you do once you get down to one nanometer can you do 0.5 nanometers but they're they're
using a lot of things that are different now right so we do understand a lot of things we do understand that these parameters so the the bigger the model the more it can understand right think of the model as the brain let's say let's say a brain in a human right the bigger the brain theoretically that's not true in Neuroscience terms but um in an abstract way the the bigger the processing system uh the mathematical model the more it can encode data um but we now know that even if you have a big mathematical model
if it doesn't see enough d data the data is not good then it doesn't learn as much so there's there's the the mathematical model one point there's the data on the other point how good the data is and then there's the underlying hardware and all of the infrastructure that these systems need right okay so so there's a lot more there possibly is a lot more to be done to max out the data side and and the creative ways of understanding how to make these models perform well yeah and and quantization is one way right that
we're talking about from from a storage perspective so there's a because these models are really really large they've really pushed the limit of Hardware um and so there's a lot that is being done on the level of Hardware so we do have purpose built chips for these things um and we're thinking about well how do we distribute the models how do we distribute the processing how do we Shard the models um so there's a lot happening there uh but there's also a lot happening at the level of software and the models themselves so how do
we um how do we for example uh quantize which we're going to talk about or how do we efficiently tune or how do we so all of these things the the the fact that the size is absolutely huge has really pushed the limit of what we traditionally know and that's where all of these new um optimizations come in well let's let's try to better understand sizing and so here we're starting to get into sizing ballpark we have one parameter at 32bit floats equals four bytes um and I think I think we said earlier that a
parameter is a representation of a number and I imagine you could represent that number in many different ways so here we have one parameter 32 bits and then yeah so usually models in general so even predictive ml uh the most common way to encode a number is 32bit float on a computer okay yeah and so that takes four bytes so every parameter or every number you want to save on your computer um in this format is going to cost you four bytes and is that and so when we see 32 FPS uh or 32 FP
is that what we're talking about is that is that standard yeah so that's 32 uh float yeah so that's 32 float prision that that's a 32bit float which is going to cost you because every uh bite has eight bits that's going to cost you four bytes and so yeah if you have 1 billion parameters that's four gabt of RAM for just parameters oh okay so we can extrapolate generally what the memory requirements are based on the parameters I mean I'm assume there must be more than just that right but this will get you into a
rough ballpark yeah exactly so there there's a lot more than that but just in base parameters like let's say you've got this model you have to load it into memory and it has a billion parameters well if you encode each number as a 32bit which is four bytes then you're going to need four gigabytes of RAM just to load that thing and and and uh one billion parameter just for folks that want to try to ground to something out there the one that I use a lot in this course is the Llama 3.2 uh 1
billion parameter model because that seems to run on everything and that makes sense if it's around four to five gigabytes most machines will have that available uh compute the side is a different story but memory memory-wise it seems like people would have that available but here it says but you need 20 times more space to uh uh space for optimizers gradients activations uh to train and so um I I kind learned this when I was utilizing open Veno which is for optimization that it makes models smaller not necessarily through quantization but optimizing for the Target
Hardware but I learned that when I trained it they're like oh you need to have more I didn't know why I needed to have more but I just I I knew I needed that uh at at the training phase um what because it's more work I guess or no because you because you're trying to to tune all of these um numbers to the right um to to to the right number it needs to be for for their learning to occur and so there's metadata that comes with that so what what the model does is in
a training process is it starts random so let's say you've got one billion parameters one t one billion knobs to tune you start randomly you start you initiate them all at random and then you make um you make a guess or you make a prediction then you calculate how far off you are this is in super supervised learning you you calculate how far off you are from the actual um real answer and then you tune the knobs to get closer and then you keep iterally iteratively doing that until you tune all of the your numbers
and so there's a lot of metadata that comes in um calculating all of these things so calculating how far you off and calculating your gradient and how much you have to shift back or forward um and that's estimated about 20 times the model size so so is it because there's a lot of operations that are being performed around just adjusting the model weights but also do you have to hold in memory uh multiple iterations of of of model weights like if you're going through multiple uh epochs or multiple iterations does it have to hold that
or does it write it to dis you could hold you definitely need to hold the current iteration so that you calculate the gradient so that you can move on you could choose to Cal to to keep them all for for historic purposes or to for you to understand how the training went but um you don't necessarily need to okay yeah so that makes sense um and so one billion parameter model so we have a 1 billion parameter model and we are training it and we're talking about full fine tuning so we're adjusting all the parameters
for those that are listening there are different levels there's different kinds of uh fine tuning one being like uh uh what's it called performant performance uh perform performance efficient parameter efficient model training parameter efficient model okay good for a minute there I thought like if you're stumbling on it's good because uh then everybody else uh again it's it's it's good so there's so many of these too right yeah yeah yeah the uh there's too many initialisms uh but we have 1 billion parameter model and so that would take 80 gigabytes of RAM limited of okay
so that's times 20 and then we have an Nvidia A1 GPU I is A1 is A1 better is the A1 a100 better or the h100 better I can't remember I think the h100 is better I think that's the newer one so a a100 is um yeah I don't think it's the latest one definitely not the latest one I mean like but I'm just saying like I think that it's it's older and so this one is a more uh is more cost effective um but even still even still that's a lot of ramp it's the like
a few Generations ago that was the the state-ofthe-art and you couldn't you know like you said most like most people are using let's say the 7 billion uh llama model that's seven times 80 gigs and so it's just to show that this would have um exceeded what we what state-of-the-art was a few a few months ago right or a few uh yeah well and and since then I think there's been like there's been the a100 the h100 there's another one and there's another one it's unbelievable how fast they're moving with this stuff but um but
yeah that's just one one Billy parameter so binary bite encoding example um so if you can walk us through uh binary encoding and and how is it applicable for quantization yeah so quantization is the idea of um reducing the Precision and the way you do that so right instead of using a 32bit which is four bytes for each number or each parameter um a lot of a lot of providers try to use 16 bits which is two bytes and and you can encode different ranges in that binary range right on on on the computer and
this is an example so we've got eight bits here right which is a single bite yep and what if you go if you click you can see if binary zero or one right so if if all of those are zeros then that encodes a zero to decimal so we're talking about decimal binary to decimal uh change but then if all of those are ones then you can encode 255 right so if you choose to encode an unsigned integer in those eight bits then you can encode a zero to 255 and unsign means counts up there's
no negative number there's no negative exactly okay right now if you decide to if you decide you want a sign number then you can choose that that first bit encodes the sign right and so now if um they're all zeros it's a zero but that not if it's if the first one is a one then it that's encodes the plus and then you get plus 127 so now now you can encode Less in integers uh but you could encode a sign okay so the first one allows you to change the direction of of the number
right right and so I wanted to let me send you I wanted to share my screen but I can't so I I want to send you this link I don't know if you can share it oh I certainly can can you share this can you open this this is a this is a nice um blog that talks about uh these things and can you share it oh yeah I'm just pulling up in the chat here sorry okay [Music] okay yep okay so if you go down a little bit what you could do is if you
got yeah just a little bit up with the p where the picture is we're just going to go over the picture and so if you can zoom that out you can decide if you've got 32 bits for example you can decide how many of those how many of those pieces in code assign how many of those um include an exponent or a fraction right so if you can put it if you put it in a power form an exponent form then your range is a lot bigger so now it becomes up to you as a
person um setting up these systems to decide how that goes on Hardware right that's the encoding system that's how these things are different so if we go back to the slides sure okay and if we go one more slide so what quantization does is like we said it's the process of reducing the position of a u a digital signal from a higher Precision to a lower Precision um what we've noticed is that when you've got let's say 70 billion parameters you don't need to um have the Precision a really really really high Precision you don't
need it to the nth power right like look at Pi in the float in the table for the floating Point 32 look at how much how long Pi is you might not need that especially when you have 80 billion parameters you can get away with potentially um uh a floating Precision of 16 which is a lot shorter but what that does is instead of costing you four bytes it's now costing you two bytes so You' practically half the RAM and the storage on that system M right so it reduces the required memory um to store
and train um and that that works well for for a lot of models now it depends where that you know if you want something that's really really high Precision you might not be able to do this but for a lot of applications this is a really good way to save on memory and Storage and you can see that there's a difference in the range that the numbers encode sorry so an FP 16 is different from a bf16 because they've changed um that that division that encoding of the now you've got um 16 bits right two
bytes is 16 bits they've changed how many are the exponent and how many are the actual fraction and so in an fp16 you can um en code integers from minus 6554 to plus 65,000 uh 54 but in the BF 16 you can encode um decimals that that range a lot more uh one thing I learned from you uh uh uh about quantization which was a big eyeopener for me and I'm sure that it does have an increase in performance in some cases but when I was learing quantization I was running and I wasn't seeing any
performance gains like like in terms of inference like it wasn't going faster I'm like oh it's not going any faster what's the point of this and you're like well yeah but how much memory did you save and I was like oh and I was saving like a lot of memory um overhead and then I was thinking okay so quantization could improve performance but it can also be the underlying resources that you utilize and so now you have more memory or compute for other things uh or or you can just utilize them on machines that have
less uh less of those resources which I thought was really interesting there was also a project that I saw called um bit net. CP where it's conization to the extreme and so it's a Microsoft uh project where a parameter is represented literally by a zero or a one wow it's it's called one bit llm it's called a switch right like that that's just switching that's transistors at their base um but I'm like what could you do with that so they had models that were like 15 billion parameter ones that they shrunk to I don't know
but the point is is that I tried to get it running and it almost worked but uh then I had some error and I didn't have time to debug it but I just wanted to tell you like have you ever seen the one bit llms like like that extreme level quantization and what would that be used for I that sounds like that sounds like going back to like switches right like complex switch networks where it's kind of like yeah no i' I've not seen them I I don't know how useful they are to be honest
with you okay a lot of what llms do and and machine learning does is because of an activation function which kind of um does a lot of abstraction so I I assume they're not as powerful because of the lack of activation uh I'm trying to remember what the original machine was called it was called like a perceptron or have you heard of this oh so you know this machine oh yeah and well yeah of course you do but but um but um uh I imagine that's maybe what it was like maybe maybe those were switches
back then I don't know or um a physical cables wired and they had a zero or one but uh yeah so I mean I I think that answers quite a lot stuff for quantization um now in terms of implementing it I find quantization really hard to do and every time I see it it looks like a bunch of math and a bunch of very complex uh stuff so uh I'm assuming that's just the nature of it is that whatever like quantization is you have to know what you're doing to to uh to to convert convert
these formats and and do some rigor roll to get what you want so the most important thing is that when you see these um these different encodings to understand what they are and what they link for and you need to understand that the encoding comes with the model so if the model is looking for a particular type of encoding it's not for you to you know if you change the encoding if you change an fp16 to a bf16 for example it all of the numbers will be off and so them and I don't know that
I don't know that the software allow you I think softwares are smart enough to tell you that there's a problem um so so so you don't just change the numbers it's it's it's it's whatever the the factory Manu the factory set it's like they say it was trained with this so this is what you use this is what you use I think I assume that you can change the encodings um you can't for it probably doesn't make more sense to make it more precise um you could probably take a model that that is more precise
and make it less precise I I assume there's a lot of uh systems and softwares to allow you to do that but you need to understand um if if it comes with a particular encoding what that is and and the fact that that's that's kind of part of the data the for for those that are listening here is that when we start working with models at a lower level you have these these uh architecture engines that are able to run multiple models and you'll have to specify the Precision and so you can put another number
in there it doesn't mean you should um and I mean I bet some people know what they're doing and can change numbers and maybe they get a result that they want but it sounds like you generally want to use the numbers that was recommended based on on training or for inference uh for it um but yeah that's what that's what I always do I just look it up and they say use this and I go okay and I don't question it but I don't know how they came to that but R is saying here it's
like oh they intentionally made it that way and you have to follow follow the numbers that were put out there so just look it up um that's that's how I understand is what you're saying there um is there anything else that we want to touch on quantization or or or would you say we we we got it covered yeah no we've got it covered it's just a trick and and we're going to do this in the course is we're going to look at a lot of the tricks that came up to optimize these things because
of their huge size this is a a ram and memory trick cool and I appreciate your time and I'm sure we'll see you throughout the course uh and we'll uh see everyone uh here in the the course here I don't know rambling here I don't know how to end it but we'll we'll see everyone uh some other time and uh Chia [Music] cha hey this is angre brown in this video I want to show off pine cone which is uh one way that we can uh utilize rag systems so let's go ahead and uh sign
up um I actually might already have an account with pinec cone because I have used this before um so I'm going to go ahead and just log in here and see if I can get started that way hopefully it just has a yep they do have a continue with Google so we'll go ahead and do that and so pine cone is one of many different kinds of rag databases that we can utilize here you can see we have um one where we have a quick start I'm going to go ahead and just delete that here
so that we're all kind of starting in the same place and so right away we already have some code which we can start working with so I'm going to go over to here onto the left hand side I'm going to uh and I already have this environment open before if you're wondering where this is is where it always is which is in the the Gen Essentials and I already have um an environment launched up here in GitHub codes spaces I'm going to go ahead and make myself a new folder and we're going to type in
Pine Cone and we're going to go ahead and make ourselves a new Iron python on file so this one's going to be basic. iynb okay and we will go ahead and begin following their instructions so the first thing they're telling us to do is install pine cone so we'll go ahead and do that so I'm going to go and run that there and while that is going we'll go to the next part and so here we have some code to initialize pine cone so I'm going to go ahead and do that clearly there's an API
key that's being brought into here and so um this is where we're going to bring in our usual EnV file and we'll bring in ourv example file um and we'll bring in a DOT get ignore okay and in here I'm just going to say ignore the. EMV file and I don't know what this should be called but I'm going to call it pine cone API key so that will be a good one there so we'll go ahead and put that in there as well and and so we'll just continue on copying the code here as
looks like this is a very straightforward uh forward way to get started okay and so then here we would want to load in our OS environment so probably I always pull from the streamlet example not sure why I like to do that we'll grab it from here and we'll paste this in and this will load ourv file so um we can bring in our environment variable and so we'll say os. Environ doget and we'll bring that in we'll just say uh pine cone API key and we'll go down to the next line here and we'll
go ahead and copy this here and we'll paste this in and so we will load that and so that will create ourselves a pine cone instance says passing open AI config is no longer supported please pass the proxy URL stuff like that well I'm just using whatever you told me to use so I'm not sure exactly what the problem is you haven't specified an API Key Well fair fair enough I haven't generated one so I'm going to go here and I clearly have a key from before but we're going to go ahead and just delete
out uh this default key we're going to generate a new one we'll call it uh basic and I'll create that new key and I'm going to go ahead and copy this key we'll go back over to here and I'm going to paste it into here okay and then we'll go back over to here I'll hit close and so now I should have a key so we go back over to here I'll scroll up and we'll just run a few things so we'll go ahead and run this oh sorry we have to reload that and then
we'll run this and we shouldn't get an error now because we have a key right so pine cone API key all I can think of is I spelled it wrong nope it's correct so passing open AI config is no longer supported um you haven't specified an API key sure I have it's right here so let's go back over to their code example getting started uh we'll view the guide with python oh my goodness just show me how to do it okay so we have install py cone we did that next set the pine cone API
key which is what we did we did set it here mhm and then yes and then they want us to initialize the client so we go back over to here into basic and is this the same yep so all I'm thinking of is maybe this key isn't picking up so sometimes what it helps to do is make the API key like this we say API key like this and do that and then paste it like that sometimes that can resolve issues for unknown reasons and I don't know what's going on here so I'm going to
go ahead and print it out here I just want to be sure so notice it's not printing anything well what the heck clearly I must be writing something wrong because there's no other way so I'm going to give this a full restart okay and I believe that's been restarted I'm just going to start from scratch oh you know what I didn't include python. DMV that's my problem and I'm just going to silence that so this is a little bit cleaner and so that's why this isn't working so I'm just going to take this out here
I don't want the key to be printed and so we'll run this again and so we're starting to have less of an issue as we go here uh I took out the line that I actually wanted this is the line that I wanted to go away and now we're able to do that and now we can initialize our index and so we'll give that a moment here here it's tell us like what we'd like to set as our dimensions and cosine um I'm not really worried about that right now but if we go over here
to database this is what we did this is generally kind of what we set up here now it's interesting down below this one is setting up specifically on adabs on us East one and actually that's where it's setting as well so it is setting up here we didn't link our adabs account in any way it's just that you're getting to choose where you'd like to provision it so I'm going to go back over here to databases let's go over to indexes and I mean I believe that we created it right so create this will create
the c Index named quick start that performs the nearest neighbor search using the cosign distance metric for uh for number two okay that makes sense so shouldn't I be able to see this now in the interface so maybe there's a bit of a delay I mean the next thing we probably would want to do is load a data set um sure I mean we could create uh create one down here but that's not exactly what I'm looking for here so again just trying to figure this out here we'll go back to the quick start and
so I have done this before but I'm not sure why it's a bit finicky okay so we have created ourselves an index so yeah we'll look at this example here so here uh generate vectors so it's creating Vector data first and then it's creating um an index down here below and this one's specifically 1024 cosine so the one that we created um even if we did create it there's not much we're going to be able to do with it and I don't think it's going to appear until we actually have some data in it so
I don't think it really matters and so this one is not a very good example I've done this before and I ran into the same problem where uh the quick start that they gave you is not that that useful so I'm going to go up to here and just so that if you are following this along as well and you want your life to be really easy as I like mine to be easy as well that um we can uh find this together so I'll just paste that into here that quick start and so we're
going to adjust this based on this quick start because I think this is the one that we actually want what's interesting is this one is saying to install without grpc you can do this I'm not sure why this one in particular is utilizing grpc but that's totally fine uh it's using it so maybe we should use it as well it's kind of annoying that this one is so different we'll go over to here grpc is just a a communication protocol so totally fine that that's what they want to do as long as it works that's
all I care about right so we'll go ahead and do this okay and so I'm just going to change this a little bit we'll grab this here does not currently take into account all packages are installed behavior is with the following ceny conflicts requires Proto buffer but you have which is incompatible you may need to restart the kernel for update so let's restart it this is already kind of annoying because we're having Capac abil issues now it's not throwing an issue so we'll just assume that that fixed our problem I'm going to paste this into
here um and so we still want to do this so that is totally fine but then we have some data that we might want to embed so we'll go down to the next line here okay and so we have a bunch of data okay and then we're giving them IDs and we're passing it through the embedding and then we're returning back embeddings PC is not defined I thought it was it's right here yes sure so run that again and we run this again and it's still saying PC is not defined it's it's totally defined it
absolutely is defined there's no way it's not so we'll go ahead and restart this again Pine cone's great so maybe this actually didn't install so let's take out the Q flag here and I'm going to bring this one up here because I'm I'm assuming that's why it's not working and so we'll just say pip install over here we'll have that separately here and then we'll do this one nope says it's working okay great then we'll load that again and then we'll run this one here and no problem now and then we'll run this one here
and so now there's no problem I don't know why sometimes you break these up and that fixes the issue maybe I don't know I don't know why that fixes it but anyway this is going to produce some embeddings and so we could probably print this out if we wanted to see what they look like it's not going to be exciting but we'll go ahead and print it out anyway so we can see it yeah so you can see that embeddings we have their data it's a bunch of values right okay so we'll go back over
to here oh they printed that as well okay well they were thinking the same thing thing as me so then return the objects like this yeah we saw that then we're creating the index so when you're creating the index you really need to make sure that uh the dimensions and the metric match for what you want to do okay at least that's the way I understand it and so we'll go here and run this and so this is going to go ahead and create a new index um has it provided the data yet I don't
think so I don't see it here I want to go back over to the interface and I want to see does this index show up because I again I don't think the index shows up until you actually have data loaded into it so here they're saying upsert the vector so this is how you'd insert the data into here so go ahead and copy that and we'll paste this in in down here so have upsert upsert count six okay so that's fine and so now if we go back here and then we look at indexes I
do believe it will appear now um I don't know why it still refuses to appear all projects we have two API Keys the other one didn't get deleted okay so here they are so it looks like our other one was created but for whatever reason it just maybe it takes a really long time for it to show up so I'm going to go ahead and delete this one as I really don't care about it but that really did confuse me for quite a while can we click into it and see what we can look at
here so we do actually have data down below and if we wanted to play around we could query it I suppose um so yeah I guess there's a query thing so yeah we're getting data back in this kind of format it's not really useful I think we really need to attach it to a use case to start making sense of this but anyway supposedly we have one here let's hit connect and so it shows us an example of how we could work with it this actually looks like a much better example than what we were
seeing before we'll go back over to here and we'll continue on and so here it will pine will eventually become consistent okay and then we can print out the information about uh that so now we can go ahead and search it let's go ahead and search it okay we'll go ahead and hit run and so now we've performed a search is this useful um I don't know like I think the way this would be useful is if we can see integration with a very particular um use case so I'm going to go take a look
and see if we have cooh here um and pine cone because I feel that uh Co here is really really really good uh connecting to external data sources and maybe they have a good example of how to do integration so we can fully understand so let's go take a look and see if we can follow this one here we'll call this basic for now we'll make a new one here and we'll make a new file and we'll call this coh here and pine cone Cod here and pine cone and see if we can get something
that actually looks like something that is practical because right now it's like cool we created a a vector store we put data into it but what does that even mean doesn't mean a whole lot to me right so we'll go over to here and we will click on the step-by-step guide which is over here it's on the pine cone website and it looks like they already have some collab code so here we're going to install CO here so we'll bring that on over and we'll just do this step by step okay so we'll add a
new one here um what's the hyphen U flag for hyphen U flag uh pip what is it I really don't know is it like Universal is a flag that tells pip to install library to the current user only inste the GL globally for everyone I don't think this really matters so I'm going to go ahead and just take that out um and we'll go ahead and run that actually we missing one we want python. EnV in here as well so I'm going to just stop this here and then we'll run that again and as that
is running it's going to have a lot of output we'll go back to our other one here and we're going to go and we're going to grab um this example code where we can import our uh ourv we'll probably have to do this twice because we'll need it for coh here so I'm just going to be a bit proactive here and go ahead and set up my coh here API as well so I'm going to go here just say coher API key and as we learned coh here has a very generous free tier so it
is great for us to learn using it so we'll go ahead and say coh here API key and I'm going to go over to here I'm going to sign in I signed in with my Google Cloud account or Gmail account on the left hand side I'm looking for API keys and I'm going to generate a new oops not a production key a new trial key this will just be for pine cone and we'll generate out that key I'm going to copy that key we're going to make our way back over to here I'm going to
paste in that coher key now I have the two components that I need to complete this let's make our way back over to the documentation of this example and so yes we're going to load the the coher client I wonder if this is going to use the old API I keep coming across the old API more frequently which is kind of annoying but if the old API works that's totally fine it just might date our code a bit um and I believe that we already set this correctly so we'll just go ahead and leave that
empty I'm going to reload that so that coh here will get loaded in um and we'll go back to this example and so here it's saying load in from data sets did we import that earlier um so if we go here did data sets get imported it did okay great so data sets is where we're pulling data from hugging face and so we're going to go ahead and grab this as our next line and we'll hit plus okay and so now we're bringing in that data set uh one called track I'm not sure what track
is let's go take a look at what that is hugging face data set and this data set is what the text retrieval conference question classification data set contains 5,500 labeled questions and training sets another so how did search and develop in and leave Russia okay so that's kind of an idea of like what that data set looks like let's go back over to here and continue on and see if we can get full integration um this is just I think a duplicate okay we'll go back over to here so we are importing that data set
the next thing we want to do is create embeddings of the data so we were using um the embedding model from Pine Cone but this one we were actually using the embedding model from coh here often um if you're going to do embeddings you want to actually I guess it doesn't really matter the model the the the large language model doesn't care what embeddings you use but um the embedding model matters for what you store based on what kind of result you want so there's no correlation in that sense but anyway so check the dimensionality
of the return vectors you will need to save the embedding dimensionality for this to be used with initializing pine cone later on so that's something I was kind of wondering it's like how do you know what to set the dimensionality of of and I guess this is the way that you would do it so that's kind of cool now I'm not sure how we would know whether this set to cosine or not or whatever other setting we have now that you have the embeding you can move onto the indexing them into pine cone Vector database
so we will grab this part next okay we'll hit Plus and so we have our pine cone API I'm just going to take this out because I'm going to assume that it will load load it based on its name maybe it won't and so it's using cosine so I guess that must be good it's going to host it on ad of us which is totally fine uh now you can be begin populating the table so we'll go ahead and copy this this looks like us putting data into it so we'll go here and yes so
here it is iterating through the data and I'm assuming at this point it's already been turned into embeddings right so we have embeds here yes so yep and then this is inserting into the pine cone database okay that makes sense and then we have our semantic search okay so what caused the 1929 Great Depression and so here it would return the result the response from Pine code includes the original text of the metadata so we'll go ahead and copy this and then we'll go back over to here looks good let's make it harder and replace
depression with the incorrect recession amount so here they're purposely matching it a little bit differently I don't care about that too much I just want to see that we get this working end to end now this is still not integrated with an llm but the idea is that you would that data you return back you just put it in the context window so you know when you'd give it a prompt you'd put it in the promp and that' be in context learning but let's make sure that we can get this to work so we ran
this line already I'm going to run this one again let's go run a CO here and hopefully it loads it then we'll load in our data set then we will wait a moment here we'll give it a moment all right so this is taking a long time because it's trying to load the first 10,000 rows and this is just tiring me out so I'm going to hit stop here just stop stop stop stop stop this is ridiculous and I'm going to just reduce this to maybe 100 first ones and maybe that will make our lives
a lot easier so just hit cancel here yes I want to interrupt it what a pain in the butt sometimes the people that write these examples aren't the ones that actually utilize them and so that's where it gets kind of frustrating so I am hitting stop here and I'll just hit restart so that the kernel stops here um and because it did restart it I'm going to just go from the top here I'm going to assume that it's lost that context and so we'll go ahead here and this time it's going to load 100 as
opposed to a thousand and maybe this one will be a lot more reasonable so we'll just wait a little bit here okay all right so this one's also having a really hard time with it um this repos oh hold on hold on this repos track contains custom code which must be executed correctly to load the data you can inspect the repository you can avoid this prompt in the future by oh did I just have to press yes up here the entire time and I hadn't been paying attention did that fix it why does it look
failed now let's go ahead and try this one more time and I'll say why and then enter okay so maybe a th wasn't that uh bad to begin with but I only want a 100 to be honest no okay let's go do a th maybe a th wasn't that bad and it was just me that was the problem this entire time okay so that's the problem I was waiting here I wasn't reading I was confused I do apologize let's go ahead and continue on so now we are um embedding the text of track so there
we go so we literally did a th000 beddings here we're grabbing the shape so we see 1,00 to 1,24 so it looks like it wants the second one here so that kind of makes sense um then down below here we are connecting to pine cones so let's go ahead and see if that works here it says name serve L spec is not defined um fair enough can we just do this I think that's working and so I believe it is creating the index now and there we go so the next thing is we're going to
upsert or insert the data into the database upsert as and send it up to the database and so now the data has been inserted um I'm not sure if we have to wait for that to become eventually consistent but let's go ahead and see if we can query it so here it says what caused the 1929 1929 Great Depression and here we're using the co- embedding model so we have to embed our query first right and then we can query against it um and so I believe yeah this is the index that's coming from Pine
Cone and then we can query that and see what we get and then we can print out a results and so here we go we have some results being returned back to us and so this is an integration of rag this is not a complete integration because obviously you'd want to take this data back to the llm right and and put it in context window but I think that's pretty straightforward so I don't plan to show it here and we're going to call this one uh done and dusted so we'll go here just make sure
I'm not committing keys and this will be our example with pine cone and so that is your first example of utilizing rag okay [Music] ciao hey it's angrew brown and one solution I would like to look at is elastic so uh elastic search is definitely something that uh you'll hear again and again when talking about rag systems because having full Tech search engine works really really well for rag um obviously ad us has a solution and some other providers do but I was thinking maybe we could go directly to elastic and I wonder if I
can make um a trial without entering my credit card in I'm not sure if I can but um we'll go take a look here so does elastic have a free trial no credit card let's see if it does because if it does then we can try to utilize it right no credit card required it's saying so if that's the case we'll go ahead and we'll go ahead and continue here I'm going to say exam Pro Co we'll continue on as elastic I feel is something that you know if you were to use something production elastic
would be really good manga Tob as well would be really good so I'm going to go ahead and just fill in my information okay we'll go ahead and continue I'm a new to elastic I have some experience with elastic but it's been a while um just evaluate it I suppose and so we have a few options we have elastic search elastic for observability elastic for security we're just going with regular elastic search here today and uh we can say where we can host it I'm totally fine with gcp I do not care where it's going
here today but it's nice we have some options looks like we got a video um here that we can watch while we're hanging out but we are going to hang out until it's done which apparently it looks like it's going really quick so it's not even 3 to 5 minutes unless there is some post stuff that we have to wait for but let's get into elastic here here and we'll go ahead and close this out and so we have an elastic search endpoint that we can utilize um so the question is how can we start
building with this let's go take a look at the playground I haven't used the elastic search interface here so this is my first time in here but I have worked with elastic so I'm not super worried about it but right away it says experiment with combining elastic search data with powerful large language models that sounds perfect that's exactly what we want and we do want to connect it to an llm so let's go ahead and take a look here so we have a few options Amazon Bedrock Google Gemini open a I feel like I'm not
using Google Gemini enough so let's go ahead and take a look here so send a request to Google Gemini so we would um let's see we would add a connector name uh oh so we'd specify a project with a very specific model yeah I'm just curious what does the other ones look like maybe open AI is much easier to utilize yes here it's so much more straightforward I think I prefer that so I'm going to go ahead and I'm going to go and use the open AI one sorry Google you just made it too much
work when I have to make a new project I don't feel like doing that here today though I did create a geni project so it isn't really that much of an issue to figure out um so I'm being a little bit um a bit of a pain in the butt but we'll go here and just make sure I'm logged in uh there's a video where we do this so you should already have an openi account and we'll go over and I already have a project for basic I'm just going to work with that project I'm
totally fine with it if you don't have one go make a new project but we're going to go over to this project in particular m we'll say manage projects then we'll go to API keys and I already have a key here I'm going to revoke it I'm going to generate a new key this is going to be for elastic I'm just going to put in my basic project I'm going to go ahead and create that and now I have a key we'll copy that we'll make our way back over to here I'll paste in my
key now I do have a bit of spend on on my open AI so this is less of an issue so just say open AI connection and that is our provider I guess there's other compatible ones that's kind of cool so we'll go ahead and hit save so that makes me think that as long as you have an open AI compatible interface it can work with it so now I've connected an llm let's go ahead and create ourselves an index this seems like a really easy way to make a rag and so we have API
so use the API to connect directly to your elastic um uh interface uh discover extract uh index searchable content from websites oo I like that and then we also have connectors let's go ahead and crawl a URL so this index will hold the data content sourced so uh maybe we'll just type in I don't know let's just try exam Pro because I don't mind scraping exam Pro for content and add a domain to your index so configure a domain you'd like to crawl so we type in www. exampro doco this is going to totally mess
up my data I probably shouldn't tell people to do that but we'll go here we'll add the domain and now we have I guess a crawler we'll go ahead and hit crawl crawl all domains crawl all domains on this here so we'll go ahead and press that it's a static website so you can't really hurt it don't do app. exampro doco that'd be really annoying do www. exampro doco and so now it's running a WebCrawler this is incredible how easy this is sorry Amazon Bedrock uh um knowledge base which uses elastic search or whatever their
variant is underneath but honestly elastic search is the main website here this one we're utilizing it's probably a lot easier so I think that it's crawled um what has done I'm not sure but let's go view that in the playground and see what we know about it so how many courses are on exampro doco can it answer that Pro cont does not specify the exact number of courses available on COD therefore I don't know the answer okay so let's go to the exampro doco website and we have a Boxing Day Sale obviously you can see
the time of this if we go down below who is your favorite instructor okay so let's go over here and say on exampro doco who who is the favorite instructor can I do that and so it says according to the context provided Andrew Brown is mentioned as a favorite Cloud instructor on exam proo he is described as an abis Community hero gcp Champion innovator not anymore I got to take that off the thing and previously CTO of multiple startups so if we go over here we can view the code for the implementation and right off
the bat it's giving us the open Ai and elastic search implementation that is really awesome I'm going to go over to here and we're just going to go directly over to gen Essentials exam proo geni Essentials and uh I might already have an environment running I do so I'm going to go ahead and open that up and I just want to bring that code over and make sure that we can execute it programmatically all right so we now have uh this open here I'm going to make a new one here for elastic Surge and we'll
go ahead and make a new file called basic. iynb um I'm going to go and make a DOT a newv here and we'll make a newv example and we'll make a a new dotg ignore okay so what we'll do is we'll go over to elastic and we have a bunch of application code there's a lot apparently you can use it with Lang chain I would not use Lang chain I'm so sorry I'm just done with Lang chain these days we'll go ahead and go all the way down to the ground we'll copy this and we'll
go back over to here uh and we'll paste it in uh no we got to go over here and we'll make a new code block and then another new code block and we'll paste it in we'll break this up a little bit so I'm going to bring these up into another line line here then we'll bring this up into here I like how they give you the PIP wow this is super easy um I do want to be quiet but I don't need the U so we'll say python. EnV so we are installing the three
things that we are utilizing and then we'll do this we also need to bring in EnV to load it so I'm going to bring these three lines in from another project saying there's no module called elastic search could not find a version that satisfies the requirements for python. EnV it's CU I spelled it wrong it's got to be like this so we'll adjust that and that should fix that installation there we don't need to double import OS so we'll just take that out here and we'll run this one and we don't have an EMV file
yet so that will be something we'll need to work on here we have an endpoint and then we have esapi and uh this stuff so I'm going to split the screen here so I can see what I'm doing and we'll write in uh open AI open API open AI API key so that one we have es API key which is another one we have es endpoint which will be this one we don't really need to load it this way but I would prefer to do it that way so I'm going to just take this out
of here like that and we'll paste it into whoops actually before we do that um I want to make sure that I copy these and we put them into here let's make sure our dotg ignore we are ignoring the EnV as I don't not feel like committing uh anything sensitive here today we'll go back over to here and we are going to grab the elas CLC search endpoint as such um here and then we'll need those keys there so here we'll say os. Environ doget and this is going to be uh es endpoint and yes
you can do it this way too but I'm going to stick with the doget syntax I just dress it more even though they both work I don't know I think one time I did it and it didn't work and so now I just always have have to do it that way and so I'm just searching here to see what other kind of stuff that we have here this one's sing to GPT 3.5 turbo which is kind of weird because our real code example didn't do that we should break this up a little bit and uh
make this a little bit nicer so we'll just make a few code blocks down here below and I'm just going to keep dragging stuff on down here so that'll be one part and then we have this function here which is clearly out of date um no one's using GPT turbo 3.5 these days nobody there's no reason to then we have this function so this isn't really written for um uh what do you call it um to be in a jupyter notebook clearly but that's totally fine we'll adjust it and then we'll just see how this
code works so we go down below here we go so this is starting to look pretty good um I don't like how everything is for space let's go and just change this I know it doesn't matter but it's really bothering me um and so we don't need to specify the key like this here this will just get picked up I'm not sure about elastic search so I'm going to leave it as is here we can see we have uh some basic search results I'm just going to simplify this make our code a little bit smaller
so here says search exam Pro on the body content that kind of makes sense there um again just fixing all the crazy indentation here we just have crazy crazy indentation okay and we'll bring this over to the wall [Music] um and we'll go here so instructions you're an assistant for question answering task answer questions truthfully and factually using only the context presented if you don't know the answer just say you don't know the answer use markdown format etc etc there is some formatting issues here that still doesn't like in inconsistent use of tabs and spaces
so that's its problem here uh so I'll just bring this back a level [Music] um and maybe bring this in a level hold on here does that fix it no does that fix it no um I don't know we'll just keep working with it here so again I'm just trying to carefully look at this code to refit it into a jupyter notebook as it's a little bit tricky um because we have these three functions right get the elastic search results create the AI open AI prompt and then do the completion so what I'm going to
do just going to close off a little bit of this we're not ready to run it yet because we don't have everything loaded in but I want to get these three things generally where they are so the first one is get elastic search results which is this right here so what I'm going to do is I'm just going to take this variable here and I I just going to do this okay and then I'll take this out of here like this and again just trying to make this a lot more readable so we have our
query here we have retriever standard query multimatch query Fields we're obviously not talking about any of those options in here we'd have to read the documentation to figure that out so we have our elastic search results next thing is to create an open AI prompt create open AI prompt so I'm going to go ahead here and the result here that we're returning is actually this okay so we set the context to be empty then based on our results which is here we're going to iterate through our results okay and it says for semantics text matches
we need to extract the text from the inner hits okay so that's doing obviously something but here we're seeing in context learning so remember we said that if you're using rag you're pulling data and you're putting it into the prompt that that's what it's doing here and so I guess that's kind of like the systems prompt yes it is I believe it is so here we have context prompt so I would name this system prompt to make it more sense and then here we would then pass this in as system prompt because that's what it
actually is doing here and then we have our question so then I'll just clear this out like this okay now the code was fine but it was it was intended for not like again it wasn't intended for uh Jupiter notebooks and so then we have our question which is up here like this so we'll do that we're not returning a response we're just going to print it so the another thing I don't know about this code is whether it's going to like that turbo um I suppose we can just swap it out because the syntax
looks very very much uh easy so let say GPT 40 mini hopefully I have that set correctly there um and then I'm just going to delete out this block cuz it's us okay so now we'll go back over to our EnV and we need to populate this information so oh you know what I brought this one into the wrong one it's okay so I'm going to close out this one here we're going to copy this here and I'll paste it in here I'll go back over to here this is completely useless so we'll take this
out I'm going to go back over to elastic oh I already have an API key from earlier so we'll grab this one that was the one we were using earlier in this interface and so I'm going to p paste it into here the one thing I need is the elastic API key so I'm looking for that elastic API key I'm not sure where it is we'll go under maybe organization I kind of feel like that's where they would put an API key it is we'll go over to here we'll create the new API key I
call this uh example and I really want this to like be dead in a week and um this API key is for elastic clouds API only the API key created here grants access to clouds for managing organization deployments that is not what we want okay that's not what we want we want an actual API key I mean I think that's what we want let's go back over here for a second uh we'll go back over to this for elastic search for querying yeah it's just for querying right so go into our deployment here also how
fast was that that was crazy every time you use knowledge base you got to wait forever it's crazy with Amazon bedrock and so I'm looking for um the AP key do you think it's this thing here maybe it's this thing okay it doesn't really say but I'm going to take a wild guess and I'm going to assume that is the key and so we have these three values set so hopefully that is good we'll go back over to here we already ran this I'm going to run it again I don't care I'll run it 10
times if I want to and then we will run this we'll load our envs and so now we will uh run this and then we will run this and we have a problem query is not defined yes fair enough it doesn't know what query is so in this context what is query is query the question because that seems like what it would be right so let's go back over to our example and if we go down here we have question here if we go up to here there is no query so where's this coming from
so it looks like their codes incomplete because it seems to me that if we had a query that's where it would be but it says multi match query I would think that that would be the same question so H one second sorry I'm back I wanted to go tell someone elastic that their code example was out of date again platform really good so far but um that one example is out of date because obviously the query is not being set so what I'm going to do is I'm going to pull this up here and I'm
assuming that the query is the question that's what I'm assuming that it is so I'll go up here and I'm just going to change this over to query and then I'm also going to go down here um to here and paste it in here as well the other thing is that um we asked a very specific question oh yeah la exam Pro who's your favorite instructor so I'm going to go ahead and grab this here and we'll go back over to here and I'm going to scroll on up and you find this a lot with
these uh products is that their stuff's always out of date um so we're going to go go ahead and do this and so again I'm just taking a guess of what this is supposed to be it seems like we still have a problem unable to authenticate the provider credentials uh uh an anonymous access is not allowed so clearly the key that we're utilizing here is not correct um so we need to figure out where that open API key is so I'm going to go over to here um and so I'm just trying to think like
where is the key that we need to utilize um so we are in the playground right now just scrolling up let's go to Dev tools maybe nope that's not it you think of the playground it would show it somewhere here nope overview Cloud ID so have active API Keys there's zero that are activated we go ahead and hit new and oh yeah so it's Crea an API key so we'll just say example and I want this expire in one day can I can I go down here there we go we'll set it for one day
I'll give it two days just in case um security [Music] privileges uh metadata I don't know if I need any of those we'll create the API key and so now we only see this once so save it somewhere else don't store it don't store your API Keys we don't store your API Keys fair enough okay so which is the API key I'm not sure so I'm going to go ahead and copy this this thing is a little bit confusing I'm going to be honest with you here but that's okay I can uh I can work
with this and so I'm going to go back over to here and I'm going to go into myv and I'm just going to drag this on over and so I'm assuming that this is is the API key now I don't know why we get all this other stuff here that's fine don't understand it but we'll go ahead and we will paste this into here I'm going to go and save this we'll go back over to here and so now we supposedly should have our key I'm going to just restart this just because we've made so
many changes at this point I'm going to restart this completely I'm going to work our way down okay we're going to reload all this stuff so now it should be loaded in and assuming the names are correct I believe all the names are correct and we should now be able to do a query okay so it says es client is not defined this code this code they gave me is so not complete but we have elastic search here mhm so let's go back over to here and we'll go back to the playground and we still
have our previous example I like that the previous example is still here so here is es client up here oh yeah we did Define that well where the heck is it I remember defining that right here oh right here did we not run that line yes client yes unless it doesn't work okay so here it says um authentication unable to authenticate with the provided credentials and Anonymous a is not allowed for this request so it could be that there's something with this API key that we have to set so we'll go over to manage AP
I Keys here that's like another spot okay so this is where we really wanted to be wow do I have a lot of keys who knew I had this many keys so this is the example that I have here control security privileges um I mean we don't need to write we just need readon access so let's just do that let's go to metadata I don't need any metadata so we'll updated so maybe we just needed some permissions to work with it and so we'll go back over to here I'm not sure if this stuff propagates
instantaneously but we'll try it again and unable to authenticate provide a credentials an ominous axis is not allowed well I definitely have the correct key now for sure so we'll go back over to here and we'll click into this again we gave it permissions but we updated the key right all I can think of is that it doesn't propagate instantaneously because again I I'm not certain how long it takes I'm going to go give this another restart and we'll try this one more time if it doesn't work we'll just read the code and we'll just
be like okay cool but we're not going to fiddle around with all day every day because if it's not easy to use and that's their problem right and so we'll go ahead and run this so 401 okay so let's I'll spend a little bit extra time with this elastic and we'll go ahead and do this and see if we can figure it out I'll be back in a moment if I find a solution okay so here it's saying the docs you referenced are to create the elastic search service API key to operate an elastic search
deployment API Keys allow you to create update delete deployments not necessarily work with data for operations you're trying to do the elastic search normal rest API you need to create a normal API key uhuh well I'm not sure but um we created a key here let's go over to manage so what kind of key is this allow external services to access elastic stack on behalf of the user okay but what kind of key is this I don't know but when we go into here it says name all privileges read and so there's no restrictions here
so to me this has to be the right kind of key that we're supposed to utilize okay so there's something something that's missing here that I'm not aware of but I I can't figure it out so we're not going to fiddle around with it too long but I would say that if if we could get past this authentication thing I think elastic search this thing looks really cool like it gives us a complete example here I'm going to play around with a little bit more I just want to see if I can get to work
all right so here they're suggesting that when you connect with API there's two parts to it this worked for me the API key is apparently API key ID and then API key so maybe this is our problem now that's not what it showed us but maybe that will fix our issue um but if we can figure it out that's totally cool so I'm going to go here and I'm going to just paste this as such uh like that and so I'm going to bring this on well we don't really need to do that do we
so this is going to be API es API key I mean that wouldn't match their docs though right so ESP API key ID and so now I have this variant here so maybe this will fix her problem so supposedly this one is fine I hope I kept that open somewhere still did I no I did not so I'm going to go ahead and delete this we are going to create a new API key this will be example two and example two and we will give it full readon access oops let's actually go back here let's
create this again according from this interface I'm noticing that we have all clusters all indenes everywhere this gives us full full access everywhere so that sounds good to me I'm going ignore the expiry date here I'm just going to manually delete that out so I've created it but where's the key information now copy this key you'll not be able to view it again oh well this one only has the okay but this key looks different what is going on the UI is a little bit uh inconsistent here so hold on here hold on here that's
way longer oh my goodness okay so let's go back to here maybe all we do need is one and then the interface is just confusing so let's go ahead and take a look here and we will do this and I guess I'll just restart this again I'm going to make sure that I saved that file where is it here yeah here it is and we'll go ahead and we'll run this and then we'll run this one and then we'll run this one and now we'll run this one run this one please work please please work
okay there we go so it's really weird it seems like there's more than one way to make a key and then the output of the key is different so that kind of strange but anyway we figured it out that's totally fine let's go ahead and continue on here and we'll go ahead and run this I don't know okay well well I mean it must have the information right so here we have the results I'm curious what it's getting back as results um so here for semantic search text let's go print out I'm going to separate
this out into two steps here and we'll say print context okay we'll hit print here so we are getting that stuff back so it is grabbing stuff but maybe the issue is like the query itself is not optimal so this is the question that we ask but what is the query so maybe these things should be separate okay so we have question here and then we have query so this will be uh [Music] favorite instructor so now what we're kind of doing is we're we're we're selectively just choosing this text's favorite instructor did I spell
that right I think I did and so I'll go down here and I'm going to update this now to question one thing I'm kind of curious is if we go over to the playground notice there it says on exampro doco who is our favorite instructor AI searching for who is the favorite instructor on exam pro. also the question is what spelling does it use um I'm Canadian so this is the Canadian spelling but the question is is that do we use the Canadian or the American spelling sping here we're using the American spelling on the
website so that could confuse it as well and it's returning back three documents here but can we see the query input so I want to type in query here query because this interface doesn't necessarily show it but I'm assuming that this is because it clearly is reworded it a little bit but I can't again I can't see what the underlying one is so I'm going to go back over here um to this one and I mean we'll use it with the incorrect spelling first no I'm going to take it out I'm G to make it
the American spelling favor favorite how do you spell it American style faor wrote favorote I can't I can't spell it the American way I just can't I can't perceive it without that value that's really hard for me let's go back to exampro doco favorite favorite favorite favorite favorite okay again I'm dyslexic so this stuff is very challenging for me and so we'll take out the this one here and then this one here that doesn't look right to me favor wrote favor right okay there we go yeah I again Tru TR truthfully very dyslexic we go
ahead and paste it in here we'll go ahead and paste it in here so now we have that I'm going to go ahead I don't probably how many times I've restarted this we'll restarted again and so maybe this time we'll get the result that we wanted okay so now we're starting to get different text in here I'm not sure if it's the text that we want but we'll go ahead and we'll run this there we go and so now we're getting the result that we want so that's the integration with elastic search uh again the
UI really good the getting the key what a pain in the butt looks like we have other connectors here which might be interesting so if there's other data sources that we want to plug into uh they are clearly here and that could be uh easy to do but uh yeah I'm not sure cost wise how good this is so if we go look at like clastic search pricing I'd imagine it's on par with dats we'll go over to here for a second what's the pricing for this thing $95 a month for standard so that sounds
about right um I'm pretty certain that elastic search cluster on ads is going to cost way more than that so this I think is still more cost effective let's go take a look here say ads elastic search pricing there's with the equivalent is open search um but I think think there's deployments onto adabs as well let's go take a look at what the cost of open search is you obviously do get a free tier but we will scroll on down here and just take a look here may maybe it US Open search is cheaper I
don't know we'll give this just a moment here and so with the smallest we have this per hour times 720 $25 a month I don't think that's right because I know that when you run uh knowledge base it costs more and I don't think you can control what size it is so we'll say like ads or like Amazon knowledge base pricing cuz that's what we're really comparing it against we're not comparing it against running our own open search service which we maybe like if we want to do that uh indirectly maybe we could do that
but this is what I really want to know is what the cost of this is um so this is Amazon bedrock and I'm looking specifically for the knowledge base pricing um knowledge base knowledge base knowledge base [Music] that's the structur query but what about just running it unless they changed it but I'm pretty certain just by running it it it costs quite a bit of money $2 per 10,000 queries did they change their pricing model knowledge base pricing model Amazon cuz they might have changed it right yeah like see this they're talking about like how
expensive it gets it us Bedrock knowledge based minimum cost beware and so they're talking about what it runs underneath and this comes out to $720 per month or $360 and so that's why I keep thinking it's very expensive but I couldn't figure it out based on that pricing page but anyway uh alasa search it's really good um I've always heard really good things about it uh so so other than the the API key being a little bit confusing there we did get a working example we're going to wrap this one up we're going to make
sure we didn't commit anything here and we'll say that this is done with elastic search and so that was great and we will see you in the next one [Music] ciao hey this is Andrew Brown in this video we're going to take a look at utilizing mongodb as our um our our place where we can store our data let's go ahead and and sign in or make an account for free I already have an account um and mongodb is really good uh I'm not in here every single day but I do know my way around
mongodb I know there's a good example out here to work with uh cooh here link Google account by linking your account we can switch to your authentication method uh sure L we'll link our account here oh I do not want to have to put a password in just let me in here today so I'm going to go ahead and give this another go I think it's because I have a manual account so I'm going to return to login and I'm just going to log in manually just give me one moment here there we go we're
in I'm going to remind myself later I do not want to do that right now and so now I'm in mongodb I probably already have a cluster set up here actually looks like it's deleted and so I want to start working with uh this and so um let's go ahead I know coh here has an example so say cooh here mongodb as uh a rag example so that might be one we want to take a look at as a way to do Integrations um so mongodb in Co here mongodb atas search as a fully managed
thing let's take a look at this guide how to use embeddings I know I've done one of them around here I'm just trying to find it here so is this the one that I utilized before I'm not certain I think I'd rather go to the mongodb website and we'll take a look at rag this way and we'll find an example that way there because I know that they I think it might have been this way like if I went to Vector search and I found example there that was the one that worked a lot better
for me so again just looking for it just give me a second to find it okay all right so I got a got to the docs here with Atlas Vector search Rag and under here we have ai Integrations and so I'm looking for a particular integration here I remember having a tutorial using it with um mongodb give me a second I probably have it on my email here I found it it's over here in um coher AI um coher AI guides and so this is one I think that we could go ahead and utilize they
do have a collab so maybe we should just open this up in collab I feel like we haven't done enough in collab and so since there's a button there let's press it and launch it up here um and so this is Integrations between coh here and mongodb so um yeah we'll just scroll on down a little bit here and see what we have so they're just kind of explaining how it all works I want to go through the steps and get this installed so here we in data set so that's obviously hugging face tqdm we
see that a lot is that just for Progress a fast extensible progress bar it is it's just the progress bar and then we're bringing Co here we're bringing Pi so we'll go ahead and say run at this point you should have collab so this should be a non-issue for you and we'll give that a moment to run here all right so let's take a look here it says pip dependency resolver does not currently take in account okay it's that generic error that says there could be issues but I don't believe there's an issue um yeah
so I don't think there's an issue here then here we need to put in our coh here API key so I'm going to go over to coh here and we'll go ahead and get an API key and we'll sign in here I wonder if um now that I'm getting really good at loading up my um uh environment variables I wonder if there's a way to do that here so we're not directly exposing um R key I'm not really sure if there actually is a way that we can do that this one's going to be mongodb
okay we'll copy that and I mean there there's a file system right here so in theory we could make aemv file um but where is this file uhoh now I don't know where I am hate when that happen so I'm just going to recollapse that really don't know where I am so I'm just going to uh do it the lame way I'm just going to paste it in here like this it looks like it also wants hugging face token I'm not sure why oh maybe for the data sets so we're going to make our way
over to hugging face I guess get really good at grabbing Keys everywhere and we're going to go down over to access tokens and I got to sign in so just give me a moment here okay and so I have an older key here I'm going to go ahead and delete it I'm always making new keys every two seconds I'm making keys we'll go ahead and make a new key here this will be read only and just be for mongodb I'm not sure what it wants to pull at this point in time but we'll go ahead
and find out in just a moment and so we'll paste in that key and we'll go ahead and we will run this okay and so we're getting the coh API key I do not like setting it here like this but I don't don't see how else we're going to set it in this environment um I mean we're not sharing this with anybody so it should be a non-issue u but I don't really like leaving my keys around here of course you can just delete your keys afterwards so the data set contains detailed information about multiple
technology companies in the information technology sector okay that sounds fine the market analysis reports provides in-depth information about each company's performance so here we have the data set we're loading the data set here uh and then we are putting it into a data frame I suppose yeah we are down Bel here if you know pandas I know panda an okay amount I'm not really good with pandas per se and we will load this into here then we have data preparation so combine attributes so here it this seems like we're prepping the data to be in
some kind of format add a new column called combined attributes okay so now we're adding the combined attributes and I think that's kind of a common technique when when you searching stuff um in a database you might make a new row specifically with all the combined information so it can do full teex search across it I'm assuming that's why why it's doing that and so now we're displaying the full combining attributes here we have the embedding generation so we've did embeddings before when we used pine cone so here is the same process we're going to
iterate through it and create the embeddings and looks like we're storing that into the data set the embeddings so here it is I love the progress bar this is a really well ridden uh lab here so we'll give it a moment there to run and uh I'm not sure why it's complaining I've never seen this little warning here before but it does seem stopped oh we did get an error was limited 40 API calls per minute so you can continue to use your trial or wait so we will just need to wait a few minutes
okay so we'll back here in just a minute okay all right so I've waited a minute and I'm going to go ahead and try to run this again as I just believe that the data set's very large so there we might run to an issue so hopefully we can get through that and we hit 64% where we're limited to 40 API calls so I feel like if the data set was a little bit smaller we could get through this um so there's probably some way that we could adjust this let me see here how could
we reduce this data set I do believe that like Google Gemini is somewhere here if we go here don't we have like AI coding assistant built into this platform somewhere we do I wonder if it would no I can only it can only explain code so again I'm not really good with panda so I'm just trying to think about how we can uh fix this let's go up do we have a cond a combined data set and I'm not sure how many data frames are in our data set let's go up here data data set
take 100 so it's taking a 100 from the data set and so I'm thinking what we can do is maybe what we can do is just cut it in half like this let's do that okay and so I'm kind of cheating whoops I didn't mean to do that cancel I tried to save it's just a a habit of mine and so what we're doing instead is we're Lo loading half that data set now okay so we'll go ahead and do this and we'll run this and we'll run this and I'm going to just wait a
minute I'll be back in just a moment okay all right we'll go ahead here and try to run this again and so now we're using a much smaller data sets I'm hoping oh we almost got it okay so we're gonna we're going to shrink it a bit more now of course we can try to figure out a rate limit this and be like okay you know work a lot slower don't make as many calls but I'm trying to do it in one go so I'm going to turn this down to let's say 40 okay whoops
I keep hitting that save button uh we'll say maybe 35 we don't need the full data set but I'm hoping that this will this will get us where we need to go so I'm going to go ahead and hit run here and I'm going to run this one again and I'm going to run this one again and I'm going to run this one again and so I'm going to wait another minute and this time this time we're going to get it all right um so I waited a generous two minutes every time every time I'm
waiting I'm just watching this video which is the uh the exit video to Legends of the galactic heroes I don't know I always like to watch that one but we'll go ahead and we will run this uh thing one more time and we did it we cheated um we cheated around the API we could have ran the full set I just don't care um so over here we have mongodb Vector setup database so first create a mongodb atlas account okay we have that uh select the atlas UI create a database uh called asset manage use
case so let's go ahead and follow it along here so we'll go over to here and we need to I mean like where is the Atlas UI I can't really tell here but we can go ahead and create ourselves a cluster um and so we do want a free one we absolutely want that but how do I know let's go back to those dogs select the atlas UI as uh as the procedure to deploy your first cluster okay but what kind of cluster is this going to deploy that's what I don't I can't tell so
let's go back here for a second I mean there's atas search you don't have a database integrate with the search into your application once your database is up and running you can create a vector search so it makes me think that first we have to have a cluster right and then we add ATL search right so uh we'll go ahead build a database here still anyway um we'll just reset here I don't care what this is this looks like it's still going to the same place it's just going to building a new cluster yeah it
is okay great so we're going to go to our free tier and the name is going to be this we're going to name it this cluster names can only be contain asking letters numbers okay so that makes me think that what we're doing here is different let's read this again first register long going to be Atlas account follow the instructions let's go over here follow the instructions deploy a free here Atlas UI log into Atlas go over to organizations create a cluster mzo clusters are free forever and suitable for learning mongodb well okay I believe
that's what we chose it's not called that this is 10 and we have zero but then we go down below here and saying that we cannot name the cluster this create the database Asset Management use case within the database create the collection market reports so here if it has a problem with the name we'll just change it out so this is not a big deal maybe something has changed since we last used it so we go ahead and do this and this okay great and uh yeah I don't care where it goes it could be
North Virginia automatically security setup preload data sample I don't think we want a data sample loaded in here because we are loading in our own data we're going to go to Advanced configuration cuz CU I'm not sure if there's something else we need to see here we'll go to free wow there's way more options here cluster capacity that sounds fine uh we don't need termin ter termination prodection and then here's the cluster name again so I'm going to try again to put the name in what a little pain this is and so we'll go over
to here and fix this and we will create the cluster um and so now we need to set some uh stuff here so we'll go back over to here here okay so it's not saying what we have to set here and so we need to set a username so I'm just going to say um admin and then the password is going to be capital T testing 1 2 3 4 5 6 exclamation mark if you've watched any of my videos you know that's how I always do it the password contains special characters which will be
you you encoded so what I'll do is I'll just do capital T testing 1 two 3 four five six and I will not do that one on the end there with the um exclamation mark and it looks like we already have a user down here but we're going to create this user anyway and so now we have this admin user that we have here then we have my local environment or Cloud environment I'm not really worried about this stuff right now um where would you like to connect well technically my cloud environment use this to
add a network IP address to the IP list you can modify it any time use this to configure network access between Atlas and your configured stuff well that's going to be kind of hard because this is not on my local machine right this is in wherever this thing is and I'm not going to know what the the IP address of this is so I'm not sure what I'm going to do for that um I mean down below here it says 0000 Z so it looks like it's already open to the Internet so I'm going to
go ahead and hit finish because I believe that this cluster is already open to the internet I'm not going to have an issue with it so now we have created our cluster right wa you're you're deploying zero three changes so I think it's still deploying it's not 100% done yet so that is what we're waiting on here so I'm going to go back to Atlas here and we have project zero you may have created had to create a project I didn't and we have our cluster and so here it is spinning up okay so let's
go back over to here and so we have our data okay and then we see that information and then it's want going to want to know our URI so that would be the next thing that we provide to it if I click connect it's not ready yet but in here we should be able to find I think through drivers it's still provisioning so I think here that's where we're going to see our connection URL string so we're going to wait oh here it is and we're going to try that again so I'm going to go
hit connect I'm going to go to drivers and then down below here it says use connection string your application I mean I actually want the real one here it's saying DB username password so it seems like I have to replace these myself which is not a big deal and so we'll go here and we said that this was admin right admin and then this one here is capital T testing 1 2 3 4 5 6 right so that's what it should be to get in here I also kind of remember having to fiddle with these
at one point but I'm not sure if that's still an issue so we going to go ahead and set that uh URI so the rest here should be attempting to establish a connection to mongodb here it's looking for a very specific name it's not my fault that it uses uh if does not allow underscore so I'll change that to hyphens but I think they said that we also had to create something else right so if we go over to here there's something else we have to create um for the mongod to be within the database
create a collection called market reports so that is something else that we have to create so we'll go back over to our cluster um let's click into this and I'm looking for where I create that there so we have our cluster right and I need to make my collection but the only way I could think of doing that is I'd have to to connect to it literally unless we're supposed to go here to add the search and then create it from here yeah here's a collection can I add it here I don't want to do
any I just want to put one in here so again I'm trying to find a way that I can insert this [Music] let's browse collections here so yeah we currently don't have any collections create the collection I guess this is what we're doing right now then create the vector search index so I'm just going to ignore this and maybe through this we actually are doing it through these steps so yeah here we're going to establish a connection let's go ahead and try that yeah here it says get database and get Conn get connection so it
establish a connection did it get the collection cuz I didn't make one here it will delete any data if it already exists I mean if it just works it'll insert it so we'll go ahead here and just say insert into the collection it says it inserted it so maybe it created The Collection indirectly let's go give this a hard refresh let's see there it is okay great so we didn't have to do anything there it just created ourselves obviously we had to uh make a a name change there but that's what I thought so we'll
go down below and here now it's talking about the mongodb query language and Vector search okay so we can go ahead here and press this okay and then they're talking about the coh here ranker so the idea here is that you're querying we haven't ran anything yet we're just defining functions right now but the idea is that we are quering something it's going to return it back and then coh here is going to rerank it for relevance and then that's going to help it uh produce more relevant results so go ahead and run this next
so what companies are NE have negative reports uh or negative stim uh sent sentiment that might deter from investment long term and so there we have some stuff handling user queries okay named reranked documents is not defined did we not run it right here and then we run it here and I mean I believe it's working here it is using coh here down here with the chat okay so it came back here and so this one here it's bringing in the documents so here we have format documents for chat and then bring in the reranked
documents so we've already pulled the data and then we're providing the documents to coh here because that's how their API works at least with this version um and then we can see the final results and then we can cite the papers and so then we have using mongodb as a data store for conversation history that's of course another thing that we can do with mongodb I'm more interested in the rag side of it but you know hopefully it's clear what's happening is that we are querying mongodb for that data we're then passing into the coher
re ranker and then we are seeing the results now is this a good example I don't know if the data if I understood the data better or if it was a smaller example I probably would be able to make better sense of it but you can see it's pretty similar to the last one when we did um when we worked with elastic search but here we're just using documents to put it and then it's going to embed it into the context window however it wants to but it's not that different from the other one so
there you go we got it working U that's all I really wanted to show you uh and we'll call this one done I'm going to get rid of that cluster just because sometimes I forget to uh do that and by getting rid of it now I just want to deal with it later um so I'm going to go over here and figure out how to delete this cluster we'll terminate this here here we go and we'll just say terminate terminate and we'll say bye-bye cluster and we are all done here I'll see you in the
next one okay [Music] ciao hey this is Angie Brown in this video I want to take a look at using postrest as a um as a vector store because I know that we can do that so what I'm going to do is open this environment up in GitHub Dev I clearly had an older one here um at GitHub code spaces I said GitHub Dev which was an accident and we'll open this up and I think what we'll do is we'll get a bit of help from claw since I'm paying for Claude I better start using
it and I'm going to say can you give me a simple code example of using postgress as a vector store because it has an extension to that we can utilize for it here yeah PG Vector is what we're we're talking about here it sure doesn't look simple but um I think it's something we can manage so what I'm going to do is we're going to come over to gen Essentials and I'm going to go bring this into a dark mode theme I'm going to make a new folder here called PG Vector okay well we want
to give a permission to there we go we fixed the permission here good reload I don't know it's being a bit finicky here today um and I'm going to make a new folder here called PG Vector okay uh can you give me code to run uh postgress via a Docker container I have sample code lying around espe from previous boot camps but I'm going to go ahead and just ask it to do it for us we'll say PG vector and I'm G to see if we get some nice Docker set up here great give me
the code please what the heck is Vector DB oh it's saying as the database okay great okay so here's the file uh there's a lot going on in here I don't think this is everything I think yeah this is the docker file okay so we'll grab this and I'll go back over to here and we'll make a new Docker file and we'll paste that in I'm not sure what the lasest version of postest is but 15 sounds okay to me as long as it works that's all I care about and then we have our Docker
compos yl file which I missed this here let's go ahead and do that so I'm going to go and say docker composed. yaml and we'll paste that on in here so that will set it up we um are clearly setting up our username password right there in line that's totally fine it's going to run on 5432 which makes sense we are mounting this volume so the data is not within the container it's mounted to uh this environment here um sorry I'm not sure why this is left blank I don't know if that actually matters oh
unless it's following with the no no that doesn't why is that blank so I'm not sure if that's an issue but I guess we'll find out then we have some initialized initial initial SQL so I'm going to grab this and I'm going to make this here I assume we'll have to run this manually SQL and we'll paste it in here I'll go back over to here and we do have some environment variables I'm not sure why we have to set it there when they're getting set somewhere else but maybe we will need this here later
on there's nothing sensitive here so this is not a big deal um and so yeah we should be able to do Docker compose up I'm very skeptical of this empty part here but maybe we'll find out that it's not an issue so we'll go here and type in PG Vector folder I'm going to go ahead and do Docker compose up how would it know because we'd have to create the database how would it know that the extension is enabled but maybe we'll take a look at our extension or it it SQL and so it enables
PG Vector right here as an extension that's usually what you do with postgress is you will enable them there here it's creating a document embeddings table if it doesn't exist and then the embeddings idx if we go to the docker file um it's not like it's loading it in anywhere here it would have been nice if it set it up there but if it doesn't that's totally fine um we'll go to our Docker compos amble file okay so here it is pulling the file here okay great so it's good that we put it in there
and it's putting it here in the volume I guess it's fine but I guess that way that when we connect to the docker container we can just run it in place I feel like this is not really necessary but it's totally fine like it's not going to initial initially load it I don't think it would so we'll just wait for this to build all right so our container is running on Port 5432 um now we should be able to install maybe some kind of database uh viewer um so I just type in postest here there
might be like a generic one that we can utilize um is not meant for creating dropping tables is a visual aid for crafting your queries sure I don't care I mean we don't need to use that to create uh the things I just need to see something so we'll go ahead and install that one um is that the most popular one it looks like it's very popular so we'll go ahead into here so right now there are no databases that is no surprise unless we have to create a connection [Music] first I mean it is
running 5432 so we should just be able to hit enter here say 127.0.0.1 the uh thing is Vector DB vector and the password Here is vector vector DB 5 for 32 I do not want to save that well thank you and it should be on Port 5432 it's using a standard connection um the database name is Vector DB and I definitely put in the wrong password okay so the connection failed so we'll go back a couple steps here and this is supposed to be Vector pass I can't say I really like this interface um it's
not the one I usually use and so here we can see the vector database we'll hit enter so now we have an established connecting to this Vector database um I'm curious do we have documents so it looks like nsql got picked up AIT SQL is that a thing for postgress okay so that actually is a thing I didn't know that how did it know to pick it up well anyway that's interesting because now our our table is set up for the stuff that we need so that mean that's really good um I guess my thought
next is um what can we do so let's go back over to here and if we go here into our Vector store example we do have some code I'm not sure how good this code is we'll go ahead and copy it um I'm going to make a iron python file so this will be basic. iy andb and we will make a new block of code here we'll paste it in and we do have a lot of stuff here okay so I'm going to go here and I'm going to bring this up to the top come
on drag up or maybe drag this one down below that might be a little bit easier and let's just choose the kernel so that we get some um highlighting here because it's a little bit annoying when we don't have highlighting and so we have a class we can add documents and do similarity search and then down below we have our example code so I'm GNA cut this out of here python python 312 that's totally fine and I'm going to go ahead and paste this in I imagine we wouldn't have pyop installed so I'm going to
go ahead here and drag sometimes easier drag the ones below it this one below and so I think we'll have to probably do pip install cyop also I can't imagine we'd have numpy in here let's go ahead and do that and hopefully those two will install I don't know if there's like ayop PG ory psycho oh it's psycho I've been calling it pyop for so many years but it's psycho pg2 there might be a version three I'm not sure but I'm going to just use whatever works here um so we'll go ahead and run this
so that works and then we're going to initialize our class so let's see what's happening here you pass the connection string to it okay it sets up the database we don't need to set up the database so we're going to go back over to here uh you know can you update the class because we don't need it to set up the database let's just see if it can um refactor our code there because really all it needs to do is add documents and things like that so it seems like it's as simple as just deleting
out those functions but we'll go ahead and copy this now cla's pretty good at producing good code so I'm pretty confident that this will will work we will not know until we work with this here okay and so we'll paste it in here like this and so this needs a connection string URL we'll go down to our implementation down below this is getting our our string which looks correct here we have some sample documents that we can insert so yeah let's see if this works okay okay if it does Works that'd be awesome and it
doesn't so password authentication failed for username um that's because this isn't the username so this is a vector DB and then it's Vector pass and so we'll try that again and cannot adopt dictionary what line does it have a problem with this um Parts append so somewhere the format's not correct so I'm just carefully looking through this to see where the issue could be maybe it's the sample document that it doesn't like because that's the only dictionary that I'm seeing here uh right now that could be an issue yeah add documents so store add documents
and here it's saying um so we'll go over here and we'll just give it back some of the information here and then maybe you can correct it but clearly it's some some way it's formatting so we need to convert the dictionary metad dat adjacent format before inserting into post CR fair enough so it seems like there's some formatting issue okay so we'll bring this here we'll go back over to here and not this one yeah so somewhere here it's doing some kind of formatting so clearly it wasn't being formatted correctly we'll try this again and
Json is not defined well fair enough um we'll just go import that here import Jason and I don't think we have to run the class again but we'll run it again and then we'll run this Jason's not defined uhuh that's why we imported it right I do not see where this line is Jason oh it's right here oh from this okay great okay so we'll copy this it's really here it has its own Json structure we run this then we'll run that we'll scroll on down we'll run this um no operator matches given name arguments
types you might need to add explicit type casting oh my goodness so I mean at least it's getting close to producing what we want MH yeah so again it's done something here worst case we can always read the code and make sense of it it's not that big of a deal but I'd rather get a working person and then read the code which is kind of backwards but that's totally fine so we'll go that's our example code here we'll go ahead and grab this and then we'll hit run I was just hoping that it would
work oh there we go okay so great so we have a working example let's now go read our code and see what's going on here so the idea is that we have uh PG Vector store right and we have some sample documents and so in here we have some text um here I'm not sure I guess what it's saying is that it's just randomly creating the embedding so this is not the real embedding but if we were to store this in the database we'd actually need an embedding Library so I guess what they're saying here
is like this is the text representation and this is the embedding example right um and then down below here you can see it's we're adding the documents and it says MPR random 3842 list so replace with a real query embedding okay and then doing a similarity search and stuff like that so maybe there is a real embedding weing can we use a real opsource uh uh embedding library because it's not really doing real embeddings so here's talking about all mini uh L L6 V2 let's go take a look at that before we go use it
oh it's by sentence Transformers okay so that's what I was thinking that maybe something like sentence Transformers would be a use case that we could utilize so we'll go ahead over here I'm going to scroll up and we will place in sentence Transformers whoops that's not what we want we'll go back over to here to our code and we'll grab sentence Transformers and then we will paste it in here as such um then we'll go back over to here and so now I'm just carefully looking at what it's changed so here it's passing the model
name I guess the code looks more or less the same so grab this looks like it added a lot more documentation I'm not sure why um but here we see model so s Transformer is being loaded and then we're doing model and code using that particular model so it's doing that great and so then in here I imagine that we might see some of that here as well so we'll go down below um oh it actually updated the uh Vector store so that's good like the connection string was not correct before and we'll go down
here we're getting different kinds of data to insert which is totally fine so we'll go ahead and paste this in here and so now we have a bunch of text we have Associated metadata with each text and so we're saying add the text and so it did create a new function up here and so in here it's going to encode each one and return back the documents and then insert the documents for us so they've kind of they've definitely refactored this in another way so instead of add documents they renamed to add text I don't
know why they renamed it and then we have our similarity search so let's go ahead and run this and then we'll run that um um oh right right right we'll go all the way to the top so we'll go to the top and actually we need to uh install sentence Transformers so let's go back over to sentence Transformers and grab this here and then we'll run this great that we actually got to use sentence Transformers for embeddings so because we we talked about it before espert earlier in that that part of the video but I
wasn't sure if we'd actually get to utilize it in this kind of manner so we'll wait for this to finish installing there we go and then we'll go ahead and run this uh it's like highlighting as if it's not utilizing it but hopefully it does we'll go ahead and run this and then we'll run this and there we go and so we've done a similarity search we've used postgress as a vector store um postgress is okay early on uh in terms of production you probably don't want to use post grass but again it depends on
the scope of your project so it's up to you to figure that out that is our postgress example with PG Vector store there we go and I will see you in the next one okay [Music] ciao hey this is Andrew Brown I want to take a look at Ser API as this is something that will be commonly used to search out to the internet and you will see Sur there's another competing uh one so there's like Ser versus Ser API versus uh was it talvi talvi yeah that's the other one um so I honestly don't
rarely I rarely ever use these apis directly but I'm sure we can figure it out so let's go jump into it uh do we have to sign up to utilize this let's go over to pricing I mean there's a free tier so let's go ahead and sign up and I'm going just connect with my GitHub account here actually I'm going to connect with Google because it'll be easier github's going to make it a lot more work for me here today and let's see if we can go utilize their API so we successfully signed in with
Google free plan includes 100 searches per month non-commercial use only uh for free plans you need to verify your account check your inbox and verify your email verify your phone number okay so I'm going to go ahead and do that so I'll be back in just a moment all right so I've confirmed both we're going to go ahead and hit subscribe now and so that should bring us into here and so we have our account there's a very old photo of me but that's totally okay and wow oh wow is there a lot of documentation
but um I already have an environment uh or developer environment open here in GitHub GitHub code spaces for our gen Essentials I'm going to make a new folder here called Ser API and I'm going to just continue on with using uh iron python ipnb okay and so we should I no that's not right it's IP ipy iron python Jupiter notebook the Y is for Jupiter which is to me silly um so we have an API key here um I need to learn how to start working this we'll take take a look at the playground I'm
not sure if there's anything interesting here I mean that's kind of cool uh but what I want to find out is how to programmatically work with this so where is the Sur API SDK okay so here's an example um so it seems like that's a ruby example I love Ruby but uh let's go switch over to Python and so here we go all right so we will go close this out that's my personal email uh we'll go and add Ser so say pip install Sur API and then we'll also get python. EnV so we'll grab
both of those there and then we have some stuff down below and we'll add this here and I mean is that my API key if that is it's really silly that they did that um we'll go back to my counts and this is my API key here oh yeah they put it right in there that's a very weird interface we'll go ahead and we'll just place it in here and I will also make a example file here and it's not like I can hide my key like it's they don't make that easy there but we'll
go ahead and say get ignore and we're going to ignore the EnV here um so we have an API key I'm just going to call this uh ser Ser API key okay and then we'll go over here we'll go back to this example I'm going to grab this on out here and I'll paste it into here and then we'll go back over to here I'll need to import um a few things we'll go over to streamlet here I always pull that because it's at the bottom we'll grab these three lines here we'll go back over
to this example we'll go down a line and we'll load in ourv file here um I'm not sure why this is an issue cannot import I'll bring this down a line so that just doesn't conflict with what we're doing I'm going to run this again from Ser API import that well that's what the library is called okay well let's go maybe it's not called that Ser API python maybe that's not the name of the library they didn't particularly tell us what the the name of the library was okay but how do we install it okay
so the actual one's called this interesting so we go back over to here and then we'll paste this in here as such and we will then install the Google search results that way then we'll run this one then we'll run this one no no that should not be an issue now I'm going to go ahead and give this a restart maybe this requires hard restart and then we'll load this in here and then we'll load this in here there we go now we have a non-issue um and so then we'll grab this I'm just say
os. n byron. getet and this will be uh what do we call this Ser API key and so this is literally just searching out to Google and we'll go down here and then we'll print results AI overview um okay well whatever the results were let's go ahead and just print them I'm going to do this in a separate one just so that I'm not calling it a bunch of times and let's just print it here and so you can see we have some kind of results this is uh not easy to read pretty print Json
python because I I can't read this um I suppose we could just dump the file too let's go take a look at that that work yeah import Json we go and then we'll go down to here and then we'll run this and so now we're starting to see our results okay so what do we have we have search metadata results search parameters search information ads um this is not very useful to me so I'm not exactly sure what we're expected to do with this but let's go back here and see if there are better examples
so yeah we have Google search API over here search query that's fine um so here they're showing the example of oh what's being returned back okay great so here we have a bunch of stuff and so I guess it's like what are we looking for so if we go here we have local results related questions organic search organic results so maybe we're interested in organic results results if that even shows up and so we'll go back over to [Music] here why doesn't that match search information oh it's right here organic results States results for exact
results for exact spelling no that's still not very useful let's go take a look at what they thought we were going to get back well this one's a little bit more different right this this one's searching coffee and this one's specifically looking at the organic results in here but if we go back over to here we just have search. get dictionary so I'm going to go ahead and grab this code here and let's see if we can get a better result okay and so I want to grab this and here we're specifically specifying the engine
to be Google search which maybe that will get us better results I'm not sure those lines are still the same um and I'm going to go down here and just print this out like this we'll go ahead and run that it's going out and searching and then we'll come back here run this and so now we are getting the organic results back okay so now that is a lot more useful in terms of information I'm kind of curious like how would it go and get stuff from a very specific website because when we were doing
um uh a last CLC search it was using Ser API I believe and I'm not sure how it was crawling specific contents of a website but you can see there's a lot of stuff in here I mean this must be old there's no way you can get Twitter information now but clearly at one point you could yeah it seems like this allows for a lot of different stuff inline images buying guide um but I guess if there's like a website maybe they have like a cache version of the website that you follow through well you
do have links right so I guess what you would do is that you would just use something like if it was Ruby use something like noiri but you use um XML or HTML parser and then you would just follow the link and then parse those pages but this is giving you um uh a way to explore those pages from there um so I mean that's something um what I would probably want to do is look up a grammar rule so let's say I was trying to learn what NOA was so I'm going to try to
switch my keyboard over here so there is a no ha and I probably would actually search bunpro and so that would probably be my results I go bunpro like this I'm trying to make like a real example that I would use bun Pro and then I would go back over to here and I would replace this in my query like this and then I would run this I want to see the organic search results and so now I'm starting to get it and here we can see we have a link uh and the first position
is correct like that's what I want and so I might add another one here and so in this I want the actual link okay and I'm not sure what we' use for pythons so I'm going to go over to Claude just because again I'm paying for it you probably see me using open AI more than Claude but uh using python I have a link of a website and I need to scrape uh the text uh but also clean out all the uh HTML I know there's like a cleaning library in Python I can't remember what
it's called um but oh yeah we could use beautiful soup to do it that's right beautiful soup is like the noiri of of python um yeah okay we'll give it a go um it probably has something built in to be able to scrape it out so maybe that's what it's doing so I'm going to go here and make a new link and we will go ahead and do pip install request and beautiful soup 4 I guess it's version four that we're on right now that was a really fast run this use no takes forever um
and so here we have an example of text so I'm going to go ahead and grab this here and so this defines that we'll bring this on to the next line here okay so I'm just hoping that this works remove script and style not not exactly what I meant clean the text okay so maybe that that that's what cleans it right there um but what I want is the link from the first result so we have organic search and so maybe we do zero and then link so here I'm thinking what we do is we
paste this in as such and then we do Square Braes here and then we do zero and then we'll see if we can scrape the content from here oh we got the content back cool okay so now we have this grammar rule all the information about this grammar Rule and so now what I can do is I can bring in an llm right and then I can tell it like hey help me learn this grammar rule or even maybe what I can do is say extract out from this the grammar examples because if we were
to follow this link here like we go back and say uh NOA Boon Pro uh that's not the example but here like here's kind of the one that I was looking for NOA or Noah Noah I'm saying NOA it's no W but we'll go back and turn back on my Japanese keyboard here so I can type it in but we say no h it's not probably H I'm just saying it like that weird way um but we go here and this is like a jlp N5 it's like beginner level stuff and in here they have
examples right so maybe I want to extract out um the examples here right um and so we have the text the body text and now I want to feed it over to llm and I know that we've done an example with a few different ones and so maybe I'd like to go use a um Co just because I'm already using it I'm going to probably go use anthropic and so in anthropic we do have a basic example here and so I'm going to go ahead and do this I don't need um python Dev so I'm
just going to grab anthropic here and I'm going to go on the next line and probably have to reload. EnV here but I'm going to go ahead here and paste this in and what I want to do is I want to say um please extract out the example Japanese sentences and provide them in a bullet list just the okay so that's what I wanted to do from this text and I need to do interpolation so I'm going to do curly here and this will be clean text I think that was called up here yeah and
then I think I have to put an F in front of this for it to be interpreted correctly and so this should be what we need I do need a key in here so I'm going to go back up to the anthropic example and copy this and go down to here and I'm going to go here onto the next line obviously if we want to do this we do this one as well I'm going to go back over to anthropic API I think I loaded up some money on this yeah because what we found out
was that anthropic even though it says has a free tier I could not find it um it is gone so we're going to go over to my API Keys here um I'm going to create a new API key this will just be for Ser I'm going to add this key so now I have this key I'm going to go back over to here I'm going to paste it into here as such and so now I have myself a key you're not going to be able to do this unless you have a paid version of anthropic
if you don't and you use open AI free one if you don't go use coh here I have code around here you can adapt it we have them over here we have a coh here example we have an we have an open AI example um I'm going to reload the EMV file here um I don't think I ran this I'm going to go run this oh no no we're in the wrong thing okay so I'm over here so I ran that I need to re load mymv file which is up here and then we can
go all the way down to the ground and so hopefully this just works I'm going to just separate this out just so I'm not making multiple calls to the API um but I'm going to go ahead and run this here and we'll give it a moment here we probably could to use tyu for this hiu would be good for extraction here are the example texts that are listed here um and so so if we look at this uh was that atarashi NOA no I'm not sure what that character is we'll go back over here and
take a look and see if it did a good job so this was here atarashi no oh k k no kurum so that was the first one does that match it does okay so basically achieved EX what I wanted to do and so that's how you would integrate Ser as a full example so this one is completely done I'm going to make sure I'm G to make sure that I clear out my key because I don't want any of you folks stealing my data my my free free or not free but my my credits there
and anyway we're going to go ahead here and grab this and I'm going to go ahead and show Sur API example and so now we have an example of how to work with Ser so there you go um and actually before I go I'm going to see if I can get rid of my Ser key I'm going to delete my c key there we go it's deleted and oh no that's copied I thought I deleted it and I'm going to regenerate the key so I'm going to do that off screen but we're going to call
this Den I'll see you in the next one okay [Music] ciao so our friend George who's really good at building out agentic workflows was telling me about all hands which is an open source project I believe that is by I thought it was by GitHub but maybe it's just on GitHub um so yeah I'm not sure I'm familiar with the company per se but this project started as open Devon and now it's been renamed to um open hands if you've heard of Devon that is a agentic um coding agent that will uh write code for
you and so open Devon is the Open Source One of it and they renamed it to Open Hands um so I have seen this in the past but it's been a while since I used it and so might be fun to give this a go it looks like we can run this in a Docker container and so if we are going to do that I think this is where I'm going to use my Intel machine uh if you can run Docker containers on your on your local machine then go ahead and do that last time
I tried this in a video it was failing because this machine is very old um not my Intel machine but this uh this my big recording machine here is so I'm going to go ahead and open up a remote desktop connection as I have it on my network this other machine I believe that it's still still turned on so we'll go ahead and try to connect to it okay say yes and here we go so as you can see I was trying to do something earlier here that failed uh I think I was trying to
use VM and that was like super hard to use unbelievably difficult to use um at least in that take that I was taking it let's go look for uh I think it was called Open Hands Open Hands by All Hands yep and so they're suggesting that we can start working with this by running a container I'm not sure how large this container is so hopefully it is not super large but we'll go ahead and we will open up um Visual Studio code okay I'm going to go ahead and run new terminal and I'm just going
to CD into my sites directory we're not doing anything insights but I want to make sure that I have Docker installed if you don't have Docker installed we have a tutorial on that setting up your Dev fire um and so follow instructions there let's go ahead whoops I did not mean to split that uh side by side but actually that might work perfectly for me here um and so I want to First pull the container so go ahead and do that and we will just wait for that to pull okay all right so we've pulled
the image and now the next question is can we run it so I'm just carefully looking here if we need um anything additional it doesn't appear that we do it's a lot of lines here but let's hope that this just works we'll go ahead and hit enter and we'll give it a moment one thing it is saying is that we'll need a model provider and API key such as Claude Sonet um and I mean that's something that we could probably try to get set up so we will see what configuration has I wonder if we
have to we can configure that through ads that would be the easiest way to do it but I imagine it wants an API key directly so that is something let's go take a look at at that while that is going here oops and we'll just expand this here so let's see what options we do have I mean the easiest for me to hook up would be GP uh GPT 40 and so I might just do that but yeah we'll have to wait for it to get running here first seems like it's pulling another image so
maybe there's something else that needed says it's now running on Port 3000 so we'll go take a look we'll give it a moment here to get started and we are in here so now we have some options I'm going to go with open AI again you can choose whatever you want looks like we can put in Azure but that's going to be the same thing as open AI I'm going to choose GPT 40 and now I'll go get an API key so that again is just easy for me to get so we'll go to open
Ai and I'm just doing this off screen here okay uh we'll say let's drop down research there we go I'm not sure why they uh have all that animation there for their their stuff and I'm going to go here I'm going to create a new project this one's going to be called open hands and then we will go into manage projects and then from here we'll go to API keys I'm going to create a new API key this will be open hands I'm going to select the project here as open hands I'm going to create
the key I'm going to copy the key I'm going to go back over here I'm going to paste in the key so now we have the key I'm going to go ahead and hit save uh we use tools to understand how applications uh how our application is used sure that's totally fine I don't need to save the password that is fine there so now we're in here and saying what do you want to build and so understand that we are using open AI um but we can plug in whatever we want we could also import
stuff I wonder if we could use an open source model it says llama 3.1 so that would be kind of interesting if we could plug in a local open source model um I mean if it's open AI compatible I bet you could yeah so you could probably use a maybe there's also open router so I'm not sure but we will just use uh the API for now and I'll just say you know I want to build a game of uh zorc okay so now it opens this this really interesting interface and so we are just
waiting for it to do something so say waiting for the client to become ready I've never used this before so I'm not sure what experience is going to be like so the agent is now running the task okay design the game choose a programming language set the developer environment uh develop the game if you need help for the specific Parts process such as setting up the development environment coding a particular feature feel free to ask uh okay so we'll download the files um I guess we'll just make a new folder here and we'll call this
uh new game and we'll select this folder we'll say View files save changes and so I imagine that the idea is that we need to open up that workspace here so I'm just carefully looking I'm not sure what these buttons [Music] do okay what if I click this I'm not sure what that button does I'm just trying to open them up here right so let's say open code yes and what I go file open folder I think it was documents okay so my question is like all right we have this open but how do we
start working with the code here so give me just a moment okay all right so I mean here they're just suggesting to do something really simple but they're not really explaining as to how we can go ahead and do that so I'm going to go ahead I'm just going to write this line here which is please write me a Bas script I'm expecting there to be files oh here we go so here's a file I'm not sure why this one was so uh so much more complicated but now we have our hello.sh file uh so
C can you can you run the hello.sh let's see if we can do that look at that so the idea is that it has the ability to write code it also has the ability to execute code which I think is really interesting um looks like we could also um have a duper notebook so let say can you make a simple Jupiter notebook to uh use Bert to do classification let's see if we can do that we'll just give it a moment there okay and so it's telling us some things it can do so continue all
right but the thing is I what I wanted to do is I actually wanted to create the files oh it did there it is okay great and so now if I go over to here why can't I see it over here though well at least we have the file here so that's fine if we go over to here we can see it over here as well so I can say yes and now I'm somewhere completely else let's try this again open code there we go and so it has written one for us can you execute
the uh The Notebook make sure it's working not exactly what I meant to write but the question is would it be able to execute it so supposedly it is running the command oh over okay over here and I mean we obviously wouldn't see it here but it clearly is executing so it's opening the notebook and then it's trying to execute and see what happens while that's happening let's read through it so it installs torch um it brings in Bert uncased we have a simple data set so yeah it should be pretty simple and easy for
this to run on my machine so I'm going to just pause and I'm going to see if it actually works all right so it said it ran the jupyter command now it's running it again I'm not sure as to why it's doing it twice here have execution time out Jupiter current working space is this and so it might be attempting it more than once to make this work um and here we can see that it did time out I don't want it to do it multiple times so I'm just going to hit stop but you
get kind of the idea of what it can do here so we can go and say uh can you make my bash script into a simple game let's see if we can do that great and so it says that script can generate a random number between one to 10 so let's see if it does that obviously you can also work with browser files that's kind of interesting we'll give it some time here to generate all right so it comes back here and it says transform your bass script we can create a text based Adventure game
I'll create it in the B script I'm not sure why it's saying the jupyter notebook as we were not suggesting for it to do that the hript has been transformed into the simple number guessing game you can use the script of the following command so if we go back over to here uh it does not look like it's been transformed okay and so you know this thing isn't exactly perfect oh maybe we have to continue to do this this we start the game so yeah I'm not exactly sure how to get this to work but
it is cool that it's available Maybe we use a different model like uh Claude Sonet 3.5 would perform a lot better I'm not certain um it's kind of a pain for me to switch out my keys right now so it's not something I'm going to do we do have some other Advanced options and so it looks like we can switch out the agent so right now we have a Kodak agent but there are other ones here I don't know much about these other agents but it is uh interesting that we can do this so that's
all I really wanted to show you um that you know there are open source projects and you know you have to play around with this to see what the results are but you can uh see all the code right and if you wanted to modify your own and make your own that would be really cool [Music] okay hey this is Andrew Brown in this video I want to show you crew AI which allows you to build AI agents so it is an opinionated framework um I George our friend showed us it uh over at stack
pack um and so wow is this thing powerful but when I tried it last time uh the library wasn't pulling and so I'm hoping that this time around we're going to have a a bit better luck so I'm going to go over to my GitHub J AI Essentials and if I already have a GitHub uh Cod space running I'm going to go ahead and utilize that as you see with a lot of these open source or sorry uh serverless things we don't need a lot of compute um so things have been uh pretty easy so
far but I'm going to go ahead and uh this I think again this environment is already running from last time you seen me launch these environments hundreds of time so I don't think I need to show you that and these are old keys so I'm not really worried about it but I'm going to make a new folder here called crew Ai and we're going to go ahead and start working with this if we can get it to uh install properly so I'm going to go over to here to crew aai and they should have a
lot of cool examples that we can utilize so I'm going to go here into examples um not exactly what I wanted I know they have a ton of examples how to nope tools nope where did I see this before let's go back over to here and go to examples ah they have a completely different repo for it and so here we have our examples I'm going to press period on my keyboard to open up github.com download this one here so I'm going to go ahead and download it and I'm just downloading it to my desktop
in a folder so just give me a moment here it went to Quest UI I'm not sure what what quest UI was but I'm going to download into there and then once that's downloaded I'm going to bring it over into here into this folder here so give me a moment as I'm looking for uh this Quest youi directory not sure why it's called that but I believe it's downloaded so I'm going to go ahead into this crew AI directory and Dr drag it in right if you are uh using my repo then you can just
open it up and so they're talking about how this might work um there is a Pi Project tomel in here that shows some configuration I believe that this is utilized does this use um poetry I found out there's a thing called poetry for um Python and it sounds like it's supposed to be uh basically bundler for um python right so poad whatever whatever so similar to pip but more like whatever that is yeah there's a there's a poetry lock file this is totally totally like um just like uh uh what do you call it um
bundler so maybe I should like uh crew AI if that's what it is or sorry not crew AI but poetry we'll go ahead here and I don't think poetry is installed yeah there's no way that that would be installed especially if I spell it wrong it's not going to work poet no so I'm going to do a pip install poetry I'm going to assume that's how we install it right I haven't used it before well sorry I've seen it before but when I did this last time it didn't work as expected I'm going to do
a poetry install and see if that works and it's saying doesn't find it because we're in the wrong directory and so we'll CD into it okay so we'll go ahead and do a PO install and here it says the current activated python version 312 is not supported by the project so it wants a version of 310 and We've ran into this issue before this is where we'll need to create a new environment do we have cond installed ah we do excellent and so I believe we have in here in our local Dev environment instructions on
how to set up a new python environment so let's go ahead and read through that create a new environment here we go so let's go ahead we'll do to create hyphen hyen name this one's going to be called crew AI I guess crew AI um and we'll say python version 3.1.0 hyphen y we'll go ahead install that and so this will be our workaround uh for us here once this is all set up so we now have uh that set up we'll say cond activate uh crew AI so we'll activate that there um cond nit
before cond activate Okay cond and it did we not I guess we'd have to do that for the first time that's fine kond cre activate Okay but we did do that right I'm going to try this again maybe it didn't create it all I'm thinking is that maybe didn't create it because we didn't do the condo knit that's something that I didn't expect that we'd have to do because um um I I just honestly thought we wouldn't have to do that but if we didn't have to do that we'll do that again it's still complaining
about this let's try cond KN off all maybe that's what it wants there we go and now we'll try this I'm going to close the shell out I'm going to make a new shell here and then close this one out I'll try this again so now we're seeing it so it is definitely initialized and so maybe that was just a little bit of the problem there um we'll go into game builder and I'll say cond activate crew AI there we go stopping starting sometimes just fixes things for us so now we'll do a pip install
poetry I'm just going based on my bundler information from uh Ruby so hopefully that works and so that is going through there now we'll do poetry and install and we'll go ahead and install that so there says the Pi Pi Project tomel has significantly uh has changed significantly since po lock was generated um run to fix the lock file okay let's do that poetry lock I'm not sure did it install the dependencies i h no update I mean I I want to use whatever is in the lock file right but I'm not that familiar with
this tool so I'm surprised it's running that it's just giving me red text I'm like is that a bad thing I don't know so we'll wait for this to do whatever it needs to do and then we'll see what happens all right so it looks like yeah that resolved and so maybe this time I have crew aai installed correctly um so I'm going to do here is let's take a look at um well we do have a EnV example that we have to configure so I'm going to assume that we're going to have to make
that so I'm going to go ahead and do exe EMV make sure that file is ignored it is and luckily I use open AI constantly so this is going to be easy for US Open AI really hate where this is it's like so slow loading too um and I have a basic project in here that I just keep using and so I'm going to go into that if you don't have it create your project we're going to go to API keys I have a key from earlier I'm going to revoke that key I'm going to
create a new key I'm going to call this uh crew Ai and we'll go ahead and we will select basic and we'll create that and now we have a key we're going to copy that we're going to make our way back over to here um and I'll paste that in here so now we have this open AI key here right so that's good um so what I'm thinking about is so that's configured that's good but let's go take a look at our source code so we understand so we have a main.py right and this is
going to load in the game crew Builder which is another file over here okay and we'll talk about this file in a second but here we are loading in the game design configuration yamel file from over here okay and if you can tell it's kind of it basically is a prompt it's telling you how this game works all the core game mechanics uh that the agent has to follow right and we actually have three different game examples here okay so coming back over to our main.py we can see that the input of this game here
is snake so it's going to run the snake game and then it's saying get the crew kick it off and provided the inputs and here are the results then we have down below here an example so train the crew for given number of uh interactions so it's going through here and this is specifically for Pac-Man okay so we have run and we have train which are two different things we're going to go over here to the actual game uh game builder crew and it's importing agent crew process tasks so I assume that these are components
of crew AI that we have to work with and what I think is really interesting is that it defines a senior engineer agent a quality assurance engineer agent a chief qaa engineer agent and then we have our code task that we want them to perform a revieww task and evaluate task and then the crew itself we go over to task. yaml this isn't this isn't explicitly get loaded anywhere so I think it's implicitly being loaded and we're describing specific tasks that it's supposed to do so you will create a game using python uh these are
the instructions right your final answer must be the full python code then we have a review task so you will create a game using python these are your instructions using the code you got check for errors so this is the one that's reviewing it then we have an evaluation task you're helping create a game using python you'll look over the code ensure that it it is is complete and does the job that it's supposed to do so that kind of correlates I believe with our three agents right and their goals that they need to achieve
so that's really really interesting so let's go ahead and figure out how to run this um so if we go to the read me does it tell us was there a read me in here did we get a read me we do okay great and so we'll go over to here so this one says it's using gp40 which is fine that's what I like it to use this will use the GT40 unless you change it to a different model which is totally fine um you may incur cost I'm using a paid version of it if
you're using the free version this might fail I don't know so we did poetry install which is fine we didn't do poetry lock poetry install but we did poetry install um we can modify the agents or tasks if we want to run it's poetry Run game builder uh game builder crew okay I did not know that could also be utilized well I guess that makes sense CU like if I was doing bundler I would run in the context of bundler right so i' do bundle exact Ruby whatever whatever and so this is a library and
so it knows that it can run that mod or Library module there so we'll go ahead and do this and we'll see what happens if it works that'd be really cool says there's no module named game builder crew I think it's because we are in the game builder directory I wonder if that's messing us up H I'm not sure maybe if we go back at directory let's see what happens what if we do it here like this could not find Project toml okay so that's not helping it out oops no no we're still in there
okay is there like a file missing from this we have the nitpy right so that's what it needs to know that it is a library maybe we need a CD into the source directory but then why is the EnV out here then how would it load it okay so game builder crew is the entry point defined in the project tomel but it's not installed as a script so I'm going to CD back a directory here okay so now it's not saying that it doesn't exist oh it says no module name called game builder crew so
let's carefully read this then so game builder crew is an entry point defined in the Pi Project tunel yes well first of all what's a pi tomel P project tomel Pi Project tomel the CES the build system requirements for p python projects okay let me go back over to here for those that know what they're doing they're probably like Andrew this is so easy why are you getting so confused let me just read this a little bit here um let just give me a second okay okay here it's suggesting I need to do po reinstall
I mean we already did it but let's do it again I guess oh maybe because we resolved the lock file that it actually never installed anything okay that's probably what happened I see so now we're actually installing things for real and this looks a lot like Ruby this is totally a ruby experience all right so that's now installed let's go take a look and see if we can get this running and it's going out to I assume chat gbt what do we have so far so the objective of the game okay so we have the
agent you will create the task so we have the task okay that makes sense we'll go down below I assume what it's doing is it's just working so we'll just wait there we go so came back with the code so now we're getting code back final answer okay so now it's produced the final answer for us so now it's going through the software quality control engineer this is really interesting because um this is not out yet but but in the boot camp we we code something and we're not necessarily using an agent but we are
using um we're reading this by hand and I like how this process is very similar so that's really interesting so now we have the software quality control engineer it looks like it's producing its own code okay and then we have the third agent running okay cool and so here's the result okay so now we can go ahead here final code for the game I can assume that this works because otherwise how would it not if um it have it as an example we'll go ahead and copy that and I'm going to make the new game
here let be snake P go ahead and paste it in and so now we have our game here I'm not sure what this utilizes uses Pi game um I'm not sure if I can run this on my on this environment cuz I think this has a UI right so I'm not sure if I could actually run it here I I'll try but I'm not I don't think it will work here um so we'll do python python um snake. py well do pip install P game pip install P game my typing get a little bit off
my office is really cold because it's winter right now um but we'll go ahead and try this I really don't expect this to work here yeah because this would only work on a local machine so what I'll do is I'll download this file this looks successful so I'm going to go ahead and copy what I can here so go ahead here and just say save um but what I want to do is I want to open up visual studio code locally here we go and I'm just going to load in this single file here so
I'm in the Gen Essentials here this actually is the same cor so I could just pull it okay so we're just pulling the latest code and now I'm in my local developer environment obviously um and then I'm going to CD into crew Ai and into the game crew Builder and I'm going to do I'm going to do pip install P game because I think that's the only thing we need to run this okay and so that's going to install and now what I'm going to do is go ahead and run this we'll say um python
snake. um no it's just python maybe Python 3 st. and so what I'm expecting is it to launch some kind of interface but right now we are in uh WSL so that might not work now that I'm thinking about it I'm not sure how would that how would that work here it wouldn't so it is running but I don't think it has any way for us to see it so actually I'm going to have to go over to um terminal here now I'm on Windows Windows I have command prompt on this machine here uh do
I have python installed okay so we'll type exit and um I'm going to CD into this directory here and then we're going to go into J Essentials and then we're going to go into crew AI if you can't get this to run it's totally fine like it's a lot of work to do this if you're on a Mac this would be a lot easier obviously but um I'm going to go ahead here and let's do di um so I'm going to do python or we'll do pip install pip install P game so we'll install it
now now it's installing on my windows side of my machine not within WSL because this needs a way of launching and it needs some kind of interface that Taps into the uh Windows um API this should work on Mac by the way if you're on a Mac on on Linux I have no idea um so we'll go ahead and we'll run this so we'll say uh python uh snakey so now it should open up and there we go so we have our game and it works cool Co all right I'm sure you don't want to
watch me play this all day but let's go back to the source code in the browser here and so we saw that there were two different ones here we had in the instructions um not instructions but in our code example we have one that's train and there's one that's run so notice that this one says train and so this allows it to go through multiple iterations and um I can't remember the term for it what's it called it's a reinforcement learning this is reinforcement learning and so that's really interesting to me that we can go
through that and so I guess the idea is that it would continuously produce the code that produces the best outcome um and so I mean we're not going to run that here to day cuz I don't want to use up a bunch of my um compute or my sorry my API credits because I'm not sure how much it would utilize but this is just one example and so we are obviously looking at the uh coding example for a game but you could apply this to anything and yeah it's just that it gives you the boilerplate
code to do it and so this is something that's super powerful and really awesome um I don't know why I don't use it more be because I know about it but uh anyway there you go and so we will see you in the next one okay ciao
Related Videos
How to Build Effective AI Agents (without the hype)
24:27
How to Build Effective AI Agents (without ...
Dave Ebbelaar
310,943 views
Machine Learning Full Course | Learn Machine Learning | Machine Learning Tutorial | Simplilearn
6:21:33
Machine Learning Full Course | Learn Machi...
Simplilearn
939,769 views
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Generative AI in a Nutshell - how to survi...
Henrik Kniberg
2,903,373 views
Claude 3.7 Sonnet Just Shocked Everyone! (Claude 3.7 Sonnet and Claude Code)
14:54
Claude 3.7 Sonnet Just Shocked Everyone! (...
TheAIGRID
12,352 views
Majorana 1 Explained: The Path to a Million Qubits
12:24
Majorana 1 Explained: The Path to a Millio...
Microsoft
2,038,786 views
How To Learn Any Skill So Fast It Feels Illegal
13:48
How To Learn Any Skill So Fast It Feels Il...
Justin Sung
1,347,846 views
How I'd learn Al in 2025 (if I had to start over)
14:07
How I'd learn Al in 2025 (if I had to star...
Srinath Warrier Tech
3,034 views
What are AI Agents?
12:29
What are AI Agents?
IBM Technology
1,294,797 views
Top 7 AI Certifications That Pay Incredibly Well Right Now
17:51
Top 7 AI Certifications That Pay Incredibl...
SuperHumans Life
345,127 views
AWS Certified Cloud Practitioner Certification Course (CLF-C01) - Pass the Exam!
13:26:00
AWS Certified Cloud Practitioner Certifica...
freeCodeCamp.org
3,572,005 views
852 Hz - LET GO of Fear, Overthinking & Worries | Cleanse Destructive Energy | Awakening Intuition
4:00:00
852 Hz - LET GO of Fear, Overthinking & Wo...
PowerThoughts Meditation Club
47,657,925 views
DON'T Become an AI/ML Engineer - Do THIS Instead
19:17
DON'T Become an AI/ML Engineer - Do THIS I...
Tech With Soleyman
161,728 views
The Most Useful Thing AI Has Ever Done
24:52
The Most Useful Thing AI Has Ever Done
Veritasium
6,788,610 views
BREAKING NEWS: Trump, Macron Take Multiple Questions From The Press During Oval Office Meeting
28:49
BREAKING NEWS: Trump, Macron Take Multiple...
Forbes Breaking News
2,470,554 views
John Oliver: The 60 Minutes Interview
13:34
John Oliver: The 60 Minutes Interview
60 Minutes
1,021,871 views
How I'd Learn AI in 2025 (if I could start over)
17:55
How I'd Learn AI in 2025 (if I could start...
Dave Ebbelaar
1,408,870 views
Google's 9 Hour AI Prompt Engineering Course In 20 Minutes
20:17
Google's 9 Hour AI Prompt Engineering Cour...
Tina Huang
354,189 views
Jesse Watters Primetime 2/24/25 FULL END SHOW HD | BREAKING FOX NEWS February 24, 2025
33:27
Jesse Watters Primetime 2/24/25 FULL END S...
Didim Tatlı dünyası
307,949 views
Trump has torn down ‘the greatest lie ever told to the American people’: GOP rep
8:51
Trump has torn down ‘the greatest lie ever...
Fox Business
1,218,182 views
Deep Dive into LLMs like ChatGPT
3:31:24
Deep Dive into LLMs like ChatGPT
Andrej Karpathy
1,159,465 views
Copyright © 2025. Made with ♥ in London by YTScribe.com