mlops short for machine learning operations refers to the practice of applying devops principles to machine learning this mlops course will guide you through an endtoend mlops project covering everything from data ingestion to deployment using state-of-the-art tools like Zen ml ml flow and various ml libraries IU sing developed this course he has created many popular machine learning courses on our Channel this course will teach you fundamentals of emops as well as we be having one end to end project which will involve from data ingestion to deployment using several state-of-the-art tools like MLF flow zenel and Etc
mlops is completely new field and there are very less resources around it and this is a gold mine and a game changer in the ml Community if you do this with the dedication and the patience you will be able to succeed in learning about mlops and then you will be also getting the international Pace or whatsoever job offer which you want in your hands but but wait who am I I'm lead data scientist atate I've L several products in crea's economy along with it I have worked as a emops engineer as one of the fastest
growing emop streamwork which is zml and have experienced in working as a data scientist at artifact and building large scale NLP products before even GPT was launched and hopefully by all of this experience I'm the right guy to teach you about mlrs all the required links and everything resources is listed down in the description on boox below you can go ahead and check that out hey everyone welcome to the another lecture on mlops we'll be starting off with giving you a slight introduction to mlops also I'll make you aware with the terminologies of mlops which
is very important for you to understand the later content of this course as well as we'll also make sure that you understand the basic things like pipeline steps which is a library which we'll be using out there but before that we'll make sure to that you understand why there's a need of mlops what like what are the stages and envelops and etc etc so let's get started with this uh lecture so the first thing first so the first thing first out is um about t the the growth of data is increasing there's exponential growth of
data and the importance of artificial intelligence is also has also increased over time now the data has increased but now we need to make sure that we utilize that data in a right way and in a positive way right so that's where it come that that that's where we we have artificial intelligence right so now you might be thinking that okay fair enough we can just build a prediction model on top of it but you should understand that machine learning is not just about building model why I say this why I say this because your
ml code or whatever Machinery model is just 20% of your whole machine learning project or a whole um business problem right there's a lot of things which comes into that place and your machine learning code is just 20% out of the whole set of things so I hope you will I'll prove you why why there's a 20% of ml code throughout this video and it's ml in the industry is more than the training models it is validated by chip Huen who is one of the ex experts in mlops and it is also validated by Elon
Musk who just said yeah it's like machine learning engineering is just 10% machine learning and 90% engineering and that's something really interesting to worry about and you might be thinking that every other courses online teaches only about machine learning engineering which is a building machine learning models but nobody teaches about the engineering part of it right and you might be thinking it might be some data structures on algorithms or it might be some design patterns or Etc of course yes it has some factors but there are lot more than these DSA and stuff which will
explore throughout the course through our project so we uh in a typical ml team uh in our corporate we have the following uh people who are actually responsible for doing x amount of tasks so your data scientists discover the raw data develop features and train models right and data engineer who productionize the data pipeline we'll talk about the term uh productionize in a Bild but data pipeline is like where the data is coming from and then making it on a large scale right and then we have a ml engineer who sits on front to deploy
the model right so that it can be used by users by you we'll talk about what does deployment mean in just some seconds and then we integrate the service into into your website or application and then you have to monitor it we'll talk about each and every steps in grade detail and then you have a lawyer who who can just ask you we we should ask question to them can I use this data for my model yes or no and and I'm pretty much sure that you might not be aware with any of the any
of the red line over here I'll make sure that you understand each of the things like training models productionizing deployment integrating monitoring and all the stuff throughout this int introductory lecture the reason why I'm doing this introductory lecture to make sure that you understand each and every bit in the project which will use which will make use of like terminologies which will make use over there so what data science actually sees you might be thinking about okay fair enough you have pd. read CSV you read the data you fit and then some some happens and
then you simply uh and then you also have the classifier you also fit that and then you do predict and do score right that's what you see right but do you really think that uh by writing three these three lines of code people will get your job of course not right and the main focus 90% focus should be on engineering and what engineering sees is much more uh very scary than what data science is so ml in production you might if you're a bit aware even about how does it goes Etc the first step is
you collect the data you train the model and then deploy the model introduction what does deployment mean so let's talk about a little bit about deployment so that you understand it a bit however I recommend if you want to understand deployment much more in great detail we'll have more sections on afterwards to actually understand what does deployment mean over here so deployment means that you once you once you have the trained model for example let's take an example you're you're working on a email spam detection project right and the model is currently is in a
local server right is in a machine how can you use that and integrate into that Gmail right so that that we can use that model to make predictions for the users who who are who are whom for for whom we are making this model for right and that's where deployment comes in right deployment is about that you have to make your local model available to the lot of people to the users for which you're building the model for right that's what deployment means you to deploy the model online and and we'll see we'll talk about
on deployment in very great detail in some time but you might be thinking that this is the process but what ex actually it looks like so basically what happens that first of all you collect the data you train the model and then you deploy the model now once your model is deployed you again go back and collect in the data and then training the model and this Loops goes in environment but how does you how how can we say how can we say that that okay the what's the loop is about you might be having
several questions what is the loop what is the production environment and lot of things out there so let's talk about in great detail about what does the loop means and what is the production environment is so I'll take a very very simple example of uh this image right so for assume that you have you had the collected the data and then you had to trained the model and then you deployed the code right so assume that assume that it is deployed in production your spam production St spam production system and is being used right now
what happens that you changed your model you changed your machine learning algorithm from logistic ration to n base right you changed your ml algorithm right so you have to you have to retrade go back to the model and then and then whatever is change you have to again deploy that changed model which is update the model right which is one case is different model needed or ml algorithm changes right another another point is for example you're building some span reduction project you might have trained it on a on a on a data set which which
is which which might be unupdated right or a new data arrives right for example your hackers or spammers change their strategy of sending spam emails right so so the the the data changes and your model should be able to identify the new patterns which the spammers are following right so what happens that if any data changes happen it retrains the model so first of all data again new data comes in retrains the model and then deploy it that's why we call it in a loop in a production environment it is a NeverEnding process you know
it is a never- ending process deployment goes in production right and it trains the model it it sorry uh it first of all collects the data trains the model and deployment goes in production what if if your model changes model changes if the model changes you again you have to go back and then push it again or if new data arrives you have to go back to data collection and then push it again I hope this really makes sense if it does not don't worry we'll have uh lot of examples to study more I'll take
I'll give one possible scenario of this production when ml algorithm changes or of the about the loop in a production environment so one possible scenario of going back of of going back is about model performance starts to Decay right so once you train the model you deploy it after a certain period of a time your model starts to Decay so let's take an example of a fraud detection example so assume you have trained your model for fraud detection and let's say you have deployed it as well and you see your model is giving incorrect prediction
right so it most probably happened that your frauders change the strategy or patterns to fraud right the the patterns which your machine learning algorithm has learned has changed so you need to recollect the data and retrain the model which means go back and then do it again and it may happen after some some time again hackers change the strategy right so this is what the model performance starts toate then you have to go back into the loop and then uh retrain and redeploy them more another scenario can be we might need to reformulate the problem
as it difficult to get gathered data more data as we need so reformulate other problem s violation of assumption which we made during training so basically what happens basically what happens when you train the model we have certain assumption for our input data that the the input data will be in certain range or input data will will will will be there are a lot of assumptions which comes into this place so if the if the Assumption changes if the assumptions changes which we had in a training data we might need to uh reformulate the Assumption
or maybe go go back to this and have those accommodate the Assumption which are being violated or simply the business objective changes and basically to restart again so a lot of things which comes into that place which can be of a loop and it is never ending process you have to have continuously um seeing your model monitoring your model and stuff so ml production which is data affects the output system and it's very hard to make it reliable when deploying model retraining and then collecting and then the loop is very very hard to make it
reliable and that's where mlops comes in place mlops is a set of practices it is not some library or it is a tool it is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently so to make sure that if there's anything changes in the data it retrains the model I'm just taking one one or two examples it rechange the model if the assumptions are violated it again goes back so we have to make a reliable production I which is which is happening at a large scale so
uh the term mlops is like the extension of the devops methodology to include machine learning and data science assets at the first class citizens within the Devol ecology I'm pretty I'm pretty much sure that that you might be a bit uncomfortable with this so let's try to think about in another way okay I'll tell you a very simple example of mlops so assume that you you are you are given a a game to build a beautiful city right now you just build the be now if you build a beautiful building in that city is that
helpful just write yes or no is that helpful yes or no building a beautiful city in that build uh in that city is not at all good thing because it needs the electrical connectivity it needs the maintenance it needs the security systems it needs the connection to the roads and the railways and lot of things which comes into the place right right so a single building is like a is like a model you have to connect it you have to securitize it or you have to monitor it there a lot of things which comes into
that place for making the fully functional City and companies wants what companies wants the full Standalone cities not a full building right and that's where the people are not getting jobs it's only because they are only focusing on building that building not the whole city and mlops is a way to building that that full city which is required so we really talked about deployment but you might be thinking it's very very easy to deploy the model in production but let me tell you that the trouble begins after deployment so you might be worrying about why
so I'll tell you what are the some of the things which needs to be taken care of the first one is accounting for latency so what is latency latency is about that that you might be shocked by the statistics that 53% of the visitors are abandoned if a mobile site takes more than 3 seconds to load so for example if your sites take more than 3 seconds to load 53% of the people will abandon the site and you know why and I'll tell you why is because for example you have you might have deployed a
120 billion parameters model or a very large model do you think that model will give prediction in less than 3 seconds that's that's that's really hard right and latency is one of the biggest problem right and if visitors are not VI your viewing a model or a website they most likely not is engaged with a brand and they most likely not buy your product or utilize that product right so this is one of the those another one is that fairness right so for example you deployed your model right so basically what happened that Microsoft created
a Twitter bot to learn from users and you know it became the racist it started supporting the various bad ideologies after deployment they thought that this will be so good and it was taken down by Microsoft in just some hours it was against feminism it was sorry it was against um I I'll not take any names it was against x amount of thing right which was which was which was so racist out there that's why they had to take it down in matter of some some some some hours after deployment they thought that this will
be good but but eventually it learned very bad things and then need to retrain it but however it never gone into production from from there another one is lack of explainability and audibility it's very hard to explain the prediction right um and and and and we also also have to make sure that it is authentic enough to trust this right and that's why there are several rules and guidelines which are coming by and by again to make sure from the U EU to make sure that we fit some of the principles of AI and it
is painfully slow I'll tell you 36 the there was um basic you know um survey conducted from a set of data scientists about how much time they spend in deploying and machine learning models 36% of them said said that they spent a quarter of their time which is 36 % of their time deploying machine learning models right and and and this is so you know um um which is and and and they're like more 36% of S quarter to half of the time of deploying and 20% half to 3 quars and 7% in more than
three quarters it is very very slow and you might be noticing why it is slow and there's a lot of things which will will face when when when we are building project we might be very surprised so to see that I'll be I'll be so correct to you I'll be so truthfully to you that when I was building projects for this course I actually spent my whole week in deploying models because it's painfully slow and and you and you might be shocked that I built the whole ml model Tor procing in just two days that's
it and spending four hours but a freaking whole week in deploying it so this is uh this is one of the so yeah so um so I'll talk about the model Centric and a data Centric in a b but what exactly model Centric and data Centric means so model Centric means that you are you want to improve the model while do not changing any data so you fixed the data so you have X amount of data you fix it and then you iteratively improve your code or model by tweening it some parameters and expect your
model to perform well or in data Centric what you do you hold the model fixed and then you keep on iteratively improving the data and a lot of work is in this model Centric only few of the work is in data Centric so I suggest for you all to focus on data Centric more probably to focus on data have still the model but yeah it's to totally upon your choice this is also Sav by Andre d which is again the very U Pioneer which is one of one of one of my instructor too is's very
nice in what he teaches and I think that his ideologies I was in one of the webinar of him and actually he told about this B Centric and data Centric and which which we really experience in our day-to-day Life as a data scientist so let's get started with talking about the whole process of mlops and what does it include so the first thing to worry about it what is the business problem we want to solve so what is the business problem we really want to solve that's the first question to start off so any melops
project any machine learning product which we have to start first to worry about not about what Pro what what exactly ml thing would you have to solve what business problem which you want to solve so in that business problem you to solve you you have to take care of several things out there the first is the cost of wrong predictions so I'll tell you a very basic we we'll have a basic example so let's take an example what we want to do we want to predict we we want to Pro we want to forecast you
know we want to forecast our retails for example what happens in a company that sometimes because of the wrong estim sometimes what happens there might be Overstock of a particular product which leads to wastage of resources and they underst stock which leads to again a revenue loss so in both cases underst stock or overstock of your um of your res of of your products is being the problem in a retail company so you not to you you want to really solve that so the first one is what is the cost of wrong predictions if we
actually if if our model gives if we actually don't estimate the right thing the cost of wrong prediction is quite High Overstock means having more stock of your products leads to wasted resources and possible rofs for unsold products and understock which means which is Miss sales opportunity and unsatisfied customers because they're not able to get the things on time and both of them has the quite high pro quite High U costs because one at one point you if you Overstock wastage resources and one point it is like misses opportuni so we we have to worry
about and if we solve this problem we'll fix Overstock and understock problems so let's break down the process of sale forecasting processes so basically in this sale for forecasting process you decompose the process of sale for forecasting into component task see see see just notice that we haven't reached to our ml thing right away we first of all talking about the problem which you want to solve right and then dividing the problem then right so now is a sales forecasting problem you're dividing the problem into several things first is data Gathering second historical sales analysis
market rint analysis and actual forecasting so what does data Gathering mean getting the required amount of data which we need analysis of the past right what is the trend of the market and the last one is actual forecasting so what you do you you actually data Gathering is something which is pretty easy right not pretty easy I would say it is it is it is something which so let's worry about what things can be solved in ml in this case and also will will it return High Roi right High return of our time right which
we devote in this so of course we can have data Gathering has all the equal importance but what eventually we could solve for using ml is this actual forecasting to actually estimate what will be the number of pro stocks which which we should have in a certain time period and we can actually use this IML in this actual forecasting task where it could analyze the past sales data and market trends right to predict future sales with higher accuracy than whatever traditional methods they're using so now you might have noticed that now after dividing it it
into the components we understand actual forecasting is the one which we should solve for by by utilizing the past data and the market trends and the ROI could be estimated by potential increase and a decrease in sales wastage due to improve forecast for example if your for forecasting is good you will be actually noticing that you're that there's a decrease in a wasted resources which means eventually it's helping right and then what you do you actually worry about what what will the cost of developing and P maintain the solution so if your wasted resources are
decreasing a lot so you might focus on building this AML solution of it and then you have to prioritize it in this case you or to prioritize the implementation which is the actual forecasting in this case Okay cool so we'll talk about structuring in our project as of now I'm just leaving the slide so there's a machine learning canvas which usually you have to really worry about while building a project or solving a business problem using machine learning the first one is value proposition right so first of all we have to define the value proposition
position what is this importance Define the problem the importance of the problem and what who will be our end user right so basically you want to for you you need to understand for who we need a product service this product will be benefiting jofre M value proposition positioning statement which means for Target customer who need our product/ service is a product category category that benefits basically we have to make sure we have to make sure the problem importance is so high right to proceed with the solving for the problem another one is we have data
sources from where we should identify potential data sources and it can be including some internal databases or apis or open data sets and Etc which comes into that place we should also consider hidden costs such as data storage purchasing external data and Etc which comes into that place so basically this is the second second step third step is what will be the production task whether it is a supervised or unsupervised problem or aom detection classification regression or ranking problem you just worry about what will be my input what will be my output what will be
the degree of model complexity right this will give you more clear clear Clarity before building the before going in actual cing part then next step is feature engineering so basically you have to interact with a do experts for example you might be working you you might building something a really good in in healthcare space but you're not a MBS doctor right you actually need to have the MB doctor to actually get more information to understand the terminologies and actually make more information extract more information from the available data sources which we get that's where feature
engineering comes into comes into this place offline evaluation which means you set up some Matrix to evaluate your system before pushing it to to the deployment pre-employment means using the model by your own and understanding the prediction errors and what will the cost of wrong predictions and then using predictions to make decisions how will the end user interact with a interact with the predictions will it will it involve any hidden cost which can be in human intervention and lot of things which comes to that place and at last we collect the new data we keep
on col collecting new data for model retraining and preventing model decaying performance we also consider cost for data collection and the role of human intervention in data labeling because it's very very important for having good labelers in the data to actually for for actually helping models to extract patterns from it and then you have deciding frequency of model retraining and Associated hidden cost for for how many time we'll retrain a model at what interval and as well as if there's any changes in a tech tag which you have to worry about and then what you
do you set up Matrix to track system you to monitor your model once your model is deployed you know you have to deploy it so that so that for example in spam deduction you have to you have to keep on check you have to have some Matrix to keep on check whether your model is giving the wrong predictions or not that's the monitoring part and also identify situations where AIML may not be the best solution it can be some subtasks of it where you actually worry about something outside of IML right because it's very very
important to understand if we can solve it without IML because it's very um hard as well as the the cost of implementing am solution is pretty much big and also so so that's that's pretty much about um what exactly we need to worry about uh in whole mlops procedure so there are three things which comes into into the workflow of building uh workflow building the machine learning based software development that there are three main artifacts in building ml-based software the first one is data second one is machine learning model and third one is code and
three main phases which which is engineering data engineering ml model engineering and code engineering so let's talk about each step by step so data engineering is like you have to collect the data acquired the data and prepared the data accordingly right what is uh and then also make sure that there are certain things which is which we have to make sure in this so the pipeline of data engineering here how goes pipeline means stepbystep procedure to go ahead right first of all you ingest the data and the data which was ingested you explore and validate
the data that is coming from a true space as well as explore it to understand the data you format and clean the data you label the data if it is a supervised learning problem and you divide the data into training validation test set so so that that can be used for training the models I'm assuming that you already know about training and validation and all those things which is already there I'm not over here to explain you ml things I'm over here to explain you things which really matters so the next step is model engineering
so the core of ml workflow is writing and executing ml algorithms the pipeline here it like this you train the model you evaluate the model validate the model which predeployment so that makes that your model is working pretty well you test the model right using the unknown unseen data set which is the Unseen um samples which your model has never seen and then you package the model so that business can be used accordingly it can be pkl file or any such you know uh models and at last but not the least you have you deploy
the model you serve the model in a production environment you monitor so that it's it's it's going well and then you record and then also log it so that for every INF for for every prediction it is making so that we can go back if there's anything goes wrong so I hope that you understand these three things pipelines we'll go into that greater detail when we actually do this project we'll implement it live over there so that you could see pretty much easily and we'll use zml to develop execute and manage our machine Learning Systems
so I'll talk about uh pipelines and steps in pretty much small detail and we'll eventually go to projects because I think that is pretty long varo so we'll just try to talk in Greater detail later on as of now let's talk about what are pipelines so zml follows a pipeline based approach so don't worry about the zml thing as of now we'll come to that what exactly it is data on but currently let's talk about pipelines and steps so zml follows a pipeline based approach to organize machine learning workflows it can be methods to promote
efficiency rep repetitively sorry repetitively and collaboration in your projects so SA for example uh so what is pipeline it's like a movie production process a pipeline is a high level workflow that organize a series of tasks to create a final product in the context of a movie production process it can be of script writing casting filming editing and distribution casting depends on stre script writing filming depends on casting editing depends on filming and distribution depends on editing everything is interrelated everything is step by step not you you you can't do scripting and then editing right
so similarly in zml your pipeline represents a complete ml workflow and each step over there it can be involved a step can be dat a preparation feature another step feature enging so feature enging can only be done if the if the previous step is completed and then train the model to evaluate and deploy it so here's a very basic basic example over here so basically you you you actually first of all uh step one which is the prepare the you have some ml you load the data which is using the step decorator you have another
function which using the step decorator you train the model you evaluate the model which is editing and then you deploy the model which is distribution now you combine all these steps into a pipeline right you have the data you give the input data which is using a pipeline decorator you give it to the feature engineering and then you give the features to the model and then you give the models to the for the evaluation to evaluate and then you deploy the model and then you run this whole pipeline so it's run step by step to
reach and give you the trained model so there are a lot of things lot of benefits of it which we'll discuss throughout the course right so in the next set of lectures I'll make sure to introduce you to bit of bit of collab notebooks to make sure that you're aware about basic functionality of Zen so that we could actually use this in our project and then we'll be go in building our first project of mlops so let's get started uh with this uh our first mlops proy hey everyone welcome back to the new video on
mlops course so basically today what I want to do is I want to make you familiar with core and the fundamentals of zenel because it's very very important to understand what are the co Core Concepts of ZL to actually start on building several projects using zml zml is an open source library for building uh full stack amops applications and the reason why I want to use this ziml because I personally worked over there for for about six to 7even months over there and I've worked there with their core team and actually this is super simple
to use that's why I want to use ziml you can use several other orchestrators which are available in the market however the most easiest one with the best ones uh is ziml so that's why I want to use ziml however uh you might face sever other problems U but there's always a community which where you can interact and uh resolve your doubts so let's get started with ML pipelines with zml and today uh this is The cocol laab Notebook from Zen byes which zml Zen ml team has already built for us and over there this
the the the Zen pites what they want to do from the these kind of collabs they want to teach you the Core Concepts so I want to utilize these and then uh record a videos on top of it to actually make you understand some of the Core Concepts of zml so let's get started um and also we'll be doing projects don't worry this is just for core understanding because as we say the core is the power so let's get started uh with this notebook so uh first of all what what we will do we will
install the zml server which is the zml server which is very important for us to install U you can this is a command line uh command which you have to paste on a terminal if you're using vs code however you can just get started with collab will when when we'll go to projects you can actually see the way I'm doing over there we we'll we'll also make use of pyit learn because I want to show you the demo that's why I want to train a very simple model over here and pipe Haring which is important
for collab and then we do it for simple things which needed however I have already did it so you don't need to uh you you only need to do I don't need to do because it takes bit of time to download so so uh and the next uh you need NG do account if you want to see the visualizations and all the stuff which is pretty easy you only need NG do account for colab you don't need NG do account for something if you're doing uh for on vs code or simple python code you'll be
easily having access to that however for collab you need NG do you can actually have it I have a coupon code over here it will I you can actually I will hide it I'm so sorry for that I'll hide it or you can just have your own ngok token as well cool so uh and over here you this is just for collab setup this is not for uh any things you which you have to learn so uh what what we will do is uh will you you might be familiar as it says that you might
be familiar with pyit learn pytorch or tens oflow so as I say the ml pipeline is simply an extension which includes the step by step as I as I told the example of a what I told the example of a movie production process so and and in that movie production you have scripting casting editing and all those things those are steps which are interconnected with each other right so the the the reason why we use pipelines is because of the following reasons the first one is we can easily rerun all our work not just the
model right so basically you run each and everything from the starting so which which helps to uh eliminate any bucks also make our models easier to reproduce second one is that for every pipeline you run you have for for for every time you run you have the you have you can easily track the previous previous run in this in these kind of pipelines for example you run the code one time and then you run the second times and usually you don't have the access to the previous run so basically using pipelines you can have the
access to every runs and it can be tracked as well and then you can comp then you can use for several purpose for for example comparing two different versions of the models right and also if the entire pipeline is coded up we can automate many operational tasks like retraining redeployment and all all those things which is needed via cicd workflows we uh don't worry about if you didn't understood this line when we'll actually do one simple uh project one simple project you will see if any things changes on the data how we how actually pipeline
help us to um this redeploy or retrain our model Okay cool so let's get started with the zml so first of all you need to have the you need to have the Zen Library installed first of all what we'll do we'll remove any existing files of there and then initialize a ziml repository which is very important uh which is very important which is the first step whenever you use ziml Library so it initializes your zml in uh in your current directory So currently you can just use ZL in to initialize it and this happens this
uh exclamation mark shows that we are dealing with terminal commands so now so now I'll show you what we exactly do for basically I'm going to train I I want to train uh class psychic learn SBC which is a support Vector machine support vector classifier on want a train a support Vector machine classifier to classify images of handwritten digits so basically we we will do the handwritten digit recognition using uh support Vector machine so over there we have there there is there are several images and each image is either 0 1 2 3 4 all
the way around to the 10 so basically you want to classify the numbers the images handwritten the digits based on 0 to 10 uh numbers so let's get started with it uh if you are not sure about what is handwritten digit recognition for should take a look at online what is the problem is about so basically what what we will do we we load the model we will train the model we'll trest the model on the and then we'll see the accuracy however this is not the right thing to do okay this is just for
practicing this this can be thousand X more complex than this is very basic version of anything right this is just a dummy dummy thing just to Showcase you so basically you load the digits so load digits will load the data set from SK learn data set so basically we'll use the SK learn data sets to load the digits and then what we will do uh and then we'll reshape it so that we'll just do the little bit of processing and once the processing is done we'll divide our data sets into XT train X test y
train and Y test the reason why we divide it because so that we can train our model on X train and Y train and then we can test our model on testing however again I'll say that machine learning algorithm coding is thousand sex much more better than this we teach in our code machine learning course uh which is if if you actually see the code machine Lear course you'll see what is this so yeah this is just for dumb example don't take inspiration from machine learning uh algorithm over here right and then you simply train
toest split and then you have the support vector classifier and then you fit the model and then you evaluate the model okay pretty simple now what now this is I and then you run it you get the test accuracy now how can we run this into how can we divide this into experim into pipelines so what we will do as we have the Z ziml init which is a ziml repository we'll create our first pipeline the first pipeline will have the following components in that it will import it will train it will evaluate the model
okay so are three distinct steps in this example loading of the data training of the model and evaluating the model right you can simply use we can will will we will simply make different different functions for different different components over here so first of all we'll make you'll make use of adate Step operator which you can simply import from zml and then what you'll have to do you the the Importer importer Does this does not takes anything this returns right this is called type set the the thing which we have to return over here so
the so basically the utter will load the digit will reshape it will uh train to split and then return xtrain xest y TR by test the reason why we have to actually write over there what is actually returning is for several reasons which happens behind ziml behind the scenes because this ASV trainer should know what type of input it is coming to me because as we trainer will get X train and Y train right so they should know what type of soda to verify the type of the data types which importer is sending and SVC
is getting we need to actually state that this is something which is it is going to return this is also helpful in readability also it happens it also helps in the back end of the system which is annotated and then np. and array this is the Xtreme will return the extem the type of that is nump array exess type of that is an so this is a formal annotated which will annotate our outputs right and then you have another step which is SBC trainer right we which we again decorated with at theate step and then
it returns as we see over it takes X train which we say that it is a nump array y train it is also a numpy array and it Returns the classifier it Returns the classifier right so basically it Returns the classifier which means that we train the model and it Returns the classifier so basically you you can import classifier mixing from skarn base which says that this is the whatever it whatever it is the type it will just make make the type so it will be classifier mixing it will be mix of classifiers you can
easily search online about classifier mixing if it un confused about what type of data type is this this is just the SVC classifier data type you can also write SVC over there by identifying the type of this model the next step is it will take the input as a X test and Y test and the model and the model type would be classified mix in and it will return a float and also it is decorated by step operator and then we just uh this test accurate score it and then however there there there can be
several set of classification measures just as of now as we say that we are taking a baseline now once we have the steps now we to connect each and every steps so we'll use the zml pipeline and the pipeline will have following like this first of all you exrain xess and you import you use the Importer which you built over there and then you use the SVC trainer and then and then you use the SVC trainer where you give the the this step which X train and Y train you give it and then you evaluate
up so once you run it once you run it you will be getting and then you simply use digits SVC and then digits pipeline when you run it it will initiate a new Run for the pipeline which is currently it I have I've already run so this is the version number two so you can you can in the dashboard you can visit the version number one okay and then revisit the accuracy over here and as well as previous one so um you can simply go this it says step importer has started step importer has finished
in 2.73 to2 seconds SVC trainer has started as we see trainer has finished evaluator has started it is finished and then every run digits has finished you can visualize your pipeline runs in simply zml dashboard right and then you can run this code and then go to this U URL so basically when you run it you will be prompted to URL something like this and then you can easily go over there and then you will Simply Having your pipeline so basically your password should be default okay so basically whenever you um go over there so
let me show it to you so basically um let me run it quickly if you want it uh I'll run it so you can see now it will train the third version okay so sorry it will train the second version because we have reinitiated our the stuffs right so you can simply go over there and then um it will will run it so now you see it's starting the zml server you can easily go over here and then it will it will open the zml dashboard now once it opens you can so basically you'll be
prompted to something like this you have to write default over here yeah you you have to write default over here and then click on login so once you click on login you'll be automatically having you'll aut automatically go to pipelines and then visualize your pip pipelines over here however this is not the pipeline which is over here you you'll be viewing something like this so this helps you can easily go to the previous one see your model score come to this your model score and etc etc so I hope you really understood what exactly steps
and pipelines means in ziml this is just the basic things in the next lecture in the lecture 1.2 what what we'll do we'll I I'll show you some of the Magics of what zml does and then you'll be surprised to know about that as well right so let's get started with a new lecture and that say pretty much simple that we'll first of all go ahead and read understand what our data looks like so that it gives you much more clarity about the problem statement which we really want to do currently I'm not setting this
too much on business objectives and all most probably we focus on the technical aspects of building this envelops project so over here you have the data and and and the data says the oldest customers data set and that data set has the uh most probably over here if you see that customer ID customer unique customer City customer State and then we have geolocation data set and then we have items data set and lot of data sets out there so what we did we we made our custom data set over here so if you go and
see our custom data set we have the lot of features where we combine everything to one and then we have the review score which is our uh most probably our uh review score which is the factions scode so I'll quickly show you um uh very basic how does it looks like in Excel sheet because it's more important to let you aware about the Excel sheet because it's much more common because currently it's bit complicated over here if you think uh in a basic um Visual Studio code so let's get started with actually show showcasing it
actually takes a bit of time because the file is a little bit large but no worries we'll get started with it but as soon as it as it is opening so what I'll do what I what I'll do I'll create several fold folders which is very important for us which you can take as an template for for starting off right so let's get started actually showcasing but before that it it actually opened up so you see that order ID customer ID order status order purchase order approved that and lot of features comes in comes into
this place and then finally you have review score which is from 1 to five which is from 1 to five however currently we'll be not using this review comment and we'll delete lot of you know features though not not because of the feature not because of the it it does not holds importance but or because but because because we I don't want to make it complex project initially you can of course tweak it accordingly you make it on whole data you know do do in the setting of machine learning setting and lot of things which
which you can do currently I want to make it pretty much simple that's very nice so let's get started this is our uh Target variable and all of this is our input features which we have to use to actually predict our uh customer satisfaction score so but before that what I'll do I'll quickly make several folders which is very important for us to get started so um and also but but before that let's install several libraries which are listed in readme.md so uh one thing which I just have to uh make a note that you
have to actually perform all the actions all the installations every of your operations in a virtual environment currently I'm in customer satisfaction virtual environment I actually use something known as as you know what I use I I use spy EnV you can use cond or you can use um ven or literally any virtual environment which you're going to use for actually creating virtual environment if you're not aware of what is virtual environment it is actually containerize all your applications into one's environment so that your dependency conflict does not happen I know if you if you
don't if you don't know you might be not able to understand this so we have linked a very nice resources in the GitHub repository just before this section uh which you'll see in the GitHub repository to actually understand what is it version environment means but it's very very important to in a virtual environment to actually have it everything on the good page cool so what I'll do I'll quickly install so basically uh I'll I'll first of all pip install zml server so that you can this is for the one who actually wants to run the
whole project Let's ignore this and let's try to install this first of all zml so I'll I'll quick quickly go and install zml so you might be seeing that it is giving some errors so what what we have to do we have to actually add something like this I hope so it works if it does not we'll have to go to ZL ser and see it okay cool it's actually working so it will take some time to actually download a zml and and I'm personally downloading over here so that you can also see that how
exact exactly these thing works and also what I'll do I'll quickly um import the requirements which I have it over here uh so that you understand it much more in a great detail so um what I'll do I will have it over here okay cool so this is the requirements.txt however you can uh this is just for show CAD boost SL jbm we'll not be learning about these algorithms if you want to learn these algorithms enroll in my core machine learning course but um but we'll be not learning this this is just for installation of
the libraries which is very important for us to install it prior however you can totally choose to ignore this we just coding step by step so that you understand understand it much more in Greater detail so now it shows that it is actually installing and I pretty much think it is installed um yeah so it also say that that we have to upgrade so let's just copy from here to here and is because the reason why I like to upgrade it because it really shows pretty interesting and pretty beautiful the way it downloads not not
is this white white one is actually very colorful that's why I like that I'll just clear it very quickly and then what I'll do I'll go ahead and zml up so what is zml up up does it UPS or the or awakes the zml server so that you can view your pipelines you can view a lot of things out there if you simply put zml up but before that what you see that it is not running the reason why it is not running that we forgot a very interesting thing out here before that we have
to actually write ziml init which initiates the repository over here which means that it it initiates the ziml repository over here so right you could see that that a do Zen folder will be created over here as soon as possible as the runs is completed so let's just wait for a few seconds and let the let the Zen ml gets Zen ZL in it gets completed the reason why we want to create the repository because we want to containerize or have all our code inside that repository so that it can be used for several other
purposes which you'll realize it a later on okay cool so uh but let let that running it's it is a first run that's why it is a bit taking time so let's just go ahead and create folder which is very important for us now you see that zml is z do dozen is created and it says that your zml fine version does not match the server version the version mismatch might lead to errors or unexpected Behavior kindly refer to blah blah blah so let's do one thing let's simply ziml down grade so that should definitely
replicate all our errors out there which is this warning because it's it's you know it I'll tell you from my personal opinion that it's very very important I'll tell you from my person it's very important fix up the warnings the reason why uh I want you to fix up the warnings because it's because sometimes it might happen that you'll completely get unexpected error and you'll never realize that you were here right so that's why I really want you to first of all um uh make sure that you uh satisfy all the errors so basically it
says that your Zen cine version doesn't as server version so you can I either downgrade or do a lot of things to actually get it done but uh living that it that it is let's just go ahead and create our folder so the folders which I'm going to create is first of all the Zen folders created the data folders created one thing which we will do in a future in another projects of this whole course is that we we we will not use CSV data set we will use the poster SQL and then retrieve the
data set from SQL because in real world setting we not eventually use CSV we actually use uh SQL databases from cloud or somewhere like U fster SQL on local right and then we use from there and then we retrieve and play with that data right so that that we'll do it later on but most probably just let's just keep it very simple and let's just go ahead with data folder cool um another folder which I really want to have is something as model folder model will contain all my um for all my files which is
of like models and stuff all the all the things which is required for training the model or you can also name it as a source so let's name it as a source because that's more important right so let's name it as a SRC and which which is which all which all contains your data sets now what I'll do I'll quickly create something known as pipelines so pipelines will contain all our pipelines which we have which will build saved model will contain if you if you want to save the model um eventually you don't need to
but you know just just for reference we created steps so steps will contain all our components or the tasks which needs to be done over here and then at last you can actually have in. Pi so let's quickly create something know as inite pi and then after that we'll create there is always there's always a requirements txt what I'll do I'll create something known as run pipeline so we can run our pipeline over there run pipeline. P cool so what I'll quickly do I will first of all um code all the um the data things
first of all what we'll do as as I said this is not a formal machine learning engineering course right is the mlops course so I'll make sure that to keep keep things very simple very knife right right but if you want to learn like more of the advanced things in machine learning there's always core machine learning course available out there to actually help you out the first thing which you want to do the first thing which we want to do is ingest the data so we'll start off with first of all uh steps so steps
in that we'll create the file Lim ingest data. pi and ingest data. Pi will consist of the steps will will consist of the steps where we will inest the data in it so what I'll do I'll quickly import logging out here so that we could eventually log when when when things completed because it's very very important to log as well and then I'll import from import pandas as PD and then will from zml import step as I as as you have seen in basics of zml that we have to actually use this step over there
then I'll create a class of ingest era I'll create the class of ingest era oh my God you know the the way my keyboard is not working pretty well and over there what I'll do I'm actually using copilot still but um but uh you have to also you know give the good documentation which will write pretty nicely so neat so let's just quickly do it so I'll just in it and you can actually do it like this um which is in it and then when you run it you can actually write the get data and
then ingesting data from the data path pd. read CSP and then self. data path right or what you can do you can simply give the pcsv this direct file to this totally matters on what you want to give okay so then we'll create a step so where we can use that class we can use and then that step will consist of the data path which will take St Str as an input which will the string of course and it will return the data frame right it will return a very nice data frame and then what
I'll do I'll first of all make a try statement I really don't don't want to take the help of um I really want to take take the help of this guy uh co-pilot but if he's helping me I can't do literally anything so what I'll do I'll show the way to write the documentation first of all you know what you write you write the description about that function so use ingesting the data from the data path then we'll write the arcs arcs means the argument it is going to take the data part which is the
part to the data and then what it returns it Returns the Panda's data frame right this is how you actually do the inest data and this is what the very interesting workflow is then we'll write in a try try and accept a workflow which will have something like this where we first of all in in instantiate our class which is ingest data which is ingest data where will first of all ingest data which is the data path and then we will simply say DF ingest data. get data and then return DF this can be easily
done in a three one line of code as well as shown by the co-pilot but I want to make it pretty simple as well for the beginners if you're watching it so for accept exception as e and then it says error while ingesting the data and this is what the error is so this this is this actually helps us to uh the best practices of coding and same goes to over here we have to actually maintain it nicely right um so let's just go ahead and quickly do it uh let me just remove it yeah
so I'll just make use of interesting data from the T data path and then we actually instantiate the method so this is this is this is used that this is used as instantiating the method uh arcs that's it and you can also write instantiation but if not bits is not required eventually over here what what what we do inting from data path and then you simply write nothing and then you just that's it so this is a basic workflow which you have to go ahead and create this step the first thing which you have created
is of course ingest data now we can actually use this ingest data which we will eventually use it later on as well so now the next step once we have ingested the data we need to we need the next one which is we need to clean the data so now now what no so now what what we will do we will we work on creating step which we will use for cleaning of the data okay so let's let's do quickly one thing we'll create first of all let's import loging which is pretty important to do
so this is something which is of course and then from zml import step right and then I create a step that step will clean the data that step will clean the data right and it will take the data frame I don't know what it will return so let's skip it as this and then we'll pass it so we want to make this step we want to make this step right which cleans the data The Next Step which I want to make is the the is the one which trains our model okay which trains our model
so first of all then I'll write model train. and in that what I'll do I'll create another step I'll create another step okay I'll create another step so again same thing so you have to just go ahead and import loging and uh import logging and then import pandas as PD and then from zml import step right and then you just write step and then you just go ahead and create the train model right train model and it takes blah blah blah and then it returns something and then chains the model past them okay so that's
that's something which we have to go ahead and I'll just quickly make this like this cool my battery is low I'm so sorry for that but yeah first of all this is the model train which we have to have now once the model train is there we have the clean data we have inest data we have trained the model now the next step which should be it should be evaluation it should be evaluate the model uh so I'll just write evaluation do p and over there same thing which is something again so import logging uh
from zml import step and then at step we just just have something Define evaluate model and then returns nothing okay so this is something which we have to actually have and then once the evaluation is done we'll have you know some know that's it so that's that's that's the four steps which we want to have now you might be saying that I'm not implemented I'll implement it the first step is always create a blueprint right so that it runs nicely okay that's the first step and whenever you go you have to actually understand the first
step now what I'll do I'll create a pipeline okay I'll create a pipeline the pipeline would be first of all training pipeline pipeline dot pipe that pipeline first of all what what it will do it will from zml okay so we can just from zml zml import pipelines and then at theate pipeline so just just let's just write at theate pip P line and then what I'll do I will simply go ahead and create the training pipeline the training pipeline the training pipeline will consist of the following that will consist of the following that will
ingest the data that will ingest the data cleans the model cleans the data trains the model and evaluates the model okay so the training pipeline will consist of the following so I think that something is wrong over there so let me just quickly do it uh just just give me a second okay so now what I'll do I'll create the uh training pipeline so let's just quickly create a simple training pipeline our training pipeline does not takes anything most probably it takes the data path right it takes the data path as an input and that's
pretty much it I guess yeah so it takes the data part as a input it first of all we we'll import everything over here we'll import first of all from steps uh from steps. ingest data I'll import the ingest data I hope that is working so let's just make sure that it's not following any conversions so yeah inest DF cool and then what I'll do I after inje from steps do clean data yeah so we'll just go to clean data import clean maybe I'm not sure what it is so let let's let's just go okay
clean data let's import clean data and most probably we'll have clean DF just for make sure that we have the good naming conventions and then after that what I'll do I'll import from Step strain model and then after that from evaluate model I just make sure that evaluate model is there cool so once we have all of these steps what I'll do I'll quickly do all of these things very nicely and show it to you the pipeline so we'll just go ahead DF is equals to ingest data it will ingest the path it will clean
okay so most probably we'll just have something cleaning so clean data will take the dfz input fair enough it returns nothing so it returns nothing so we just have something very nicely over here after cleaning we'll have strain model and then after that we'll have evaluate model I just hope so that everything takes um DF as an input so that it makes sense Okay cool so now once we have this what I'll do I'll do nothing I'll just go ahead and then run the pipeline okay so how to run this pipeline so we can just
create something known as run pipeline as we have created let's just go and create the Run pipeline as soon as possible so we'll just go from pipelines do training pipeline import training Pipeline and then you just go maybe just write train pipeline just just for make sure that we following the naming conventions so the train train Pipeline and then after that I'll just if name is equal equals to main I just hope that it does yeah we'll just run the pipeline so run pipeline will happen something like this and the data pipeline which I'll give
is I'll just copy the whole path from here and send it out to here there are a lot of things which you can do you can actually upload in cloud and do stuff which we'll do it later on the of the course Okay cool so let's run it so are you ready if you are then give me a thumbs up I'll get started with it so I'll just go ahead and clear most probably and then let's run it Python and pipeline you know the when I code I actually listen you know lot of you know
what do you say uh music but eventually I'm not as of now so sorry for that install pandas pip install pandas right so let's install quickly because that's more important okay something really happened over there uh evaluation right okay as I say uh import Hondas PD that's let's go ahead okay it says modu object is not callable I think most probably what's the error is about it's about uh let let let let me just quickly go to that error training pipeline pipelines okay it's actually pipelin bro right it's actually pipeline okay cool enough let's go
ahead Okay cool so something is giving a really interesting error uh because of some of the things so let's just quickly fix it so it when so what is happening I'll tell you why the error is happening it says that wrong wrong type out wrong typee for output for step clean DF why it says that it is expecting Panda's data frame because you have told that it will give Panda's data frame but it is giving none so we have to actually write run over here same over here uh same over here so it is actually
expecting that um you know uh it will return something that's why it was initiating that it it'll give warnings just go ahead with the warnings okay that's pretty nice thing which happened cool so I'll explain you what what what just happened you can choose to ignore um completely about all of these stuff like what is this user stack and orchestrator artifact store we'll explain later on so you'll see that the inest has started clean DF started right clean DF has finished evaluate model model started evaluate model finished trade model started trade models finished nothing goes
over there that's it now what I'll do I I I'll showcase you this very simple dashboard which is out here so we'll just go ahead and go to the dashboard quickly so the username would be default and then login so when you log in simply go to the pipelines for you it might be super new so let's just go to pipelines this is the train pipeline so let's go to the first one okay fair enough so inest data gives the output which is the data frame so that is a data frame so if you go
and and you see the this is the output and it also shows some of the visualizations or you know data type of it you see that data is imported this is called the artifacts okay the thing which is stored uh and over here if you see for after every step so this injest DF has something known as what is the name of this what is the doc string which is like the documentation right start time run time and all those things and then after that what what is the output and the artifact so artifact is
something which is returned after every step which is stored in some local stores it is which is stored in some store which can be retrieved further so you see where where it is stored it is stored in this U uh URI so if you go over there and you will see a very nice output over there if you if you go to this particular location okay that is the and logs are simply nothing using cas version step to you you might be noticed what is using cast version of it this is pretty interesting to understand
we we'll understand it way greater detail I'll show you a very nice example of it just wait uh and then you have clean DF which is again clean DF is finished evaluate model train model so you see the ingest data gives the and then it does not returns anything so this is a visualization which you can for sure see over here and that's pretty much it and now we pretty much think that our uh dashboard is working our pipeline is running up and we are good to go go with it right cool so one thing
which I just want to make sure that you are aware about is something known as caching so uh so what if I do enable cache is equals to true false sorry let's run it okay um let it yeah cool run pipeline now let's see this on our dashboard pipelines the one the latest one so um so you see the same thing which happened over here so I'll just quickly you know do it and then show it to you okay so now this is the you you could see another version is over there which is version
number four now ingest data started ingesting data from this ingest data has finished but over here do you think that something really happened interesting using cast version of inj DF so zenim has an amazing and super duper amazing feature what does it mean that it uses the cast version so if there's nothing nothing changes in the data if there's nothing changes in the code or if there's nothing changes in that step it will use the step from the previous run and you see that how interesting this is that see how's the level is going on
that's nothing changes right and we eventually using those because the caching was enabled but but caching was disabled right so so catching was disabled but over here I told to enable to to actually uh actually I sa catching isal to for that don't do caching don't use run from the previous version it states step inest data has started uh ingesting data and then in just is finished so if you make it true if you make it true so let me let me just make it true let's run it so uh using cast version of ingest
it uses from the clean DF also not change evaluate model it just trains the model in matter of seconds you see how good it is say you're training a large language model right and this is a feature over there you'll be super happy that your forast version has been used sometimes it causes error but most of the time it works like a charm okay that's pretty much cool uh I hope you understood most of most of the things from on the next session on the next video what I'll do I'll Implement all these steps and
run it step by step uh and afterwards I'll deploy the model I'll deploy the model using mlflow I'll also track I'll show you how I how we can use ml FL experiment tracker to actually use this and then we'll make a very basic stream extrem lit application to actually use this uh to actually use a deployed model to actually make the Press inference right we'll use the ml4 deployment and ml4 tracking a libraries to actually integrate into zml and then use it currently we have the blueprint ready so that what you have learned in this
first of all you learned the way to about write code and to structure the code also you learned a very important thing that it's always good to start with preparing a blueprint and then starting coding it I hope it really made sense to you I'll be catching up in the next video bye-bye hey everyone uh welcome back to another video so what I'm going to achieve through the this video is we will Implement all the steps which is listed out here so that with the clean data ingest data evaluation stuff however uh we'll we'll do
it in very uh nice way I'll show you the way to write code in a nice way by using design patterns I hope that you have already that you're already aware about the design patterns before you started this project if you are not then also we have a very nice resources uh which which will be linked of course before this lectures you'll be also you'll be taught the basics of patterns like strategy pattern Factory pattern singl ton pattern and all of this is already taught to you so let's get started with actual imp implementation of
uh data cleaning and see if you're not aware about these patterns we actually teach in our course cor machine learning course you can actually consider enrolling over there or we'll add before this uh section as well cool so in Source what I'll do uh we have several other so first of all in s SRC will Implement all these classes the classes of these steps and then use the classes from this in these type in these steps first thing which which I really want to develop is data cleaning and data cleaning is something which is obvious
which we have to work on so let's get started with actually uh creating uh data cleaning classes so I'm going to start off by importing logging if we really need to log literally anything and then I'll import from ABC import ABC and abstract method uh and then after that from typing import Union but I'll just make I'll just import some basic libraries and then as we need we can import more so I'll import pandas as well import pandas as PD and then from escale learn. model selection because we are going to split out data as
well so what I'll do I'll create a abstract class abstract class for defining a strategy for handling data okay you might already be aware by several animals example of strategy pattern and all so first of all create abstract class for defining our strategy this is known as the data strategy this is known as the data strategy this is this will be an abstract class this abstract class would be abstract class defining strategy for handling uh data Okay cool so now this will uh with we'll create an abstract method in in it we'll create an abstract
method this you know the reason why we do it is already known to you the reason why why we do it is to just make sure that we have a that we can just that that data clean class will show the same handle data right we have to make the same class so when when when we'll work on other the the the the strategies of these data cleaning so you you'll see how handy it is so handle data this will be the data frame because I expect the data frame uh DF pd. datf frame and
that should return that should return um set of you know PD data frame or the series so basically we can we we we can just as I say from uh from typing import Union and then you simply add Union and then pd. data frame and it will return pd. cies to okay so this is what it is going to uh return however this is just an abstract class this is just a blueprint I would say blueprint mean this is what we have to implement in our strategies we can override this method to implement our own
custom Solutions so let's first of all start building data pre-processing strategy so we'll we'll build a data pre pre-processing strategy data pre process okay let's make it little good strategy and that will have something know some something which is that will inherit data strategy which is an abstract class so that we can overwrite this handle data right override this handle data so when as soon as we have the handle data so it will it will take the data frame as an input and it Returns the data frame as an output okay so we don't need
to logging as of now so it will try so we will have the try and exceptions try uh so basically so basically I'll I'll first of all drop certain you know uh um certain columns from the data because this is something which as as I already told to you that we want to make it super duper simple that's why I'll I I'll Dro several columns however these columns are not like they're not important they are actually very important just for Simplicity for this project I'm going to delete some of the uh some of the um
uh columns from the data because that's more important right so what I'll do I'll just um I'll just I have already WR the names of the The Columns which I have to delete so I'll just copy them out from quickly from there uh yeah so you can also see that order approved that order you know received that and all those stuff which is required over here okay very cool so data. drop you do you drop certain uh columns you drop certain colums out there and then you simply go ahead with it okay cool so now
what I'll do I'll go to go go to the next step I really hope that it works nicely yeah cool so now what I'll do um there are some certain columns there are some certain columns which has which has the um null values okay so what I'll do I I'll quickly fill up the null values you can do by two or three two or three things okay what what you can do you can actually uh do the when these things are analyzed when you do the Eda part however I have done already Eda on my
part as as as I said you can actually make the Ed and then see which columns and all I've already did it just for Simplicity so that it actually makes sense to you to actually get started with directly with the project there are some certain columns um which is actually you know um there's some certain columns which which has the null values and we'll try to you know uh make it work so here we go so the data which we have they so basically these these columns fill now with a median of that column and
in place equals to true with the median of that column in place equal two median means we have to take the median of this and then we have to permanently apply this on our data right and there's review common message which Al also fill the null values with no review because there are several n null values in the data so that we can just write no view over there okay that's very cool so now what I'll do I will just go ahead and we will we will drop the columns we'll drop the columns which are
you know uh which are of non which are of non- number type or some some some columns which are actually you know um uh numbering times okay so basically we'll just take columns to train a model who are numbers however you can take like see the the reason why I'm doing this the selecting the number is not because of I want to you know I I'm doing it on purpose it is like the reason why is I just want to make this project simple so I'll select the AL select the columns which are of numeric
so that I don't need to apply a lot of processing steps okay so what I'll do I will simply go ahead and data is equals to data do select data do select select data types I'll select the data types which are include uh numbers so NP do number okay so this data will have the columns we have the data which are of only numeric type so now we don't need to worry about categorical encoding ordinary encoding or whatsoever or even tokenizer of this this is also removed okay but you can do lot of things out
here you don't need to remove this you you actually Implement another processing strategies where you encode the data where you tokenize this re review comment messages and a lot of things which you can do over here we'll also drop a couple of uh couple of you know uh columns and the columns which we have to drop is the following so the first one is is uh customer zip pre prefix and the order item so these are the columns which you have to drop the reason why we have to drop because this is not important at
all okay so we can just write data is equal to data drop drop equals to true and then return data accept exception as e logging error and raas E cool so um you might be worried about like what did I did over here first I dropped certain columns which is not required for us as of now because of the because because of making the project simple second you drop the you actually fill up the null values which are available in the in these kind of columns and then you only select the data which is a
numeric type we are not selecting the categorical encoding just categorical data that's just because of the Simplicity of the project and then you drop certain certain uh you know um uh columns and then you just return the data that's pretty much it that's pretty much it you're doing that's pretty nice so now what I'll do I'll create another strategy the another strategy is data split strategy so basically data divide strategy and then that will inherit the data strategy and in that we'll just quickly create strategy for dividing the data into train and testing set we'll
again U make the handle data and then the data it will take p. data frame and it will return you know it will return it will return the union of pandas data frame and series you will notice why I'm saying Union Union means uh both of them so here we go so I quickly explain you what does it matter so X is equals this is all copilot that's why I love him so data. drop we are dropping the um the the the target variable and then we have Y which is the target variable and then
xray in by xra y Train by test and give the test size to be 0.2 and the random state to be 42 and the next train is the panas data frame X taste is the pan data frame y train is a series and Y test is a series that's why your output of the combination of both of them I hope it makes pretty much very sense so now once we have that now once we have that we will make we'll make a final class where we will utilize where we'll utilize both of these strategies into
that class so we'll create another class which is data cleaning okay the data cleaning data cleaning class which will process the data and divide the data so I'll just create create data class for which will which preprocesses okay processes the data oops and divides it into training and testing set cool so what I'll do I'll quickly create Define need and then it will take self data frame and it will also take what strategy you want to implement that strategy would be the would be the data strategy it can be either you know this these are
are the these are the types of data strategy right this abstract class so strategy will take either do you want data process strategy or divide strategy Okay cool so what I'll do I'll quickly self. strategy see equals to strategy okay so now we'll we'll have another class which is handle data sorry method and that will that will either return Union or you know a simple Panda State data frame so this will return self. strategy. handle data and then self data so basically the strategy will for for example if someone chooses this data divide strategy so
so so data we can use the simple class so basically someone can someone will go and just run this class data cleaning so some something like this if name is equal equals to is equal equal to main oops I'm s for it okay um over here it will just say data cleaning okay so it will say something for for example assume that we reading this uh D CSV file however we are not going to do it right now and then data cleaning and then data cleaning is we'll instantiate this with data and then we want
to use this data pre-process strategy so that data pre-process that data cleaning will use this data pre-process strategy in this case over here and then it it has a method called handle data it will run over there then you can same same way you can give it another strategy right which is data device strategy it will do nicely okay so I hope that you really understood the the wave that we do this is called a strategy pattern where you first of all create the abstract class and there are several strategies in it which is data
pre-process and data divide and then create the final class which will make use of those strategies uh over here okay and this all this this is actually very helpful when actually be just for flexible code writing as well as readable as well as not writing so much of FNL statements cool so what I'll do I'll quickly implement this into um clean data so let's implement this into clean data that's more important for us okay so clean data we'll take the data frame as an input and then we'll just uh go ahead and try and then
let's go so first of all we'll import we'll import what we we we'll have to import uh from source. data cleaning I'll import data cleaning data divide strategy and data pre-process strategy Okay cool so once once we import this now we'll just go ahead and use it try and then try and accept so basically first of all we'll we'll create a processes process strategy so over here we can just go ahead and create Pro pre-process process strategy process strategy and then we C the DAT data cleaning class data cleaning is equals to data in instantiate
the data clean class by giving this process strategy and then we will uh have something which is processed data is equals to the the object which we have object which is data cleaning do process data okay process data sorry sorry sorry handle data so what is that eventually doing we have this class we are giving the strategy which we want to use we want we want to use the data preer strategy and then we are calling that strategies uh method which is handle data which will handle the data then we'll have the data divide strategy
or maybe divide strategy right and then it will again data cleaning and then you have something known as process ESS data in this case not the data now you have the device strategy now we can simply make use of XT train YX test y train y test do handle data because it is returning Panda's data frame in series and then we have to actually loging data completed and then excepts except as e loging error raise e okay so now one thing which is missing is it returning none it's returning xtrain X test y train and
Y test right so we'll use something known as annotated which is the python built-in type setting path parameters a type hend parameters so let's first of all quickly do this from typing extensions from typing extensions I'll import annotated which is a formal one so annotated what it will do and also we we'll have to import the tupo let's quickly import tle from typing import tle sorry uh over here we'll have two p and then okay so annotated the first output the first output is of course pd. data frame and then it will it it's actually
xtrain right it's actually xra now we have another one which is annotated X test and then another one y train and Y test and mostly we are done so so basically this is what happens that we are done and now one can actually now it says that it will return the tle it will it it will return the following it will return the four four types which is data Frame data frame series and series which is an annotated using annotated uh from typing EXT extensions so I hope that it makes sense uh now I let
me see what type of error it is giving it's mostly because of the this I hope this fixes it so now we are done with this step now we can just you know simple make it very basic doc string cleans the data and divides Orcs so let's just write Orcs raw data and then simply you just have this you can also write returns training data testing data training labels and testing labels okay so now we have this class ready for us and then we can actually use sorry step ready for us where we using several
strategies and then we'll actually implement it okay so I hope this this actually makes sense to you all now the next thing which which we'll work on is something something known as um which is something known as model development so model development is something which is pretty much important we'll actually make use of uh linear regression which we will Implement right away from here Implement right away from here and you know so yeah um so we'll just Implement linear regression out here so that it makes sense for you to get started with it so we'll
Implement a basic LR so that it is not however there's a lot of things which you can Implement I in the representative which you'll get you'll be having implemented these kind of like you know random forest or XG boost CAD boost and then after that we'll evaluate our model So currently we're not focusing on core machine learning kind of thing we're just focusing on building a full mlops project so I can build it in more complex situations another waye which we have is evaluation as well as where we'll we'll we'll make an EV evaluation measures
and then after that we'll also make the steps for it and then we are mostly done however there's something which is left left which is something know as deployment pipeline we'll also deploy the pipeline right you will be amazed to see the the way we deploy the pipeline the way we run it right and the way we run it and also we'll just use a stream application to actually go ahead with this then deployment so I hope that really it makes sense so let's catch up in the next video so now everyone what I'll do
I'll go to the next step which is model development which is pretty much important as well um so let's get started with model development quickly and then try try try to complete this project as soon as possible so modal depth. pi and in that what I'll do I'll create a I'll create again the you know uh abstract class and then we have to extend that abstract class from ABC from ABC import ABC and Abstract method so let let's just go and start off with it so we create a class model the class model will have
ABC right this is the abstract class for all models this is abstract class for all models um and then after that we'll create an abstract method the abstract method and Abstract method will be called as a self train and that self train will have something as X train which is train training data by train which is testing sorry uh training labels we we can also create some method known as optimize uh but it's not required as of now so let's just leave it so uh I'll create a very simple class see my point again I'll
say I'm emphasizing on it first of all focus on learning about amops and then implementing complex models and stuff so I'll just make a simple linear regression model on top of it so let's just make a simple linear regression model and then it will take xtrain and by train and quirks it will uh first of all for together and it would just Okay cool so we'll have some some something which is training and the training we'll first of all we'll just import from SK learn do linear model import linear regression and uh most probably let's
name it as a model okay it makes much more sense okay so um I'll just make it over here quickly which is reg equals to linear regression qux and then reg. fit which is and then return the regression and then return the regression right if we can also put put this in a try and try an error so so try accept Okay cool okay that's very nice so now what I'll do uh so now what I'll do I'll simply go ahead and uh you know just we have the model training model ready however uh we'll
see in the next Pro which we do this is model development is much more complex because we have to first of all train the model validate the Assumption test if things are working or not you know tweak the data fure engineering cleaning which we'll do in the next project don't need to worry about it okay so this is the model development which is linear regression model where we simply fit it and then we train the model we complete the training of the model so let's quickly go to model train and then we just Implement something
over here so uh what I'll do I'll just go ahead and then import from model sorry from Source model def UT linear regression model right yeah L linear regression model now what I'll do I'll simply go ahead and uh I'll simply go ahead and then first of all we have to first of all get the get the data which we have to so it will take several input so it will take extr X test y train and Y test so let's just take cell extr and then Peter data frame X test y train okay X
test white train and white test and it will uh yeah that's pretty much it it will return regression mixing so actually it will return the linear regression model right however there is something known as regress and mixing right so from SK learn from SK learn do base okay base import regression mixing regression mixing is a type of you know which is the type like of course we are going to um output the regression algorithm right trains the model and then simply you know appreciates the model my uh I mean just stain the model that's that's
pretty much a Okay cool so let's just first of all do it and then let's go on the next PATH so the model which we have is equals to none and we'll also make a config do PI we make that config.py so make some something as config.py and that config.py will have from zml do steps import base parameter base parameters and then create something as model name config that will have base parameters out here and it will contain the model configurations which we which we want to add model configs which which can be model name
first of all model what what model name which we want to use what model we want to use and then yeah so so that's it so that is the model name which we want to use so first of all we'll import some something which is over here we'll import a model train and then we'll import um from do config import model name config and it will also take config which will be the type of model name config okay Okay cool so now it will it will also take the config so config will contain the stuff
so if so if if the config do model name is linear regression we will say um just just you know use that model which is linear regression model linear regression model and then just train the model on X train and Y train okay that that is something which you really want to do or what or what what we can do we can just have have something which is U model so it will of course it it should it should return something right so let me just go and quickly see yeah it is returning lineation model
we just have train model is equals to model train and X train X test and it Returns the train model else uh we can just write you know something which is model name not listed or something some something like that you can raise a value error okay so um the reason the the reason why why I do this over here you might Implement other models as well you can just go ahead and Implement class random forest model right random forest model so you can just go ahead and don't don't don't need to change the name
you just say if the config says if the config do model name says random for regressor you train another model so this is how it works okay you don't need to worry about like lot of things out out here it's very simple to understand so just have it as an try and accept exception as e logging error and then raise the E Okay cool so that's it about the training of the models we'll just go ahead and quick quickly create some something known as evaluation system part so let's just go and create the evaluation part
as well evaluation Pi I'll just go ahead and create the evaluation. pi so again over here we'll create a very basic again abstract class and then it extend that abstract class to other um strategies which you're going to use over there from ABC from ABC import big ABC and Abstract method and then we'll just have class evaluation that that will take ABC and then it will have something it is an abstract class right it's an abstract class defining strategy defining strategy for evaluating our models right and then we'll have abstract method abstract method will have
something calculate scores so it will calculate the scores out here which is y true and then it will it it it is a numai in the array so we just import numai as NP so nump and the array and by prediction which is also the numpy and the array so cool so over here you have something which is calculus scores which is abstract method and then abstract method will have something over here which is y to the the model prediction sorry ground thr and the model prediction now what I'll do I'll simply go ahead and
uh create several strategies for it the first strategy which I which I'll create something as MSE so that that MSE will inherit the abstract class of evaluation and this is this is the evaluation strategy this is the evaluation this is the evaluation strategy that uses mean squ eror right mean squ eror and then we'll create the calculate score that calculate scores will take self again y y true and Y true and which is of of course num and I'll just copy it from here Okay cool so uh again so we'll just start with we'll just
say we have entered calculating msse so it will start off with the calculating MSE so basically we can use simply something known as from Cy learn Matrix from SK learn. Matrix I'll import mean squ error and R2 score so we'll just go ahead and then just do it so let's let's do so we'll just have some something MS e we'll just give y true and Y PR we just say that it is done and then we return the MSE otherwise we see if there's anything wrong which is error in calculating scores and that's pretty much
it so uh it Returns the MSE so now we have one strategy done we'll go and create another strategy we'll go and create another strategy another strategy would be R2 score so R2 score we'll have the evaluation strategy that you that uses so that that uses R2 score and then we'll just calculate the scores and then give everything out so it just implements automatically of course you can add your documentation on your own over here I'm I'm not adding it right now please add it by your own the way I have taught you to do
so we'll have we'll have another um evaluation strategy which is evaluation rmsse and then over there we'll again that's evaluation strategy that uses the root mean squared error to calculate stuff so it just again just you know mean square error and then rmse and squared equals to false so basically over here your calculating the root mean square error right okay so now we have the rmsc also done so we have several evaluation strategies totally done now we'll just go ahead and then implement it in evaluation out here so now which is the last thing which
you have to do is very very simple that you actually implement this so we have uh so first of all I'll import from SRC do model Dev sorry evaluation I'll import msse rmsc and R2 R2 is it there evaluation R2 yeah R2 is there cool R2 is there um so it will just first of all we'll have the evaluate model this will take lot of things first of all it will take model okay that model would be a regressor mixing so we'll have to UT this is the type of the model would be the regression
mixing because it is a regession regression model right so import regressor mixing then we'll uh then we'll uh get the X test then we'll get the X test that X test will be the Panda's data frame and then we'll get the Y test again for and for understanding now let's try try to implement the solution let's try to quickly implement the solution so first of all we'll get the prediction we'll get the prediction quickly the prediction which we'll get is model do predict and then on the X test so model predicts on X test we
we create MSE which is is equals to MSE class so so sorry MSE class is equals to MSE and we use that MSE is equals to MSE class class. calculate scores it will simply just have to give y test and predictions and then we are done so now we have another ver is R2 class R2 class and then about and and then after that you have calculate the scores and that's pretty much it and then you have another rmsc class R calculator course and then that's pretty much it cool so we'll return at least let's
return two things let's return um msse let's let's return R2 score and rmsse right because that's more efficient to actually look at the Matrix so we are done and we can just put this into try and then this into accept eror evaluating the models Okay cool so now we are mostly done the one thing which is left out here so you might be thinking what is left guess what is left so over here we are returning to R2 score and RMS so we also have to indicate over here that what thing we are returning so
I'll just import from typing import Tuple and from typing extensions I'll import annotated okay so this will have two and then it will return two things let's annotate the float R2 score as as the RMS okay so now I hope that it makes little bit more sense now uh I really hope so yeah cool so now we have the evaluate model also done which is which means that we are pretty much done with ingesting of the data cleaning the data training the model evaluating the model now you understand everything is completed now what is left
let's worry about that so we have something known as run pipeline so let's try to go ahead into into that Pipeline and let's try to create the pipeline let's try to run that pipeline right away from here so let let me just quickly go and run the pipeline um so I'll just go [Music] to um yeah cool so I'll enable the cash as are true and this takes the data path that's that's good that's good takes the data path we have then clean what does clean DF takes clean DF takes something and what what is
it returns this returns X TR XTR and by TR okay so let's just quickly write this xra X test and Y TR y test it it's a clean DF right this takes the data so now it is done we have train model so train model what does it does train model you know takes X train X test y train y test and the configs as well so what what we have to do over here we have to actually um okay so let's try to uh quick quickly do that as well um so I'll just go
ahead and uh model train so this is train model okay so train model and then after that model is equals to train model train model X test y test all the the Y test and then we simply go ahead and msse which is msse sorry R2 score and rmse you evaluate the model by giving these things X test and Y test I hope that really is X test and Y test that's it yeah that's true and we are mostly done right so now we have the pipeline ready we have everything ready now let's go and
run the whole pipeline to see the magic I'm pretty I'm pretty much sure that it it will give some sort of error but always be be on a positive side so let let's just quickly go and run the pipeline okay so no module named psyit lar so let let's just quickly go and so I'll just go escale learn pep install and then just go and uh okay I'll use this one because this is much more easy to install okay let's just wait for this to be installed because I'm I'm actually using the new environment that's
why please activate your environment before working on the project please that's the request for you all of you out here let's wait and let's see the magic what happens it's running on so just wait for a few seconds um and after that we are mostly done done with the pipeline of melops you will see the dashboard the next thing which is left which is integration of tracking of our experiments which is using MLF flow and then deployment of our model using MLF flow deployment these two things are left and then we are mostly done with
the project and you'll be seeing like now now I really hope that you are seeing the way we do the project the way you know we do the caching stuff the way we write the code that is much more visible in the next set of projects you'll see much more challenging code much more challenging topics which eventually you will learn by yourself so I don't know why it is not a running but okay so cannot unpack a non iterable step R effect object so I guess something really have an interesting out here so let let's
just quickly go and see what the error it has given so this expects the data frame and this returns X test and Y train and Y test and then clean DF we have the following full name uh okay my test so let's see now if it is actually returning oh yeah so you see that that it is not returning anything we have to actually return it right that's why it is saying that that's it is returning none and when it is asking for the output so that's why it is not able to cool so inest
it is started clean data completed and then we just go ahead and then it justes does something small training completed something failed in the pipeline and R2 score is not defined so let's go and quickly do it so uh let's go to evaluation and then this is R2 not R2 score so let's just go and do that as well so now you'll see how quickly it will be run first of all it will use the cast version you know it it it will use the cast version of it and it will just do the evaluate
model and then you're done you see the magic you see you you just see the magic just simply install P install P Arrow to remove this error so let's just quickly go in zml up so let's just go in zml up please do it it will open it okay okay sir I'll give you the default let's go to pipelines let's go to train pipeline let's go to this pipeline you see it ingests the data it againsts the output clean the data at the clean DF it returns these these goes into training the model it returns
these gexas goes in evaluation of the model it return R2 score and rmsc you see how magical this is right how magical and how really interesting these things has become right now I I I really think that this is the power this the future of ml right if you don't know about this you don't know anything right so I just go I just hope that you understand it much more greater detail we are most done with the project however there's two things which is left which is deployment as as well as we are also left
with tracking of our experiments so I hope this makes sense I'll be catching up in the next lecture bye so hey everyone uh let's come back to our project so basically the project is left with two things the first thing which is left which is of course our most favorite experiment Draco and the second thing which is left was the deployment of our model so I'll talk about what this experiment tracker means along with I'll talk about the deployment pipeline so let's just first of all talk about about what thises experiment tracker means so when
you do the data science engineering or a real world machine machine learning engineering job then most probably what you will see that whenever you have you actually want to track every run switch you do because you have to tweak the parameters and then rerun it and then check the scope from the previous one compare it with SE several Matrix and see how well well it was performing in the 30th run or even in the first run right so we need to track our every experiments which we are doing over here so so where should we
Implement our experiment tracker the experiment tracker will will be implemented over the train model so what I'll do I'll quickly implement the experiment Tracker out there so when you go to the model train so model train will have something like this and uh I'm so sorry for the background noise I'm extremely sorry because this is India and you keep on hearing these voice so I'll simply import ml flow import ml flow so once we import the ml flow what I'll do I'll uh simp simply go ahead and then initiate an experiment tracker class sorry uh
object but but before that we have to import something known as client from zml Cent import client and that seems like experiment tracker is equals to client get experiment uh get active stack so I'll just go ahead and then do active stat okay wait wait for a second yeah do active stack do experiment tracker so once we have this we can EAS easily use this so basically what what we have to do we have to use this ml4 tracker right so we have to in in The Decorator we have to pass the experiment tracker and
the experiment tracker which we'll use is the following experiment tracker do name then the name of that so that it should be notified that this step has the experiment track track right so now what we have to do in this case we have to actually you know um uh log our models okay log our models so in this case we have to actually use the pyit learn Auto so basically what I'll do I'll use mlflow dosk learn. autog this will automatically log your models scores and everything out there right in the same way you have
for several other libraries so basically we'll do the m. log same what I'll do I'll do on something known as evaluation right so so what what what I can do I let me just go to evaluation part and in evaluation part I I I have to do the same thing I have to actually copy the two couple of things which I did and then what I'll do I'll simply copy the step as well step as well and then over here what I'll do I'll I'll simply go ahead and then select ml slow. log matric and
then I'll log the msse right same goes with ML flow. log Matrix I log the R2 and then I log rmsc right so I've already logged these three things now what I'll do it's mostly done so now we actually can use this particular um statement over here but but before that let's import sorry sorry sorry sorry let's import something known as import flow cool so now what we have to do guess what what we have to do we have simply go to non run Pipeline and then simply run the same pipeline so let's just quickly
run I've already uh done this so let's just run the pipeline first so once we run the pipeline it will say that we are using the mlow tracker and once once it says that that we using the ml tracker it will say something like this just no module names ml flow so what you can do you can simply go ahead and just just just like zml integration install ml flow which is simply you going go over here and then search something thing which is like this right and then paste it sorry for that um simply
past over here which is zml integration install ml flow it will take some time to install the ml flow but before going on to running that you have to make sure that first of I'll explain you what the stack means stack means that there's something the stack which is a cont containerized thing where your project is running and the stack I I'll show you what the stack contains stack contains very artifact stores which are default okay orchestrator which is default you don't need to worry about what these terminologies means so but basically the thing which
is a default the the stack which you're working on the what do you say stack means I'll say in terms of environment which you're working on you also need to stay to the ziml that I'm going to use ml flow please register this experiment tracker okay and just like this as in stack you have orchestrator orchestrator means will will talk about it which eventually it's called as pipeline however you have artifacts right artifacts will talk about all of these terminologies in very great detail um however you don't need to know lot more you all have
so much theor theoretical books but basically we'll first of all install our MSO integration once it is installed as one as you can see over here we can simply go and register our experiment tracker so once you go and register our experiment tracker you can just say zimal expand track register ml flow track but before that what I'll do I'll show you what the ziml stack list and it will show the set of stacks which we have over here right and uh it's taking too much time I guess yeah okay so basically this is a
very common error which you might get if you're using Mac so you just have to do a couple of things the first thing we have to do is zenal disconnect and then what you have to do you have to run another command which is zml up so when you dis disconnect it so basically it is giving another error which is error initializing SQL store error initializing whatsoever this is something new error however you can totally choose to ignore this right just say ZL up ZL up okay cool something is really interesting over here so down
so maybe down it or we might need okay fair enough so so then then we'll up it and if it gives the same error we have to r on the zml disconnect maybe and then let's see if it works zml disconnect okayl let's for wait for a few seconds and it should work okay cool it's working pretty fine that's very very nice so now we have we have fixed it so let's let's just quick quick quickly go and let let me show you what this set describe me so current stack which we have as of
now the current stack which we have as of now it will say that everything is default right so your orchestrator is default and artifact store is default right orchestrator where you're running an artifact store where your variables you can assume the artifacts which are being stored over there so let let's quickly go and then also make the experiment tracker so you can just go with the read me and then copy this command and then paste it which is ziml experiment tracker register our ml flow tracker which will have the flavor of MLF flow so it
says that unable to register mlflow tracker which is in the same Works Space so let's quick quickly go and change something customer okay so now it should work fine because I guess I've already used it somewhere cool then you have to just go and uh okay let's just quickly ignore model deployer as of now okay we'll come back to this or let's just do one thing let's just quick do do this as as a long because this is important to do so uh we will come back what does this model deployer means and then we'll
register okay so I'll just customer and the over here as well as customer just copy it so that it makes sense and it will set the deployer so it will just set something this okay fair enough so it says that unable to register the stack name full stack so again it is saying the ml stack name is registered because I've already used it in the past so what I'll do I'll simply make it customer customer here let's just quickly do it and then just wait for a few seconds so most probably it's done so now
you have the ml so when you do zml stack describe now so once you do it you most probably see your stack over there so which is different solve which is and now your model deployer is mlflow customer and XM tracker is ml tracker so we have done this so let's now run the pipeline and then let's see what is what what this leads to run pipeline. pi and let's just wait for a few seconds to to complete the Run of it so it is initiating a new run and it is saying something really happens
you're using unsupported version if you cter errors blah blah blah you just have to downgrade or something upgrade ml flow or whatever it is giving you can totally choose to ignore this but there's something interesting comes in as we see it that there's something interesting comes in so me might need to um maybe most probably my case say that maybe just search it quickly because I'm not sure what this eror means I'm so sorry for it what do this given no okay uhuh Okay so so scorer maybe let's just go and upgrade it what do
you think so it says the warning and let's the warning says that try upgrading and downgrading the scale learned version to a supported version or try upgrading your ml flow okay so pip install PP install psych okay upgrade py law okay it's done and then what I can do I can just quickly go and then serve it over here which is pick install upgrade flow okay so we might need to upgrade a little bit version of mlflow and let's see if this if the error is still until processes if it processed I'll just see the
one another solution which I have in my mind so most probably your mosts is being fixed by you know just upgrading reinstalling you know disconnecting then connecting restarting your laptop fixes the erors because sometimes you you don't know what is happening behind the back so you actually have to be very careful while initiating stuffs please okay so that that was a very simple letter we might have to you know so we might have to you know upgrade then it run completely fine okay cool that's nice so let's let's just quick quickly go and search default
login pipelines this one this one and here we go so if you go and see the configuration you have the experiment tracker as well you might think two things as of now the first thing which you which which you might be thinking that hey a how where can I find my um where can I find my you know uh most probably my ex tracking URI you know how can I view the MLF flow stuff and and my own stuffs right so let's I'll tell you two things okay the first thing which I'll tell you that
how you can track the URI and stuff like how you can view the experiments so let let's Qui quickly go and search about uh how you can track the URI right so let's just quickly go cing so there was I'm just trying to search one thing which was they had given a very nice code actually you know okay we we got it we we got it okay so what you have to do you have to just go quickly over there and then just go quick quickly over there and in your run pipeline over there and
just paste this okay just paste this because you will get your URI wait for a few seconds let it run we have initiated our cach so it'll just use the cach version of everything okay cool so it says that your file is available over here maybe yeah it makes sense okay cool so um so your file is available over there now what now what we will do we'll just run mlflow UI backend so something like this I'll show you which is which you can find on official you know ziml page as well M URI and
then you paste the URI which you got by pasting that code which is ml runs right let's just do it okay there might be some error okay let's let's just quick quickly run it you know it's something very important to run I guess it will give some sort of er I'm not exactly sure okay got an expected argument which is maybe we have to make it in like this file and it last as well let let's just see if it works in if it works then it's fine all right it works so let's just go
and paste it over there and expect it to run so this is 3 3 minutes ago you just go over there you see the Matrix which is listed out here right you see the parameters you see the model which is literally logged in ml model right you can use this model to make prediction make ml flow to make predictions or even pan to make predictions you see how interesting this is this is pretty pretty amazing this just loged each and everything so I that's what I wanted to Showcase to you uh I hope it makes
sense to you not what I'll do I'll just cancel it up you can just all these commands are available you know get don't need to worry about it cool so now we done with experiment tracker in the next video we'll just go ahead and then worry about something known as um deployment of our model I hope that will also sound well to you K and catch you in the next hey everyone welcome back to this video in this video what exactly I'm going to do I'm going to actually cover the last of the last thing
which is deployment pipeline so we'll use the ml deployer to actually deploy our model locally so that you can use it uh and and make predictions right and we'll see how to deploy a model you might have seen that you just save the model use some fast tape applications slow load with job lib and then you do it that's really not true which happens in the production use case you actually use something known as mlon deployment or Seldon deployment pipelines stuff to do it so what I'm going to do I'm going to actually use something
known as uh some something really known as um ml4 deployment which is entirely used for local deployment mostly use for local deployment for deployment on AWS or you know GC Cloud you might have to use S code because that's much more advanced deployment software but as of now let's just go with AML deployment software or sorry uh Tool uh to get started with that so the first thing which I'll do I'll go ahead and create something known as uh deployment pipeline okay a deployment pipe a deployment pipeline so let's let's just quick quickly go and
uh make a deployment pipeline so what I'll do I'll just go ahead and then create uh deployment pipeline okay okay so over here I'll just scoll it down to looks good okay so run deployment run deployment dop right let let's just go there and then just first of all remove all of this because I you know the reason why I like to remove all of this because I think that gives me much more pleasure if I remove all of this because that seems like okay fair enough you have something uh off the load that's why
I really really like this stuff okay cool so let's just go to run deployment and then in run deployment what I'll do so basically you might have already used uh click over there right so so basically we'll create two pipelines we'll create two pipelines the pipeline which we'll do so let's first of all create the pipeline as well which is deployment pipeline deployment uh pipeline Pi so in that deployment pipeline. Pi we'll make two pipelines which is continuous deployment pipeline I'll explain you what this continuous deployment pipeline means as well as inference pipeline later we'll
explain what do it mean as of now let's just go in quickly as of now assume that continuous P pipeline it's like a traditional pipeline which we have built prior so let's just go from pipelines do deployment pipeline we we going to import some certain things so as of now let let's just go with this which is deployment Pipeline and inference pipeline okay so inference pipeline okay and then what I'll do I'll simply go ahead and create some do do one thing is I'm going to use click okay I'm I'm going to use click so
that we can just state in a command that okay we want to deploy or we want to predict or whatsoever so I'll just go ahead and then create a click command which is click. command and click option click option would be click option would be uh so let me just quickly copy it because that's something is easy ra rather than I write whole set of thing so let me just copy it over here okay so click man is conf right config and then it will say okay in config what do you want to choose you
want to choose deploy or you want to choose predict or you want to choose deploy predict so let me just quickly write over here deploy predict and deploy and predict okay so you can actually state in like this python run deployment pipe run deployment. py do SL sorry uh Das Dash config and then you want to deploy or predict you can simply write it over there and then you have minimum accuracy we'll come to come come to what this minimum accuracy means in some time so uh then what then what I'll do I'll create something
known as this which is run deployment and that config is Str Str and minimum accuracy is of course number we we'll come to what this minimum accuracy means uh okay so we'll just go ahead and write float okay and over here what I'll do if it says deploy if it says deploy what I'll do I'll run the deploy Pipeline and if it says predict I'll run the inference pipeline I'll run the inference pipeline okay um so let's just quickly get started with it so now now you might be thinking hey is it done no of
course not we have to actually build this deployment pipeline as well as the inference pipeline explain you what this minimum accuracy means so let's just go over there and quickly create our deployment pipeline right away so I'll just import import numai as NP import pandas as PD from zml import pipeline comma step and then from uh let's let's quickly import all of this from zl. config um you'll see where where we'll use this Docker settings Docker settings zl. config right that's correct okay so and then what I'll do I will just import some something which
is mlflow deployer so I I'll just copy and paste all of this things so that it's much more easy as of now you can just forget it what does it mean and stuff we we'll come back to this later on so let me let me just quickly go and just copy and paste it over here so we have imported from zml con constants we'll actually make use of all of this please don't worry about it we we have also imported our um Steps From the steps which is clean data in that clean data we have
imported clean DF so let's let let's just go and import clean DF then you have evaluation in evaluation you have evaluate model so let just go and write evaluate model inest data you have ingest DF so let's just go and import that as well and in model train you have train model which is already there okay so now what now what I do I'll just first of all so basically I want to train the model right sorry I want to deploy the model as well as if the model is good in accuracy we'll deploy it
okay so let's and also we'll also create the docker setting so Docker setting is like what are the libraries or the tools which we need over here so in Docker setting the required Integrations which is the the the required Integrations which we have Integrations which is equals to uh only mlflow right so we we have we want to use only ml flow Library into this okay now what I'll do I'll create something known as I I want to actually use the model you know I want to use the model to um to actually deploy the
to actually make the predictions but before that I'll I'll create the basic pipeline which is the continuous deployment pipeline so I'll explain you what this continuous deployment P pipeline means let's just goe first of all so pipeline comma we have to enable cache equals to true we want to enable the caching and then settings is equals to which is Docker first of all so we'll use Docker settings right stalker settings and then that's it that's that's pretty much it so we'll create a deployment pipeline so continuous I guess the spelling is correct continuous deployment Pipeline
and then over there we'll have first of all minimum accuracy we'll come that what does minimum accuracy means uh then then after that um we'll have workers number of workers which we need and then we have timeout so timeout means how much like like what what will the required amount of if it is in Loop then at how at how much time we should stop the run right so when the time out should be there so so that's why we have imported this default service start which is from constant we have imported this defa default
service start stop time time out so that we can actually stop the pipeline if it is taking too much okay so first of all let's just quick quickly go in and then just run the in justf we have actually imported over there and then we have xtrain so let's just quick quickly go over uh training pipeline and in trading pipeline let let's just import everything out Okay cool so let's I've imported everything which is we have the R2 score as well right so now um so you have the R2 score now what I'll do I'll
create um what do you say the deployment uh the the deployment which is the deployment decision okay so now now we have once we have the rmsse now we have the trained model so now what we have to do we have to actually deploy the model so there should be some criteria for deploying our models what is that criteria criteria can be if your model is great if your model accuracy is greater than the minimum accuracy which which is required to deploy the model then only deploy the model that's where your minimum accuracy comes in
place so let let's just quickly go and create something know as deployment trigger okay so the deployment decision will depend on this deployment trigger so first of all let's create a step called deployment decision okay so um so just quick quickly over there so I'll create a class and the class will have doc sorry deployment trigger config in that we have the base parameters right base parameters and then the minimum accuracy the the minimum accuracy of it we we'll change it don't worry we'll change it as if I'm just adding a random number we'll create
a step and the step will say is the Define define deployment trigger trigger and that trigger first of all the accuracy uh which will be a float and config and the config from where we are using that the config so basically we need a config right so we need a config deployment trigger and so you have it's simp so basically what does it does let let me just write it implements a simple model deployment trigger that looks at the at the input model accuracy and decides if it is good enough to deploy or not okay
so this is a very basic deployment trigger it says first of all it will return the accuracy greater than or equals to config do so basically so basically we we'll make use of the deployment out here so basically I'll I'll tell you what is so the we'll create a deployment decision deployment decision and deployment decision will contain the deployment trigger and let's use as of now rmse rather than R2 score so if if you don't know about R2 score you can just go online and search about it like space it's it's like it it it
indicates like whe whether it's a goodness of it or not okay then then when you actually actually use it right so let's just go with something known as um MSE or rmse right or maybe let maybe let's let's go to R2 score because that's more good okay so now we have the deployment decision now so what it does it takes the R2 score and then it takes the which is the minimum accuracy which is required minimum R2 score which is 0.992 it only deploys this particular model if and only if if the deplo deployment decision
true how does it evaluate first it checks that your R2 score is greater than this or not if it is then when it deploys the model then it will go to the next step the next step is mlflow model deployer step so what is mlflow model deployer step so let's I'll I'll show you what does it mean mlow model deployer step which is over here so we actually import from Zen integration M steps we actually use this em model deployer steps which actually is that we is already pre-built step we can actually use that to
deploy our model so that we'll have to give certain parameters so what is the model what is the deployment decision is the deployment decision workers which we need so workers is equals to workers timeout is equals to timeout Okay cool so now we have the deployment pipeline done now we can actually use this for inference so for for running our deployment pipeline okay now um so now I think we are mostly done with it right so let's just quickly go and run our deployment continuous deployment pipeline that's much more good to get started off with
and then we'll come back to uh building up our inference pipeline so let's just quickly go to run deployment pipeline over there and um uh yeah so let's just quickly go over there and uh now what I'll do I will simply this is the config which we have right and this is the minimum accuracy which is required for us to deploy our model so we'll create something something known as mlflow model deployer component so this component will will be like em deployer so let me just quickly go ahe and then import with there and each
and every libraries which we need technically yes so let's just import the libraries which is required I'll just paste it from my repository okay so basically we'll actually use the first one which is ml4 deployer which is ml4 deployer component and then this will ml4 deployer get active model deployer this will take the active model deployer out there and then deploy is equals to config is equals to deploy uh deploy or conf config isal deploy this okay so if the deploy is this or this it will run the deploy if the predict is this or
this it will run the protect cool so let me just quickly import my from pipelines import continuous continuous deployment pipeline that continuous deployment pipeline will be the following so let's just take that and then continuous deployment pipeline that deployment pipeline will contain the minimum accuracy which is required and then after that it will have following workers which let's may name it as a three and time out maybe 60 seconds okay so this is our uh continuous deployment pip plane so now what what we can do we can actually you know um use it so let's
let's quickly let's let me let me just copy it from the uh repository so now we we'll have the we'll make the predict one very soon but but let's just write it out so we can say that you can run your so this is the thing which which I've copied from um Zen M repository so it says you can run your ml for UI it Tak the so so you can see the visual representation of your models right and then what it then then what it does it Fetch and then we have to fetch the
existing services with the same pipeline step name and model name right so that we can say that if there is any existing services are there so I've just copied it from their uh own mlflow examples repository because these are mostly same so basically we are fetching the existing Services if it is running if there's an existing Services running or not right and then we have employ deployer step and then model which is the model name if the existing services are there then then we say that the existing Services is running locally as a Derman to
stop the service it will like this and if the service is failed then it says the ml for service is failed or you just say there's no ml for prodiction server is running right so there just use the deploy model to get started with so this is a very basic run deployment um as we don't have the inference pipeline so we don't need to worry too much about uh so let's just quickly go and run this pipeline up right and then we are and then we are mostly done so let's run this Pipeline and we
have the minimum accuracy right so we don't need to really worry about stuff Okay cool so let's go and run python run deployment uh config and in that config going to deploy the model let's see if it gives any error if it gives then we'll solve it right away so it says that materializer is not found like material there's no module named must materializer where in in deployment pipeline so are we using the materializer I guess yeah okay let's remove this we not using it you don't need to worry about it just go ahead let's
solve the error which it is going to give okay it says that invalid settings can either refer to this this this invalid setting doger settings settings can be refer to the general settings or stack component there there might be some error interesting error so let let's see first of all if where it is giving eror okay it is giving in deployment Pipeline and in deployment pipeline what is it giving it's that's that's why I say that most of our time will go into this only just by solving these pretty errors so where it is bro
okay fair enough so we actually have to write settings to be Docker not Docker settings right and mostly we done okay fair enough okay cool let's run it now it can be available keys are either resources or Docker so we need to have the docker one over there rather than Docker settings okay fair enough so it gives the main why it gives the main so we have run deployment rather than main so let's run that sorry for that let's wait please run whoops so it says confix so basically I there is some naming error I'm
so sorry for that again no problem you know these things you know very silly errors which I do you know this rectifies this I really want a tool that rectifies us all of this naming errors or you know import errors and all the stuff okay it says that wrong arguments emo deployer got an unexpected argument called deployment decision what is it so where does it gets an error it gets in done deployment and then it says okay fair enough so it says in continuous in that continuous you have the ml flow deployment decision and then
it says that okay fair enough so let's let me just quick quickly go okay so basically it's actually not deployment decision it's deploy decision not deployment decision Okay Okay cool so let's just run wait I hope it works diplom decision is not defined so why did okay fair enough again I did the big mistake it should be over here not there I'm so sorry for it for these mistakes because these is this is something you know when when things are super occupied when you know things out there these mistakes happens so it initiates the new
run it says missing entry point input data path so let's input our data path where to input the data path okay I'll I I I'll add it over here only because I guess that's more important right or let's do one thing let's write over here data path St Str and then let's just go over there and then also add the data path just copy it from directly over here I just quickly replace it okay I hope it works now if it does not then we again have to fix something inference is still left so stay
tuned for inference and mostly we'll be done by then missing empty Point data path why where it is getting I guess okay okay okay again we made a good error so we have to actually put the data path to be data path sorry sorry for that this little little erors keeps keeps on happening so you have to actually debug it and see where the your VAR code is running and stuff it is using the cached version again I'm saying it's like ml deployer okay so it says that an mlflow model with the name was not
logged in the current Pipeline and no running ml4 server was found please ensure the pipeline includes a step with the ml4 experiment to configure that trains a model and locks it to that so most probably what I feel that we have to make it false right and then let's run it now if it does not then you know then then we then then we'll get in a big trouble now if it does not deploys the model we are going to get in big trouble if it is not please let's wait know this is something where
I just pray you know that it works out because this is a step where you get most of the errors and if some unknown error happens then just you have to actually spend your ton of time in it ton of people has to spend your time in it you know uh because it's not a very simple thing actually go inside your system and see if it works or not now see what is happening it says no materializer is registered for type linear regression so default pickle materializer was used it's not production easy so not we
can blah blah blah so basically we have to actually make a materializer I'll show you how to make it later on I'm just waiting for it to deploy our model mlflow model deployer step and if it is and if it goes above service demo on is not running okay something really happened now so it says the fail to start the MSO deployment service model serving ml flow uhu okay for more information on the status please see the logging file okay something really interesting happen so basically timed out happened over there okay let's just go and
see if what we can do in this case so what we can do is fail to start the service ml flow deployment service demon is not running okay so zml up NL stack describe is there anything which we did wrong in the deploy we have the the following why is giving the worst error okay fair enough so something is really interesting H happening over here we actually have to run deployment okay let's try to run it if you see it it like this this might be causing some problem this this warning okay okay okay skipping
model deployment because the model quality does not match the criteria so again it will say the same thing so basically what's really happening that it is not matching the criteria so let's let's write 0.5 this is that skipping model because the model called does does not match the criteria using last s deployed by step and continuous for model so I guess that's that wasn't like it it was not meeting the criteria maybe that's why yeah so let's try to I have reduced my minimum accuracy or maybe I have not so minimum accuracy 0.5 let's wait
now we just we can do just one thing just wait for it and see if it works so basically I'll tell you what happens is in these type of cases you have actually concentrate more you know you have to see actually what's going wrong what might go wrong even the smallest thing even restarting your laptop really works I literally seen a I was soling an error for two days and I saw okay fair enough like I restarted the laptop and it works like a charm so yeah so just just just wait for a few seconds
and let's see if it works or not deployment trigger start ml deployment service skipping model because the model does not meet the criteria my goodness so it's really interesting things that happening out here right so uh what I'll do I'll just quickly go over through the code and let's see if it works and let's see if this let let it work right and then we'll just go and see what happens over there okay so we have the deployment existing services and uh okay so it will of course not work because this does not Tak too
much of time right so I'll just you know go ahead and then see if it works on your side ml4 deployment service and okay and then you just go into deployment pipeline in that deployment pipeline you have [Music] the get data for ML slow deployment service parameter steps service demo is not working fair enough I I I get the words the error is about I get it what's the error is about step parameters if your accuracy is greater than then config do minimum accuracy and your minimum accuracy is this one 0.5 okay fair and then
you have the MLF flow model load step parameters so basically we have pipeline step name running okay so let's just copy this m deployment loader steps as well so what it does it helps you to get the get all the stuff which is ml deployment and then we have the prediction service loader and predictor which will come to in some details you know I've already written this these code out actually right so let's just go and then just you know okay fair enough so I'll try I'll try to run one more time maybe right so
I'll try to run one more time and then see if it works but before that what'll do I'll just check the everything is working fine on that site which is run deployment and in that run deployment you have the following ml flow deployment services and then you have the to get which is Zen model deployers MSO model deployer do we have the Emon model deployed okay we have the get tracking U and then okay fair enough so I'll just run it try it nicely again let's see if it works or not if it does not
then we actually have to go inside and talk to zenel team and then see if it works because you know we have to actually have the continuous talks to the zml you know because because it's something you know some something which should be which maybe have which may be bit common in their side and which may have they may have solution to it or we may have to open the GitHub issues and then may most probably will have the error solved because this is how we solve it it's just we we are not expert in
this we just go to some people and then talk to them about it right we'll try we'll try one one more solution which is this one we'll try do this solution which is the L linear regression model okay fair enough again it just just does not matches the deployment criteria so I want to see the R2 score R2 score is so bad bro okay fair that's why it was not given okay okay okay so I guess R2 score is very bad that's why it is not giving good errors so I'll just add zero maybe this
this this might run zero means like we I I want to just showcase you that that it deploys the model right so now I'll go and run it R2 score is zero right so of course the R2 score is greater than that so yeah why it is zero bro something really interesting cases happening of with me really interesting Okay cool so it's weet and also we'll fix that um which is this one we'll fix that if these things does not Works bet please run bro run no materializer is registered that that that that that that
is common deployer updating an existing mblo deployment service which is this one and let's see if it this works if it does not which is like it met the criteria now it met the area now let's see what it does it should work you know but if it does not we'll come back and then see what it works or not okay let's see if it works or not updating an existing ml flow deployment service at this this this this stage I think it will mostly not work because this does not takes this much time it
will say that a demon is not working blah blah blah we might need to do something we'll try one one more solution which I have in my mind okay service dayon is not running uh for more information please see the following lock file so I'll just check it out and then come back very soon so everyone there was a very simple error so basically my I've already tried this mlro on couple of environments that's why it was like service that we have the current service running we cannot actually use that so what I done I
actually deleted that uh then I you if you were working on a new the new uh you might be working on new stuff right you might working on new stack right so we have to actually use the new stack that's why it was giving me error now it is working to totally fine the only thing which is is not working fine is the following so let's just go and uh fix that thing so basically what what we need to do we need to import something as cast and let let me go and just import that
from typing import cast from typing import cast okay fair enough so we'll just run it deploy and then let's just see if it works you can totally choose to ignore the mornings and stuff or you can just go and solve that thing if you want so it ingested data first after it is ingesting the data it cleans the data data cleaning is completed it goes to the next step which is trains the model trains the model then gives some sort of warnings you can totally choose to ignore this or maybe see if it works if
model training is completed train model has finished and it gives some you know root squares and then deployment trigger has started deployment enow models step has started it updates an existing ml development services right it starts with the ml development services and most latest times right it starts the service so let's just go in then see if it works hopefully it should work if it does not then I'll just you know take his ass off okay I'm so sorry for that okay so now your model is available you can make the pred make your prediction
over here because it is already running you can also delete the model if you want so now your model is successfully deploy it so now we need to do we need to actually make predictions from this model so I what I'll do I'll quickly go ahead and then create something known as um I'm so sorry for it go to deployment pipeline in that deployment pipeline first of all we have this MSO deployment loer step which will help us to load that model okay and then let's go and then start doing stuffs so uh we'll just
go ahead and then create something define prediction service loader right so I'll just copy and paste the code if you want but yeah but okay it's like I already this is already pre-written Okay so let's just go and then write the prediction service load out so we'll create a step where we'll enable the caching equals to false because sometimes caching is also not very good then we'll create the prediction service service lower and in that we'll have the pipeline name pipeline name name will be St Str pipeline step name pipeline step name will be also
also St Str is it running boom equals to true and then model name to be model okay and then it Returns what it returns it returns mlflow deployment service okay it Returns the ml flow deployment service so basically it gets the prediction service started by it it gets the prediction service over here just copy and paste it yeah so it gets the prediction service started by the deployment it takes all of these arguments in it so first of all get the ml flow deployer stack component so basically we'll get the mlflow deployer stack component so
which is very simple get active model deployer over here and then what I'll do I'll existing fetch existing services with the same pipeline name and model name so what I'll do I'll go existing Services which is mlso MSO deployer which is MSO model deployer component. find model server and in that we pip pipeline name to pipeline name pip pipeline step name model name and running right so if there is running running running to be running that's it that's pretty much it so if not existing Services then we say we raas the runtime error and in
that runtime error we say that no MSO Services is found which is like this no step in this Pipeline and something like this you know pipeline for the model name is not deployed so I just copy and paste the errors which is so traditional errors found from you can just just go at you know zenel examples and then just copy and paste there it's not a big thing then you print the existing services or maybe it's not then then then you return what you return you know you return existing services so you return the services
by using so it is actually prediction service loader it loads the uh current prediction service so to actually use this for model predictions now what I'll do I'll create the predictor so I'll create the predictor the predictor will have the service that service would be the mlflow deployment service type of that and then it then then then it takes the NP Dot and the array right and it returns those array of predictions and b. ND array so what is Arrow of array of predictions so we'll first of will create a step over here as well
that step will be of dynamic data importer so that will create the step of enable cache equals false then we'll create a dynamic importer that returns s Str that returns uh string right so it downloads the data from the first of all downloads the data from a mock API or maybe just just just go ahead and create data is equals to get data for test and it on data so we have to actually build this quickly so let's just go and quick quickly build this which is utils utils p let's go there and run it
so we'll just have import login import andas SPD from source. model to sorry data cleaning input data cleaning and data process strategy pre-process strategy and then we'll just use this much get data for test where we first of all get the data for the test and then we want the 100 one we actually clean the data we drop the review score we convert into date Json format that's why it is the it is returning St Str okay now what I'll do I'll simply go ahead and uh make this so let's just go quick quickly make
this so I've already made this let me just copy and paste that because this is pretty simple to understand okay so first of all it starts the service um and then it loads the data it removes some of the column it for the columns which we want from the data we create we convert into P data frame we convert in list and then we finally convert that Json list to a nump array and then we make the prediction from that service okay I hope it makes sense now okay fair enough so let's just go and
run it now so we mostly done now what now we have the prediction service loader now at last we'll create the inference pipeline so that inference pipeline will have okay sorry pipeline will enable the C setting the docker then then I'll create the inference pipeline that inference pipeline will contain the pipeline name which we want and the pipeline step name right which is Str Str so it first of all uses the dynamic importer Dynamic importer right and then it service which is the production service loader it gives the pipeline and running equals to false let's
just write running equals to false as of now and the new prodution production should be predictor is this is service equals to service and then data like this and then we mostly down okay so we have the data over there and then we have the service over there now we give the service it uses that service from the from this and then make predictions on this data using predictor right which is like predictor you might have seen over here okay cool so now we we mostly done let's just go to run deployment and then run
the pipelines so let's just go to run deployment and then we'll just import our inference pipeline out there so I just go and import inference Pipeline and once we import the inference pipeline we'll just go ahead and uh run our inference pipeline so yeah the pipeline the pipeline names should be continuous deployment pipeline right and then your pipeline step name which is mem deployer step I guess it should work now most probably okay so now we have done that so let's just run the predict it's so tiring please fix this error I want you all
to fix this error this is very basic you just have to write ZL downgrade fair enough so we get the error nice nice nice that that is expected from uril import get data for test I'm very happy that uh that you so one thing which I'll tell you that it is very like that you sometimes don't understand it right because this is something very conceptual very technical things out there so I want you to be very strategic in understanding stuffs right so please be if if you're not able to understand it that's totally all right
so it says that data process strategy is not there okay pre-process strategy is not defined what does it mean name the pr process strategy mhm if it is game gives th you have to worry about it this is something really interesting what happened I really hope that it does not give that now Yahoo okay so buil-in materializer can M write unable to handle class numpy and the array uh buil-in materializer can only handle the artifacts of the following so it gives some matter let let's go and fix this up this is so tiring um okay
let's go to where built in in the predictor okay where is predictor in deployment it gives in predictor andd array array sorry this was St Str by num so basically the the data which you're getting is Str Str not an and see if it works if it is not then we have to worry about now uhuh okay Json is not defined import Json anything else please give me the error Fast Pro I'm I'm I'm so worried about errors fix this up so just say just say it's animal down downgrade it it should fix please yeah
so we are done so now we actually completed our stuff right so let's just go over here enjoy so you have the dynamic importer prediction service load of dynamic import outputs and then service load output something which is a service and then this uses output and output output of the which is the data for test and this for service and then uses both of them to make the predictor and this actually the predictor outputs the following so if you go and see the visualization you see something really interesting that your predictions has been made over
there so I guess this is not showing any V visualization because of some my error so you see that your mean is standard deviation of predictions is this right so actually it made the predictions out there okay I hope it makes sense now we have deployed the model made the predictions too now you might be thinking here how can I make the single handle had single handle predictions right so we are done with the deployment and inference as well now it is actually making good uh inference out there it is actually predicting but it might
happen that you're still confused so let's just not have too much of confusion in your head and then fix that confusion too so I have actually made um simple streamlet dot send it app dop appy okay paste it right right I hope it works mostly so you from the deployment pipeline it Imports the prediction service loader and then run deployment from run deployment it Imports let's make it main rather than R right Main and then over here as main okay so everything is same the only thing which is if the sub person clicks the predict
button it will go to the prediction service gives this and then it says that if the post service is there then creates the DAT data frame does the same step which which we have data and predictor and predict from the data right so let's run the streamlit Run streamlet app.py please run fair enough high level overview is not there so I'll just make sure that I remove all the images okay any images M okay let's run it [Music] now let's wait is it is running cool so I'll just make it z z everything so it's
now it giv prediction so basically your detail is 4.22 so basically it is actually using the production from the model so you see that we haven't even saved the model save saved the model is not there it's actually using it it's actually the pipelines over there if you go and see the pipelines of our so let's go to pipelines and then let's go to continuous deployment pipeline The Continuous deployment pipeline you have the following inference pipeline which is done right and then let's go behind back okay let's just go continuous deployment continuous deployment continuous deployment
and in that continuous deployment you see the continuous deployment is also there right so whole P pipeline is done so we are done with one project right I hope it was really good project for you uh I understand it cool so let's just go ahead and then that's it that's the app of the project in the next project we'll actually use something known as customer CH or maybe let's cover the next project up to catch you later bye-bye