[Music] [Music] [Applause] [Music] [Applause] [Music] welcome students this course on data analytics with the Python today is the introduction class this lecture is on introduction to data analytics the objective of this course is to introduce the conceptual understanding using simple and practical examples rather than repetitive and point clique mentality here most of the students generally they're how they are using the software for doing data analytics just today they want to just click it they want to get the result they don't want bother about exactly what is happening inside the software this course should make you
comfortable using analytics in her career and her life you will know how to work with a real data and you might have learnt the many different methodologies but choosing the right methodology is important this course will focus you will help you how to choose the right data analytics tools when you look at this picture how this person is using this tool there is a ladder he was not knowing correctly how to use this ladder for thee for the purpose it is intended so the danger in using quantitative method does not generally lie in the inability
to perform the calculation because of the computer development in computer technology there are many packages are available for doing data analytics but the real threat is lack of fundamental understanding of why to use particular technique or procedures and how to use it correctly and how to correctly interpret the result this course will focus on how to choose the right technique and how to use it correctly and how to interpret the result so what was the learning objective of this class that is after completing this lecture what you will learn one is you can define what
is data and its importance you can define what is data analytics and types you can explain why analytics is in today's business environment is so important then we can see how artistic analytics and data science are interrelated there seems to be some overlap in this we will clarify that what is the difference how these are overlapped how these are interrelated the this course we are going to use a package called Python will I will explain how why it is important to use the Python in this course at the end of this session we will explain
the four important levels of data that is nominal ordinal interval and ratio now we will go to the content we will define data and importance there are three term one is variable measurement and data next we will see what is generating so much data next we will see how data add value to the business then we will say why data is important see the variables measurement data these are the terms which we are going to use frequently in this course so what is a variable variable is a characteristics of any entity being studied that is
capable of taking on different values say for example X is the variable it can take any values it may be 1 it may be 2 or it may be 0 and so on the measurement is when you standard processes used to assign numbers to your particular attributes or characteristics of variable is called a measurement 430ex you want to substitute some values for that values you have to measure the characteristics of the variable that is nothing but the your measurement so then what is the data data are recorded measurement so there is a variable you measure
the phenomena after measuring the phenomena you are substituting some value for the variable so that vary the variable will take a particular value that value is nothing but your data so X is the variable for example number 5 is the data how you are measuring that 5 that is called measurement then what is generating so much of data data can be generated different way humans machines human mission C combines the humans machines a human which in combined the sense now now seen everybody is having the various Facebook account we have LinkedIn account we are in
in various social network sites now the availability data availability of the data is not the problem okay it can be generated anywhere where the information is generated and stored and structured or unstructured format so how the data add value to the business so the data after getting from various sources assume that it is a store in the form of data of arrows so the from the data of arrows the data can be used for development of a data product here we are using the word data product and the coming slides I will explain exactly what
is the data product some examples so the same data if we look at the right hand side that can be used to get more insights from the data okay what do you me the data product for example algorithm solutions in production marketing and sales example of some data product for example recommendation engine what not the example of example for data product suppose if you go for Flipkart or Amazon if for buying a particular product that package that software itself we will recommend to you what is the next product possible product that you can buy that
is nothing but the recommendation engine even if you watch some YouTube videos on particular topic that YouTube itself will suggest to you what are the relevant videos are available so that is a recommendation engine that is one of the exact one of the example of your data product with the help of data so that will help you to for me a data product or you can get a insight from the data that will add your business value to you see this is an example of your data products this is the driverless car Google car so
the whole concept of Google car is with the help of data it is detecting all other requirements for driving the cars the next example is for recommendation engine as I told you previously when you buy any product they will suggest you that along with this product the other product also can be purchased another very common example for a data product is Google the Google is lot of applications there one of the application of example for data product is Google so the Google mapping is helping you to find out what is the right route which road
there is a traffic in which road there is a toll booth so this kind of informations we can get it from the Google map so this Google map is the one of an example of your data production now why data is important the data helps in making better decisions data helps in solve problem by finding the reason for underperformance suppose some company it is not performing properly by collecting the data we can identify what was the reason for this under performance the data helps one to you have already the performance so what is the current
performance the data calso can be used for benchmarking the performance of your business organization and after benchmarking data helps when improving the performance also so data also can help to understand the consumers and the markets especially the marketing context you can understand who are the right consumers and what kind of preferences they are having in the market next we will define what is a data analytics and it's a types so in this coming to three lecture two three slides we are going to discuss we will define what is data analytics then you say why analytics
is important then we will see that data analysis then we will see how data analytics is different from data analysis at the end will we see types of data analytics will define data analytics is the scientific process of transforming data into insights for making better decisions see it is a scientific process for transforming the data into for making better decisions even without the data also even without doing analytics also you can make the decision but you cannot you cannot make the better decision without analytics by the virtue of you will experience on intuitions you can
take the decisions that also sometimes may be correct but about the help of data if you are making the decision then that will enable you to make the better decisions another professor James even he has defined the data analytics in this way it is the use of the data information technology statistical analysis quantitative methods and mathematical or computer-based models to help managers gain improved insight about their business operations and make better fact-based decisions you see that there are many terms which are appearing here on a site II next one is a statistical analysis next one
is the quantitative methods the mathematical knowledge and computer-based models so when we will see how these are interrelated in coming see coming slides generally among the students there is a confusion whether the analysis and analytics is same or different why analytics is important the opportunity abounds for the use of analytics and big data such as for determining the credit risk for developing new medicines especially in healthcare the healthcare analytics is an emerging that is helping you to identify what is the correct medicines finding more efficient ways to deliver product and services and for example in
the banking context data analytics is used for preventing the fraud and it is uncovering the cyber threats the help of data analytics you can find out the possible cyber crimes and we can detect it we can prevent it and data analytics are also important for retaining the most valuable customers we can identify who's the your valuable customer or non valuable customers so we can focus on more on our valuable customers okay now what is the data analysis is the process of examining transforming and arranging raw data in a specific way to generate useful useful information
from it so data analysis allows for the evaluation of data through analytical and logical reasoning to lead some sort of outcome our conclusion in some context data analysis is a multi-faceted process that involves a number of steps approaches and diverse techniques that we will see in coming lecture so now we will see what is the analysis is data analysis and data analytics when you say analysis when you say data analysis it is something about what has happened in the past so we will explain why that has happened we will explain how it has happened yeah
we can explain why it has happened for example when we say data analysis that is nothing about studying about what has happened it is like kind of a post-mortem analysis what has happened in the past okay in the contrary the analytics is studying about what will happen in future and with the help of analytics we can predict explore possible potential future events so the analytics is maybe qualitative or quantitative for example in analytics if we say qualitative analytics so it is the decision mostly based on the intuition but if you say in quantitative where with
the help of formulas with the help of algorithms will make the decisions so in the analysis data analysis also we can go for qualitative we can explain how and why a story ends in that way it did when we say in quantitative we can say how the sales decreased the last summer say when I say as you are repeating and you say analysis is something studying about what has happened in the fast okay so it is not exactly analysis equal to analytics similarly when you say data analysis is different data analytics is different similarly business
analyst is different business analytics when they say analytics is nothing but studying about the future events with the help of the past data next we'll go for classification of data analytics based on the face of workflow and kind of analysis required there four major types of data analytics one is descriptive analytics diagnostic analytics predictive analytics and prescriptive analytics we will see these four types of analytics in detail in coming classes if we look at the difficulty and the kind of value which we can get from different types of analytics this picture shows for example when
you see the descriptive analytics that will answer what happened diagnostic analytics will help you to answer why'd it happen predictive analytics will help you what will happen prescriptive analytics will help you to answer how can we make it happen there is one context when you look at the level of difficulty you see that the descriptive analytics is the level of difficulty is very less and the contrary when you look at the prescriptive analytics the difficulty level is more and the value also value in the sense business value which which adds to you also more so
when there is a more difficulty there is a more value okay then we listen what is the descriptive analytics descriptive analytics is the conversion form of business intelligence or data analysis it seeks to provide the depiction or summary view of facts and figures in an understandable format this either inform or prepare data for further analysis so descriptive analysis or we can say another way in statistics can summarize raw data and convert it into your form that can be easily understood by humans they can describe in detail about an even that has occurred in the past
okay some of the examples of descriptive analytics is a common example of descriptive analytics our company reports that simply provide the historic review like data queries reports data descriptive statistics data solution and data dashboard okay the next one will go to the diagnostic analytics diagnostic analytics is a form of advanced analytics which examines data are content to answer the question why did it happen so we are diagnosing suppose we are meeting a doctor for for consulting so he will try to understand why this has happened okay so that kind of analytics nothing but diagnostic analytics
so the diagnostic analytical tools aid and analyst to dig deeper into an issue so that they can arrive at the source of the problem so doctor also will identify you somebody has got some disease what was the sources of the problem similarly the diagnostic analytics also if something has happened for example the company's not performing well that diagnostic abilities will help you to identify what was the core reason for that in a structured business environment tools for both descriptive and diagnostic analytics go parallel so when you look at the whether it is a prescriptive or
diagnostic analytics the tools analytical tools which are using can be same only the purpose may be different for example data discovery data mining correlations these tools can be used for your prescriptive analytics also okay now we'll go for predictive analytics predictive analytics helps to forecast trends based on the current events when you say predicting obviously it say that it is discussing about what will happen in future predicting the probability of an event happening in future are estimating accurate time it will happen can all be determined with the help of predictive analytical models many different but
code dependent variables are analyzed to predict a trend in this type of analysis so in the predictive analytics one of the tool is the regression analysis there may be some independent variables some dependent variables sometimes more dependent variable more than one dependent variable and how these variables are interrelated so that kind of is nothing but your predictive analytics when you look at this picture you see that with the help of historical data by using different algorithm predictive algorithms you can come with a model once the model is developed a new data can be fit into
this model so we can get some predictions about the past events example is linear regression time series analysis and forecasting data mining these are the techniques for predictive analytics the last one is the prescriptive analytics a set of techniques to indicate the best course of action it tells what decision to make to optimize the outcome the goal of prescriptive analytics is to enable quality improvements service enhancements cost reductions and increasing productivity okay in the prescriptive analytics some of the tools which we can use is optimization models simulation model decision analysis these are the tools under
prescriptive analytics next is we are going to see why the analytics so important in this section we will see what is happening the demand for data analytics and we look at the different elements of data analytics this picture shows Google Trends this was up to 2017 see for example the blue represents the data scientist this orange represents the statistician operation researchers you see the trend is it is increasing that means people are searching in the Google search engine the word data scientist more number of times see the the search counties it is increasing that means
there is a demand for that particular say job you see if you look at this this is the newspaper clipping from ties number Times of India there are so many news are coming about data scientists and the feature requirement of data scientists you see the data scientist earning more than CS and a genius we can look at this link for further and you see the demand for data analytics this also newspaper clipping with companies across industries striving to bring their search and analysis department up to speed demand for qualified data scientist is rising rising so
there is an emerging field so many companies is looking for the qualified data scientist so if you take this course and end up the course that you may be qualified for getting into this companies many time you see what is data analytics statistics data mining optimizations these are students having different understanding on that so when we say data analytics there are different element one is statistics next one is the business intelligence information systems then modeling and optimizations then simulation and risk we can say if you are able to do what if analysis that is nothing
but sensitivity analysis visualization data mining these are the components of data analytics and how these different domains are interrelated next we will see what kind of skill set is required to become a debt analyst then we will see the small difference between data analyst and data scientist C to become a data analyst is the basic fundamental knowledge is you need to have knowledge of mathematics next you need to have the knowledge of technology is nothing but hacking skill hacking skills the sense if the data is given hacking is done in the pearl look at the
positive way how to use the dead data to get more informations the next skill is business and strategy acumen you should have the knowledge of the domain knowledge of the business and new take out the strategy equipment so these three skills are required for a good data scientist it is very difficult to have a one person will have all these three skills that's why availability of good data analytics is becoming very difficult because somebody may be very good at mathematics but they may not have very good knowledge and business some people may have very good
at technology they may not technology in the sense information technology they may not have good knowledge on the business knowledge or and so on so we need to have the combination of all these three skills otherwise the group of people some people from mathematics department mathematics area some people from computer science some people from the domain knowledge they were to work together to for me a good data scientist team so these forms data science now what is the difference between data analysts and data scientists and the difference is what kind of role they are doing
for example the role of your data analyst is see in your business context he may have the knowledge of business domain for example if he is good at doing analytics in the area of marketing he can be called as a marketing analyst if the person is from finance area he can be called it as a finance analyst so he is the analyst data analyst but the role of data scientist is little bigger because the data scientist need to have the knowledge of advanced algorithms and machine learning and able to come out with a data product
which I told you in the previously so the data scientist can come out with a data product okay in this course we are going to use Python in this in in the next lecture I will tell you the basic introduction about the Python here we will see why we are going to use the Python because python is very simple and easy to learn most importantly it is a freeware and open source it uses interpreted it is not the compiler suppose what do you my compiler and interpreter is you need a compiler to solve the whole
program but interpreter need not be in that way it can solve even you can interpret one sentence also one line in the programming line also it is dynamically typed dynamically type it to the sense in some other programs every time you have to declare the variable what is the nature of the variable whether it is integer whether it is a float but here you need not do it is dynamically it takes the value it is extensible extendable sense if you make a code in some other language that can be extended with the help of Python
and can be embedded embedded in the sense you have made some program in Python it can be embedded with the some other platforms and it has extensive library the usability of Python is it is a desktop and web applications it can be used for data applications it can be used for networking applications most importantly it can be used for data analyst data science can be used for machine learning it can be used for IOT Internet of Things and artificial intelligence applications and can be used for games the another reason for choosing Python is most of
the companies they use Python is a language in their in their company like for example Google Facebook NASA Yahoo and so eBay they use Python is a programming language in this Python also we are going to use your bitten notebook in the next class I will explain you because it is the client-server application is aided code on web browser the ec in documentation he's in demonstration and user friendly interface this was the last session of the this lecture we will explain for different level of the data what is the type of variables level of data
of measurement compare for different level of data will say nominal ordinal interval ratio will we see that why what is the usage of knowing this different level of data the one way for classifying the data is the categorical data or is a numerical data in categorical data you see marital status political party I color these are categorical data numerical data it can be discrete or continuous discrete data may be a number of children defects per hour so this is the discrete data in the continuous data may be weight voltage these are the example of continuous
data so what is the difference between discrete and continuous A's you say a number of children you may say two children or three children 2.5 children was not possible but in continuous if you look at between 0 & 1 the numbers are continuing there are infinite number of values that are there between 0 & 1 so it is a continuous variable next will you see the different level of data measurement easily we have seen the classification of data we classified as the categorical data a numerical data there is another way of classification is classifying into
nominal data ordinal data interval data ratio data we will look at the what is a nominal data nominal scale scale classifies data into distinct categories in which no ranking is employed the example of nominal data is gender marital status for example gender suppose you are conducting a questionnaire suppose you captured the gender male 0 female 1 this 0 1 represents just the gender you cannot do any arithmetic operations with the help of the 0 & 1 for example you cannot find out the average software will give you some number but there is no meaning for
that similarly marital status whether it is married unmarried this is the example of nominal data the next level of data is the ordinal scale it classifies data into distinct categories in which the ranking is implied here the numbers are the rank for example you may ask the customer to give you a ranking about the their level of satisfaction for example satisfied neutral unsatisfied the faculty ranking for example professor associate professor Aston professor you see that their rank is followed for example one professor - associate professor three Aston professor student grades ABCDEF these are ranking because
the numbers one two three represents the rank the next level of data is interval scale the interval scale is ordered scale in which the difference between measurement is a meaningful quantity but the measurement do not have to zero point the example of interval scale is for example year say now say this here is 2019 you can add and subtract something you can add another five years 2024 or you can subtract another nine years 2010 but you cannot multiply if you multiply that number for example 2019 and 2020 you will end up with the big number
there is no meaning for that because there is no meaning for zero the another example of interval scale is you were Fahrenheit temperature for example in the Fahrenheit scale the zero represents freezing point but it is not the absence of the seat but absence of the temperature but at the same time in the Kelvin for example minus 273 it is absence of heat so Kelvin will be the some other scale that you will see the next one the ratio scale is the ordered scale in which the difference between the measurement is a meaningful quantity and
the measurement have the to zero point weight age salary and the Kelvin temperature comes under ratio scale because 0 Kelvin that means the absence of the heat so in the ratio scale he can do all kinds of arithmetic operation for example the nominal you cannot do any arithmetic operation in ordinal you cannot do in arithmetic operation in the interval you can add and subtract but you cannot multiply but in the ratio data you can do all kinds of arithmetic operations you can you can substract you can multiply you can divide you see the usage potential
various level of data for example the usage potential potential of nominal data is not that much the next one is ordinal next one is interval next one ratio so the ratio data is having the highest to use its potential the nominal data is having the least usage potential this is more important why we are to still know the different types of data because this types of data is helping to choose the right analytical tools for doing analysis for example if the data is the nominal data you can do only nonparametric tests for example the data
is ordinal here also you can do only nonparametric test but if the data is interval you can do parametric test you see that interval you can do all of the above plus addition and subtraction in the ratio if you can do all of the above plus multiplication and division and statistical methods you can go for parametric methods so the purpose of classifying the data into nominal ordinal interval ratio is to choose the right analytical tools with it whether it is a parametric or non parametric the other reason is sometime for if we want to do
a non parametric analysis that is used only for nominal data sometime the students they will the data may be nominal but they may go for a parametric test that should not be done that is the purpose of knowing what kind of what is the nature of this data so in this class we have seen the introduction for data analytics we have seen the importance of data analytics we have seen the classification of data analytics then we can we have seen what is the analytics and analyst and we have seen different types of data the next
class we'll learn about what is Python how to install the Python and what kind of descriptive analysis we can do with the help of Python so the next class will meet you with another listen thank you very much [Music] [Applause] [Music] [Applause] [Music] [Music]