hello and welcome in this video i explain to you what an analysis of variance a so-called anova is and how you can calculate it there are different types of analyses of variance this video is about the one-way or single factor analysis of variance without measurement repetitions and that's where we start the first question is why do you need an analysis of variance at all what does an analysis of variance do an analysis of variance checks whether there are statistically significant differences between more than two groups therefore the analysis of variance is the extension of the
t-test for independent samples to more than two groups when calculating an independent t-test we looked at whether there is a difference or more precisely a difference in means between two independent groups for example if there is a difference in the salary of men and women in this case we have two groups the man's group and the women's group if we want to compare more than two independent groups we use the analysis of variance in case of the t-test we used an independent t-test if the two groups or samples were independent this is the case if
one person in the first group has nothing to do with a person from the second group exactly the same now applies to the analysis of variance without repeated measures except that here we have at least three independent samples if we have more than two dependent samples we would use an analysis of variance with repeated measures now let's look at an example let's say that as the founder of datatab i might be interested in whether there are differences in the age between people who use data tab spss or r in order to do this i take
a sample of people who use statistical software and ask them which statistical software they use and how old they are i've only compared three groups in this example of course there could also be more groups in order to analyze this example i would now use an anova so the next question is what is the research question i can answer with using an anova the research question is is there a difference in a population between the different groups of the independent variable in relation to the dependent variable the independent variable is the variable with the different
categories in our example it is the statistics software used here we have the three groups datadeb spss and r the dependent variable in our example is the age of the software users we would like to know whether the groups of the independent variables have an influence on the dependent variable of course the analysis of variance does not give us any information about the direction of the causal relationship but why is our research question about the population don't we just have a sample actually we want to make a statement about the population unfortunately in most cases
it is not possible to survey the whole population and we can only draw a sample the aim is to make a statement about the population based on our sample with the help of the analysis of variants for our example the question would be is there a difference between the users of different statistical software solutions in terms of age but what about the hypotheses in the case of the analysis of variance the null hypothesis is that there are no differences between the means of the individual groups we have our individual groups of which we can calculate
the mean in each case and our null hypothesis is that there is no difference in the mean in a population the alternative hypothesis h1 is that there is a difference between at least two group means therefore our null hypothesis assumes that there is no difference and the alternative hypothesis says that there is a difference all well and good now we know what the null hypothesis is but what does this mean graphically how can one picture that vividly let's say we want to test whether there is a difference in salary between the three groups group 1
group 2 and group 3. the salary has some dispersion some people earn 400 euros a month some 2 600 and others 6 000 euros a month thus both in the population and in our sample the salary is broadly distributed now the question is where does this variation come from and can we explain some of the variation by these three groups so how much of the variation in salary can we explain by dividing the people into these three groups in the extreme case the result could be that the salary in group 1 has this distribution in
group 2 that distribution and in group 3 the distribution would look like this in this case the division into groups could explain a lot of variance in a variable salary the result would be different in this case here however we could explain almost no variance by forming the three groups within the groups the variance is almost the same as in the whole sample therefore it does not matter whether we form the groups or not the three groups have nearly no influence on the salary if we now look at the variance within the groups we can
see that in this case we have very small variances within the groups so within this group we have a very small variance within that group we have a small variance and also in the last group on the other hand the variance between the groups is very large because the mean values of the individual groups are very far apart in the other case we have a very large variance within the groups however the variance between the groups is very small because the mean values of the groups are very close together how can we calculate an anova
there are two possibilities for the calculation either you use a statistic software like datatab or you calculate the analysis of variance by hand admittedly no one will calculate the analysis of variance by hand but the knowledge is very helpful to understand more precisely how an analysis of variance works in this video i show you how you can easily calculate an analysis of variance online with data tab to calculate an analysis of variance with data tab just visit datadeb.net you can find the link in the video description below then you copy your own data into this
table and click on this tab under this tab you will find a variety of hypotheses tests here below you can see the variables you copied into the table depending on which variables you select data tab will calculate the appropriate hypothesis test if you click on a matrix variable and a nominal variable with at least three characteristics datadab calculates an analysis of variance here you can read the p-value if you don't know exactly how to interpret the p-value just get the summary inverts above furthermore you can check the assumptions of the analysis of variance here thanks
for watching and i hope you enjoyed the video you