this video is about everything you need to know about the t-test after this video you know what a t-test is and when you use it what types of t-tests there are what the hypotheses and the assumptions are how a t-test is calculated and how you interpret the results let's start with the first question what is a t-test the t-test is a statistical test procedure and what does the t-test do the t-test analyzes whether there is a significant difference between the means of two groups for example the two groups may be patients who received one's drug a and once drug B we would now like to know if there is a difference in blood pressure between these two groups now there are three different types of t-tests the one sample t-test the independent samples t-test and the paired samples t-test when do we use a one sample t-test we use the one sample t-test when we want to compare the mean of a sample with a known reference mean example a chocolate bar manufacturer claims that it's chocolate bars weigh an average of 50 grams to check this we take a sample of 30 bars and weigh them the mean value of this sample is 48 grams now we can use a one sample t-test to check if the mean of 48 grams is significantly different from the claimed 50 grams when do we use the independent samples t-test we use the t-test for independent samples when we want to compare the means of two independent groups or samples we want to know if there is a significant difference between these means example we would like to compare the effectiveness of two painkillers we randomly divide 60 people into two groups the first group receives track a and the second group receives track B using an independent t-test we can now test whether there is a significant difference in pain relief between the two drugs when we use the pair samples t-test we use the paired samples t-test to compare the means of two dependent groups example we want to know how effective a diet is to do this we weigh 30 people before the diet and then weigh exactly the same people after the diet now we can look at the difference in weight between before and after for each subject we can now use a paired samples t-test to test whether there is a significant difference in a paired sample the measurements are available in pairs the pairs result for example from repeated measurements with the same people independent samples are made up of people and measurements that are independent of each other here's an interesting note the paired samples t-test is very similar to the one sample t-test we can also think of the paired samples t-test as having one sample that was measured at two different times we then calculate the difference between the paired values giving us a value for one sample the difference is once minus five once plus two ones minus one and so on and so forth now we want to test whether the mean value of the difference just calculated deviates from a reference value in this case 0 this is exactly what the one sample T Test does what are the assumptions for a t-test of course we first need a suitable sample in the one sample t-test we need a sample and a reference value in the independent t-test we need two independent samples and in the case of a pair T Test a pair sample the variable for which we want to test whether there is a difference between the means must be metric examples of metric variables are age body weight and income for example a person's level of education is not a metric variable in addition the metric variable must be normally distributed in all three t-test variants to learn how to test if your data is normally distributed watch my video test for normal distribution in case of an independent t-test the variances in the two groups must be approximately equal you can check whether the variances are equal using Levine's test for more information watch my video on Levine's test so while the hypotheses of the t-test let's start with the one sample t-test in the one sample t-test the null hypothesis is the sample mean is equal to the given reference value so there's no difference and the alternative hypothesis is the sample mean is not equal to the given reference value what about the independent samples t-test in the independent t-test the null hypothesis is the mean values in both groups are the same so there's no difference between the two groups and the alternative hypothesis is the mean values in both groups are not equal so there is a difference between the two groups and finally the paired samples t-test in the pair T Test the null hypothesis is the mean of the difference between the pairs is zero and the alternative hypothesis is the mean of the difference between the appears is not zero so now we know what the hypotheses are before we look at how the t-test is calculated let us look at an example of why we actually need a t-test let's say there is a difference in the length of study for a bachelor's degree between men and women in Germany our population is therefore made up of all grad weights of a bachelor who have studied in Germany however as we can observe the old batch like graduates we draw a sample that is as representative as possible we now use the t-test to test the null hypothesis that there is no difference in a population if there is no difference in the population we will certainly still see a difference in study duration in the sample it would be very unlikely that we drew a sample where the difference would be exactly zero in simple terms we now want to know at what difference measured in a sample we can say that the duration of study of men and women is significantly different and this is exactly what the t-test answers but how do we calculate a t-test to do this we first calculate the T value to calculate the T value we need two values first we need the difference between the means and then we need the standard deviation from the mean this is also known as the standard error in a one sample t-test we calculate the difference between the sample mean and the known reference mean as is the standard deviation of the collected data and N is the number of cases s divided by the square root of n is then the standard deviation from the mean which is the standard error in a dependent samples t-test we simply calculate the difference between the two sample means to calculate the standard error we need the standard deviation and the number of cases from the first and second sample depending on whether we can assume equal or unequal variance for our data there are different formulas for the standard error read more about this in our tutorial on datadeep. net in a paired sample t-test we only need to calculate the difference between the paired values and calculate the mean from that the standard error is then the same as for a one sample t-test so what have we learned so far about the T value no matter which t-test we calculate the T value will be greater if we have a greater difference between the means and the T value will be smaller if the difference between the means is smaller for sure that the T value becomes smaller when we have a larger dispersion of the mean so the more scattered the data the less meaningful a given mean difference is now we want to use the t-test to see if we can reject the null hypothesis or not to do this we can now use the T value in two ways either we read the critical T value from a table or we simply calculate the p-value from the T value we'll go through both in a moment but what is the p-value a t-test always tests the null hypothesis that there is no difference so first we assume that there is no difference in the population when we draw a sample this sample deviates from the null hypothesis by a certain amount the p-value tells us How likely it is that we would draw a sample that deviates from the population by the same amount or more more than a sample we drew thus the more the sample deviates from the null hypothesis the smaller the p-value becomes if this probability is very very small we can of course ask whether the null hypothesis holds for the population perhaps there is a difference but at what point can we reject the null hypothesis this border is called the significance level which is usually set at five percent so if there is only a five percent chance that we draw such a sample or one that is more different than we have enough evidence to assume that we reject the null hypothesis and to put it simple we assume that there is a difference that the alternative hypothesis is true now that we know what the p-value is we can finally look at how the T value is used to determine whether or not the null hypothesis is rejected let's start with the path through the critical T value which you can read from a table to do this we first need a table of critical T values which we can find on datatap. net under tutorials and T distribution let's start with the two-tailed case we'll briefly look at the one-tailed case at the end of this video here below we see the table first we need to decide what level of significance we want to use let's choose a significance level of 0.
05 or 5 then we look in this column at 1 minus 0. 05 which is 0. 95 now we need a degrees of freedom in the one sample t-test and the paired samples t-test the degrees of freedom are simply the number of cases -1 so if we have a sample of 10 people there are 9 degrees of freedom in the independent sample C test we add the number of people from both samples and calculate that -2 because we have two samples note that the degrees of freedom can be determined in a different way depending on whether we assume equal or unequal variance so if we have a five percent significance level and 9 degrees of freedom we get a critical T value of 2.
262 now on the one hand we've calculated a t value with the t-test and we have the critical T value if our calculated T value is greater than the critical T value we reject the null hypothesis for example suppose we calculate a t value of 2. 5 this value is greater than 2. 262 and therefore the two means are so different that we can reject the null hypothesis on the other hand we can also calculate the p-value for the T value we've calculated if we enter 2.
5 for the T value and 9 for the degrees of freedom we get a p-value of C 0. 034 the p-value is less than 0. 05 and we therefore reject the null hypothesis as a control we copy the T value of 2.
262 here we get exactly a p-value of 0. 05 which is exactly the limit if you want to calculate a t-test with data tab you just need to copy your own data into this table click on hypothesis test and then select variables of interest for example if you want to test whether gender has an effect on income you simply click on the two variables and automatically get a t-test calculated for independent samples here below you can read the p-value if you're still unsure about the interpretation of the results you can simply click on interpretation in words a two-tailed t-test for independent samples equal variances assumed showed that the difference between female and male with respect to the dependent variable salary was not statistically significant thus the null hypothesis is retained the final question now is what is the difference between directed hypotheses and undirected hypothesis in the undirected case the alternative hypothesis is that there is a difference for example there is a difference between the salary of men and women in Germany we don't care who earns more we just want to know if there is a difference or not in a directed hypothesis we are also interested in the direction of the difference for example the alternative hypothesis might be that men earn more than women or women earn more than men if we look at the T distribution graphically we can see that in the two-sided case we have a range on the left and the range on the right we want to reject the null hypothesis if we are either here or there with a five percent significance level both ranges have a probability of 2.