• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View

# Two-sample t-test for the difference between means

last edited by 5 years, 3 months ago

A two-sample t-test for the difference between means or a two-sample t-test is a hypothesis test used to determine whether or not the population means from two groups could be the same. Since we are comparing means and not proportions, the sampling distribution model is a Student's t and not a Normal model. Likewise, the standard error (SE) is used to estimate the difference in independent sample means instead of the standard deviation (SD). The calculations from the test statistic and SE equations below are then used to obtain a P-value on SPSS.

The assumptions and conditions of the two-sample t-test for the difference between the means of two independent groups are the same as for the two-sample t-interval. Since no statistical test can verify the independence assumption, we must look at how the data were collected. Therefore the Randomization Condition is satisfied when the data in each group are drawn independently and at random from a population or generated by a randomized comparative experiment. The Nearly Normal Condition must be checked for both samples because we must assume that both samples are each Normally distributed. If there is not a histogram or a Q-Q plot available to illustrate whether or not the data are normally distributed, this condition can be satisfied when n40. And the Independent Groups Assumption is checked off when the two samples are randomized and independent of each other. After these conditions are met, we can proceed to perform a hypothesis test. The null hypothesis could be written as because the difference between two means is usually 0. The hypothesized difference is called "delta naught," and because it is so common for the hypothesized difference between two means to be 0, it is just assumed that . Therefore the the null hypothesis is usually written as or although the latter is more popular. The null hypothesis is tested using the test statistic equation below to find the difference between the observed group means and compare it with the hypothesized difference.  The following data set is taken from Statistical Analysis Quick Reference Guidebook (2007). The research question is whether the heights of plants are dependent on the type of fertilizer used. He sets up a controlled experiment in which he grows 7 plants using Fertilizer 1, and 6 plants from Fertilizer 2 under identical conditions. The null hypothesis is that the fertilizer type does not matter and the two mean growth heights of the plants will be the same. The alternative hypothesis is that the fertilizer type does matter and the two mean growth heights of the plants will be different. And because we are trying to determine whether one type of fertilizer is better or worse than the other, the Student's t-model will be two-tailed. (This reflects our null hypothesis that the mean growth heights of the plants will be same despite fertilizer type.)

(This reflects our alternative hypothesis that the mean growth heights of the plants will be different because fertilizer type matters.)

The sample of interest are the 13 plants in the randomized comparative experiment, and the population is all plants of the kind grown in the experiment. The Randomization Condition is satisfied because the plants were a part of a randomized comparative experiment and the conditions were all controlled for, therefore making the fertilizer type the only independent factor and the mean growth height the dependent factor. The 10% condition is satisfied because we can safely assume that the 13 plants do not exceed 10% of all plants that will be grown using the fertilizer types. The Independent Groups Assumption is also satisfied because we can safely ascertain that the two groups of plants are independent of each other and will not affect the growth heights of one another. And finally the Nearly Normal Condition can be checked by looking at a histogram and Q-Q Plot generated by SPSS because our sample size is not greater than 40. After checking that the conditions are satisfied, we can proceed with a two-sample t-test using the equations posted above. For the SE and t-test equations, Y-bar1 and Y-bar2 are the observed sample means of the growth heights of the two sets of plants. n is the sample size of the two sets, which is 7 and 6 in this case. The difference of Muand Mu2 is 0, which was established in the null hypothesis. And s1 and sare the standard deviations of the two groups. SPSS can generate these very same numbers, which are shown in the tables below. The output shows the same t-score of -1.174, which means that the Student's t-model will be shaded 1.174 SE away from 0, which is our null hypothesis, on both sides since our alternative hypothesis is two-tailed. Equal variances are not assumed, therefore .265 is our P-value. Because our P-value is greater than alpha (which is usually assumed to be 0.05 unless stated otherwise), we fail to reject the null hypothesis. Contextually this means that there is insufficient evidence that there is a difference in the mean growth heights of the two sets of plants. The fertilizer type does not matter.

# Two-sample t-test for the difference between means in SPSS

• Go to the "analyze" menu and select "compare means". Then select the "Independent-Samples T Test" option.
• Drag the grouping variable into "Grouping Variable", and then hit "Define Groups".
• Click on "Use Specified Values" and then for Group 1 and Group 2, type the value designated from the 'Variable View' menu.
• Hit continue.
• Drag the quantitative variable you are using to compare your groups into the "Test Variable(s):" menu.
• Select "Options", and make sure your confidence interval is set at 95% and hit "Continue".
• Select "OK" and two charts will be generated in the output window.

the following video illustrates these steps: