When we take a sample from a population, we use descriptive statistics to summarize that sample data.
We can also use sample data to draw conclusions about the population.
Hypothesis testing is one way we make inferences about the population.
What is a hypothesis? a claim about a characteristic (p, or µ) of a population.
p |
Population proportion |
95% of students pass stats class (QM) the first time they take it. |
µ |
Population average |
The average grade in your QM class is 74%. |
What is a hypothesis test? A standard procedure for testing a claim about a population characteristic.
People, companies and organizations make claims about all sorts of things. We perform hypothesis tests to evaluate if the claims are likely to be true.
1. Teacher |
“My students score higher on standardized tests.” |
Are test scores significantly higher? |
2. Mechanic |
“I’ll charge you less than the other guys.” |
Does this mechanic charge less? |
3. University |
“Our graduating students’ starting salaries are different from the average starting salary.” |
Are salaries significantly different? |
Step 1. Formulate the null and alternative hypotheses
Null hypothesis (H0): The starting assumption for a hypothesis test, that there is no difference between the sample and the population mean (µ) or parameter (p).
Examples of Null Hypotheses (Ho):
1. Teacher |
Test scores are not significantly different from the average. |
2. Mechanic |
This mechanic charges about the same amount as everyone else. |
3. University |
Starting salaries for graduating students from this institution are the same as students from elsewhere. |
Alternative hypothesis (Ha): The claim that the population mean (µ) or parameter (p) has a value that is different from that claimed in the null hypothesis.
Deciding if your test is left, right or two-tailed
1. Teacher |
Test scores in this class are higher than the average for all classes. |
Ha: µ > average test scores |
Right-tailed test |
2. Mechanic |
This mechanic charges less than everyone else. |
Ha: µ < average cost |
Left-tailed test |
3. University |
Students’ starting salaries are significantly different from the average. |
Ha: µ ≠ average cost |
Two-tailed test |
https://dataz4s.com/statistics/one-tailed-test/
Step 2. Collect Data
Draw a sample (link word sample to section 1.a.ii) and measure:
Step 3. Perform a statistical test
https://www.scribbr.com/statistics/statistical-tests/
Common Hypothesis Tests include T-tests and Chi-Squared tests
T-tests
The t-test compares sample means / proportions to population means / proportions, and asks how different they are. We use t-tests with quantitative data.
The larger the sample size the fewer values are in the tails of the distribution.
https://statisticsbyjim.com/hypothesis-testing/t-tests-t-values-t-distributions-probabilities/
For example
SPSS T-test “How-To” (Embed)
Chi-squared (χ2)
Chi squared tests ask: Are there differences between the observed frequencies of an event or characteristic, and the frequencies we expect to see?
We use chi-squared tests to look at differences between groups, usually for categorical / qualitative data.
SPSS Chi-Squared “How-To”
See the example (link example to “working through a chi-square test”) at the end of this section to learn how to work through a chi-square hypothesis test by hand.
Step 4. Possible outcomes and drawing conclusions
Reject Ho |
In which case we have evidence to support Ha |
Not Reject Ho |
In which case we do not have evidence to support Ha
|
Interpreting Hypothesis Tests
The p-value is the probability that you would obtain the answer you have, assuming the null hypothesis is true.
It is often used in conjunction with a pre-determined level of statistical significance.
https://www.statisticshowto.com/probability-and-statistics/find-critical-values/
Critical Values
Critical values are cut off points beyond which a test value is unlikely to lie.
Common critical values in the social sciences include 0.05 and 0.01.
But be careful! Errors can happen
Errors are divided into two categories, type I and type II.
|
Tests rejects null |
Test fails to reject null |
Null(Ho) is true |
Type I error, false positive |
Correct decision, no effect |
Null (Ho) is false |
Correct decision, effect exists |
Type II error, false negative |
https://pregnancymb.blogspot.com/2019/10/type-1-error-pregnancy-test.html
Chi-square (χ2)
Chi square tests ask: Are there differences between the observed frequencies of an event or characteristic, and the frequencies we expect to see?
Variable 1 |
& Variable 2 |
Soccer Team (Team A, Team B) |
Height (Short, Tall) |
Regular junk food consumption (yes / no) |
Weight (underweight, average, overweight) |
Calculating the Chi Square value
Using the soccer team example from above, we input the observed and expected values for hair colour on each team.
Observed values: Categorize and count the number of short and tall players.
Expected values: Calculate the number of short and tall players you would expect to see if the two variables (Team and Height) are independent (don’t affect each other).
How to calculate expected values: Σ column * Σ row / n
Σ = sum
n = total number
Short |
Tall |
Total |
|
Team A |
7 (6) |
4 (5) |
11 |
Team B |
5 (6) |
6 (5) |
11 |
Total |
12 |
10 |
22 |
*Note: expected values must be at least 5 for this test to work.
Use a table like the one below to help you calculate the Chi-Square (χ2) value:
expected
Fo = Frequency observed
Fe = Frequency expected
Understanding the results
Using your pre-determined level of significance, compare your chi-square value to this example’s critical value of 3.84.
If the calculated chi sq. value is…
< critical value |
Random |
> critical value |
Significant |
In this case?
Calculated chi sq. value |
Greater or less than |
Critical value |
0.74 |
< |
3.84 |
Conclusion: The height difference is independent / random.