Introduction

Learning Objectives

By the end of this module, you should be able to:

Explain the logic of null hypothesis significance testing (NHST)
Correctly interpret a p-value and avoid common misconceptions
Choose between independent-samples and paired t-tests based on study design
Check assumptions (normality, equal variance) before running a t-test
Run t-tests in R, compute effect sizes, and report results in APA style
Conduct a power analysis to determine sample size before running a study
Apply non-parametric alternatives when assumptions are violated

The Logic of Null Hypothesis Testing

In HCI research, we frequently compare conditions: Does a new keyboard layout reduce typing errors? Is voice input faster than touch input? To answer these questions rigorously, we use null hypothesis significance testing (NHST).

The reasoning works by contradiction. We start by assuming there is no difference between conditions --- this is the null hypothesis (H₀). We then collect data and ask: "If H₀ were true, how surprising would our observed data be?" If the data would be very unlikely under H₀, we reject H₀ in favor of the alternative hypothesis (H₁), which states that a real difference exists.

This is analogous to a proof by contradiction in logic. We do not directly prove that our new interface is better; instead, we show that the data are inconsistent with the assumption that it is not.

What a p-Value Actually Means

The p-value is the probability of observing data as extreme as (or more extreme than) what we collected, assuming H₀ is true.

$$p = P(\text{data this extreme or more} \mid H_0 \text{ is true})$$

A small p-value (conventionally < 0.05) suggests the observed data would be unlikely if H₀ were true, leading us to reject H₀.