# Hypothesis Testing Rahul's Noteblog Notes on Biostatistics Hypothesis Testing

## Null hypothesis:

• Null hypothesis = Ho = the claim is a hypothesis that can be tested = a statement of "no effect" or "no difference".

## Alternate hypothesis:

• Alternate hypothesis = HA = the claim that is accepted if the null hypothesis is rejected.

## Sample vs Population Mean:

• The greater the difference between sample mean and the population mean specified by the null hypothesis, the less probable it is that the sample really does come from the specified population.

• If a very large number of random samples are taken from any population, their means form a normal distribution - the random sampling distribution of means - which has a mean equal to the population mean.

• Critical values define the areas of acceptance and rejection.

## Z-tests:

• Z-tests are different from t-tests in the sense that z-tests are used when the sample size>100. z-tests are less used than t-tests.

## Type I and Type II Errors:

• Type I: Accepting Ha when Ho is true. Probability = alpha = level of significance. In essence, alpha is the amount of evidence the experimenter is demanding before abandoning the null hypothesis. The value of alpha is preset by the experimenter.

• Type II: Accepting Ho when Ha is true. Probability = beta. It should be noted that beta is not set initially by the experimenter. Each hypothesis test has an infinite number of Type II error probabilities.

• It is possible to guard against Type I error, preventing Type II error isn't easy.

## Type I and Type II Errors in Tabulated Form:

 H1 (+Reality) H0 (-Reality) H1 (+Tests) 1-beta | Power | Some difference exists in reality and can be confirmed by statistical testing. alpha | Type 1 error | No difference exists in reality but is falsely assumed by testing H0 (-Tests) Beta | Some difference exists in reality but cannot be confirmed by testing | Type II error No difference exists in reality and cannot be confirmed by testing

= b/(b+d)

= c/(a+c)

## Power of Statistical Tests:

• The ability to avoid type II error depends upon our ability to detect a null hypothesis that is false, and this is called the power of statistical tests.

• A tests power increases as: alpha increases, beta increases, size difference between sample and population mean increases, sampling error decreases, and sample size increases.

• Very power systems and statistical tests run the risk of making type I errors.

## Directional Hypotheses:

• If Ha claims that mu lies either above or below the mu claimed by Ho, the hypothesis is bidirectional.

• If Ha claims mu lies above the mu claimed by Ho, the hypothesis is right tailed.

• If Ha claims mu lies below the mu claimed by Ho, the hypothesis is left tailed.

• One tailed tests are more powerful than two tailed tests.

## Analysis of Variance (ANOVA):

• A t-test is used to test a hypothesis about a single mean.

• Some tests, however, require comparisons between two means.

• For example, Ho states that sample mean A = sample mean B = population mean.

• And Ha states that sample mean A and sample mean B are not equal to population mean.

• ANOVA involves a few groups written in columns, and their means written below them.

• One-Way ANOVA: When you are only testing the variability between and within groups with only one factor at a time. I.e. treatment only.

• Variance of A = Variance of B

• Two-Way ANOVA: When you are testing the variability between and within groups with more than one factor I.e. treatment and gender.

• The ANOVA only shows if there is variability in the results. If attributable to one factor or other factor and two factors in combination.

• If a single factor is significant than it is called the main effect.

• If a combination of factors is significant that it is an interaction effect. The two factors together differ from the sum of the individual effects alone.

## Parametric Tests:

• Hypotheses refer to population parameters. The population mean (t and z tests) or population variance (F-tests).

• Hypotheses concern interval or ratio scale (continuous) data such as weight, blood pressure, IQ, per capita income, measures of clinical improvement, and so on.

• Population is normally distributed provided random sample of sufficient size.

## Non-parametric Tests:

• Do not test hypothesis concerning parameters.

• Do not assume that the population is normally distributes, so they are distribution free tests.

• Used to test nominal and ordinal scale data.

• Generally less powerful than parametric tests.

## Chi-Square Test:

• This test is used for testing hypotheses about nominal scale data.

• Tests proportions, telling us whether proportions of observations falling in different categories differ significantly from the proportions that would be expected by chance.

### Chi-Square Test Example:

• Suppose you toss a coin 100 times and expect 50% of tosses to be heads and 50% to be tails. However, the actual result is 59 heads, 41 tails. Now, chi-square can show whether difference in proportion is too large to be expected by chance, i.e., "significant."