# Correlational Techniques

Rahul's Noteblog Notes on Biostatistics Correlational Techniques

## What is Correlation?

• Medical sciences often establish relations between two variables, eg., smoking and cancer.

• The methods used are called correlational techniques.

• Correlation: Establish and quantify the strength and direction of the relationship between two variables. It is expressed as r (correlation coefficient).

• The relation between two correlated variables forms a bivariate distribution, which is commonly presented graphically in the form of a scattergram.

• Coefficient of determination = correlation coefficient X correlation coefficient.

• Correlation doesn't establish a casual relation between two variables, but merely a statistical association.

## Scatter Plots:

• A scatterplot or scattergram shows the relationship between two quantitative variables (continuous - interval or ratio data) measured on the same individuals. Value of one variable on horizontal axis and variable of other variable appear on the vertical axis.

• Values of r near 0 indicate a weak linear relationship. r=-1 and r= +1 is perfect.

## Types of Correlation:

### Pearson product-moment correlation:

• Used for interval or ratio scale data.

### Spearman rank-order correlation:

• Used for ordinal scale data.

Both these techniques are linear, and cannot be used for non-linear relation.

## Regression:

• Regression: Express the functional relationship between two variables, so that the value of one variable can be predicted from knowledge of the other. One value X is used to predict Y.

• When two variables are correlated, it is possible to predict the value of one of them if the other variable is numerically known.

• A simple linear regression equation may be: Y = a + bX, where X and Y are the two variables.

• X = independent/explanatory variable; Y = dependent variable/response variable.

• Slope (b) (rate of change) - Slope of the regression line and is known as regression coefficient.

• x is the value of the variable x.

• Intercept (a) is known as "intercept constant" (where point on Y axis where Y axis is intercepted by the regression line.

## Multiple Regression:

• More than one variable is used to predict the expected value of Y, thus increasing the overall percentage of variance in Y that can be accounted for.

• Birth weight of a baby (Y in grams) can be partly predicted from number of cigarettes smoked on a daily basis by both baby's mother (x1) and baby's father (x2). Y=3385-9x1-6x2.

## Z-Test for Correlation:

• If n >100, or if the s.d. of population is known a z-test is used.

## Additional Readings:

### General Biostatistics

1. Inferential Statistics

2. Descriptive Statistics

3. Correlational Techniques

4. Hypothesis Testing

5. Statistical Research Methods

## Random Pages:

Please Do Not Reproduce This Page

This page is written by Rahul Gladwin. Please do not duplicate the contents of this page in whole or part, in any form, without prior written permission.