Lesson 8 - Bias, Confounding, Random Error, & Effect modification
Lesson 8 - Bias, Confounding, Random Error, & Effect modificationLesson 8 Objectives:
- Identify different types of bias
- Describe 2 main sources of random error
- List the pros and cons of using p-values and confidence intervals for hypothesis testing
- Describe methods for assessing the presence of confounding and effect modification
- Distinguish between confounding and effect modification
- Describe methods for controlling for confounding
- Describe methods for presenting results when effect modification is present
8.1 - Bias
8.1 - BiasBias is a systematic error in the design, recruitment, data collection, or analysis that results in a mistaken estimation of the true effect of the exposure and the outcome.
Bias occurs when the method used to select subjects or collect data results in an incorrect association. Bias is something to consider while designing the study - it usually cannot simply be “corrected” in the analysis stage of the study.
Two main types of bias are selection and information bias.
- Selection Bias
- Selection bias is the systematic error in the selection or retention of participants.
Examples of selection bias:
- Suppose you are selecting cases of rotator cuff tears (a shoulder injury). Many older people have experienced this injury to some degree, but have never been treated for it. Persons who are treated by a physician are far more likely to be diagnosed (and identified as cases) than persons who are not treated by a physician. If a study only recruits cases among patients receiving medical care, there will be selection bias.
- Some investigators may identify cases predicated upon previous exposure. Suppose a new outbreak is related to a particular exposure, for example, a particular pain reliever. If a press release encourages people taking this pain reliever to report to a clinic to be checked to determine if they are a case and these people then become the cases for the study, a bias has been created in sample selection. Only those taking the medication were assessed for the problem. Ascertaining a case based upon previous exposure creates a bias that cannot be removed once the sample is selected.
- Exposure may affect the selection of controls – e.g, hospitalized patients are more likely to have been smokers than the general population. If controls are selected among hospitalized patients, the relationship between an outcome and smoking may be underestimated because of the increased prevalence of smoking in the control population.
- In a cohort study, people who share similar characteristics may be lost to follow-up. For example, people who are mobile are more likely to change their residence and be lost to follow-up. If the length of residence is related to the exposure then our sample is biased toward subjects with less exposure.
- In a cross-sectional study, the sample may have been non-representative of the general population. This leads to bias. For example, suppose the study population includes multiple racial groups but members of one race participate less frequently in the type of study.
- Information Bias
- Information bias, (also known as misclassification bias) is the systematic error due to inaccurate measurement or classification of disease, exposure, or other variables.
Examples of information bias
- Instrumentation - an inaccurately calibrated instrument creating a systematic error
- Misdiagnosis - if a diagnostic test is consistently inaccurate, then information bias would occur
- Recall bias - if individuals can't remember exposures accurately, then information bias would occur
- Missing data - if certain individuals consistently have missing data, then information bias would occur
- Socially desirable response - if study participants consistently give the answer that the investigator wants to hear, then information bias would occur
Misclassification can be differential or non-differential.
- Differential misclassification
-
The probability of misclassification varies for the different study groups, i.e., misclassification is conditional upon exposure or disease status.
Are we more likely to misclassify cases than controls? For example, if you interview cases in person for a long period of time, extracting exact information while the controls are interviewed over the phone for a shorter period of time using standard questions, this can lead to differential misclassification of exposure status between controls and cases.
- Nondifferential misclassification
-
The probability of misclassification does not vary for the different study groups; is not conditional upon exposure or disease status, but appears random. Using the previous example, if half the subjects (cases and controls) were randomly selected to be interviewed by phone and the other half were interviewed in person, the misclassification would be nondifferential.
Either type of misclassification can produce misleading results.
8.2 - Random Error
8.2 - Random Error- Random Error
- Random error is the false association between exposure and disease that arises from chance and can arise from two sources - measurement error and sampling variability.
Measurement error
Measurement error occurs when there is a mistake in assessing the exposure or the outcome.
Consider the figure below. If the true value is the center of the target, the measured responses in the first instance are the goal. There is a negligible random error (high precision), and the measurements are accurate (reliable). The second target has a negligible random error, but the measurements are not accurate. The third has a random error but is still accurate. The fourth has a random error and is inaccurate.
Methods to increase precision and reduce random error
- Increase the sample size of the study
- Repeat a measurement with a study
Sampling variability
Sampling variability refers to the fact that there are a huge number of possible samples that can be drawn from any single population, and there will be variation among the different possible samples that could be selected. We take a sample because it is not feasible to measure the entire population and we hope that the sample we select is representative of the population. It is best to choose a random sample rather than a non-random one, but a random sample can still be unrepresentative of the population simply by chance. Selecting a large enough sample size can help minimize the chance of selecting an unrepresentative sample.
P-values and confidence intervals
Epidemiologists use hypothesis testing to assess the role of random error and to make statistical inferences. P-values can be useful in understanding relationships, but they cannot be the only tools used to make inferences. It is very important to provide good estimates along with confidence intervals in order to make good scientific conclusions.
- P-value
- P-value: the probability of obtaining the test statistic you got, or one more extreme, assuming that the null hypothesis is true.
- Probability between 0-1
- p<0.05 is a typical cut-off for significance
- Small – evidence to suggest a difference in groups
- Large – no evidence to suggest a difference in groups
Issues with p-values
- Dependent on both the magnitude of the association and the sample size
- Can be viewed as providing black/white conclusions. If p<0.05 claim significance, but p>0.05 is non-significant. How different really is a p-value of 0.04 versus 0.06?
- statistical significance does not imply clinical significance
Read The ASA Statement on p-Values: Context, Process, and Purpose (tandfonline.com). The ASA (American Statistical Association) concludes that:
Confidence Intervals
Confidence Intervals provide a way to quantify the amount of random error in an estimate. Once the estimate of interest is calculated (ie. cumulative incidence, incidence rate, etc), there are many formulas (depending on the measure) that can be used to calculate the confidence interval.
The general formula for a 95% CI is estimated ± 1.96*sn, where s is the standard deviation and n is the sample size. The 1.96 is from the normal distribution, to estimate a 95% CI.
You can see that the width of the CI will decrease as the sample size increases, and as the standard deviation decreases.
The true parameter from the population is unknown (because we can’t measure the entire population), so we calculate our estimate from the sample we selected. Once we put a CI around our estimate, it either does or does not contain the true estimate - we don’t know. The idea of the confidence interval is that if we repeated the exercise (select a sample, calculate the estimate, calculate the CI) that 95% of the CIs we constructed would contain the true estimate. It does NOT mean that we are 95% confident that the CI contains the true mean. As stated above, it either does or does not, we can’t know which.
8.3 - Confounding
8.3 - ConfoundingConfounding is a situation in which the effect or association between an exposure and outcome is distorted by the presence of another variable. Positive confounding (when the observed association is biased away from the null) and negative confounding (when the observed association is biased toward the null) both occur.
If an observed association is not correct because a different (lurking) variable is associated with both the potential risk factor and the outcome, but it is not a causal factor itself, confounding has occurred. This variable is referred to as a confounder. A confounder is an extraneous variable that wholly or partially accounts for the observed effect of a risk factor on disease status. The presence of a confounder can lead to inaccurate conclusions.
Confounders
A confounder meets each of the following three criteria:
- It is a risk factor for the disease, independent of the putative risk factor.
- It is associated with putative risk factor.
- It is not in the causal pathway between exposure and disease.
The first two of these conditions can be tested with data. The third is more biological and conceptual.
Confounding masks the true effect of a risk factor on a disease or outcome due to the presence of another variable. We identify potential confounders from our:
- Knowledge
- Prior experience with data
- Three criteria for confounders
We will talk more about this later, but briefly here are some methods to control for a confounding variable (if the confounder is suspected a priori):
- randomize individuals into different groups (use an experimental approach)
- restrict/filter for certain groups
- match in case-control studies
- analysis (stratify, adjust)
Controlling potential confounding starts with a good study design including anticipating potential confounders.
Example: Coronary Heart Diseas and Diabetes
Suppose as part of the cross-sectional study we survey patients to find out whether they have coronary heart disease (CHD) and if they are diabetic. We generate a 2 × 2 table (below):
Category | CHD | No CHD | Total |
---|---|---|---|
Diabetes | 26 (12%) | 190 | 216 |
No Diabetes | 90 (3.9%) | 2241 | 2331 |
Total | 116 | 2431 | 2547 |
The prevalence of coronary heart disease among people without diabetes is 90 divided by 2331, or 3.9% of all people with diabetes have coronary heart disease. By a similar calculation, the prevalence among those with diabetes is 12%. A chi-squared test shows that the p-value for this table is p<0.001. The large sample size results in a significant p-value, and the magnitude of the difference is fairly large (12% v 3.9%).
- Prevalence Ratio (PR):
- The prevalence ratio, considering whether diabetes is a risk factor for coronary heart disease is 12 / 3.9 = 3.1. Thus, people with diabetes are 3.1 times as likely to have CHD than those without diabetes.
- Odds Ratio (OR):
- The odds ratio, considering whether the odds of having CHD is higher for those with versus without diabetes is ( 2241 × 26) / ( 90 × 190) = 3.41. The odds of having CHD among those with diabetes is 3.41 times as high as the odds of having CHD among those who do not have diabetes.
Which of these do you use? They come up with slightly different estimates.
It depends upon your primary purpose. Is your purpose to compare prevalences? Or, do you wish to address the odds of CHD as related to diabetes?
Now, let's add hypertension as a potential confounder. There are 3 criteria to evaluate to assess if hypertension is a confounder.
-
"Is hypertension (confounder) associated with CHD (outcome)?" also could be thought of as “Is hypertension a risk factor for CHD, independent of diabetes?”
First of all, prior knowledge tells us that hypertension is related to many heart related diseases. Prior knowledge is an important first step but let's test this with data. We look at this relationship just among the non-diabetics, so as to not complicate the relationship between the confounder and the outcome.
Consider the 2 × 2 table below:
Category CHD No CHD Total Hypertension 39 (5.5%) 669 708 No Hypertension 51 (3.1%) 1572 1623 Total 90 2241 2331 PR = 1.75
OR = 1.80The prevalence of coronary heart disease among people without hypertension is 51 divided by 1623, or 3.1% of all people with hypertension have coronary heart disease. By a similar calculation, the prevalence among those with hypertension is 5.5%. A chi-squared test shows that the p-value for this table is p=0.006. The large sample size results in a significant p-value, even if the magnitude of the difference is not large. But yes, we see that hypertension is associated with higher rates of CHD.
-
This leads us to our next question, "Is hypertension (confounder) associated with diabetes (exposure)?"
Category Diabetes No Diabetes Total Hypertension 133(15.8%) 708 841 No Hypertension 83(4.9%) 1623 1706 Total 216 2331 2547 PR = 3.25
OR = 3.67The prevalence of diabetes among people without hypertension is 83 divided by 1706, or 4.9% of all people with hypertension have diabetes. By a similar calculation, the prevalence among those with hypertension is 15.8%. A chi-squared test shows that the p-value for this table is p<0.001. The large sample size results in a significant p-value, and the magnitude of the difference is fairly large.
-
A final question, "Is hypertension an intermediate pathway between diabetes (exposure) and development of CHD?"
– or, vice versa, does diabetes cause hypertension which then causes coronary heart disease? Based on biology, that is not the case. Diabetes in and of itself can cause coronary heart disease. Using the data and our prior knowledge, we conclude that hypertension is a major confounder in the diabetes-CHD relationship.
What do we do now that we know that hypertension is a confounder?
Stratify....let's consider some stratified assessments...Among hypertensives: Category CHD No CHD Total Diabetes 20 (15%) 113 133 No Diabetes 39 (5.5%) 669 708 Total 59 782 841 PR = 2.73
OR = 3.04Among non-hypertensives: Category CHD No CHD Total Diabetes 6 (7%) 77 83 No Diabetes 51 (3.1%) 1572 1623 Total 57 1649 1706 PR = 2.30
OR = 2.40Both estimates of the odds ratio (hypertensives OR=3.04, non-hypertensives OR= 2.40) are lower than the odds ratio based on the entire sample (OR=3.41). If you stratify a sample, without losing any data, wouldn't you expect to find the crude odds ratio to be a weighted average of the stratified odds ratios? A similar phenomenon occurs with the prevalence ratios: (hypertensives PR=2.73, non-hypertensives PR= 2.30) when the PR for the entire sample was 3.1.
This is an example of confounding - the stratified results are both on the same side of the crude odds ratio. This is positive confounding because the unstratified estimate is biased away from the null hypothesis. The null is 1.0. The true odds ratio, accounting for the effect of hypertension, is 2.84 from the Maentel Hanzel test. The crude odds ratio of 3.41 was biased away from the null of 1.0. (In some studies you are looking for a positive association; in others, a negative association, a protective effect; either way, differing from the null of 1.0). The adjusted prevalence ratio is 2.60.
This is one way to demonstrate the presence of confounding. You may have a priori knowledge of confounded effects, or you may examine the data and determine whether confounding exists. Either way, when confounding is present, as, in this example, the adjusted odds ratio should be reported. In this example, we report the odds ratio for the association of diabetes with CHD = 2.84, adjusted for hypertension. Accordingly, the prevalence ratio for the association of diabetes with CHD is 2.60, adjusted for hypertension.
8.4 - Effect Modification
8.4 - Effect ModificationEffect modification is not a problem that investigators need to protect against, instead, it is a natural phenomenon that the investigators wish to describe and understand. Different groups may have different risk estimates when effect modification is present.
Effect modification occurs when the effect of a factor is different for different groups. We see evidence of this when the crude estimate of the association (odds ratio, rate ratio, risk ratio) is very close to a weighted average of group-specific estimates of the association. Effect modification is similar to statistical interaction, but in epidemiology, effect modification is related to the biology of disease, not just a data observation.
In the hypertension example, we saw both stratum-specific estimates of the odds ratio went to one side of the crude odds ratio. With effect modification, we expect the crude odds ratio to be between the estimates of the odds ratio for the stratum-specific estimates.
Why study effect modification? Why do we care?
- to define high-risk subgroups for preventive actions,
- to increase the precision of effect estimation by taking into account groups that may be affected differently,
- to increase the ability to compare across studies that have different proportions of effect-modifying groups, and
- to aid in developing a causal hypothesis for the disease
If you do not identify and handle properly an effect modifier, you will get an incorrect crude estimate. The (incorrect) crude estimator (e.g., RR, OR) is a weighted average of the (correct) stratum-specific estimators. If you do not sort out the stratum-specific results, you miss an opportunity to understand the biological or psychosocial nature of the relationship between risk factors and outcomes.
Planning for effect modification investigation
To consider effect modification in the design and conduct of a study:
- Collect information on potential effect modifiers.
- Power the study to test potential effect modifiers - if a priori you think that the effect may differ depending on the stratum, power the study to detect a difference.
- Don't match on a potentially important effect modifier - if you do, you can't examine its effect.
- To consider effect modification in the analysis of data:
- Again, consider what potential effect modifiers might be.
- Stratify the data by potential effect modifiers and calculate stratum-specific estimates of the effect of the risk on the outcome; determine if effect modification is present. If so, present stratum-specific estimates.
Example
Continuing the use of our example for confounding, part of our research hypothesis may be that the relationship between diabetes and CHD is different for males and females. Stratifying results by sex shows:
Category | CHD | No CHD | Total |
---|---|---|---|
Diabetes | 13 (12.3%) | 93 | 106 |
No Diabetes | 25 (2.1%) | 1191 | 1216 |
Total | 38 (2.9%) | 1284 | 1322 |
PR = 5.97
OR = 6.66
Category | CHD | No CHD | Total |
---|---|---|---|
Diabetes | 13 (11.8%) | 97 | 110 |
No Diabetes | 65 (5.8%) | 1050 | 1115 |
Total | 78 (6.3%) | 1147 | 1225 |
PR = 2.03
OR = 2.16
The prevalence ratio for females is 5.97, while it is only 2.03 for males. The overall estimate is closer to a weighted average of the two stratum-specific estimates and thus sex does not seem to be a confounder. Sex does modify the effect of diabetes on coronary heart disease.
Both groups have an increased risk of CHD for those with diabetes, but for females, those with diabetes are almost 6 times as likely to develop CHD. This is in comparison to males, where those with diabetes are only about 2 times as likely to develop CHD. Notice that the overall rates of CHD differ by sex as well. Overall males have higher incidence of CHD (6.3%), but the differential risk for those with and without diabetes is not as large as in the females. For females, the overall incidence of CHD is lower, at 2.9%, but the differential risk for those with and without diabetes is larger.
Summary of confounding v effect modification
To review, confounders mask a true effect, and effect modifiers mean that there is a different effect of the exposure on the outcome for different groups.
In summary, the process is as follows:
- Estimate a crude (unadjusted) estimate between exposure and outcome.
- Stratify the analysis by any potential major confounders to produce stratum-specific estimates.
- Compare the crude estimator with stratum-specific estimates and examine the kind of relationships exhibited.
With a Confounder:
- The crude estimator (e.g. RR, OR) is outside the range of the two stratum-specific estimators ( in the hypertension example - the crude odds ratio was higher than both of the stratum specific ratios).
- If the adjusted estimator is importantly (not necessarily statistically) different (often 10%) from the crude estimator, the “adjusted variable” is a confounder. In other words, if including the potential confounder changes the estimate of the risk by 10% or more, we consider it important and leave it in the model.
- Do not report the crude overall estimate (RR, OR). Instead an adjusted estimator should be reported. This can be done using the Mantel-Haenszel method or statistical modeling.
With Effect modifiers:
- The crude estimator (e.g. RR, OR) is closer to a weighted average of the stratum-specific estimators.
- The two stratum-specific estimators differ from each other.
- Report separate stratified models or report an interaction term.