7.2 - Ecological studies
7.2 - Ecological studiesRationale and Ecological Variables
An ecological study is an observational study in which at least one variable is measured at the group level. An ecological study is especially appropriate for the initial investigation of causal hypotheses.
So...why conduct an ecological study? Several reasons support using an ecological study design.
- The hypothesis is relatively new
- Adequate measurement of individual-level variables is not possible
- Adequate design of an individual-level study is not possible (i.e., not ethical)
- We are interested in the effect of ecological variables, for which there is no correlation at the individual level
- We have limited funds or limited time to do the study
Three types of ecological variables:
- Aggregate Variables
- A summary or composite measure derived from values collected from individual members of a population. Aggregate variables can measure exposure (e.g., mean blood pressure) or outcome (e.g., rate of disease) variables. One limitation with aggregate measures is that there is variation within the population - not all the individuals in the population have the average blood pressure.
- Environmental Variables
- A measure of the physical characteristics of the environment in which people reside, work, recreate or attend school. For example, we might hypothesize that rainfall is a risk for a fungal disease or that the content of minerals in drinking water is protective against a certain disease. Therefore, environmental variables would be the mean rainfall in a geographic area or the mean level of minerals in drinking water. Environmental variables measure exposure, not outcomes. One limitation of an environmental variable is that there is variation in exposure levels for individuals in the population.
- Global Variables (Measure Exposure)
- A measure of the attributes of groups, organizations, or places for which there is no analog at the individual level. For example, the procedures or treatments that are covered in a health insurance plan might affect the rate of disease or adverse health outcomes. Additionally, population density would be another global variable because crowding might be an important exposure. There is no individual population density! Global variables are used to measure exposures, not outcomes.
Advantages & Disadvantages of Ecological Studies
Advantages
- Can be done quickly and inexpensively bc rely on pre-existing data
- Analysis and presentation are relatively simple
- Can achieve a wider range of exposure levels than can be expected from an individual-level study
- Help explain population-level associations
Disadvantages
- Ecological fallacy - the possibility of making incorrect conclusions about individual-level associations when only using group-level data
- Lack of information on important variables
Analysis of Ecologic Studies
Analytic models in ecologic studies are of different forms:
- Completely Ecologic
- All variables (outcome, exposure, and covariates) are ecological.
- Partially Ecologic
- Some, but not all, variables are ecological.
- Multilevel
- Analyses may simultaneously include individual and ecological variables on the same construct (e.g., income). This could be called multilevel modeling, hierarchical regression, or mixed-effects modeling.
7.2.1 - Sample Ecological Data and Analysis
7.2.1 - Sample Ecological Data and AnalysisThe following data illustrate a problem with the interpretation of ecological studies. The data include the numbers in an exposed and non-exposed group and the disease rate per 100,000 person-years within each of the three different groups.
With the data given, we can calculate the exposure rates per group as:
Exposure | Group 1 | Group 2 | Group 3 | ||||||
---|---|---|---|---|---|---|---|---|---|
Cases | PY |
Rate/ |
Cases | PY | Rate/ 100,000 PY |
Cases | PY | Rate/ 100,000 PY |
|
Exposed (x=1) | 20 | 7000 | 20 | 10000 | 20 | 13000 | |||
Unexposed (x=0) | 13 | 13000 | 10 | 10000 | 7 | 7000 | |||
Total | 33 | 20000 | 165 | 30 | 20000 | 150 | 27 | 20000 | 135 |
Exposure Rate | 35% | 50% | 65% |
What is the relationship between exposure level and disease rate per 100,000 person-years?
Once we can calculate the exposure rate in each group, we see that as exposure rates increase, disease rates decrease.
The natural conclusion would seem to be that exposure protects individuals from the disease by decreasing the rate of disease.
So...would you want to be exposed to this factor in order to cut your disease risk? Or would you like to ask further questions?
What about the fact that we have no data measured at the individual level? For example, do we know the exposure level and the disease outcome for each person in the study? NO! In fact, all the cases could have actually occurred among the exposed individuals. This would be a problem if our hypothesis was that a biological process was responsible for the increased risk.
Consider these tables:
Stratum 1 and Stratum 2 are similar to the groups, of which there were 3, in the previous example. We don't know the numbers for each cell within any stratum, nor do we know A, B, C, or D for the combined data. Only the marginal counts are known - the number exposed and unexposed, and the numbers of cases and non-cases within each stratum. So, if our hypothesis for the risk pathway is biological, then we run the risk of an ecological fallacy. An ecological fallacy is possible when we use group-level data as evidence for risk pathways that operate at the individual level because we are ascribing group observations to the individual! (Note: Group-level data are appropriate if our hypothesis is that the disease pathway is from a group-level exposure. Group-level exposures are recognized as important in disease causation models with both individual and group processes).
Individual-level Data and Analysis
To demonstrate the ecological fallacy, let's look at the individual-level data from the same example. We will fill in the number of cases within each cell for each group. For instance, in group 1, there were 20 cases in 7,000 person-years of being at-risk.
Then we can calculate the rates per 100,000PY for each exposure level in each group as:
Exposure | Group 1 | Group 2 | Group 3 | ||||||
---|---|---|---|---|---|---|---|---|---|
Cases | PY |
Rate/ |
Cases | PY | Rate/ 100,000 PY |
Cases | PY | Rate/ 100,000 PY |
|
Exposed (x=1) | 20 | 7000 | 286 | 20 | 10000 | 200 | 20 | 13000 | 154 |
Unexposed (x=0) | 13 | 13000 | 100 | 10 | 10000 | 100 | 7 | 7000 | 100 |
Total | 33 | 20000 | 165 | 30 | 20000 | 150 | 27 | 20000 | 135 |
Exposure Rate | 35% | 50% | 65% |
Next, we can calculate the Rate Difference and Rate Ratio within each group as
\(\text { Rate Difference }=\text { Rate }_{\text {Exposed }}-\text { Rate }_{\text {Unexposed }}\)
\(\text { Rate Ratio }=\dfrac{\text { Rate }_{\text {Exposed }}}{\text { Rate }_{\text {Unexposed }}}\)
Exposure | Group 1 | Group 2 | Group 3 | ||||||
---|---|---|---|---|---|---|---|---|---|
Cases | PY |
Rate/ |
Cases | PY | Rate/ 100,000 PY |
Cases | PY | Rate/ 100,000 PY |
|
Exposed (x=1) | 20 | 7000 | 286 | 20 | 10000 | 200 | 20 | 13000 | 154 |
Unexposed (x=0) | 13 | 13000 | 100 | 10 | 10000 | 100 | 7 | 7000 | 100 |
Total | 33 | 20000 | 165 | 30 | 20000 | 150 | 27 | 20000 | 135 |
Exposure Rate | 35% | 50% | 65% | ||||||
Rate Difference | 186 | 100 | 54 | ||||||
Rate Ratio | 2.86 | 2.00 | 1.54 |
When we look at each group separately, we see that exposure is related to a higher rate of disease!
So, we would conclude that exposure increases the risk of this outcome, which is the opposite of what we concluded previously! We also observe that the rate of disease among the non-exposed was the same for all groups. Across groups, the rate of disease among the exposed was higher than the unexposed, but the rate seems to vary among the exposed groups.
Recall, that when we used the group-level (ecological) data we saw that this exposure appeared to be protective. HOWEVER, given the individual-level data, exposure appears to increase the risk of disease! This is an example of an ecological fallacy (or ecological bias)... using group-level data to support an individual pathway.
Can an ecological study produce results without ecological bias? Yes, under certain conditions...