You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dependency between variables can complicate causal inference.
If there is a correlation between a treatment variable and other covariates in the model, then this decreases the overlap and increases the uncertainty in the estimate of the treatment effect. This is the correct way to model things if we want to ask what would happen if we intervened on that variable holding the correlated ones fixed. For example, suppose attendance in a paid music program at a school is strongly correlated with socio-economic status. It makes sense to ask about the impact of rolling out that program to all students.
On the other hand a strong correlation may indicate that the 'treatment concept' is spread across multiple variables. For example, the quality of school leadership could be measured by teachers views of leadership across multiple dimensions & principal experience. The goal of observational causal inference in this setting could be to get a rough indication of the magnitude of any impact of school leadership so as to decide whether to prioritize developing interventions that might improve leadership. It probably does not makes sense to ask what the impact would be of changing teachers views of a particular aspect of leadership whilst holding the remaining aspects fixed. In this case, we need to select (or construct) one variable to represent the concept.
The other case where correlation can introduce complexity is when estimating CATE, where you need to marginalise out the variables you do not care about conditional on the ones you do.
Identifying clusters of variables that it makes sense to treat as a single concept can help - both in constructing treatment variables and variables with which to estimate CATE with respect to.
The text was updated successfully, but these errors were encountered:
Dependency between variables can complicate causal inference.
If there is a correlation between a treatment variable and other covariates in the model, then this decreases the overlap and increases the uncertainty in the estimate of the treatment effect. This is the correct way to model things if we want to ask what would happen if we intervened on that variable holding the correlated ones fixed. For example, suppose attendance in a paid music program at a school is strongly correlated with socio-economic status. It makes sense to ask about the impact of rolling out that program to all students.
On the other hand a strong correlation may indicate that the 'treatment concept' is spread across multiple variables. For example, the quality of school leadership could be measured by teachers views of leadership across multiple dimensions & principal experience. The goal of observational causal inference in this setting could be to get a rough indication of the magnitude of any impact of school leadership so as to decide whether to prioritize developing interventions that might improve leadership. It probably does not makes sense to ask what the impact would be of changing teachers views of a particular aspect of leadership whilst holding the remaining aspects fixed. In this case, we need to select (or construct) one variable to represent the concept.
The other case where correlation can introduce complexity is when estimating CATE, where you need to marginalise out the variables you do not care about conditional on the ones you do.
Identifying clusters of variables that it makes sense to treat as a single concept can help - both in constructing treatment variables and variables with which to estimate CATE with respect to.
The text was updated successfully, but these errors were encountered: