Skip to content

Correlation between outcome and exposure GWAS estimation errors can also bias the Rao-Blackwellization #1

@harryyiheyang

Description

@harryyiheyang

Dear Dr. Wu,

It is very common that the estimation errors between outcome and exposure GWASs are correlated, especially due to sample overlap. This introduces a second source of bias in Mendelian randomization beyond the usual winner’s curse, and it also affects the Rao-Blackwellization procedure. I would like to raise the following points:

  1. Joint distribution of hat(Gamma_j) and hat(gamma_j)

Suppose for a given SNP j, the GWAS estimators for outcome and exposure follow a bivariate normal distribution:

 (hat_Gamma_j, hat_gamma_j) ~ N( (Gamma_j, gamma_j), Σ )

where the covariance matrix Σ is:

 Σ = [ Var(hat_Gamma_j) Cov(hat_Gamma_j, hat_gamma_j) ]
    [ Cov(hat_Gamma_j, hat_gamma_j) Var(hat_gamma_j) ]

That is,

 Σ = [ sigma_Gamma^2 rho * sigma_Gamma * sigma_gamma ]
    [ rho * sigma_Gamma * sigma_gamma sigma_gamma^2 ]

Here, rho represents the correlation between the estimation errors, typically induced by sample overlap.

  1. Selection region S is implicitly conditioned on hat(Gamma_j)

In the Rao-Blackwell procedure, the selection of instruments is based on randomized values of hat(gamma_j). However, since hat(Gamma_j) is correlated with hat(gamma_j), conditioning only on hat(gamma_j) (and simulating based on it) does not remove the selection bias in estimating Gamma_j. This violates the key assumption that the selection event S is independent of hat(Gamma_j) given hat(gamma_j).

  1. Empirical illustration

Image

The first boxplot shows causal effect estimates using all true IVs (true value = 0.2).
The second boxplot uses only IVs selected at p < 0.05.
The third boxplot applies Rao-Blackwellization to the selected IVs.

The estimator used is MRBEE (a debiased univariable/multivariable MR method). In this simulation, the correlation between exposure and outcome GWAS estimation errors is rho = 0.5, which introduces substantial bias. Only when this correlation is removed does the RB-corrected estimate become unbiased.

  1. Implications for Rao-Blackwellization in practice

One possible solution is to correct both hat(gamma_j) and hat(Gamma_j) simultaneously under their joint conditional distribution given the selection event. However, this becomes challenging in multivariable MR, because:

Each exposure selects its own set of instruments;

The outcome model is fitted on the union of all selected instruments;

There is no unique way to determine how to conditionally adjust hat(Gamma_j) unless we know which exposure’s selection caused its inclusion.

Best regards,
Yihe Yang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions