A dataset simulated with one binary exogenous regressor X and one endogenous continuous regressor P, and a dependent variable y. The exogenous regressor X is a binary variable (0 or 1) derived from a latent standard normal variable and is correlated with P (r = 0.5). The endogenous regressor P follows a bounded continuous distribution (Phi(P*) + 0.5). The endogeneity strength is given by rho = 0.5. This dataset illustrates that the IMA estimator does not require the exogenous regressor to be continuously or normally distributed. There is no intercept in the data generating process. The true parameter values are alpha = 1 for P and beta = 1 for X.

data("dataCopIMABinExo")

Format

A data frame with 1000 observations on 3 variables:

y

a numeric vector representing the dependent variable.

X

a numeric vector, binary (0 or 1) and exogenous, correlated with P.

P

a numeric vector, continuous and endogenous, following a bounded distribution Phi(P*) + 0.5 with values in (0.5, 1.5).

References

Haschka, R. E. (2025). Robustness of copula-correction models in causal analysis: Exploiting between-regressor correlation. IMA Journal of Management Mathematics, 36, 161-180. doi:10.1093/imaman/dpae018

Author

Kimberly Lew kimberlylew12@gmail.com