misclass()
and probsens()
allow to provide adjusted measures
of association corrected for misclassification of the exposure or the outcome.
Usage
misclass(
case,
exposed,
type = c("exposure", "exposure_pv", "outcome"),
bias_parms = NULL,
alpha = 0.05
)
probsens(
case,
exposed,
type = c("exposure", "exposure_pv", "outcome"),
reps = 1000,
seca = list(dist = c("constant", "uniform", "triangular", "trapezoidal", "normal",
"beta"), parms = NULL),
seexp = NULL,
spca = list(dist = c("constant", "uniform", "triangular", "trapezoidal", "normal",
"beta"), parms = NULL),
spexp = NULL,
corr_se = NULL,
corr_sp = NULL,
alpha = 0.05
)
Arguments
- case
Outcome variable. If a variable, this variable is tabulated against.
- exposed
Exposure variable.
- type
Choice of misclassification:
exposure: bias analysis for exposure misclassification; corrections using sensitivity and specificity: nondifferential and independent errors,
exposure_pv: bias analysis for exposure misclassification; corrections using PPV/NPV: nondifferential and independent errors,
outcome: bias analysis for outcome misclassification.
- bias_parms
Vector defining the bias parameters. This vector has 4 elements between 0 and 1, in the following order:
Sensitivity of exposure (when
type = "exposure"
) or outcome (whentype = "outcome"
) classification among those with the outcome (whentype = "exposure"
) or exposure (whentype = "outcome"
),Sensitivity of exposure (or outcome) classification among those without the outcome (or exposure),
Specificity of exposure (or outcome) classification among those with the outcome (or exposure), and
Specificity of exposure (or outcome) classification among those without the outcome (or exposure).
If PPV/NPV is chosen in case of exposure misclassification, this vector is the following:
Positive predictive value among those with the outcome,
Positive predictive value among those without the outcome,
Negative predictive value among those with the outcome,
Negative predictive value among those without the outcome.
- alpha
Significance level.
- reps
Number of replications to run.
- seca
List defining sensitivity among cases:
The sensitivity of exposure classification among those with the outcome (when
type = "exposure"
), orThe sensitivity of outcome classification among those with the exposure (when
type = "outcome"
).
The first argument provides the probability distribution function (constant, uniform, triangular, trapezoidal, truncated normal, or beta) and the second its parameters as a vector. Lower and upper bounds of the truncated normal have to be between 0 and 1.
constant: constant value,
uniform: min, max,
triangular: lower limit, upper limit, mode,
trapezoidal: min, lower mode, upper mode, max,
normal: lower bound, upper bound, mean, sd.
beta: alpha, beta.
If PPV/NPV is chosen in case of exposure misclassification, the same four (4) parameters
seca
,seexp
,spca
,spexp
as for Se/Sp have to be used but with the following meaning, and only for a beta distributions and no correlation between distributions:Positive predictive value among those with the outcome,
Positive predictive value among those without the outcome,
Negative predictive value among those with the outcome,
Negative predictive value among those without the outcome.
- seexp
List defining sensitivity among controls:
The sensitivity of exposure classification among those without the outcome (when
type = "exposure"
), orThe sensitivity of outcome classification among those without the exposure (when
type = "outcome"
).
- spca
List as above for
seca
but for specificity.- spexp
List as above for
seexp
but for specificity.- corr_se
Correlation between case and non-case sensitivities. If PPV/NPV is chosen in case of exposure misclassification, correlations are set to NULL.
- corr_sp
Correlation between case and non-case specificities.
Value
A list with elements (for misclass()
):
- obs_data
The analyzed 2 x 2 table from the observed data.
- corr_data
The expected observed data given the true data assuming misclassification.
- obs_measures
A table of observed relative risk and odds ratio with confidence intervals.
- adj_measures
A table of corrected relative risks and odds ratios.
- bias_parms
Input bias parameters.
A list with elements (for probsens()
):
- obs_data
The analyzed 2 x 2 table from the observed data.
- obs_measures
A table of observed relative risk and odds ratio with confidence intervals.
- adj_measures
A table of corrected relative risks and odds ratios.
- sim_df
Data frame of random parameters and computed values.
- reps
Number of replications.
Simple bias analysis with misclass()
misclass()
allows you to run a simple sensitivity analysis for disease or
exposure misclassification. Confidence interval for odds ratio adjusted using
sensitivity and specificity is computed as in Chu et al. (2006), for exposure
misclassification.
For exposure misclassification, bias-adjusted measures are available using sensitivity and specificity, or using predictive values.
Probabilistic sensitivity analysis with probsens()
probsens()
performs a summary-level probabilistic sensitivity analysis to
correct for exposure misclassification or outcome misclassification and random
error. Non-differential misclassification is assumed when only the two bias
parameters seca
and spca
are provided. Adding the 2 parameters
seexp
and spexp
(i.e. providing the 4 bias parameters) evaluates
a differential misclassification.
For exposure misclassification, bias-adjusted measures are available using sensitivity and specificity, or using predictive values. However, only a beta distribution is available for predictive values.
Correlations between sensitivity (or specificity) of exposure classification among cases and controls can be specified and use the NORmal To Anything (NORTA) transformation (Li & Hammond, 1975).
In case of negative (<=0) adjusted count in the 2-by-2 table following given prior Se/Sp distribution(s), draws are discarded.
Updated calculations, probabilistic bias analysis
episensr 2.0.0 introduced updated calculations of probabilistic bias analyses
by (1) using the NORTA transformation to define a correlation between
distributions, and (2) sampling true prevalences and then sampling the
adjusted cell counts rather than just using the expected cell counts from a
simple quantitative bias analysis. This updated version should be preferred
but if you need to run an old analysis, you can easily revert to the
computation using probsens_legacy()
as follows:
References
Fox, M.P, MacLehose, R.F., Lash, T.L., 2021 Applying Quantitative Bias Analysis to Epidemiologic Data, pp.141–176, 233–256, 293–308, Springer.
Li, S.T., Hammond, J.L., 1975. Generation of Pseudorandom Numbers with Specified Univariate Distributions and Correlation Coefficients. IEEE Trans Syst Man Cybern 5:557-561.
Chu, H., Zhaojie, W., Cole, S.R., Greenland, S., Sensitivity analysis of misclassification: A graphical and a Bayesian approach, Annals of Epidemiology 2006;16:834-841.
Barros, A. & Hirakata, V.N., 2003. Alternatives for Logistic Regression in Cross-sectional Studies: An Empirical Comparison of Models that Directly Estimate the Prevalence Ratio. BMC Medical Research Methodology 3:21.
McNutt, L-A, Wu, C., Xue, X., Hafner J.P., 2003. Estimating the Relative Risk in Cohort Studies and Clinical Trials of Common Outcomes. American Journal of Epidemiology 157(10):940-943.
Greenland, S. (2004). Model-based Estimation of Relative Risks and Other Epidemiologic Measures in Studies of Common Outcomes and in Case-Control Studies. American Journal of Epidemiology 160(4):301-305.
Zhou, G. (2004). A Modified Poisson Regression Approach to Prospective Studies with Binary Data. American Journal of Epidemiology 159(7):702-706.
See also
Other misclassification:
misclass_cov()
,
probsens.irr()
Examples
# The data for this example come from:
# Fink, A.K., Lash, T.L. A null association between smoking during pregnancy
# and breast cancer using Massachusetts registry data (United States).
# Cancer Causes Control 2003;14:497-503.
misclass(matrix(c(215, 1449, 668, 4296),
dimnames = list(c("Breast cancer+", "Breast cancer-"),
c("Smoker+", "Smoker-")),
nrow = 2, byrow = TRUE),
type = "exposure",
bias_parms = c(.78, .78, .99, .99))
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: Breast cancer+
#> • Comparing: Smoker+ vs. Smoker-
#>
#> Smoker+ Smoker-
#> Breast cancer+ 215 1449
#> Breast cancer- 668 4296
#> 2.5% 97.5%
#> Observed Relative Risk: 0.9653825 0.8523766 1.0933704
#> Observed Odds Ratio: 0.9542406 0.8092461 1.1252141
#> ── Bias-adjusted measures ──
#>
#> 2.5% 97.5%
#> Misclassification Bias Corrected Relative Risk: 0.9614392
#> Misclassification Bias Corrected Odds Ratio: 0.9490695 0.7895687 1.1407909
misclass(matrix(c(4558, 3428, 46305, 46085),
dimnames = list(c("AMI death+", "AMI death-"),
c("Male+", "Male-")),
nrow = 2, byrow = TRUE),
type = "outcome",
bias_parms = c(.53, .53, .99, .99))
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: AMI death+
#> • Comparing: Male+ vs. Male-
#>
#> Male+ Male-
#> AMI death+ 4558 3428
#> AMI death- 46305 46085
#> 2.5% 97.5%
#> Observed Relative Risk: 1.294347 1.240431 1.350607
#> Observed Odds Ratio: 1.323321 1.263639 1.385822
#> ── Bias-adjusted measures ──
#>
#>
#> Misclassification Bias Corrected Relative Risk: 1.344039
#> Misclassification Bias Corrected Odds Ratio: 1.406235
# The following example comes from Chu et al. Sensitivity analysis of
# misclassification: A graphical and a Bayesian approach.
# Annals of Epidemiology 2006;16:834-841.
misclass(matrix(c(126, 92, 71, 224),
dimnames = list(c("Case", "Control"), c("Smoker +", "Smoker -")),
nrow = 2, byrow = TRUE),
type = "exposure",
bias_parms = c(.94, .94, .97, .97))
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: Case
#> • Comparing: Smoker + vs. Smoker -
#>
#> Smoker + Smoker -
#> Case 126 92
#> Control 71 224
#> 2.5% 97.5%
#> Observed Relative Risk: 2.196866 1.796016 2.687181
#> Observed Odds Ratio: 4.320882 2.958402 6.310846
#> ── Bias-adjusted measures ──
#>
#> 2.5% 97.5%
#> Misclassification Bias Corrected Relative Risk: 2.377254
#> Misclassification Bias Corrected Odds Ratio: 5.024508 3.282534 7.690912
# The next example, using PPV/NPV, comes from Bodnar et al. Validity of birth
# certificate-derived maternal weight data.
# Paediatric and Perinatal Epidemiology 2014;28:203-212.
misclass(matrix(c(599, 4978, 31175, 391851),
dimnames = list(c("Preterm", "Term"), c("Underweight", "Normal weight")),
nrow = 2, byrow = TRUE),
type = "exposure_pv",
bias_parms = c(0.65, 0.74, 1, 0.98))
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: Preterm
#> • Comparing: Underweight vs. Normal weight
#>
#> Underweight Normal weight
#> Preterm 599 4978
#> Term 31175 391851
#> 2.5% 97.5%
#> Observed Relative Risk: 1.502808 1.381743 1.634480
#> Observed Odds Ratio: 1.512469 1.388465 1.647547
#> ── Bias-adjusted measures ──
#>
#> 2.5% 97.5%
#> Misclassification Bias Corrected Relative Risk: 0.9528156
#> Misclassification Bias Corrected Odds Ratio: 0.9522211
#
# The data for this example come from:
# Greenland S., Salvan A., Wegman D.H., Hallock M.F., Smith T.J.
# A case-control study of cancer mortality at a transformer-assembly facility.
# Int Arch Occup Environ Health 1994; 66(1):49-54.
greenland <- matrix(c(45, 94, 257, 945), dimnames = list(c("BC+", "BC-"),
c("Smoke+", "Smoke-")), nrow = 2, byrow = TRUE)
set.seed(123)
# Exposure misclassification, non-differential
probsens(greenland, type = "exposure", reps = 20000,
seca = list("trapezoidal", c(.75, .85, .95, 1)),
spca = list("trapezoidal", c(.75, .85, .95, 1)))
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [10ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [31ms]
#>
#> ⠙ Incorporating random error
#> ! Chosen Se/Sp distributions lead to 821 impossible values which were discarded.
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [155ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 45 94
#> BC- 257 945
#> 2.5% 97.5%
#> Observed Relative Risk: 1.646999 1.182429 2.294094
#> Observed Odds Ratio: 1.760286 1.202457 2.576898
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 2.187211 1.738514 6.478489
#> total error: 2.245067 1.317778 6.903311
#> Odds Ratio -- systematic error: 2.469835 1.872963 13.670279
#> total error: 2.543051 1.361831 15.167288
# Exposure misclassification, differential
probsens(greenland, type = "exposure", reps = 20000,
seca = list("trapezoidal", c(.75, .85, .95, 1)),
seexp = list("trapezoidal", c(.7, .8, .9, .95)),
spca = list("trapezoidal", c(.75, .85, .95, 1)),
spexp = list("trapezoidal", c(.7, .8, .9, .95)),
corr_se = .8,
corr_sp = .8)
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [191ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [40ms]
#>
#> ⠙ Incorporating random error
#> ! Chosen Se/Sp distributions lead to 4221 impossible values which were discarded.
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [120ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 45 94
#> BC- 257 945
#> 2.5% 97.5%
#> Observed Relative Risk: 1.646999 1.182429 2.294094
#> Observed Odds Ratio: 1.760286 1.202457 2.576898
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 2.911013 1.808631 9.832892
#> total error: 2.983846 1.490802 10.412820
#> Odds Ratio -- systematic error: 3.526097 1.966118 44.123981
#> total error: 3.624326 1.568020 52.205437
probsens(greenland, type = "exposure", reps = 20000,
seca = list("beta", c(908, 16)),
seexp = list("beta", c(156, 56)),
spca = list("beta", c(153, 6)),
spexp = list("beta", c(205, 18)),
corr_se = .8,
corr_sp = .8)
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [151ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [32ms]
#>
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [94ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 45 94
#> BC- 257 945
#> 2.5% 97.5%
#> Observed Relative Risk: 1.646999 1.182429 2.294094
#> Observed Odds Ratio: 1.760286 1.202457 2.576898
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 1.595231 1.348804 2.016217
#> total error: 1.604960 1.051014 2.468737
#> Odds Ratio -- systematic error: 1.697448 1.400317 2.237852
#> total error: 1.709025 1.054424 2.834296
probsens(matrix(c(338, 490, 17984, 32024),
dimnames = list(c("BC+", "BC-"), c("Smoke+", "Smoke-")), nrow = 2, byrow = TRUE),
type = "exposure",
reps = 1000,
seca = list("trapezoidal", c(.8, .9, .9, 1)),
spca = list("trapezoidal", c(.8, .9, .9, 1)))
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [7ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [23ms]
#>
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [33ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 338 490
#> BC- 17984 32024
#> 2.5% 97.5%
#> Observed Relative Risk: 1.224104 1.066961 1.404390
#> Observed Odds Ratio: 1.228315 1.068081 1.412589
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 1.298977 1.251494 1.383416
#> total error: 1.305208 1.093808 1.552964
#> Odds Ratio -- systematic error: 1.304850 1.256300 1.391444
#> total error: 1.311283 1.095413 1.564852
# Disease misclassification
probsens(matrix(c(173, 602, 134, 663),
dimnames = list(c("BC+", "BC-"), c("Smoke+", "Smoke-")), nrow = 2, byrow = TRUE),
type = "outcome",
reps = 20000,
seca = list("uniform", c(.8, 1)),
spca = list("uniform", c(.8, 1)))
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [10ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [46ms]
#>
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [127ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 173 602
#> BC- 134 663
#> 2.5% 97.5%
#> Observed Relative Risk: 1.184136 1.056368 1.327359
#> Observed Odds Ratio: 1.421865 1.106140 1.827708
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 1.232712 1.186131 1.311966
#> total error: 1.236527 1.072703 1.450525
#> Odds Ratio -- systematic error: 1.566668 1.446974 1.733297
#> total error: 1.568530 1.146551 2.206620
probsens(matrix(c(338, 490, 17984, 32024),
dimnames = list(c("BC+", "BC-"), c("Smoke+", "Smoke-")), nrow = 2, byrow = TRUE),
type = "outcome",
reps = 20000,
seca = list("uniform", c(.2, .6)),
seexp = list("uniform", c(.1, .5)),
spca = list("uniform", c(.99, 1)),
spexp = list("uniform", c(.99, 1)),
corr_se = .8,
corr_sp = .8)
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [23ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [45ms]
#>
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [115ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 338 490
#> BC- 17984 32024
#> 2.5% 97.5%
#> Observed Relative Risk: 1.224104 1.066961 1.404390
#> Observed Odds Ratio: 1.228315 1.068081 1.412589
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 0.9891868 0.4879325 1.8220256
#> total error: 0.9835122 0.4834684 1.8594789
#> Odds Ratio -- systematic error: 0.9887910 0.4673115 1.8539193
#> total error: 0.9826212 0.4622968 1.8910876
probsens(matrix(c(173, 602, 134, 663),
dimnames = list(c("BC+", "BC-"), c("Smoke+", "Smoke-")), nrow = 2, byrow = TRUE),
type = "outcome",
reps = 20000,
seca = list("beta", c(100, 5)),
seexp = list("beta", c(110, 10)),
spca = list("beta", c(120, 15)),
spexp = list("beta", c(130, 30)),
corr_se = .8,
corr_sp = .8)
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [142ms]
#>
#> ⠙ Simple bias analysis
#> ✔ Simple bias analysis [205ms]
#>
#> ⠙ Incorporating random error
#> ⠹ Incorporating random error
#> ✔ Incorporating random error [105ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: BC+
#> • Comparing: Smoke+ vs. Smoke-
#>
#> Smoke+ Smoke-
#> BC+ 173 602
#> BC- 134 663
#> 2.5% 97.5%
#> Observed Relative Risk: 1.184136 1.056368 1.327359
#> Observed Odds Ratio: 1.421865 1.106140 1.827708
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Relative Risk -- systematic error: 1.358313 1.246587 1.532278
#> total error: 1.361514 1.143605 1.650434
#> Odds Ratio -- systematic error: 1.776187 1.546615 2.124443
#> total error: 1.785509 1.277801 2.524935
# Fox M.P., MacLehose R.F., Lash T.L.
# SAS and R code for probabilistic quantitative bias analysis for
# misclassified binary variables and binary unmeasured confounders
# Int J Epidemiol 2023:1624-1633.
if (FALSE) { # \dontrun{
fox <- matrix(c(40, 20, 60, 80),
dimnames = list(c("Diseased", "Non-diseased"), c("Exposed", "Unexposed")),
nrow = 2, byrow = TRUE)
set.seed(1234)
probsens(fox, type = "exposure", reps = 10^6,
seca = list("beta", c(25, 3)),
spca = list("trapezoidal", c(.9, .93, .97, 1)),
seexp = list("beta", c(47, 7)),
spexp = list("trapezoidal", c(.8, .83, .87, .9)),
corr_se = .8,
corr_sp = .8)
} # }
# Using PPV/NPV, from Bodnar et al. Validity of birth certificate-derived maternal
# weight data. Paediatric and Perinatal Epidemiology 2014;28:203-212.
set.seed(1234)
probsens(matrix(c(599, 4978, 31175, 391851),
dimnames = list(c("Preterm", "Term"), c("Underweight", "Normal weight")),
nrow = 2, byrow = TRUE),
type = "exposure_pv", reps = 10^6,
seca = list("beta", c(50, 27)), ## PPV_case
spca = list("beta", c(120, .5)), ## NPV_case
seexp = list("beta", c(132, 47)), ## PPV_ctrl
spexp = list("beta", c(115, 2))) ## NPV_ctrl
#> ℹ Calculating observed measures
#> ⠙ Assign probability distributions
#> ✔ Assign probability distributions [791ms]
#>
#> ⠙ Simple bias analysis
#> ⠹ Simple bias analysis
#> ✔ Simple bias analysis [673ms]
#>
#>
#> ── Observed data ───────────────────────────────────────────────────────────────
#> • Outcome: Preterm
#> • Comparing: Underweight vs. Normal weight
#>
#> Underweight Normal weight
#> Preterm 599 4978
#> Term 31175 391851
#> 2.5% 97.5%
#> Observed Relative Risk: 1.502808 1.381743 1.634480
#> Observed Odds Ratio: 1.512469 1.388465 1.647547
#> ── Bias-adjusted measures ──
#>
#> Median p2.5 p97.5
#> Odds Ratio -- systematic error: 1.0712427 0.6834697 1.5144332