Epidemiologic studies can suffer from more than one bias. Bias
functions in episensr
can be applied sequentially to
quantify bias resulting from multiple biases.
Following the example in Lash et al., we can use the study by Chien et al.. It is a case-control study looking at the association between antidepressant use and the occurrence of breast cancer. The observed OR was 1.2 [0.9–1.6].
chien <- matrix(c(118, 832, 103, 884),
dimnames = list(c("BC+", "BC-"), c("AD+", "AD-")),
nrow = 2, byrow = TRUE)
AD+ | AD- | |
---|---|---|
BC+ | 118 | 832 |
BC- | 103 | 884 |
Records on medication use differed between participants, from pharmacy records and self-reported use, leading to misclassification:
## Loading required package: ggplot2
## Thank you for using episensr!
## This is version 2.0.0 of episensr
## Type 'citation("episensr")' for citing this R package in publications.
seq_bias1 <- chien %>%
misclass(., type = "exposure",
bias_parms = c(24/(24+19), 18/(18+13), 144/(144+2), 130/(130+4)))
seq_bias1
##
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
##
## AD+ AD-
## BC+ 118 832
## BC- 103 884
## 2.5% 97.5%
## Observed Relative Risk: 1.1012443 0.9646019 1.2572431
## Observed Odds Ratio: 1.2172330 0.9192874 1.6117443
## ── Bias-adjusted measures ──
## 2.5% 97.5%
## Misclassification Bias Corrected Relative Risk: 1.256946
## Misclassification Bias Corrected Odds Ratio: 1.628058 1.113458 2.380487
Controls and cases also enrolled into the study at different rates.
By storing the result of misclass()
in an object named here
seq_bias1
, we have access to the various elements returned
by the function (see the help()
for a given function to
know what is returned). The bias-adjusted cell frequencies can be
accessed as corr_data
.
seq_bias1$corr_data
## AD+ AD-
## BC+ 192.8332 757.1668
## BC- 133.5114 853.4886
This 2-by-2 table will be used as starting values for the next
sequential bias adjustment, selection()
.
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
##
## AD+ AD-
## BC+ 192.8332 757.1668
## BC- 133.5114 853.4886
## 2.5% 97.5%
## Observed Relative Risk: 1.256946 1.132669 1.394858
## Observed Odds Ratio: 1.628058 1.278900 2.072541
## ── Bias-adjusted measures ──
##
## Selection Bias Corrected Relative Risk: 1.172097
## Selection Bias Corrected Odds Ratio: 1.448430
The adjusted OR is now 1.45 Again, the bias-adjusted cell frequencies
can be used in the next bias analysis, confounders()
.
seq_bias2$corr_data
## AD+ AD-
## BC+ 262.7156 1251.515
## BC- 163.6169 1128.953
The association between antidepressant use and breast cancer was adjusted for various confounders (race/ethnicity, income, etc.). None of these confounders were found to change the association by more than 10%. However, for illustration, we can add the effect of a potential confounder (e.g. physical activity):
seq_bias3 <- seq_bias2$corr_data %>%
confounders(., type = "OR", bias_parms = c(.8, .299, .436))
seq_bias3
## ── Observed data ───────────────────────────────────────────────────────────────
## • Outcome: BC+
## • Comparing: AD+ vs. AD-
##
## AD+ AD-
## BC+ 262.7156 1251.5153
## BC- 163.6169 1128.9532
## 2.5% 97.5%
## Crude Odds Ratio: 1.448430 1.172758 1.788903
## Odds Ratio, Confounder +: 1.406219
## Odds Ratio, Confounder -: 1.406219
## ── Bias-adjusted measures ──
## OR due to confounding
## Standardized Morbidity Ratio: 1.406219 1.030018
## Mantel-Haenszel: 1.406219 1.030018
The serially bias-adjusted OR is 1.406. And the adjusted cells by confounders, for regular exercise:
seq_bias3$cfder_data
## AD+ AD-
## BC+ 66.83851 478.2302
## BC- 48.92144 492.2236
And for no regular exercise:
seq_bias3$nocfder_data
## AD+ AD-
## BC+ 195.8771 773.2851
## BC- 114.6954 636.7296
The same process can be realized in a probabilistic framework, as
each probsens()
, probsens.sel()
and
probsens_conf()
provides the adjusted cell frequencies as
A1, B1, C1, D1 in sim_df
.
Exposed | Not exposed | |
---|---|---|
Cases | A1 | B1 |
Controls | C1 | D1 |
mod1 <- chien %>%
probsens(., type = "exposure",
seca = list("trapezoidal", c(.45, .5, .6, .65)),
seexp = list("trapezoidal", c(.4, .48, .58, .63)),
spca = list("trapezoidal", c(.95, .97, .99, 1)),
spexp = list("trapezoidal", c(.96, .98, .99, 1)),
corr_se = .8, corr_sp = .8)
## ℹ Calculating observed measures
## ⠙ Assign probability distributions
## ✔ Assign probability distributions [15ms]
##
## ⠙ Simple bias analysis
## ✔ Simple bias analysis [21ms]
##
## ⠙ Incorporating random error
## ✔ Incorporating random error [42ms]
##
str(mod1)
## List of 7
## $ obs_data : num [1:2, 1:2] 118 103 832 884
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:2] "BC+" "BC-"
## .. ..$ : chr [1:2] "AD+" "AD-"
## $ obs_measures: num [1:2, 1:3] 1.101 1.217 0.965 0.919 1.257 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:2] " Observed Relative Risk:" " Observed Odds Ratio:"
## .. ..$ : chr [1:3] " " "2.5%" "97.5%"
## $ adj_measures: num [1:4, 1:3] 1.071 1.076 1.147 1.155 0.963 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:4] "Relative Risk -- systematic error:" " total error:" " Odds Ratio -- systematic error:" " total error:"
## .. ..$ : chr [1:3] "Median" "p2.5" "p97.5"
## $ sim_df :'data.frame': 1000 obs. of 27 variables:
## ..$ seca : num [1:1000] 0.491 0.568 0.46 0.573 0.601 ...
## ..$ seexp : num [1:1000] 0.445 0.534 0.421 0.496 0.534 ...
## ..$ spca : num [1:1000] 0.971 0.977 0.973 0.973 0.97 ...
## ..$ spexp : num [1:1000] 0.984 0.985 0.973 0.976 0.983 ...
## ..$ A1 : num [1:1000] 195 177 214 169 156 ...
## ..$ B1 : num [1:1000] 755 773 736 781 794 ...
## ..$ C1 : num [1:1000] 204 170 194 168 166 ...
## ..$ D1 : num [1:1000] 783 817 793 819 821 ...
## ..$ prevca : num [1:1000] 0.194 0.177 0.233 0.186 0.152 ...
## ..$ prevexp: num [1:1000] 0.208 0.176 0.187 0.157 0.154 ...
## ..$ ppvca : num [1:1000] 0.801 0.844 0.839 0.828 0.78 ...
## ..$ ppvexp : num [1:1000] 0.882 0.883 0.783 0.794 0.85 ...
## ..$ npvca : num [1:1000] 0.888 0.913 0.856 0.909 0.931 ...
## ..$ npvexp : num [1:1000] 0.871 0.908 0.88 0.912 0.92 ...
## ..$ ab : num [1:1000] 175 159 254 172 149 176 212 182 199 202 ...
## ..$ bb : num [1:1000] 775 791 696 778 801 774 738 768 751 748 ...
## ..$ cb : num [1:1000] 207 177 199 172 166 162 223 195 181 216 ...
## ..$ db : num [1:1000] 780 810 788 815 821 825 764 792 806 771 ...
## ..$ corr_RR: num [1:1000] 0.919 0.958 1.196 1.024 0.958 ...
## ..$ corr_OR: num [1:1000] 0.851 0.92 1.445 1.048 0.92 ...
## ..$ rr_se_b: num [1:1000] 0.0612 0.0629 0.0499 0.0597 0.0646 ...
## ..$ or_se_b: num [1:1000] 0.115 0.12 0.108 0.119 0.123 ...
## ..$ z : num [1:1000] 0.397 -0.718 -2.71 2.796 0.379 ...
## ..$ tot_RR : num [1:1000] 0.897 1.002 1.369 0.866 0.935 ...
## ..$ tot_OR : num [1:1000] 0.813 1.003 1.937 0.751 0.878 ...
## ..$ syst_RR: num [1:1000] 0.996 1.049 1.089 1.027 0.985 ...
## ..$ syst_OR: num [1:1000] 0.992 1.101 1.187 1.054 0.97 ...
## $ reps : num 1000
## $ fun : chr "probsens"
## $ warnings : NULL
## - attr(*, "class")= chr [1:3] "episensr" "episensr.probsens" "list"
Each of the lines from mod1$sim_df[, 5:8]
can then be
tabulated and fed to the next bias function. Be careful, as the number
of observations can quickly become unmanageable.
head(mod1$sim_df[, 5:8])
## A1 B1 C1 D1
## 1 195.2980 754.7020 204.2431 782.7569
## 2 176.7932 773.2068 169.7718 817.2282
## 3 213.7715 736.2285 194.0093 792.9907
## 4 168.7198 781.2802 167.8590 819.1410
## 5 156.1432 793.8568 166.3320 820.6680
## 6 143.5092 806.4908 154.8496 832.1504