Software

Diagnostic Test Se and Sp Estimation

  • 2 independent tests, 2 populations, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.2.2.
    Two independent tests, two populations
    Estimate Se and Sp of microscopic examination and PCR.
    Hui-Walter Model for N. salmonis in trout.
    Data source: Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Preventive Veterinary Medicine 2000; 45(1-2):61-81. DOI: 10.1016/S0167-5877(00)00117-3.

    model{
    y[1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q], n1)
    z[1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q], n2)
    p1[1,1] <- pi1*Seme*Sepcr + (1-pi1)*(1-Spme)*(1-Sppcr)
    p1[1,2] <- pi1*Seme*(1-Sepcr) + (1-pi1)*(1-Spme)*Sppcr
    p1[2,1] <- pi1*(1-Seme)*Sepcr + (1-pi1)*Spme*(1-Sppcr)
    p1[2,2] <- pi1*(1-Seme)*(1-Sepcr) + (1-pi1)*Spme*Sppcr
    p2[1,1] <- pi2*Seme*Sepcr + (1-pi2)*(1-Spme)*(1-Sppcr)
    p2[1,2] <- pi2*Seme*(1-Sepcr) + (1-pi2)*(1-Spme)*Sppcr
    p2[2,1] <- pi2*(1-Seme)*Sepcr + (1-pi2)*Spme*(1-Sppcr)
    p2[2,2] <- pi2*(1-Seme)*(1-Sepcr) + (1-pi2)*Spme*Sppcr
    Seme ~ dbeta(2.82, 2.49) ## Mode=0.55, 95% sure Seme < 0.85
    Spme ~ dbeta(15.7, 1.30) ## Mode=0.98, 95% sure Spme > 0.80
    pi2 ~ dbeta(1.73, 2.71) ## Mode=0.30, 95% sure pi2 > 0.08
    Sepcr ~ dbeta(8.29, 1.81) ## Mode=0.90, 95% sure Sepcr > 0.60
    Sppcr ~ dbeta(10.69, 2.71) ## Mode=0.85, 95% sure Sppcr > 0.60
    Z ~ dbern(tau1)
    pi1star ~ dbeta(1.27, 9.65) ## Mode=0.03, 95% sure pi2 < 0.30
    pi1 <- Z*pi1star
    }

    list(n2=30, n1=132, z=structure(.Data=c(3,0,24,3),.Dim=c(2,2)),
    y=structure(.Data=c(0,0,3,129),.Dim=c(2,2)), Q=2, tau1=0.95)
    list(Z=1, pi1star=0.03, pi2=0.30, Seme=0.55, Spme=0.98, Sepcr=0.90, Sppcr=0.85

    node  mean        sd             MC error   2.50%         median     97.50%     start   sample
    Seme  0.1745     0.0652      2.32E-04  0.06738      0.1678     0.3195    10001 100000
    Sepcr  0.9301     0.04626    1.86E-04  0.8173        0.9391     0.9919    10001 100000
    Spme  0.9914     0.007498  3.93E-05  0.9717        0.9935     0.9996    10001 100000
    Sppcr  0.963       0.01671    6.88E-05  0.9247        0.9651     0.9893    10001 100000
    pi1      0.009058 0.01219    5.79E-05  0                 0.003916 0.04176  10001 100000
    pi2      0.8524     0.06658    2.56E-04  0.7027        0.8596     0.9596    10001 100000
  • 2 independent tests, 2 populations, no gold standard (TAGS) - frequentist approach
  • TAGS Instructions
    Worked Example

    Dubey et al. 1995 (Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036) compared 5 serologic tests for the diagnosis of toxoplasmosis in 1000 naturally-exposed sows using bioassay methods as the gold standard (definitive test). Bioassays were done in mice (all sows) and cats (183 sows) using cardiac muscle from sampled sows. Samples in the study were collected in two batches: nos. 1-463 and 464-1000.

    If Toxoplasma gondii were isolated from either mice or cats, the sow was considered infected. A sow was considered non-infected if the bioassay results were negative. To demonstrate use of TAGS, we will use results of the bioassay test and 2 of the 5 serologic tests: the modified agglutination test (MAT) and the enzyme-linked immunosorbent assay (ELISA). These 2 serologic tests were the most accurate of the tests evaluated and are commonly used in screening of pigs for toxoplasmosis. The MAT was considered positive if the titer was >= 20 and the ELISA was positive if the OD value was > 0.36. The sensitivity (Se) and specificity (Sp) of the MAT using bioassay as the gold standard were calculated to be 82.9% and 90.2%, respectively. This represents the traditional approach that assumes that the combined bioassay (cat + mouse) has Se = Sp = 1.

    Calculations 
    The TAGS method requires a minimum of 2 populations with 2 conditionally independent tests. Use of the batch data (batch 1 = 463; batch 2 = 537) provides a logical way to create 2 populations of similar size in which the sensitivity and specificity of the tests should be equivalent. Cross-classified test results for the batches are in the following table:

    Batch  MAT+/      MAT+/      MAT-/      MAT-/
           Bioassay+  Bioassay-  Bioassay+  Bioassay-
     1     37         55         7          364
     2     104        26         22         385
    What is the sensitivity and specificity of the MAT using “TAGS” when the MAT and bioassay are the tests under consideration?

    Estimates for the MAT using TAGS are exactly the same as using the traditional approach i.e. Se = 82.9% and Sp = 90.2%. Also, TAGS estimated that the bioassay was perfectly sensitive and specific – exactly the same as the traditional approach. Hence, this provides evidence in support of bioassay as a true gold standard.

    Now let ’s ignore the bioassay results and the use the cross-classified data from MAT and ELISA by batch as follows (1 pig had a missing ELISA value):
    Batch  MAT+/   MAT+/   MAT-/   MAT-/

           ELISA+  ELISA-  ELISA+  ELISA-
     1     67      25      41      329
     2     97      33      36      371

    What is the sensitivity and specificity of the MAT now using “TAGS” when the MAT and the ELISA are the tests under comparison?

    MAT is estimated to have a Se = 100% and a Sp = 97.3% which clearly are overestimated compared with the correct values obtained when bioassay is used at the comparison.

    What is the most likely reason for the change in MAT estimates?

    Results of the two tests (MAT and ELISA) are positively correlated (dependent), conditional on infection status (see Gardner IA, Stryhn H, Lind P, Collins MT. Conditional dependence between tests affects the diagnosis and surveillance of animal diseases. Preventive Veterinary Medicine 2000; 45(1-2):107-122. DOI: 10.1016/S0167-5877(00)00119-7). This positive dependence between the tests results in overestimation of MAT accuracy with TAGS.

    The main conclusion is that is inappropriate to use the TAGS software to compare 2 serologic tests without a gold standard because the tests are likely to be dependent. Other methods (and software) that account for this dependence (2 conditionally dependent tests, 2 populations) are necessary to obtain unbiased estimates.
  • 2 independent tests, 2 populations, no gold standard - spreadsheet workbook
  • Excel workbook to estimate Se and Sp, and for frequentist sample size calculations.

  • 2 dependent tests, 1 population, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.3.3.1.
    Two dependent tests, one population.
    Estimation of the Se and Sp of two FAT tests.
    Classical swine fever virus.
    Data source: Bouma A, Stegeman JA, Engel B, de Kluijver EP, Elbers AR, De Jong MC. Evaluation of diagnostic tests for the detection of classical swine fever in the field without a gold standard. Journal of Veterinary Diagnostic Investigation 2001; 13(5):383-388. DOI: 10.1177/104063870101300503.

    model{
    node   mean   sd      MC error 2.50%    median 97.50% start sample
    Sefat1 0.7691 0.06151 9.16E-04 0.6497   0.7693 0.887  10001 100000
    Sefat2 0.8147 0.06045 9.83E-04 0.6959   0.8156 0.9281 10001 100000
    Spfat1 0.8851 0.0581  5.35E-04 0.7533   0.8924 0.975  10001 100000
    Spfat2 0.8485 0.07411 5.83E-04 0.6856   0.8566 0.9667 10001 100000
    rhoD   0.6814 0.1484  0.001845 0.3135   0.7049 0.9071 10001 100000
    rhoDc  0.3629 0.2655  0.002054 -0.09339 0.361  0.858  10001 100000
    x[1:4] ~ dmulti(p[1:4], n)
    p[1] <- pi*(Sefat1*Sefat2+covDp) + (1-pi)*((1-Spfat1)*(1-Spfat2)+covDn)
    p[2] <- pi*(Sefat1*(1-Sefat2)-covDp) + (1-pi)*((1-Spfat1)*Spfat2-covDn)
    p[3] <- pi*((1-Sefat1)*Sefat2-covDp) + (1-pi)*(Spfat1*(1-Spfat2)-covDn)
    p[4] <- pi*((1-Sefat1)*(1-Sefat2)+covDp) + (1-pi)*(Spfat1*Spfat2+covDn)
    ls <- (Sefat1-1)*(1-Sefat2)
    us <- min(Sefat1,Sefat2) - Sefat1*Sefat2
    lc <- (Spfat1-1)*(1-Spfat2)
    uc <- min(Spfat1,Spfat2) - Spfat1*Spfat2
    pi ~ dbeta(13.322, 6.281) ### Mode=0.70, 95% sure > 0.50
    Sefat1 ~ dbeta(9.628,3.876) ### Mode=0.75, 95% sure > 0.50
    Spfat1 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70
    Sefat2 ~ dbeta(9.628, 3.876) ### Mode=0.75, 95% sure > 0.50
    Spfat2 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70
    covDn ~ dunif(lc, uc)
    covDp ~ dunif(ls, us)
    rhoD <- covDp / sqrt(Sefat1*(1-Sefat1)*Sefat2*(1-Sefat2))
    rhoDc <- covDn / sqrt(Spfat1*(1-Spfat1)*Spfat2*(1-Spfat2))
    }

    list(n=214, x=c(121,6,16,71))
    list(pi=0.7, Sefat1=0.75, Spfat1=0.90, Sefat2=0.75, Spfat2=0.90)
     

    node   mean   sd      MC error 2.50%    median 97.50% start sample
    Sefat1 0.7691 0.06151 9.16E-04 0.6497   0.7693 0.887  10001 100000
    Sefat2 0.8147 0.06045 9.83E-04 0.6959   0.8156 0.9281 10001 100000
    Spfat1 0.8851 0.0581  5.35E-04 0.7533   0.8924 0.975  10001 100000
    Spfat2 0.8485 0.07411 5.83E-04 0.6856   0.8566 0.9667 10001 100000
    rhoD   0.6814 0.1484  0.001845 0.3135   0.7049 0.9071 10001 100000
    rhoDc  0.3629 0.2655  0.002054 -0.09339 0.361  0.858  10001 100000

  • 2 dependent tests, 2 populations, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.3.1.
    Two dependent tests, two populations.
    Estimation of Se and Sp of ELISA and MAT.
    Toxoplasmosis in pigs.
    Data source: Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036.

    model{
      y1[1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q], n1)
      y2[1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q], n2)
      p1[1,1] <- pi1*eta11 + (1-pi1)*theta11
      p1[1,2] <- pi1*eta12 + (1-pi1)*theta12
      p1[2,1] <- pi1*eta21 + (1-pi1)*theta21
      p1[2,2] <- pi1*eta22 + (1-pi1)*theta22
      p2[1,1] <- pi2*eta11 + (1-pi2)*theta11
      p2[1,2] <- pi2*eta12 + (1-pi2)*theta12
      p2[2,1] <- pi2*eta21 + (1-pi2)*theta21
      p2[2,2] <- pi2*eta22 + (1-pi2)*theta22
      eta11 <- lambdaD*eta1
      eta12 <- eta1 - eta11
      eta21 <- gammaD*(1-eta1)
      eta22 <- 1 - eta11 - eta12 - eta21
      theta11 <- 1 - theta12 - theta21 - theta22
      theta12 <- gammaDc*(1-theta1)
      theta21 <- theta1 - theta22
      theta22 <- lambdaDc* theta1
      eta2 <- eta11 + eta21
      theta2 <- theta22 + theta12
      rhoD <- (eta11 - eta1*eta2) / sqrt(eta1*(1-eta1)*eta2*(1-eta2))
      rhoDc <- (theta22 - theta1*theta2) / sqrt(theta1*(1-theta1)*theta2*(1-theta2))
      pi1 ~ dbeta(1.3, 5)
      pi2 ~ dbeta(1.5, 3)
      eta1 ~ dbeta(24.09, 5.73)
      theta1 ~ dbeta(23.05, 3.45)
      lambdaD ~ dbeta(1.376, 1.161) ## Mode=0.70, 5th %tile=0.10
      gammaD ~ dbeta(1.376, 1.161)
      lambdaDc ~ dbeta(2, 1.176) ## Mode=0.85, 5th %tile=0.20
      gammaDc ~ dbeta(2, 1.176)
    }

    list(n1=462, n2=537, Q=2,
    y1=structure(.Data=c(
    67,25,41,329),.Dim=c(2,2)),
    y2=structure(.Data=c(
    97,33,36,371),.Dim=c(2,2)))
    list(pi1=0.07, pi2=0.20, eta1=0.83, theta1=0.90, lambdaD=0.50, lambdaDc=0.50, gammaD=0.50, gammaDc=0.50)

    node   mean   sd      MC error 2.50%   median 97.50% start sample
    eta1   0.8075 0.07051 4.54E-04 0.6516  0.8143 0.9246 10001 100000
    eta2   0.7403 0.1159  0.00108  0.4863  0.7498 0.9326 10001 100000
    theta1 0.897  0.04269 6.07E-04 0.809   0.9016 0.9688 10001 100000
    theta2 0.8629 0.04344 6.18E-04 0.7756  0.867  0.9387 10001 100000
    rhoD   0.3121 0.2676  0.00196  -0.1906 0.3249 0.7918 10001 100000
    rhoDc  0.4015 0.1916  0.002407 -0.004  0.4337 0.6932 10001 100000
    pi1    0.1438 0.05738 7.88E-04 0.0269  0.1483 0.2484 10001 100000
    pi2    0.1921 0.05861 7.95E-04 0.06753 0.1966 0.2982 10001 100000

  • 3 tests (2 dependent and 1 independent), 1 population, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".
     

    Example from section 3.4.2.

    Three tests (two dependent tests, one test independent of these two), one population.

    Estimation of Se and Sp of two FAT tests given a third conditionally independent test, virus isolation (VI).

    Classical swine fever virus.

    Data source: Bouma A, Stegeman JA, Engel B, de Kluijver EP, Elbers AR, De Jong MC. Evaluation of diagnostic tests for the detection of classical swine fever in the field without a gold standard. Journal of Veterinary Diagnostic Investigation 2001; 13(5):383-388. DOI: 10.1177/104063870101300503.

    model{

      y[1:K, 1:K, 1:K] ~ dmulti(p[1:K, 1:K, 1:K], n)

      p[1,1,1] <- pi*SeVI*(Sefat1*Sefat2+covDp) + (1-pi)*(1-SpVI)*((1-Spfat1)*(1-Spfat2)+covDn)

      p[1,2,1] <- pi*SeVI*(Sefat1*(1-Sefat2)-covDp) + (1-pi)*(1-SpVI)*((1-Spfat1)*Spfat2-covDn)

      p[1,1,2] <- pi*(1-SeVI)*(Sefat1*Sefat2+covDp) + (1-pi)*SpVI*((1-Spfat1)*(1-Spfat2)+covDn)

      p[1,2,2] <- pi*(1-SeVI)*(Sefat1*(1-Sefat2)-covDp) + (1-pi)*SpVI*((1-Spfat1)*Spfat2-covDn)

      p[2,1,1] <- pi*SeVI*((1-Sefat1)*Sefat2-covDp) + (1-pi)*(1-SpVI)*(Spfat1*(1-Spfat2)-covDn)

      p[2,2,1] <- pi*SeVI*((1-Sefat1)*(1-Sefat2)+covDp) + (1-pi)*(1-SpVI)*(Spfat1*Spfat2+covDn)

      p[2,1,2] <- pi*(1-SeVI)*((1-Sefat1)*Sefat2-covDp) + (1-pi)*SpVI*(Spfat1*(1-Spfat2)-covDn)

      p[2,2,2] <- pi*(1-SeVI)*((1-Sefat1)*(1-Sefat2)+covDp) + (1-pi)*SpVI*(Spfat1*Spfat2+covDn)

      ls <- (Sefat1-1)*(1-Sefat2)

      us <- min(Sefat1,Sefat2) - Sefat1*Sefat2

      lc <- (Spfat1-1)*(1-Spfat2)

      uc <- min(Spfat1,Spfat2) - Spfat1*Spfat2

      rhoD <- covDp / sqrt(Sefat1*(1-Sefat1)*Sefat2*(1-Sefat2))

      rhoDc <- covDn / sqrt(Spfat1*(1-Spfat1)*Spfat2*(1-Spfat2))

      pi ~ dbeta(13.322, 6.281) ## Mode=0.70, 95% sure > 0.50

      Sefat1 ~ dbeta(9.628,3.876) ## Mode=0.75, 95% sure > 0.50

      Spfat1 ~ dbeta(15.034, 2.559) ## Mode=0.90, 95% sure > 0.70

      Sefat2 ~ dbeta(9.628, 3.876) ## Mode=0.75, 95% sure > 0.50

      Spfat2 ~ dbeta(15.034, 2.559) ## Mode=0.90, 95% sure > 0.70

      SeVI ~ dbeta(15.034, 2.559) ## Mode=0.90, 95% sure > 0.70

      SpVI ~ dbeta(151.769, 4.077) ## Mode=0.98, 95% sure > 0.95

      covDn ~ dunif(lc, uc)

      covDp ~ dunif(ls, us)

    }

     

    list(n=214, K=2)

    list(pi=0.70, Sefat1=0.75, Spfat1=0.90, Sefat2=0.75, Spfat2=0.90, SeVI=0.90, SpVI=0.98)

    y[,1,1] y[,1,2] y[,2,1] y[,2,2]

    96 25 3 3

    12 4 4 67

    END

     

    node   mean   sd      MC error 2.50%   median 97.50% start sample

    SeVI   0.8339 0.042   3.80E-04 0.7538  0.8328 0.9204 10001 100000

    Sefat1 0.8476 0.03171 1.42E-04 0.7817  0.849  0.9054 10001 100000

    Sefat2 0.9228 0.02507 1.53E-04 0.8681  0.925  0.9656 10001 100000

    SpVI   0.9764 0.01124 4.44E-05 0.95    0.9781 0.9933 10001 100000

    Spfat1 0.8896 0.04907 6.49E-04 0.7806  0.895  0.969  10001 100000

    Spfat2 0.8944 0.05052 6.81E-04 0.7812  0.9004 0.9747 10001 100000

    rhoD   0.2827 0.1449  7.42E-04 -0.0085 0.2862 0.5535 10001 100000

    rhoDc  0.5822 0.2195  0.002261 0.05866 0.6257 0.8979 10001 100000

  •  

    3 tests (2 dependent and 1 independent), 2 populations, no gold standard
  •  

    WinBUGS 1.4 code for "three tests (two dependent tests, one test independent of these two), two populations" scenario, based on a modification of the example from section 3.3.1 of "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".
     

    Estimation of Se and Sp of ELISA and MAT for toxoplasmosis in pigs.

    The code uses the parameterization of Dendkuri and Joseph based on Se and Sp covariances and includes a third independent test, the combined results of cat and mouse bioassay, which was used as the reference standard in the original test evaluation study.

    Data source: Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036.

     

    model {

      y1[1:Q, 1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q, 1:Q], n1)

      y2[1:Q, 1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q, 1:Q], n2)

      p1[1,1,1] <- Prev1*(SeT1*SeT2+GSe)*SeT3 + (1-Prev1)*((1-SpT1)*(1-SpT2)+GSp)*(1-SpT3)

      p1[1,1,2] <- Prev1*(SeT1*SeT2+GSe)*(1-SeT3) + (1-Prev1)*((1-SpT1)*(1-SpT2)+GSp)*(SpT3)

      p1[1,2,1] <- Prev1*(SeT1*(1-SeT2)-GSe)*SeT3 + (1-Prev1)*((1-SpT1)*SpT2-GSp)*(1-SpT3)

      p1[1,2,2] <- Prev1*(SeT1*(1-SeT2)-GSe)*(1-SeT3) + (1-Prev1)*((1-SpT1)*SpT2-GSp)*(SpT3)

      p1[2,1,1] <- Prev1*((1-SeT1)*SeT2-GSe)*SeT3 + (1-Prev1)*(SpT1*(1-SpT2)-GSp)*(1-SpT3)

      p1[2,1,2] <- Prev1*((1-SeT1)*SeT2-GSe)*(1-SeT3) + (1-Prev1)*(SpT1*(1-SpT2)-GSp)*SpT3

      p1[2,2,1] <- Prev1*((1-SeT1)*(1-SeT2)+GSe)*SeT3 + (1-Prev1)*(SpT1*SpT2+GSp)*(1-SpT3)

      p1[2,2,2] <- Prev1*((1-SeT1)*(1-SeT2)+GSe)*(1-SeT3) + (1-Prev1)*(SpT1*SpT2+GSp)*SpT3

      p2[1,1,1] <- Prev2*(SeT1*SeT2+GSe)*SeT3 + (1-Prev2)*((1-SpT1)*(1-SpT2)+GSp)*(1-SpT3)

      p2[1,1,2] <- Prev2*(SeT1*SeT2+GSe)*(1-SeT3) + (1-Prev2)*((1-SpT1)*(1-SpT2)+GSp)*(SpT3)

      p2[1,2,1] <- Prev2*(SeT1*(1-SeT2)-GSe)*SeT3 + (1-Prev2)*((1-SpT1)*SpT2-GSp)*(1-SpT3)

      p2[1,2,2] <- Prev2*(SeT1*(1-SeT2)-GSe)*(1-SeT3) + (1-Prev2)*((1-SpT1)*SpT2-GSp)*(SpT3)

      p2[2,1,1] <- Prev2*((1-SeT1)*SeT2-GSe)*SeT3 + (1-Prev2)*(SpT1*(1-SpT2)-GSp)*(1-SpT3)

      p2[2,1,2] <- Prev2*((1-SeT1)*SeT2-GSe)*(1-SeT3) + (1-Prev2)*(SpT1*(1-SpT2)-GSp)*SpT3

      p2[2,2,1] <- Prev2*((1-SeT1)*(1-SeT2)+GSe)*SeT3 + (1-Prev2)*(SpT1*SpT2+GSp)*(1-SpT3)

      p2[2,2,2] <- Prev2*((1-SeT1)*(1-SeT2)+GSe)*(1-SeT3) + (1-Prev2)*(SpT1*SpT2+GSp)*SpT3

      LowerGSe <- (SeT1 - 1)*(1 - SeT2)

      UpperGSe <- min(SeT1,SeT2) - SeT1*SeT2

      LowerGSp <- (SpT1 - 1)*(1 - SpT2)

      UpperGSp <- min(SpT1,SpT2) - SpT1*SpT2

      SeT1~dbeta(24.05, 5.72) # mode = 0.83, 95% sure > 0.68

      SpT1~dbeta(22.99, 3.44) #mode = 0.9, 95% sure > 0.75

      SeT2~dbeta(1,1)

      SpT2~dbeta(1,1)

      SeT3~dbeta(1,1)

      SpT3~dbeta(1,1)

      Prev1~dbeta(1,1)

      Prev2~dbeta(1,1)

      # Equal probability of any value between min and max allowed for the Se covariance

      GSe ~ dunif(LowerGSe, UpperGSe)

      # Equal probability of any value between min and max allowed for the Sp covariance

      GSp ~ dunif(LowerGSp, UpperGSp)

    }

    # data

    # T1 = MAT; T2 = ELISA; T3 = Cat/Mouse Bioassay

    # n1 = batch 1; n2 = batch 2

    list(n1=461, n2=537, Q=2, y1=structure(.Data=c(22,45,6,19,2,39,1,327),.Dim=c(2,2,2)),

    y2=structure(.Data=c(51,46,11,22,2,34,12,359),.Dim=c(2,2,2)))

     

    # initials 1

    list(SeT1=0.7, SpT1=0.7, SeT2=0.7, SpT2=0.7, SeT3=0.7, SpT3=0.7, Prev1=0.2, Prev2=0.2, GSe=0.1, GSp=0.1)

     

    # initials 2 (alternative)

    list(SeT1=0.5, SpT1=0.5, SeT2=0.5, SpT2=0.5, SeT3=0.5, SpT3=0.5, Prev1=0.1, Prev2=0.4, GSe=0.05, GSp=0.05)

     

    node  mean    sd       MC error 2.50%   median  97.50%  start sample

    GSe   0.06076 0.02339  1.57E-04 0.0124  0.06158 0.105   20001 100000

    GSp   0.05713 0.01561  2.90E-04 0.01853 0.05972 0.08102 20001 100000

    Prev1 0.09667 0.03023  5.09E-04 0.05455 0.09059 0.1732  20001 100000

    Prev2 0.1789  0.03437  5.45E-04 0.1225  0.1749  0.2545  20001 100000

    SeT1  0.8581  0.03721  2.51E-04 0.7834  0.8585  0.9298  20001 100000

    SeT2  0.7318  0.04657  2.51E-04 0.639   0.7323  0.822   20001 100000

    SeT3  0.7578  0.1371   0.002402 0.4864  0.7639  0.9839  20001 100000

    SpT1  0.8826  0.0259   4.83E-04 0.8421  0.8785  0.9447  20001 100000

    SpT2  0.8387  0.02217  3.82E-04 0.8014  0.8362  0.8899  20001 100000

    SpT3  0.9935  0.005286 3.42E-05 0.9804  0.9948  0.9998  20001 100000

Disease Freedom

  • Bayesfreecalc: Posterior probability of herd freedom from disease and sample size calculation
  • The self-extracting Bayesfreecalc software for Microsoft Windows includes a pair of programs: Bayesfreecalc1: to estimate the posterior probability of disease freedom, and Bayesfreecalc2: for sample size calculations. Bayesfreecalc is a GUI interface program that does not require other ancillary statistical software (such as S-plus or WinBUGS). Follow the steps below to install Bayesfreecalc1 and Bayesfreecalc2 on your computer.

      Step 1: Download the FreeBS self-extracting installation file (FBSinst.zip). We suggest saving the file to your computer (e.g., into "C:\My Downloads\", or "C:\My Documents\", or onto the Desktop) and running the installation program from your local drive.

     Step 2: You should now be able to double click on the "FBSinst.exe" copy you just downloaded to your computer and install Bayesfreecalc into "C:\FreeBS".

     

    Worked examples are provided in the Help file provided with the Program Bayesfreecalc1.

    Download FBSinst.zip: posterior probability of herd freedom from disease 

    Download DBFree1.zip: sample size calculation to estimate disease freedom 

     

    Program: Bayesfreecalc1

    The calculations performed by the software program Bayesfreecalc1. Determine the posterior probability that a population of size N, with a given number of positive reactors (test positive individuals) out of a sample of size n, truly exceeds some predetermined threshold/cut-off value. This is Hypergeometric sampling. The calculator can also be used for binomial sampling where N is assumed large relative to n.  Each of the n individuals sampled is tested for a given agent and classified as test positive or test negative. The screening test used is usually imperfect and there may be false positive and negative results. Prior distributions reflecting the uncertainty about the diagnostic test sensitivity and specificity are incorporated into the calculation. Analytic results provide the posterior probability that prevalence in the overall population sampled exceeds a given cutoff/threshold value and the number of positive reactors expected out of the n individuals sampled.

    Expanded Description

    The basic situation involves collecting a binomial or hypergeometric-like sample from a population of individuals. Each individual sampled is tested for a given agent and classified as test positive or test negative. The screening test used is usually imperfect and there may be false positive and negative results. Our goal is to determine if the prevalence of the agent in the population sampled either exceeds a given cutoff/threshold value. If the prevalence is below the threshold, the population is considered to be "disease free" and no action is regarded as necessary. Alternately, if the prevalence does exceed this value, an intervention may be considered imperative. Precise details associated with the calculations discussed below for both Bayesfreecalc1 and Bayesfreecalc2 are provided in the article "Johnson WO, Su CL, Gardner IA, Christensen R. Sample size calculations for surveys to substantiate freedom of populations from infectious agents. Biometrics. 2004; 60(1):165-171. DOI: 10.1111/j.0006-341X.2004.00143.x".

     

    The program Bayesfreecalc1 is designed to calculate the posterior probability of "herd infection status," given (a) prior knowledge about the prevalence of the population and the sensitivity and specificity of the diagnostic test used to sample the population, (b) information about the number of positive reactors (T+), the number of individuals sampled (n) and the overall population size (N). This program is consists of 3 "tabs": the "Determine Priors" tab, the "Options and Output" tab, and the "Extra Options" tab. The "Determine Priors" tab is used to specify prior knowledge about prevalence and the sensitivity and specificity of the diagnostic test; ultimately beta prior distributions are elicited from probability statements provided from either subjective expert opinion or previous studies. The "Options and Output" tab is used to supply information about the number sampled, the number in the population, and the number of reactors, and obtain a posterior probability of "herd infection status." The "Extra Options" tab is used to specify information about program algorithm itself (i.e., Monte Carlo sample size, maximum sample size, and the "burn-in").

     

Logistic Regression

  • Outcome measured perfectly - ordinary logistic model
  • We illustrate the model and methods developed in "McInturff P, Johnson WO, Cowling D, Gardner IA. Modelling risk when binary outcomes are subject to error. Statistics in Medicine 2004; 23(7):1095-1109. DOI: 10.1002/sim.1656" using the smoking cessation example.

     

    The following WinBUGS code is presented for the logistic regression model when the outcome is measured perfectly. It illustrates standard logistic regression modeling and inference, and uses a so called “non informative” prior on the regression coefficients. Assume that Se = 1 and Sp = 1, and that the prior for the Beta regression coefficients is specified by the multivariate (MVN) distribution as MVN(0,10^9). We begin by assuming that the outcome variable y is not subject to misclassification error.

     

    Caveat: This prior works when sample sizes are large, as in this instance. With smaller sample sizes, more care is needed to construct a noniniformative prior. This is accomplished by standardizing all continuous covariates by subtracting off the sample mean and dividing by the sample standard deviation for each variable. Then, standard normal priors on the regression coefficients can be shown to be reasonably noninformative.

     

    # N is the number of individuals in the data set,

    # y is a 0,1 binary (or Bernoulli) outcome,

    # pi is the true probability of D (“Disease or Infection”),

    # beta[1] through beta[6] are the regression coefficients,

    # x1 through x5 are covariates

    # y = success in cessation program

    # x1 = 1 (intercept)

    # x2 = Age level 2: 20-29 years

    # x3 = Age level 3: >= 30 years

    # x4 = Education level 2: HS grad

    # x5 = Education level 3: some college

    # x6 = smoking history < 1 pack/day

    model {

      for (i in 1:N) {

        y[i] ~ dbin(pi[i],n[i])

        logit(pi[i]) <- beta[6] + beta[1]*x1[i] + beta[2]*x2[i]

                        + beta[3]*x3[i] + beta[4]*x4[i] + beta[5]*x5[i]

      }

      beta[1] ~ dnorm(0,1.0E-06)

      beta[2] ~ dnorm(0,1.0E-06)

      beta[3] ~ dnorm(0,1.0E-06)

      beta[4] ~ dnorm(0,1.0E-06)

      beta[5] ~ dnorm(0,1.0E-06)

      beta[6] ~ dnorm(0,1.0E-06)

    }

    # inits

    list(beta=c(0,0,0,0,0,0))

     

    # data

    list(

    y=c(

      0,1,0,1,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,

      0,0,1,0,0,0,1,0,1,1,1,0,1,0,1,0,0,0,1,0,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,

      1,0,1,0,1,0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,0,0,1,0,1,1,0,1,1,0,1,0,0,0,0,

      1,0,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,

      0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,

      0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,1,1,1,0,0,0,0,0,1,1,0,0,1,

      0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,1,0,0,0,

      1,0,1,0,0,0,0,1,1,1,0,0,0,1,1,0,0,0,1,0,1,1,0,0,0,0,1,0,1,1,0,0,0,1,1,

      0,1,0,1,1,0,0,1,0,0,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,0,1,0,

      1,0,0,0,0,1,0,0,0,1,0,1,1,0,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,1,0,0,0,1,0,

      0,0,0,0,0,0,0,1,0,1,1),

    x2=c(

      0,1,1,1,1,1,1,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,0,0,0,1,1,

      1,0,1,1,1,1,1,1,0,0,1,1,0,1,1,1,1,0,0,1,1,0,0,1,0,1,1,0,1,1,1,1,1,1,1,

      1,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,0,0,0,

      0,1,0,1,1,1,1,1,1,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,0,0,1,

      1,1,0,1,1,1,1,0,1,1,1,1,0,0,1,1,0,1,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1,1,1,

      0,0,1,1,0,1,0,1,1,0,1,1,1,1,0,1,1,1,0,1,0,1,0,0,1,1,1,1,1,1,0,1,1,1,1,

      0,1,0,0,0,1,1,0,1,1,0,1,1,1,0,1,0,1,1,0,1,0,1,0,0,0,1,1,1,1,1,1,1,1,1,

      1,0,1,1,1,1,0,1,1,1,0,0,1,1,1,1,1,1,1,0,1,1,0,1,1,0,1,0,0,1,0,1,0,0,0,

      0,0,1,0,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,1,1,1,1,0,1,1,1,0,0,0,0,0,0,1,0,

      0,1,1,0,1,0,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,

      1,1,0,1,1,1,0,0,0,0,1),

    x3=c(

      1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,

      0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,

      0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,1,

      0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,

      0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,

      1,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,

      0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,

      0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,1,0,

      1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,1,

      1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,

      0,0,0,0,0,0,0,0,1,0,0),

    x4=c(

      0,1,1,1,0,1,1,1,0,1,1,0,1,0,0,1,0,0,0,0,1,1,1,0,0,0,1,1,1,1,1,0,0,0,1,

      0,0,0,1,1,0,1,0,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,0,1,0,1,0,0,1,1,1,0,1,1,

      0,0,0,1,1,1,1,1,0,1,0,0,0,1,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,1,0,0,0,

      1,0,1,1,1,0,1,0,0,1,1,0,0,0,0,0,1,1,1,0,0,0,1,1,0,0,1,1,0,0,0,1,1,1,1,

      1,0,0,0,1,1,0,1,0,0,0,1,1,0,0,1,0,0,1,0,0,0,0,0,0,0,1,1,1,0,1,0,0,1,1,

      0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0,1,1,0,0,1,1,1,1,1,0,0,0,0,1,0,1,

      0,0,0,0,0,1,0,0,0,1,0,0,1,1,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,1,0,0,

      1,0,1,1,0,1,0,1,1,1,1,0,1,0,0,1,0,0,1,1,1,1,1,0,1,1,0,1,1,0,0,1,1,0,0,

      1,1,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,0,1,1,0,0,1,1,0,

      0,1,1,0,1,0,1,1,1,1,0,1,1,0,1,1,0,0,1,1,0,1,1,1,0,1,1,1,0,1,1,0,1,0,1,

      1,1,1,1,1,0,0,0,1,0,1),

    x5=c(

      0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,

      1,0,1,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,

      0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,1,0,1,

      0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,1,0,0,1,1,1,0,0,0,0,

      0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,1,1,0,1,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,

      1,0,1,0,1,0,0,0,0,0,0,1,0,1,0,0,1,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,

      0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,

      0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,

      0,0,0,1,1,1,0,0,0,1,1,1,0,1,0,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,0,0,1,

      0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,1,0,

      0,0,0,0,0,0,0,0,0,0,0),

    x6=c(

      0,1,1,1,0,1,1,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,1,1,0,0,0,1,

      1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1,0,0,1,0,0,0,1,1,0,1,0,1,0,1,

      1,0,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,

      0,1,0,1,0,1,1,1,0,1,0,1,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,

      1,0,0,1,1,0,1,1,1,0,0,1,1,1,1,1,1,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,1,

      1,1,1,0,0,1,0,0,1,1,0,1,1,0,0,1,1,1,1,1,0,0,0,1,1,1,0,1,0,1,1,1,1,0,1,

      1,1,1,0,1,1,0,1,1,1,0,0,0,1,0,0,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,1,0,1,1,

      1,0,1,1,1,1,1,0,1,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,1,0,1,0,0,1,0,0,1,1,1,

      0,1,1,1,1,1,0,1,0,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,

      1,0,0,1,0,1,1,0,0,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,0,1,1,0,0,0,1,1,

      1,1,0,0,1,1,1,1,1,1,1),

    n=c(

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,1,1,1,1,1,1,1,1,1,1),

    N=361)

     

    node    mean    sd     MC error 2.50%   median  97.50%  start sample

    beta[1] 1.488   0.3097 0.009837 0.8892  1.484   2.104   1001  10000

    beta[2] -0.4751 0.3762 0.0134   -1.222  -0.4717 0.2483  1001  10000

    beta[3] -0.1799 0.4521 0.01439  -1.071  -0.1702 0.699   1001  10000

    beta[4] 0.0934  0.3271 0.007282 -0.5429 0.08938 0.7477  1001  10000

    beta[5] -0.4274 0.3923 0.008005 -1.185  -0.4263 0.3433  1001  10000

    beta[6] -1.502  0.391  0.01609  -2.275  -1.499  -0.7378 1001  10000

  • Outcome based on imperfect test results, and measured imperfectly or with “error”
  • We illustrate the model and methods developed in "McInturff P, Johnson WO, Cowling D, Gardner IA. Modelling risk when binary outcomes are subject to error. Statistics in Medicine 2004; 23(7):1095-1109. DOI: 10.1002/sim.1656" using the smoking cessation example.

    The following WinBUGS code is presented for the logistic regression model when the outcome is measured imperfectly or with “error”. It illustrates logistic regression modeling and inference where the binary outcome indicating “success” may be incorrect, and uses an informative prior on the regression coefficients.

    The binary outcome may be 1, indicating the presence of infection for example, but in fact the infection is not present, and vice versa. In this instance, we incorporate information about the proportion of the time that the outcome will indicate the infection is there, when in fact it is (e.g. the Se) and also the proportion of the time that the outcome indicated that the infection is absent, when it is indeed absent (e.g. the Sp). Independent beta priors are placed on these proportions which reflect available scientific knowledge about them. Replacing the priors for Se and Sp with Se = Sp = 1.0 results in a standard binomial regression analysis.

    The latent or unobserved true infection status is unknown but still modeled as a logistic regression (or possibly a probit or complementary log log regression as desired). Thus globally changing logit to cloglog in the WinBUGS code below will convert the model to a cloglog regression model. Changing logit to g(·), when g(·) is a function supported by WinBUGS, will give an analysis corresponding to that link.

     

    We use “Conditional Means Priors” (see Bedrick EJ, Christensen R, Johnson WO. A new perspective on priors for generalized linear models. Journal of the American Statistical Association 1996; 91(436):1450-1460. DOI: 10.1080/01621459.1996.10476713) for the regression coefficients. The principle behind these priors is to specify a prior distribution on parameters that are easy for scientists to think about, rather than on regression coefficients which are not easy to think about directly. For example, consider a situation with a single covariate, say age. The logistic regression model relates the logit of the probability of event, say infection, as a linear function of age. There is an intercept and a slope corresponding to age. We require a joint prior distribution on the intercept and slope. We ask the experimenter to think about the prevalence of infection for two types of individuals, one who is of average age say, and one for an individual who is say one standard deviation above the mean. We elicit independent beta distributions on these two probabilities. Because there is a one to one correspondence between these two probabilities and the regression intercept and slope, we are able to induce an informative prior distribution on the regression coefficients. In general, this is accomplished for say p-1 covariates by specifying independent beta priors on p probabilities of infection corresponding to distinct covariate combinations. These are specified in WinBugs. All that remains is to relate the regression coefficients to these probabilities in WinBugs. Again, see Bedrick et al for details. These priors result in data augmentation priors in the case of logistic regression.

    There are N = 361 observations in the data below. The prior is a BCJ prior and is elicited based on p-1 = 5 covariates so there are 6 probabilities that are used to elicit a prior on the 6 regression coefficients. These probabilities are indexed as p[1] to p[6]. The data are given below. They are taken from McInturff et al. The results here differ from the paper slightly due to the fact that this was a separate monte carlo run.

     

    # N is the number of individuals in the data set,

    # y is a 0,1 binary (or Bernoulli) outcome,

    # q is the probability that y = 1,

    # pi is the true probability of D (“Disease or Infection”),

    # beta[1] through beta[6] are the regression coefficients,

    # x2 through x6 are covariates,

    # xinv is the inverse of the covariate combination matrix, which

    # must be calculated elsewhere and entered as data.

    #  y = success in cessation program

    # x1 = 1 (intercept)

    # x2 = Age level 2: 20-29 years

    # x3 = Age level 3: >= 30 years

    # x4 = Education level 2: HS grad

    # x5 = Education level 3: some college

    # x6 = smoking history < 1 pack/day

     

    model {

      for (i in 1:N) {

        y[i] ~ dbern(q[i]) # specification of the LR model with error

        q[i] <- pi[i]*Se+(1-pi[i])*(1-Sp)

        logit(pi[i]) <- beta[1] + x2[i]*beta[2] +

                         x3[i]*beta[3] + x4[i]*beta[4] + x5[i]*beta[5] + x6[i]*beta[6]

      }

      Se ~ dbeta(99, 1) # the priors are specified for Se and Sp

      Sp ~ dbeta(14, 2)

      p[1] ~ dbeta(8, 15) # specification of the prior on 6 probabilities of smoking

      p[2] ~ dbeta(10, 15) # cessation for 6 covariate combinations

      p[3] ~ dbeta(3, 13)

      p[4] ~ dbeta(8, 10)

      p[5] ~ dbeta(4, 15)

      p[6] ~ dbeta(6, 15)

      # relate the regression coefficients to the probabilities on which prior was specified

      # this results in an induced informative prior on the regression coefficients

      beta[1] <- xinv[1,1]*logit(p[1]) + xinv[1,2]*logit(p[2])

                     + xinv[1,3]*logit(p[3]) + xinv[1,4]*logit(p[4])

                     + xinv[1,5]*logit(p[5]) + xinv[1,6]*logit(p[6])

      beta[2] <- xinv[2,1]*logit(p[1]) + xinv[2,2]*logit(p[2])

                     + xinv[2,3]*logit(p[3]) + xinv[2,4]*logit(p[4])

                     + xinv[2,5]*logit(p[5]) + xinv[2,6]*logit(p[6])

      beta[3] <- xinv[3,1]*logit(p[1]) + xinv[3,2]*logit(p[2])

                     + xinv[3,3]*logit(p[3]) + xinv[3,4]*logit(p[4])

                     + xinv[3,5]*logit(p[5]) + xinv[3,6]*logit(p[6])

      beta[4] <- xinv[4,1]*logit(p[1]) + xinv[4,2]*logit(p[2])

                     + xinv[4,3]*logit(p[3]) + xinv[4,4]*logit(p[4])

                     + xinv[4,5]*logit(p[5]) + xinv[4,6]*logit(p[6])

      beta[5] <- xinv[5,1]*logit(p[1]) + xinv[5,2]*logit(p[2])

                     + xinv[5,3]*logit(p[3]) + xinv[5,4]*logit(p[4])

                     + xinv[5,5]*logit(p[5]) + xinv[5,6]*logit(p[6])

      beta[6] <- xinv[6,1]*logit(p[1]) + xinv[6,2]*logit(p[2])

                     + xinv[6,3]*logit(p[3]) + xinv[6,4]*logit(p[4])

                     + xinv[6,5]*logit(p[5]) + xinv[6,6]*logit(p[6])

    }

    # inits

    list(Se = 0.95, Sp = 0.93, p = c(0.33, 0.39, 0.14, 0.44, 0.18, 0.26))

    # data

    list(

    y=c(

      0,1,0,1,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,

      0,0,0,1,0,0,1,0,0,0,1,0,1,1,1,0,1,0,1,0,0,0,1,0,0,0,0,1,0,1,0,1,1,0,0,

      0,0,0,0,1,0,1,0,1,0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,0,0,1,0,1,1,0,1,1,0,1,

      0,0,0,0,1,0,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,1,

      0,0,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,1,0,0,0,0,

      0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,1,1,1,0,0,0,0,0,1,

      1,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,

      1,0,0,0,1,0,1,0,0,0,0,1,1,1,0,0,0,1,1,0,0,0,1,0,1,1,0,0,0,0,1,0,1,1,0,

      0,0,1,1,0,1,0,1,1,0,0,1,0,0,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,

      0,0,1,0,1,0,0,0,0,1,0,0,0,1,0,1,1,0,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,1,0,

      0,0,1,0,0,0,0,0,0,0,0,1,0,1,1),

    x2 = c(

      0,1,1,1,1,1,1,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,0,0,

      0,1,1,1,0,1,1,1,1,1,1,0,0,1,1,0,1,1,1,1,0,0,1,1,0,0,1,0,1,1,0,1,1,1,1,

      1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,

      0,0,0,0,1,0,1,1,1,1,1,1,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,

      0,0,1,1,1,0,1,1,1,1,0,1,1,1,1,0,0,1,1,0,1,1,1,1,0,0,1,1,0,0,1,1,0,0,1,

      1,1,1,0,0,1,1,0,1,0,1,1,0,1,1,1,1,0,1,1,1,0,1,0,1,0,0,1,1,1,1,1,1,0,1,

      1,1,1,0,1,0,0,0,1,1,0,1,1,0,1,1,1,0,1,0,1,1,0,1,0,1,0,0,0,1,1,1,1,1,1,

      1,1,1,1,0,1,1,1,1,0,1,1,1,0,0,1,1,1,1,1,1,1,0,1,1,0,1,1,0,1,0,0,1,0,1,

      0,0,0,0,0,1,0,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,1,1,1,1,0,1,1,1,0,0,0,0,0,

      0,1,0,0,1,1,0,1,0,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,

      1,0,0,1,1,0,1,1,1,0,0,0,0,1),

    x3 = c(

      1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,

      1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,1,0,0,0,0,0,0,0,

      0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,

      1,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,

      1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,1,1,0,0,1,1,0,

      0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,

      0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,

      0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,

      1,1,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,

      1,0,1,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

      0,1,0,0,0,0,0,0,0,0,0,1,0,0),

    x4 = c(

      0,1,1,1,0,1,1,1,0,1,1,0,1,0,0,1,0,0,0,0,1,1,1,0,0,0,1,1,1,1,1,0,

      0,0,1,0,0,0,1,1,0,1,0,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,0,1,0,1,0,0,1,1,1,

      0,1,1,0,0,0,1,1,1,1,1,0,1,0,0,0,1,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,1,

      0,0,0,1,0,1,1,1,0,1,0,0,1,1,0,0,0,0,0,1,1,1,0,0,0,1,1,0,0,1,1,0,0,0,1,

      1,1,1,1,0,0,0,1,1,0,1,0,0,0,1,1,0,0,1,0,0,1,0,0,0,0,0,0,0,1,1,1,0,1,0,

      0,1,1,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,1,0,1,1,0,0,1,1,1,1,1,0,0,0,0,

      1,0,1,0,0,0,0,0,1,0,0,0,1,0,0,1,1,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,

      1,0,0,1,0,1,1,0,1,0,1,1,1,1,0,1,0,0,1,0,0,1,1,1,1,1,0,1,1,0,1,1,0,0,1,

      1,0,0,1,1,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,0,1,1,0,0,

      1,1,0,0,1,1,0,1,0,1,1,1,1,0,1,1,0,1,1,0,0,1,1,0,1,1,1,0,1,1,1,0,1,1,0,

      1,0,1,1,1,1,1,1,0,0,0,1,0,1),

    x5 = c(

      0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,0,

      1,0,0,1,0,1,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,

      1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,

      1,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,1,0,0,1,1,1,0,

      0,0,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,1,1,0,1,0,0,1,1,1,1,0,0,0,0,0,0,

      0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,1,0,0,1,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,

      0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,

      0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,

      0,1,0,0,0,0,1,1,1,0,0,0,1,1,1,0,1,0,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,

      0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,1,

      0,1,0,0,0,0,0,0,0,0,0,0,0,0),

    x6 = c(

      0,1,1,1,0,1,1,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1,0,1,1,1,1,1,1,1,0,

      0,0,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1,0,0,1,0,0,0,1,1,0,1,0,

      1,0,1,1,0,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,

      1,1,0,0,1,0,1,0,1,1,1,0,1,0,1,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,

      0,1,1,1,0,0,1,1,0,1,1,1,0,0,1,1,1,1,1,1,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,

      0,0,1,1,1,1,0,0,1,0,0,1,1,0,1,1,0,0,1,1,1,1,1,0,0,0,1,1,1,0,1,0,1,1,1,

      1,0,1,1,1,1,0,1,1,0,1,1,1,0,0,0,1,0,0,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,1,

      0,1,1,1,0,1,1,1,1,1,0,1,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,1,0,1,0,0,1,0,0,

      1,1,1,0,1,1,1,1,1,0,1,0,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,

      1,1,1,1,0,0,1,0,1,1,0,0,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,0,1,1,0,0,

      0,1,1,1,1,0,0,1,1,1,1,1,1,1),

    xinv = structure(.Data=c(

      -1.0, 1.94E-16, 3.33E-16, 2.57E-16, 1.0, 1.0,

      -7.77E-16, 2.50E-16, 1.0, -4.86E-17, 3.33E-16, 1.0,-1.0, 1.0, 1.0,

      -3.75E-16, -3.33E-16, 1.0,1.0, -5.83E-16, -5.55E-16, -9.02E-17, -1.0,

      8.88E-16,4.16E-16, -6.11E-16, -3.89E-16, 1.0, -1.0, 5.41E-16,1.0,

      1.67E-16, -1.0, 3.33E-16, 5.55E-17, 7.77E-16),.Dim=c(6,6)),

    N = 361)

     

    node    mean   sd      MC error 2.50%   median 97.50%  start sample

    Se      0.9889 0.0112  1.87E-04 0.9581  0.9923 0.9997  1001  10000

    Sp      0.8846 0.05832 0.001576 0.7586  0.8904 0.98    1001  10000

    beta[1] -1.28  0.4824  0.009809 -2.378  -1.224 -0.4945 1001  10000

    beta[2] -1.794 0.4204  0.008314 -2.759  -1.746 -1.109  1001  10000

    beta[3] -1.814 0.5734  0.01031  -3.17   -1.732 -0.8912 1001  10000

    beta[4] 0.8832 0.3738  0.004783 0.2234  0.8552 1.688   1001  10000

    beta[5] 0.4718 0.4658  0.006418 -0.3917 0.4457 1.445   1001  10000

    beta[6] 1.225  0.352   0.004054 0.5862  1.209  1.989   1001  10000

    p[1]    0.4578 0.06514 0.001163 0.3356  0.4555 0.5909  1001  10000

    p[2]    0.4537 0.08422 0.001083 0.2842  0.4554 0.6153  1001  10000

    p[3]    0.2036 0.05354 8.07E-04 0.1106  0.1994 0.3207  1001  10000

    p[4]    0.3621 0.08063 0.001202 0.2122  0.3593 0.5298  1001  10000

    p[5]    0.2646 0.071   0.001258 0.1368  0.2614 0.4161  1001  10000

    p[6]    0.4039 0.05669 0.00158  0.2743  0.411  0.4932  1001  10000

Prevalence Estimation

  • 1 test, 1 population - binomial sampling
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Bayesian modeling of animal- and herd-level prevalences. Preventive Veterinary Medicine 2004; 66(1-4):101-112. DOI: 10.1016/j.prevetmed.2004.09.009".

    Example from section 2.1.1.

    Binomial Sampling: 1 test, 1 population

    Estimate the prevalence of T. gondii in pigs.

    Data source: Davies PR, Morrow WE, Deen J, Gamble HR, Patton S. Seroprevalence of Toxoplasma gondii and Trichinella spiralis in finishing swine raised in different production systems in North Carolina, USA. Preventive Veterinary Medicine 1998; 36(1):67-76. DOI: 10.1016/S0167-5877(98)00072-5.

     

    model {

      Tpos ~ dbin(p, n)

      p <- pi * Se + (1 - pi) * (1 - Sp)

      Z ~ dbern(tau0)

      Se ~ dbeta(22.5, 10.22) ## Mode=0.70, 95% sure Se > 0.55

      Sp ~ dbeta(88.28, 1.882) ## Mode=0.99, 95% sure Sp > 0.95

      pistar ~ dbeta(1.80, 26.74) ## Mode=0.03, 95% sure pistar < 0.15

      pi <- Z * pistar

      piequal0 <- equals(pi, 0)

    }

    list(n=91, Tpos=1, tau0=0.10)

    list(pistar=0.03, Se=0.70, Sp=0.99, Z=0)

     

    node    mean sd     MC       error    2.50% median 97.50%   start sample

    pi      7.8  2E-04  0.005399 2.62E-05 0     0      0.009687 5001  50000

    piequal 0    0.9701 0.1702   8.64E-04 0     1      1        5001  50000

     

  • 1 test, 1 population - hypergeometric sampling
  • WinBUGS 1.4 code for prevalence estimation under hypergeometric sampling for "1 test, 1 population" scenario.

    model{

      for (i in 1:k){

         n[i]<-n1[i]-1

         zeros[i]<-0

         zeros[i]~dpois(phi[i])

         phi[i]<--log(pdfx[i])+C

         Uy[i]<-min(n[i],d[i])

         Ly[i]<-max(0,n[i]-N[i]+d[i])

         logNn[i]<-logfact(N[i])-logfact(n[i])-logfact(N[i]-n[i])

         for (y in 1:Ud[i]){

           # (y-1) from 0 to (Ud[i]-1) Ud[i]=n[i]+1

           yI[i,y]<-step(Uy[i]-y+1)*step(y-1-Ly[i])

           temy1[i,y]<-yI[i,y]*(logfact(d[i])-logfact(y-1)-logfact(yI[i,y]*(d[i]-y+1)))

           temy2[i,y]<-yI[i,y]*(logfact(N[i]-d[i])-logfact(n[i]-y+1)-logfact(yI[i,y]*(N[i]-d[i]-n[i]+y-1)))

           hypdf[i,y]<-yI[i,y]*exp(temy1[i,y]+temy2[i,y]-yI[i,y]*logNn[i])

           Uj[i,y]<-min(x[i],y-1)

           Lj[i,y]<-max(0,x[i]-n[i]+y-1)

           for (j in 1:x1[i]){

             jI[i,y,j]<-step(Uj[i,y]-j+1)*step(j-1-Lj[i,y])

             tem1[i,y,j]<-jI[i,y,j]*(logfact(y-1)-logfact(j-1)-logfact(jI[i,y,j]*(y-j))+(j-1)*log(se)+(y-j)*log(1-se))

             tem2[i,y,j]<-jI[i,y,j]*(logfact(n[i]-y+1)-logfact(x[i]-j+1)-logfact(jI[i,y,j]*(n[i]-y-x[i]+j))+(n[i]-y-x[i]+j)*log(sp)+(x[i]-j+1)*log(1-sp))

             sj[i,y,j]<-jI[i,y,j]*exp(tem1[i,y,j]+tem2[i,y,j])

           }

           sumj[i,y]<-sum(sj[i,y,])

         }

         pdfx[i]<-inprod(hypdf[i,],sumj[i,])

         #d[i]~dunif(Ld[i],Ud[i])

         d[i]<-z[i]*dI[i]

         dI[i]~dcat(pi0[i,])

         pi[i]<-d[i]/N[i]

         for (id in 1:N[i]){

           p0[i,id]<-pow(id/N[i],api[i]-1)*pow(1-id/N[i],bpi[i]-1)

         }

          p0.sum[i]<-sum(p0[i,])

          for (id in 1:N[i]){

            pi0[i,id]<-p0[i,id]/p0.sum[i]

          }

          z[i]~dbern(tau)

          z0[i]<-1-z[i]

      }

      se~dbeta(ase,bse)

      sp~dbeta(asp,bsp)

    }

    # data 1, k = 1, N :: n, the estimate for prevalence with hypergeometric model was similar to binomial model for this data (because of extremely low prevalence)

    list(tau=0.1,api=c(1.8),bpi=c(26.74),ase=22.5,bse=10.22,asp=88.28,bsp=1.882,Ud=c(92),x=c(1),x1=c(2),n1=c(92),N=c(91),C=1000,k=1)

    # initials 1

    list(z=c(0),se=0.7,sp=0.99,dI=c(3))

    # data 2, k = 1, N :: 20n, the estimate for prevalence with hypergeometric model was similar to binomial model for this data (because N >> n)

    list(tau=0.1,api=c(1.8),bpi=c(26.74),ase=22.5,bse=10.22,asp=88.28,bsp=1.882,Ud=c(92),x=c(1),x1=c(2),n1=c(92),N=c(1820),C=1000,k=1)

    # initials 2

    list(z=c(0),se=0.7,sp=0.99,dI=c(60))

    Output for data 1:

    node  mean     sd       MC error 2.50% median 97.50% start sample

    pi[1] 5.30E-04 0.003893 3.98E-05 0     0      0      5001  10000

    z0[1] 0.9757   0.154    0.001628 0     1      1      5001  10000

  • 1 test, multiple populations - estimation of herd-level prevalence
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Bayesian modeling of animal- and herd-level prevalences. Preventive Veterinary Medicine 2004; 66(1-4):101-112. DOI: 10.1016/j.prevetmed.2004.09.009".

    Example from section 3.2.1.

    Estimation of the herd-level prevalence (tau) and prevalence distribution of Johne's disease in California based on screening with IDEXX ELISA.

     

    model{

      Se ~ dbeta(58.8, 174.5) ## Mode=0.25; 95% sure < 0.30

      Sp ~ dbeta(272.4, 6.5) ## Mode=0.98, 95% sure > 0.96

      tau ~ dbeta(4.8, 3.6) ## Mode=0.60, 95% sure < 0.827

      alpha <- mu*psi

      beta <- psi*(1-mu)

      mu ~ dbeta(3.283, 17.744) ## Mode=0.12, 95% sure < 0.30.

      psi ~ dgamma(4.524, 0.387) ## Uses Median of 95th percentile of prevalence

      ## distribution=0.30 and 99% sure this number is < 0.50

      for(i in 1:k){

        prob.tpos[i] <- pi[i]*Se + (1-pi[i])*(1-Sp)

        y[i] ~ dbin(prob.tpos[i], n)

        inf[i] ~ dbern(tau)

        pi.star[i] ~ dbeta(alpha,beta)

        pi[i] <- pi.star[i] * inf[i]

      }

      Z30 ~ dbern(tau)

      pistar30 ~ dbeta(alpha,beta)

      pi30 <- Z30*pistar30

      a1 <- 1-step(pi30 - 0.05)

      a2 <- 1-step(pi30-0.5)

      a3 <- equals(pi30,0)

      b1 <- step(tau-0.50)

    }

    list(k=29, n=60, y=c(2,1,2,2,3,6,0,6,3,13,2,3,1,7,2,2,0,4,1,2,6,1,4,0,5,4,2,0,13))

    list(Se=0.25, Sp=0.98, mu=0.12, psi=11.69, tau=0.60,

    pi.star=c(0.05, 0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05),

    inf=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1), Z30=0, pistar30=0.05)

     

    node  mean   sd      MC error 2.50%  median  97.50% start sample

    Se    0.2642 0.02716 2.03E-04 0.2133 0.2634  0.3188 5001  50000

    Sp    0.9756 0.00618 6.57E-05 0.9629 0.9758  0.987  5001  50000

    a1    0.5146 0.4998  0.002515 0      1       1      5001  50000

    a2    0.9702 0.1701  8.35E-04 0      1       1      5001  50000

    a3    0.4201 0.4936  0.002906 0      0       1      5001  50000

    alpha 1.758  1.155   0.02111  0.4283 1.47    4.756  5001  50000

    b1    0.7001 0.4582  0.005842 0      1       1      5001  50000

    beta  6.462  3.168   0.04273  2.081  5.863   14.25  5001  50000

    mu    0.2084 0.06054 0.001253 0.1068 0.2026  0.3428 5001  50000

    pi30  0.1175 0.1551  8.69E-04 0      0.04166 0.5196 5001  50000

    psi   8.22   4.128   0.06068  2.638  7.415   18.43  5001  50000

    tau   0.5792 0.1428  0.002234 0.3005 0.5792  0.847  5001  50000

  • 2 tests, multiple populations - estimation of prevalence distribution
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Bayesian modeling of animal- and herd-level prevalences. Preventive Veterinary Medicine 2004; 66(1-4):101-112. DOI: 10.1016/j.prevetmed.2004.09.009".

     

    Example from section 3.1.1.

    Estimate the prevalence distribution for Brucellosis in cattle in Mexico.

    Data source: Hanson TE, Johnson WO, Gardner IA. Hierarchical models for estimating herd prevalence and test accuracy in the absence of a gold standard. Journal of Agricultural, Biological, and Environmental Statistics 2003; 8(2):223-239. DOI: 10.1198/1085711031526.

     

    An example of a joint sequential test.

    Test 1 is BAPA so Sebapa and Spbapa are the sensitivity and specificity for BAPA, and test 2 is Rivanol so Seriv and Spriv are the sensitivity and specificity of Rivanol.

    model{

      for(i in 1:K){

        x[i, 1:3] ~ dmulti(p[i, 1:3], n[i])

        p[i,1] <- pi[i]*(Sebapa*Seriv+covDp) + (1-pi[i])*((1-Spbapa)*(1-Spriv)+covDn)

        p[i,2] <- pi[i]*(Sebapa*(1-Seriv)-covDp) + (1-pi[i])*((1-Spbapa)*Spriv-covDn)

        p[i,3] <- 1-p[i,1]-p[i,2]

        pi[i] ~ dbeta(alpha, beta)

      }

      Sebapa ~ dbeta(88.3,1.9) ## Mode=0.99, 95% sure > 0.95

      Spbapa ~ dbeta(13.3,6.3) ## Mode=0.70, 95% sure > 0.50

      Seriv ~ dbeta(130.7,15.4) ## Mode=0.90, 95% sure > 0.85

      Spriv ~ dbeta(99.7,6.2) ## Mode=0.95, 95% sure > 0.90

      alpha <- mu*psi

      beta <- psi*(1-mu)

      mu ~ dbeta(16.9,48.6) ## Mode=0.25; 95% sure < 0.35

      psi ~ dgamma(7.23, 1.28)

      ls <- (Sebapa-1)*(1-Seriv)

      lc <- (Spbapa-1)*(1-Spriv)

      us <- min(Sebapa,Seriv)-Sebapa*Seriv

      uc <- min(Spbapa, Spriv) - Spbapa*Spriv

      covDn ~ dunif(lc, uc)

      covDp ~ dunif(ls, us)

      rhoDp <- covDp / sqrt(Sebapa*(1-Sebapa)*Seriv*(1-Seriv))

      rhoDn <- covDn / sqrt(Spbapa*(1-Spbapa)*Spriv*(1-Spriv))

      pi21 ~ dbeta(alpha,beta)

      a[1] <- 1-step(pi21-0.50)

      a[2] <- 1-step(pi21-0.10)

      a[3] <- step(pi21-0.90)

      a[4] <- step(pi21-0.75)

    }

    list(Sebapa=0.99, Spbapa=0.70, Seriv=0.90, Spriv=0.95, mu=0.25, psi=7)

    list(K=20, n=c(67,12,22,71,18,80,147,45,37,118,12,62,11,20,56,60,26,54,17,17),

    x=structure(.Data=c(

    4,2,61,

    1,1,10,

    2,2,18,

    2,0,69,

    2,0,16,

    2,0,78,

    80,5,62,

    4,5,36,

    10,0,27,

    10,12,96,

    1,0,11,

    1,1,60,

    3,0,8,

    3,7,10,

    38,0,18,

    1,0,59,

    3,0,23,

    13,3,38,

    7,0,10,

    1,1,15),.Dim=c(20,3)))

     

    node   mean    sd      MC error 2.50%    median 97.50% start sample

    Sebapa 0.972   0.01888 2.77E-04 0.9255   0.9761 0.9966 5001  50000

    Seriv  0.9176  0.01896 1.19E-04 0.8772   0.9188 0.9513 5001  50000

    Spbapa 0.9296  0.01477 1.92E-04 0.8996   0.93   0.9574 5001  50000

    Spriv  0.9426  0.02129 1.43E-04 0.8926   0.946  0.9754 5001  50000

    a[1]   0.9148  0.2792  0.001232 0        1      1      5001  50000

    a[2]   0.4314  0.4953  0.002647 0        0      1      5001  50000

    a[3]   0.00208 0.04556 2.10E-04 0        0      0      5001  50000

    a[4]   0.01478 0.1207  5.01E-04 0        0      0      5001  50000

    alpha  0.6935  0.2334  0.003044 0.3263   0.6641 1.23   5001  50000

    beta   2.961   0.9713  0.01053  1.393    2.849  5.177  5001  50000

    mu     0.1918  0.03431 2.81E-04 0.1294   0.1902 0.2636 5001  50000

    psi    3.654   1.161   0.01339  1.77     3.526  6.284  5001  50000

    rhoDn  0.3981  0.168   0.00198  0.05055  0.4086 0.6859 5001  50000

    rhoDp  0.3119  0.2146  0.002489 -0.01402 0.2886 0.774  5001  50000

  • Posterior distribution of true prevalence given apparent prevalence
  • Obtaining the posterior distribution of true prevalence given the apparent prevalence

    This module presents a sample of WinBUGS program code that can be used to obtain the posterior distribution of true prevalence when a "binomial experiment" has been performed to estimate apparent (or true) prevalence. The user specifies prior distributions for test sensitivity and specificity, which may be based either upon expert opinion or previous diagnostic test validation studies. The necessary observed data are a sample of size N, of which, y are test positive.

    Expanded Description

    Diagnostic tests are often imperfect and when a population is sampled and the individuals' disease status determined using a diagnostic test, the quantity estimated--apparent prevalence--may be noticeably different from the true prevalence of disease in the population. The degree to which apparant prevalence and true prevalence differ is a function of the sensitivity and specificity of the diagnostic test.

    The conventional approach to calculating true prevalence (TP) based on known sensitivity, specificity, and apparent prevalence is based on the following relationship:

      AP = Se * TP + (1 - Sp) (1 - TP)

    Rearranging terms, it follows that TP can be estimated by

      TP = (AP + Sp - 1) / (Se + Sp - 1)

    Some algebra shows that when AP is less than 1 - Sp, TP will be estimated to be negative. Likewise, when AP is greater than Se, then TP will be estimated to be greater than 1.

    By applying priors for TP, Sp, and Se, to observed binomial data (say, y individuals in a population test positive out of n tested), WinBUGS can be used obtain posterior distributions for TP, Sp, and Se. Moreover, because these posterior disbributions obey the rules of probability, they remain in the interval from 0 to 1, preventing nonsensical estimates of TP that are negative or greater than 1.

     

    The AP-to-Prev  contains the WinBUGS code for obtaining the posterior distribution of prevalence, given prior distributions for sensitivity, specificity, and prevalence, combined with observed binomial data for apparent prevalence.

  • Monte Carlo simulated NPV, PPV, LR+, and LR- given distributions for prevalence, Se, and Sp
  • WinBUGS 1.4 code that takes the simple one test parameters (prevalence, Se and Sp) and calculates positive predictive value (PPV) and negative predictive value (NPV) along with likelihood ratio positive (LR+) and likelihood ratio negative (LR-). Note that the posterior distributions for LR+ and LR- depend only on the prior distributions for Se and Sp since the apparent prevalence data contain no information to update the priors.
     

    model{

      x ~ dbin(AP, n)

      AP <- pi*se + (1-pi)*(1-sp)

      se ~ dbeta(ase,bse)

      sp ~ dbeta(asp,bsp)

      pi ~ dbeta(api,bpi)

      PPV <- pi*se/AP

      NPV <- (1-pi)*sp/(1-AP)

      LRP <- se/(1-sp)

      LRN <- (1-se)/sp

    }

     

    DAT

    list(n=91,x=1,ase=22.5,bse=10.22,asp=88.28,bsp=1.882,api=1.80,bpi=26.74)

     

    ## Use BetaBuster to obtain the following results (ase,...)

    ## Mode=0.70, 95% sure se > 0.55

    ## Mode=0.99, 95% sure sp > 0.95

    ## Mode=0.03, 95% sure pi < 0.15

     

    ## Initial values

    list(pi=0.03,se=0.70,sp=0.99)

     

    # Positive predictive value (PPV) = The proportion of all individuals with a

    # positive test who actually have the disease

     

    # Negative predictive value (NPV) = The proportion of all individuals with a

    # negative test who do not have the disease
     

    # The likelihood ratio for a positive test result (LR+):

    # LR+ = sensitivity/(1-specificity)

    # The likelihood ratio for a negative test result (LR-):

    # LR- = (1-sensitivity)/specificity

     

    node mean    sd       MC error 2.50%    median  97.50%  start sample

    LRN  0.327   0.0829   8.63E-04 0.1769   0.3239  0.4986  10001 10000

    LRP  106.9   202.2    1.878    19.05    61.87   473.2   10001 10000

    NPV  0.9908  0.007312 6.46E-05 0.9722   0.9927  0.999   10001 10000

    PPV  0.5748  0.227    0.002149 0.1198   0.5956  0.94    10001 10000

    pi   0.02688 0.01812  1.68E-04 0.003491 0.02318 0.07178 10001 10000

    se   0.6772  0.08172  8.50E-04 0.5082   0.68    0.8261  10001 10000

    sp   0.9873  0.008698 7.95E-05 0.9659   0.9892  0.9985  10001 10000

  • A Bayesian approach to estimate OJD prevalence from pooled faecal samples of variable pool size
  • WinBUGS file: OJDPrev_wo_data.doc

Prior Elicitation

  • BetaBuster
  • BetaBuster is a GUI that was designed by Dr. Chun-Lung Su for the purpose of obtaining specific Beta prior distributions based on scientific input for quantities like the sensitivity or specificity of a diagnostic test or prevalence. For example, assume that a scientist is asked for their best guess for the Se of a particular test and the answer is 0.9. Moreover, if the scientist is also willing to assert that he/she is 95% sure that the Se is greater than 0.75, BetaBuster will take this information and obtain the unique Beta distribution that has mode (value with the greatest density/plausibility) equal to 0.9 and which has 95% of the area to the right. In fact, the Beta(22.99, 3.44) distribution is obtained by BetaBuster, and is also plotted in the process. This is accomplished by putting in the above input information and then clicking on "Set Priors".

    In addition, it is possible to use BetaBuster to specify the two parameters of the Beta(a, b) distribution, and to obtain a plot of that distribution. This is accomplished by first double clicking on the word "Density", and they typing in a and b and clicking on "Set Priors".

    Download BetaBuster

    UC Legal Disclaimer - The Regents of the University of California disclaim all warranties with regard to these software, including all implied warranties of merchantability, non-infringement, and fitness for a particular use. In no event shall the Regents of the University of California be liable for any direct, special, indirect, or consequential damage or any damage whatsoever resulting from the loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of these software. Use at your own risk. If you do not agree to this, do not use these software.

    Note: Please use "English (United States) - US" as your Input Language when using BetaBuster. BetaBuster does not accept comma instead of dot as decimal separator.

     

    Expanded Description

    Beta distributions can be used to model binomial probabilities (e.g. prevalence and test accuracy values) in a Bayesian analysis. The Beta distribution is characterized by 2 shape parameters, often designated as a and b, that can be manipulated to change the shape of distribution, e.g. make it more or less peaked, or change the location of its most likely value.

     

    (1) What types of shapes are possible for Beta distributions?

    The values given to the parameters, a and b, determine the final shape of the distribution. There are 5 common scenarios that users mostly will be interested in:

    a > 1 and b > 1, unimodal distributions are obtained.

    a = 1 and b = 1, a “flat” prior is obtained. This is equivalent to a Uniform(0, 1) distribution.

    a < 1 and b < 1, the distribution is bimodal with modes at 0 and 1.

    a = 1 and b > 1, the mode is 0 with high values being unlikely.

    a > 1 and b = 1, the mode is 1 with low values being unlikely.

     

    It is also possible to have 2 other scenarios; a = 1 and b < 1, and a < 1 and b = 1.

    (2) How are the mean, mode and variance calculated for a beta distribution?

    Mean = a / (a + b)

    Mode = (a - 1) / (a + b - 2)

    Variance = ab / [(a + b + 1)(a + b)^2]

     

    (3) How do I construct an appropriate beta prior distribution when I don't know a and b?

    Two approaches are possible. The first is based on expert opinion and the second uses data from comparable experiments. BetaBuster was primarily designed to address the first situation and allow easy implementation by users without access to specialized software such as S-plus. The program allows calculation of Beta distributions for scenarios 1, 2, 4 and 5 that are described above. In the second situation, the parameters of the distribution can be directly calculated from the observed data as described in a subsequent section. For this situation, it is possible to draw the corresponding Beta density with BetaBuster using the calculated values of a and b.

     

    (4) What information do I need from an expert to allow me to use BetaBuster?

    Two inputs are necessary before using the program. The expert should initially be asked for the most likely value of the parameter of interest – this is set to the mode of the corresponding Beta prior.

    If the modal value is between 0 and 0.5, ask the expert for the 95th percentile of possible values for the parameter (e.g. you are 95% certain that the parameter is below what value?)

    If the modal value is between 0.5 and 1, ask the expert for the 5th percentile of possible values for the parameter (e.g. you are 95% certain that the parameter exceeds what value?)

    If the modal value is designated by the expert to be 0.5, then ask for either the 5th or 95th percentile since the distribution is symmetric.

     

    Example: Suppose an expert indicates that the most likely value for the sensitivity of an ELISA test is 0.9 and he is 95% sure that the value exceeds 0.7.

    Step 1: Enter the values of 0.7 and 0.9 into the top line of the screen either by typing in the values or using the spindles to toggle to the appropriate numbers. The data entry line should indicate that the expert is “95% sure that x greater than 0.7 and Mode at 0.9”. Values cannot be entered in the other blank spaces elsewhere on the screen – blank spaces are for values generated by the program.

    Step 2: Click on the “Set Priors” button and the appropriate beta distribution is generated: a = 15.0342 and b = 2.5594 with mean, variance and the 2.5th and 97.5th percentiles. Should you wish percentiles other than the 2.5th and 97.5th percentiles, you can double click on the corresponding label e.g. “2.5%” and adjust it to say 5%. The percentile is automatically adjusted.

    The corresponding Beta density is generated in the graph window and this can be verified as appropriate by the expert. If it is considered inappropriate, the limits and mode can be adjusted until they generate a picture that is suitable and correctly represents scientific input.

     

    (5) How do I obtain a Beta distribution using data from a prior study?

    Suppose a previous comparable and well-designed study involved the testing of 100 infected animals and 90 animals tested positive and 10 animals tested negative. Then a Beta(91, 11) distribution as shown in the figure above would be appropriate for modeling uncertainty about test sensitivity provided the animals tested are similar in both studies.

    In general, if an experiment resulted in “s” successes (e.g. no. test-positive animals) recorded in “n” trials (e.g. number of truly infected animals), use of a Beta(a, b) distribution with a = s + 1 and b = n - s + 1 is an appropriate choice to model the uncertainty in that parameter.

     

    (6) Can I draw a beta density and obtain means, medians and percentiles that correspond to known values of a and b?

    Yes, double click on the label “Density” on the left hand side of the input window. The font color will change from Black to Bold Red. Input the known values for a and b and click on the “Set Priors” button. Double clicking on the “Density” label will return the original setting.
     

    (7) How can I obtain a copy of the beta distribution in the figure window?

    In the directory in which you installed the program, there is a bitmap file, "betabuster.bmp", that is automatically updated when you click on the “Set Priors” button. The file is overwritten with each new Beta distribution that you create so that it only contains the figure from the current calculation.

    A second option is to do a screen capture (with the Print Screen key) and paste the screen into a program such as Microsoft Paint. The editing tool can then be used to select the part that you wish to keep, e.g. the Beta density only or the entire BetaBuster GUI that includes a and b values, mean, variance, median and percentiles.

Receiver Operating Characteristic (ROC) Curve Estimation

  • Estimating ROC curves and corresponding AUC’s based on a gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Choi YK, Johnson WO, Collins MT, Gardner IA. Bayesian estimation of ROC curves in the absence of a gold standard. Journal of Agricultural, Biological, and Environmental Statistics 2006; 11(2):210-229. DOI: 10.1198/108571106X110883".

    Example from section 3.2

    Two tests, one population

    Estimate parameters for binormal distributions, ROC curves using sensitivity and specificity, corresponding AUC’s, and the difference between AUC’s for two tests.

    Two ELISA tests, HerdChek® (IDEXX Laboratories Inc., Westbrook, Maine) and Parachek™ (Biocor Animal Health, Omaha, Nebraska), were performed to detect serum antibodies to Mycobacterium avium subsp. paratuberculosis, the cause of Johne’s disease in cattle. ELISA results were compared with fecal culture for the organism, as the gold standard. There were 88 cattle that were considered infected (fecal culture scores of at least 3) and 393 cattle that were considered non-infected (negative fecal cultures and from herds in which Johne’s disease was not suspected).

     

    # Gold Standard case

    Model {

      for (i in 1:88) {

        S1[i] ~ dnorm(mu1[i],tau1[i]) # test 1

        S2[i] ~ dnorm(condmu[i],condtau[i]) # test 2

        mu1[i] <- lambda1[1]

        tau1[i] <- gamma1[1]

        condmu[i] <- lambda2[1]+rho[1]*sqrt(gamma1[1]/gamma2[1])*(S1[i]-lambda1[1])

        condtau[i] <- (gamma2[1])/(1-pow(rho[1],2))

      }

      for (i in 89:481) {

        S1[i] ~ dnorm(mu1[i],tau1[i]) # test 1

        S2[i] ~ dnorm(condmu[i],condtau[i]) # test 2

        mu1[i] <- lambda1[2]

        tau1[i] <- gamma1[2]

        condmu[i] <- lambda2[2]+rho[2]*sqrt(gamma1[2]/gamma2[2])*(S1[i]-lambda1[2])

        condtau[i] <- (gamma2[2])/(1-pow(rho[2],2))

      }

      lambda1[1] ~ dnorm(0,0.001)I(lambda1[2],) # prior for the mean of disease group (test 1)

      lambda2[1] ~ dnorm(0,0.001)I(lambda2[2],) # prior for the mean of disease group (test 2)

      lambda1[2] ~ dnorm(0,0.001) # prior for the mean of non-disease group (test 1)

      lambda2[2] ~ dnorm(0,0.001) # prior for the mean of non-disease group (test 2)

      rho[1] ~ dunif(-1,1) # correlation coefficient for disease group

      rho[2] ~ dunif(-1,1) # correlation coefficient for non-disease group

      gamma1[1] ~ dgamma(0.001,0.001) # prior for the precision of disease group (test 1)

      gamma2[1] ~ dgamma(0.001,0.001) # prior for the precision of disease group (test 2)

      gamma1[2] ~ dgamma(0.001,0.001) # prior for precision of non-disease group (test 1)

      gamma2[2] ~ dgamma(0.001,0.001) # prior for precision of non-disease group (test 2)

      sigma1[1] <- 1/gamma1[1] # define the variance for disease group (test 1)

      sigma1[2] <- 1/gamma1[2] # define the variance for non-disease group (test 1)

      sigma2[1] <- 1/gamma2[1] # define the variance for disease group (test 2)

      sigma2[2] <- 1/gamma2[2] # define the variance for non-disease group (test 2)

      delta1 <- phi((-1.357-lambda1[1])/sqrt(sigma1[1]))

      delta2 <- phi((-2.326-lambda2[1])/sqrt(sigma2[1]))

      # AUC

      AUC1 <- phi(-(lambda1[2]-lambda1[1])/sqrt(sigma1[2]+sigma1[1]))

      AUC2 <- phi(-(lambda2[2]-lambda2[1])/sqrt(sigma2[2]+sigma2[1]))

      diff <- AUC1-AUC2 # difference between two AUCs

      # ROC curve

      for(i in 1:111) {

        c1[i] <-  ((-8.1+0.1*i)-lambda1[1])/sqrt(sigma1[1]) # grid is from -3 to 8

        se1[i] <- 1-phi(c1[i])

        c2[i] <-  ((-8.1+0.1*i)-lambda1[2])/sqrt(sigma1[2])

        sp1[i] <- phi(c2[i])

        c3[i] <-  ((-8.1+0.1*i)-lambda2[1])/sqrt(sigma2[1])

        se2[i] <- 1-phi(c3[i])

        c4[i] <-  ((-8.1+0.1*i)-lambda2[2])/sqrt(sigma2[2])

        sp2[i] <- phi(c4[i])

      }

    }

     

    list(lambda1=c(0,-2),lambda2=c(0,-2),gamma1=c(0.5,1),gamma2=c(0.5,1),rho=c(0,0))

    list(

    S1=c(

      -2.30,0.66,-0.45,-1.11,-1.11,0.52,0.93,-1.71,0.80,0.07,0.76,0.43,-2.66,0.46,

      -0.34,0.53,0.17,-0.65,-0.04,0.17,0.35,-2.81,0.79,-1.17,0.40,0.13,0.88,0.83,

      -2.30,-2.53,-1.35,-1.71,0.63,0.55,-3.22,0.13,0.07,0.60,-2.21,-3.51,-0.07,-1.56,

      1.02,0.61,0.52,-3.51,0.56,0.03,0.60,0.46,-2.04,0.76,-1.39,0.51,0.15,-0.26,

      0.98,0.80,0.24,0.12,0.67,-0.29,0.18,-0.34,-0.07,-0.31,0.41,-1.05,-1.35,0.39,

      0.25,-1.71,0.92,-1.20,-2.30,-1.27,0.49,-0.26,-2.81,-2.81,-0.82,-0.73,-0.20,0.59,

      0.58,-2.41,-1.77,-2.04,

      -3.51,-2.66,-2.30,-1.77,-3.00,-3.22,-2.30,-2.21,-2.53,-3.22,-2.21,-3.22,-3.22,-3.22,

      -3.00,-3.22,-2.41,-2.21,-3.51,-2.12,-3.91,-2.04,-2.53,-2.81,-1.56,-1.77,-1.90,-2.81,

      -1.90,-1.05,-3.22,-1.90,-2.41,-3.51,-2.81,-2.41,-3.00,-2.66,-3.22,-1.27,-2.66,-1.66,

      -1.56,-2.81,-2.04,-3.00,-3.91,-4.61,-1.90,-2.04,-2.21,-2.21,-1.61,-1.24,-1.90,-3.91,

      -3.00,-2.81,-2.12,-2.12,-3.51,-2.12,-2.21,-2.81,-2.66,-4.61,-2.12,-2.21,-3.91,-3.51,

      -1.71,-2.81,-3.91,-4.61,-2.53,-3.22,-3.00,-2.53,-3.00,-4.61,-2.66,-4.61,-1.90,-2.53,

      -2.81,-2.66,-2.66,-2.53,-2.81,-3.00,-2.53,-3.91,-1.56,-3.91,-2.53,-2.81,-3.51,-3.22,

      -2.53,-3.22,-2.53,-1.71,-4.61,-3.22,-3.22,-3.51,-2.53,-2.81,-2.66,-3.00,-2.30,-1.83,

      -2.66,-3.22,-2.81,-4.61,-2.30,-2.53,-2.81,-3.91,-4.61,-1.71,-2.53,-3.00,-3.51,-2.66,

      -1.39,-2.30,-3.51,-2.81,-2.41,-3.00,-2.81,-2.53,-2.53,-3.91,-1.11,-2.41,0.17,-1.24,

      -2.04,-2.04,-1.20,-3.51,-2.41,-2.12,-3.22,-0.42,-1.11,-1.83,-2.21,-2.66,-3.51,-3.22,

      -2.12,-1.77,-0.54,-2.30,-2.12,-2.30,-4.61,-2.66,-1.71,-1.97,-1.51,-1.61,-2.41,-2.53,

      -2.66,-3.91,-1.61,-3.00,-2.41,-2.04,-0.99,-2.12,-3.00,-1.97,-1.56,-3.51,-2.81,-2.41,

      -1.77,-2.53,-2.12,-2.66,-3.51,-4.61,-3.22,-3.22,-4.61,-3.22,-1.71,-2.41,-2.04,-3.00,

      -3.22,-3.51,-2.53,-3.91,-2.30,-2.04,-3.00,-2.30,-3.51,-2.81,-0.97,-2.30,-2.66,-3.91,

      -2.66,-3.22,-3.00,-3.00,-2.81,-3.91,-3.00,-3.22,-3.22,-3.91,-3.51,-2.81,-3.00,-2.81,

      -3.51,-1.83,-4.61,-3.51,-2.53,-3.22,-2.12,-2.81,-2.66,-1.47,-2.81,-2.41,-1.77,-3.22,

      -2.81,-2.41,-2.41,-2.41,-1.35,-3.00,-3.22,-3.51,-2.53,-3.22,-2.53,-3.22,-2.41,-2.04,

      -4.61,-2.53,-2.30,-2.66,-2.41,-2.66,-3.22,-0.97,-3.51,-3.51,-3.51,-3.22,-2.30,-1.83,

      -2.81,-3.22,-3.22,-1.39,-3.00,-2.53,-2.53,-2.04,-3.51,-2.66,-2.66,-2.66,-3.22,-3.00,

      -2.41,-2.53,-2.41,-1.08,-2.81,-3.22,-1.66,-4.61,-3.91,-3.51,-3.91,-4.61,-2.41,-3.00,

      -3.91,-2.66,-3.00,-3.00,-2.81,-3.91,-2.66,-3.00,-3.51,-2.66,-3.91,-3.22,-1.27,-2.81,

      -1.31,-3.51,-2.21,-2.41,-2.04,-2.81,-2.81,-4.61,-3.51,-1.97,-4.61,-2.53,-3.00,-1.71,

      -3.91,-4.61,-3.22,-3.51,-0.60,-3.91,-1.39,-3.22,-2.41,-1.61,-1.90,-3.22,-3.22,-2.81,

      -2.30,-2.04,-1.83,-2.21,-1.51,-2.21,-2.21,-2.66,-2.21,-2.66,-2.66,-3.22,-2.66,-1.83,

      -2.66,-1.83,-3.22,-3.91,-2.21,-3.22,-3.22,-2.66,-2.66,-2.04,-2.41,-1.61,-3.91,-2.66,

      -2.21,-2.41,-1.71,-3.22,-1.83,-2.41,-2.66,-2.66,-2.04,-1.83,-3.22,-1.83,-3.22,-2.21,

      -2.41,-2.41,-2.66,-2.41,-3.91,-3.22,-2.41,-1.83,-2.04,-2.21,-2.66,-2.04,-3.22,-3.91,

      -2.66),

    S2=c(

      -1.09,-0.69,-1.26,-0.54,-2.09,-0.76,0.13,-1.86,-0.21,-1.48,-0.02,-0.17,-1.19,-1.51,

      -0.92,0.21,-1.01,-1.97,-1.02,-0.06,-1.26,-2.34,-0.29,-2.44,-1.03,-1.97,-0.22,-1.45,

      -2.53,-2.03,-1.90,-2.27,0.81,0.17,-1.45,-0.56,-0.32,-0.02,-2.03,-2.21,-1.29,-1.31,

      0.22,0.17,0.37,-3.06,-0.33,-0.97,0.84,-0.03,-2.66,0.01,-1.39,0.81,-0.45,-0.87,

      0.21,0.46,-0.53,-0.29,0.51,-0.56,-0.67,-0.54,-0.51,-0.99,-0.21,-2.04,-1.97,-0.16,

      0.43,-1.77,0.19,-1.08,-2.30,-1.66,-0.78,-0.69,-2.41,-2.41,-1.51,-2.53,-0.33,0.20,

      -0.24,-2.53,-0.80,-2.12,

      -2.98,-2.90,-2.66,-2.75,-2.83,-2.83,-2.92,-2.98,-2.88,-2.55,-2.78,-2.80,-2.94,-2.98,

      -3.02,-2.83,-2.86,-2.98,-2.88,-2.63,-2.94,-2.48,-2.86,-2.88,-2.78,-2.98,-2.80,-2.94,

      -2.60,-2.63,-2.92,-2.81,-2.62,-2.86,-2.83,-2.81,-2.78,-2.76,-2.63,-1.98,-2.81,-2.28,

      -2.69,-2.60,-2.31,-2.63,-2.70,-2.88,-2.55,-2.60,-2.45,-2.73,-2.78,-1.81,-2.69,-2.51,

      -2.72,-2.67,-1.53,-2.78,-2.65,-2.17,-2.36,-2.75,-2.49,-2.50,-2.59,-2.69,-2.20,-2.40,

      -2.23,-2.12,-2.65,-2.44,-2.67,-2.55,-2.56,-2.23,-2.70,-2.24,-2.44,-1.93,-2.28,-2.72,

      -2.25,-2.44,-2.66,-2.65,-2.70,-2.55,-2.30,-2.67,-2.73,-2.40,-3.06,-2.92,-3.06,-3.06,

      -3.02,-2.86,-2.94,-3.00,-2.88,-2.88,-3.08,-2.92,-3.06,-2.90,-2.92,-3.04,-3.10,-2.96,

      -3.02,-3.08,-2.88,-2.92,-2.96,-3.08,-2.94,-3.00,-3.04,-2.67,-2.98,-3.02,-3.00,-3.04,

      -2.44,-3.08,-3.02,-2.98,-2.94,-2.90,-2.47,-3.06,-3.06,-3.04,-1.87,-2.14,-2.45,-2.62,

      -2.63,-2.19,-2.54,-2.67,-2.54,-2.27,-2.73,-2.50,-2.60,-2.62,-2.62,-2.40,-2.59,-2.35,

      -2.44,-2.48,-2.58,-2.55,-2.75,-2.73,-2.80,-2.49,-2.51,-2.66,-2.60,-2.60,-2.67,-2.43,

      -2.73,-2.59,-2.67,-2.49,-2.54,-2.23,-2.49,-2.11,-2.59,-2.41,-2.51,-2.75,-2.81,-2.38,

      -2.47,-2.45,-2.75,-2.96,-2.83,-2.98,-2.90,-2.90,-2.92,-2.98,-2.92,-2.60,-2.98,-2.96,

      -2.92,-2.94,-2.81,-3.00,-2.54,-2.67,-2.96,-2.83,-2.90,-2.02,-2.70,-2.81,-1.97,-2.69,

      -2.30,-2.80,-2.90,-2.67,-2.86,-2.88,-2.75,-2.65,-2.80,-2.73,-2.76,-2.73,-2.78,-2.76,

      -2.83,-2.53,-2.85,-2.78,-2.85,-3.08,-3.10,-3.08,-3.08,-3.02,-3.08,-2.80,-2.98,-3.04,

      -3.02,-3.04,-3.08,-3.02,-2.86,-2.98,-3.00,-3.08,-3.08,-2.98,-3.06,-3.04,-3.06,-3.02,

      -3.02,-3.08,-2.83,-3.08,-3.08,-2.90,-2.98,-3.06,-2.94,-2.98,-3.06,-2.85,-3.02,-3.00,

      -3.06,-3.06,-3.08,-2.92,-3.10,-3.04,-3.08,-2.94,-3.10,-3.00,-2.98,-3.06,-2.98,-3.06,

      -3.04,-3.08,-2.65,-2.85,-3.00,-3.04,-2.51,-2.85,-2.63,-2.90,-2.56,-2.67,-2.78,-2.65,

      -2.67,-2.76,-2.78,-2.50,-2.32,-2.54,-2.56,-2.47,-2.73,-2.44,-2.76,-2.76,-2.54,-2.62,

      -2.54,-2.83,-2.65,-2.55,-2.59,-2.75,-2.66,-2.86,-2.69,-2.75,-2.75,-2.72,-2.15,-2.63,

      -2.80,-2.63,-2.48,-2.47,-2.34,-2.63,-2.50,-2.36,-2.75,-2.51,-2.30,-2.70,-2.69,-2.65,

      -2.40,-2.81,-3.00,-2.83,-2.81,-2.90,-2.86,-2.81,-2.25,-2.88,-3.00,-2.90,-2.81,-2.88,

      -2.80,-2.85,-2.98,-2.80,-2.83,-2.63,-2.58,-2.85,-2.76,-2.75,-2.85,-2.86,-2.81,-2.78,

      -2.75,-2.69,-2.62,-2.81,-2.92,-2.88,-2.92,-2.85,-2.94,-2.86,-2.96,-2.67,-2.63,-2.83,

      -2.81,-2.70,-2.69,-2.90,-2.80,-2.85,-2.67,-2.43,-2.85,-2.86,-2.98,-2.86,-2.94,-2.94,

      -2.83))

     

    node       mean     sd       MC error 2.50%    median   97.50%    start sample

    AUC1       0.9318   0.0162   3.24E-04 0.8959   0.9334   0.9593    501   9500

    AUC2       0.9625   0.01316  2.49E-04 0.9318   0.9641   0.9833    501   9500

    delta1     0.238    0.03691  8.02E-04 0.1693   0.2366   0.3152    501   9500

    delta2     0.07813  0.02163  4.24E-04 0.04171  0.07615  0.1268    501   9500

    diff       -0.03064 0.01228  1.34E-04 -0.05641 -0.03022 -0.007731 501   9500

    lambda1[1] -0.4688  0.1338   0.00308  -0.7292  -0.469   -0.2062   501   9500

    lambda1[2] -2.699   0.04106  4.09E-04 -2.78    -2.699   -2.619    501   9500

    lambda2[1] -0.9385  0.1041   0.002333 -1.14    -0.9393  -0.7317   501   9500

    lambda2[2] -2.744   0.01289  1.43E-04 -2.77    -2.744   -2.719    501   9500

    rho[1]     0.7868   0.04121  6.81E-04 0.6965   0.7905   0.8571    501   9500

    rho[2]     0.1921   0.04841  4.68E-04 0.09728  0.1932   0.2848    501   9500

    sigma1[1]  1.557    0.2398   0.00395  1.159    1.533    2.102     501   9500

    sigma1[2]  0.6712   0.04826  4.54E-04 0.5829   0.6687   0.771     501   9500

    sigma2[1]  0.9526   0.1441   0.002362 0.7113   0.9383   1.273     501   9500

    sigma2[2]  0.06541  0.004792 4.72E-05 0.05666  0.06516  0.0754    501   9500

  • Estimating ROC curves and corresponding AUC’s in the absence of a gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Choi YK, Johnson WO, Collins MT, Gardner IA. Bayesian estimation of ROC curves in the absence of a gold standard. Journal of Agricultural, Biological, and Environmental Statistics 2006; 11(2):210-229. DOI: 10.1198/108571106X110883".

     

    Example from section 3.2

    Two tests, one population

    Estimate parameters for binormal distributions, the prevalence of infection, ROC curves using sensitivity and specificity, corresponding AUC’s, and the difference between AUC’s for two tests.

    Two ELISA tests, HerdChek® (IDEXX Laboratories Inc., Westbrook, Maine) and Parachek™ (Biocor Animal Health, Omaha, Nebraska), were performed to detect serum antibodies to Mycobacterium avium subsp. paratuberculosis, the cause of Johne’s disease in cattle.

     

    # No Gold Standard case

    Model {

      for (i in 1:481) {

        S1[i] ~ dnorm(mu1[i],tau1[i]) # test 1

        S2[i] ~ dnorm(condmu[i],condtau[i]) # test 2

        mu1[i] <- lambda1[T[i]]

        tau1[i] <- gamma1[T[i]]

        condmu[i] <- lambda2[T[i]]+rho[T[i]]*sqrt(gamma1[T[i]]/gamma2[T[i]])*(S1[i]-lambda1[T[i]])

        condtau[i] <- (gamma2[T[i]])/(1-pow(rho[T[i]],2))

        T[i] ~ dcat(P[]) # disease group if T[i] =1, non-disease if T[i]=2

      }

      P[1:2] ~ ddirch(alpha[])

      lambda1[1] ~ dnorm(0,0.001)I(lambda1[2],) # prior for the mean of disease group (test 1)

      lambda2[1] ~ dnorm(0,0.001)I(lambda2[2],) # prior for the mean of disease group (test 2)

      lambda1[2] ~ dnorm(0,0.001) # prior for the mean of non-disease group (test 1)

      lambda2[2] ~ dnorm(0,0.001) # prior for the mean of non-disease group (test 2)

      rho[1] ~ dunif(-1,1) # correlation coefficient for disease group

      rho[2] ~ dunif(-1,1) # correlation coefficient for non-disease group

      gamma1[1] ~ dgamma(0.001,0.001) # prior for the precision of disease group (test 1)

      gamma2[1] ~ dgamma(0.001,0.001) # prior for the precision of disease group (test 2)

      gamma1[2] ~ dgamma(0.001,0.001) # prior for the precision of non-disease group (test 1)

      gamma2[2] ~ dgamma(0.001,0.001) # prior for the precision of non-disease group (test 2)

      sigma1[1] <- 1/gamma1[1] # define the variance for disease group (test1)

      sigma1[2] <- 1/gamma1[2] # define the variance for non-disease group (test1)

      sigma2[1] <- 1/gamma2[1] # define the variance for disease group (test2)

      sigma2[2] <- 1/gamma2[2] # define the variance for non-disease group (test2)

      delta1 <- phi((-1.357-lambda1[1])/sqrt(sigma1[1]))

      delta2 <- phi((-2.326-lambda2[1])/sqrt(sigma2[1]))

      # AUC

      AUC1 <- phi(-(lambda1[2]-lambda1[1])/sqrt(sigma1[2]+sigma1[1]))

      AUC2 <- phi(-(lambda2[2]-lambda2[1])/sqrt(sigma2[2]+sigma2[1]))

      diff <- AUC1-AUC2 # difference between two AUCs

      # ROC curve

      for(i in 1:111) {

        c1[i] <-  ((-8.1+0.1*i)-lambda1[1])/sqrt(sigma1[1]) # grid is from -3 to 8

        se1[i] <- 1-phi(c1[i])

        c2[i] <-  ((-8.1+0.1*i)-lambda1[2])/sqrt(sigma1[2])

        sp1[i] <- phi(c2[i])

        c3[i] <-  ((-8.1+0.1*i)-lambda2[1])/sqrt(sigma2[1])

        se2[i] <- 1-phi(c3[i])

        c4[i] <-  ((-8.1+0.1*i)-lambda2[2])/sqrt(sigma2[2])

        sp2[i] <- phi(c4[i])

      }

    }

    list(lambda1=c(0,-2),lambda2=c(0,-2),gamma1=c(0.5,1),gamma2=c(0.5,1),rho=c(0,0))

    list(alpha=c(1,1),

    S1=c(

      -2.30,0.66,-0.45,-1.11,-1.11,0.52,0.93,-1.71,0.80,0.07,0.76,0.43,-2.66,0.46,

      -0.34,0.53,0.17,-0.65,-0.04,0.17,0.35,-2.81,0.79,-1.17,0.40,0.13,0.88,0.83,

      -2.30,-2.53,-1.35,-1.71,0.63,0.55,-3.22,0.13,0.07,0.60,-2.21,-3.51,-0.07,-1.56,

      1.02,0.61,0.52,-3.51,0.56,0.03,0.60,0.46,-2.04,0.76,-1.39,0.51,0.15,-0.26,

      0.98,0.80,0.24,0.12,0.67,-0.29,0.18,-0.34,-0.07,-0.31,0.41,-1.05,-1.35,0.39,

      0.25,-1.71,0.92,-1.20,-2.30,-1.27,0.49,-0.26,-2.81,-2.81,-0.82,-0.73,-0.20,0.59,

      0.58,-2.41,-1.77,-2.04,

      -3.51,-2.66,-2.30,-1.77,-3.00,-3.22,-2.30,-2.21,-2.53,-3.22,-2.21,-3.22,-3.22,-3.22,

      -3.00,-3.22,-2.41,-2.21,-3.51,-2.12,-3.91,-2.04,-2.53,-2.81,-1.56,-1.77,-1.90,-2.81,

      -1.90,-1.05,-3.22,-1.90,-2.41,-3.51,-2.81,-2.41,-3.00,-2.66,-3.22,-1.27,-2.66,-1.66,

      -1.56,-2.81,-2.04,-3.00,-3.91,-4.61,-1.90,-2.04,-2.21,-2.21,-1.61,-1.24,-1.90,-3.91,

      -3.00,-2.81,-2.12,-2.12,-3.51,-2.12,-2.21,-2.81,-2.66,-4.61,-2.12,-2.21,-3.91,-3.51,

      -1.71,-2.81,-3.91,-4.61,-2.53,-3.22,-3.00,-2.53,-3.00,-4.61,-2.66,-4.61,-1.90,-2.53,

      -2.81,-2.66,-2.66,-2.53,-2.81,-3.00,-2.53,-3.91,-1.56,-3.91,-2.53,-2.81,-3.51,-3.22,

      -2.53,-3.22,-2.53,-1.71,-4.61,-3.22,-3.22,-3.51,-2.53,-2.81,-2.66,-3.00,-2.30,-1.83,

      -2.66,-3.22,-2.81,-4.61,-2.30,-2.53,-2.81,-3.91,-4.61,-1.71,-2.53,-3.00,-3.51,-2.66,

      -1.39,-2.30,-3.51,-2.81,-2.41,-3.00,-2.81,-2.53,-2.53,-3.91,-1.11,-2.41,0.17,-1.24,

      -2.04,-2.04,-1.20,-3.51,-2.41,-2.12,-3.22,-0.42,-1.11,-1.83,-2.21,-2.66,-3.51,-3.22,

      -2.12,-1.77,-0.54,-2.30,-2.12,-2.30,-4.61,-2.66,-1.71,-1.97,-1.51,-1.61,-2.41,-2.53,

      -2.66,-3.91,-1.61,-3.00,-2.41,-2.04,-0.99,-2.12,-3.00,-1.97,-1.56,-3.51,-2.81,-2.41,

      -1.77,-2.53,-2.12,-2.66,-3.51,-4.61,-3.22,-3.22,-4.61,-3.22,-1.71,-2.41,-2.04,-3.00,

      -3.22,-3.51,-2.53,-3.91,-2.30,-2.04,-3.00,-2.30,-3.51,-2.81,-0.97,-2.30,-2.66,-3.91,

      -2.66,-3.22,-3.00,-3.00,-2.81,-3.91,-3.00,-3.22,-3.22,-3.91,-3.51,-2.81,-3.00,-2.81,

      -3.51,-1.83,-4.61,-3.51,-2.53,-3.22,-2.12,-2.81,-2.66,-1.47,-2.81,-2.41,-1.77,-3.22,

      -2.81,-2.41,-2.41,-2.41,-1.35,-3.00,-3.22,-3.51,-2.53,-3.22,-2.53,-3.22,-2.41,-2.04,

      -4.61,-2.53,-2.30,-2.66,-2.41,-2.66,-3.22,-0.97,-3.51,-3.51,-3.51,-3.22,-2.30,-1.83,

      -2.81,-3.22,-3.22,-1.39,-3.00,-2.53,-2.53,-2.04,-3.51,-2.66,-2.66,-2.66,-3.22,-3.00,

      -2.41,-2.53,-2.41,-1.08,-2.81,-3.22,-1.66,-4.61,-3.91,-3.51,-3.91,-4.61,-2.41,-3.00,

      -3.91,-2.66,-3.00,-3.00,-2.81,-3.91,-2.66,-3.00,-3.51,-2.66,-3.91,-3.22,-1.27,-2.81,

      -1.31,-3.51,-2.21,-2.41,-2.04,-2.81,-2.81,-4.61,-3.51,-1.97,-4.61,-2.53,-3.00,-1.71,

      -3.91,-4.61,-3.22,-3.51,-0.60,-3.91,-1.39,-3.22,-2.41,-1.61,-1.90,-3.22,-3.22,-2.81,

      -2.30,-2.04,-1.83,-2.21,-1.51,-2.21,-2.21,-2.66,-2.21,-2.66,-2.66,-3.22,-2.66,-1.83,

      -2.66,-1.83,-3.22,-3.91,-2.21,-3.22,-3.22,-2.66,-2.66,-2.04,-2.41,-1.61,-3.91,-2.66,

      -2.21,-2.41,-1.71,-3.22,-1.83,-2.41,-2.66,-2.66,-2.04,-1.83,-3.22,-1.83,-3.22,-2.21,

      -2.41,-2.41,-2.66,-2.41,-3.91,-3.22,-2.41,-1.83,-2.04,-2.21,-2.66,-2.04,-3.22,-3.91,

      -2.66),

    S2=c(

      -1.09,-0.69,-1.26,-0.54,-2.09,-0.76,0.13,-1.86,-0.21,-1.48,-0.02,-0.17,-1.19,-1.51,

      -0.92,0.21,-1.01,-1.97,-1.02,-0.06,-1.26,-2.34,-0.29,-2.44,-1.03,-1.97,-0.22,-1.45,

      -2.53,-2.03,-1.90,-2.27,0.81,0.17,-1.45,-0.56,-0.32,-0.02,-2.03,-2.21,-1.29,-1.31,

      0.22,0.17,0.37,-3.06,-0.33,-0.97,0.84,-0.03,-2.66,0.01,-1.39,0.81,-0.45,-0.87,

      0.21,0.46,-0.53,-0.29,0.51,-0.56,-0.67,-0.54,-0.51,-0.99,-0.21,-2.04,-1.97,-0.16,

      0.43,-1.77,0.19,-1.08,-2.30,-1.66,-0.78,-0.69,-2.41,-2.41,-1.51,-2.53,-0.33,0.20,

      -0.24,-2.53,-0.80,-2.12,

      -2.98,-2.90,-2.66,-2.75,-2.83,-2.83,-2.92,-2.98,-2.88,-2.55,-2.78,-2.80,-2.94,-2.98,

      -3.02,-2.83,-2.86,-2.98,-2.88,-2.63,-2.94,-2.48,-2.86,-2.88,-2.78,-2.98,-2.80,-2.94,

      -2.60,-2.63,-2.92,-2.81,-2.62,-2.86,-2.83,-2.81,-2.78,-2.76,-2.63,-1.98,-2.81,-2.28,

      -2.69,-2.60,-2.31,-2.63,-2.70,-2.88,-2.55,-2.60,-2.45,-2.73,-2.78,-1.81,-2.69,-2.51,

      -2.72,-2.67,-1.53,-2.78,-2.65,-2.17,-2.36,-2.75,-2.49,-2.50,-2.59,-2.69,-2.20,-2.40,

      -2.23,-2.12,-2.65,-2.44,-2.67,-2.55,-2.56,-2.23,-2.70,-2.24,-2.44,-1.93,-2.28,-2.72,

      -2.25,-2.44,-2.66,-2.65,-2.70,-2.55,-2.30,-2.67,-2.73,-2.40,-3.06,-2.92,-3.06,-3.06,

      -3.02,-2.86,-2.94,-3.00,-2.88,-2.88,-3.08,-2.92,-3.06,-2.90,-2.92,-3.04,-3.10,-2.96,

      -3.02,-3.08,-2.88,-2.92,-2.96,-3.08,-2.94,-3.00,-3.04,-2.67,-2.98,-3.02,-3.00,-3.04,

      -2.44,-3.08,-3.02,-2.98,-2.94,-2.90,-2.47,-3.06,-3.06,-3.04,-1.87,-2.14,-2.45,-2.62,

      -2.63,-2.19,-2.54,-2.67,-2.54,-2.27,-2.73,-2.50,-2.60,-2.62,-2.62,-2.40,-2.59,-2.35,

      -2.44,-2.48,-2.58,-2.55,-2.75,-2.73,-2.80,-2.49,-2.51,-2.66,-2.60,-2.60,-2.67,-2.43,

      -2.73,-2.59,-2.67,-2.49,-2.54,-2.23,-2.49,-2.11,-2.59,-2.41,-2.51,-2.75,-2.81,-2.38,

      -2.47,-2.45,-2.75,-2.96,-2.83,-2.98,-2.90,-2.90,-2.92,-2.98,-2.92,-2.60,-2.98,-2.96,

      -2.92,-2.94,-2.81,-3.00,-2.54,-2.67,-2.96,-2.83,-2.90,-2.02,-2.70,-2.81,-1.97,-2.69,

      -2.30,-2.80,-2.90,-2.67,-2.86,-2.88,-2.75,-2.65,-2.80,-2.73,-2.76,-2.73,-2.78,-2.76,

      -2.83,-2.53,-2.85,-2.78,-2.85,-3.08,-3.10,-3.08,-3.08,-3.02,-3.08,-2.80,-2.98,-3.04,

      -3.02,-3.04,-3.08,-3.02,-2.86,-2.98,-3.00,-3.08,-3.08,-2.98,-3.06,-3.04,-3.06,-3.02,

      -3.02,-3.08,-2.83,-3.08,-3.08,-2.90,-2.98,-3.06,-2.94,-2.98,-3.06,-2.85,-3.02,-3.00,

      -3.06,-3.06,-3.08,-2.92,-3.10,-3.04,-3.08,-2.94,-3.10,-3.00,-2.98,-3.06,-2.98,-3.06,

      -3.04,-3.08,-2.65,-2.85,-3.00,-3.04,-2.51,-2.85,-2.63,-2.90,-2.56,-2.67,-2.78,-2.65,

      -2.67,-2.76,-2.78,-2.50,-2.32,-2.54,-2.56,-2.47,-2.73,-2.44,-2.76,-2.76,-2.54,-2.62,

      -2.54,-2.83,-2.65,-2.55,-2.59,-2.75,-2.66,-2.86,-2.69,-2.75,-2.75,-2.72,-2.15,-2.63,

      -2.80,-2.63,-2.48,-2.47,-2.34,-2.63,-2.50,-2.36,-2.75,-2.51,-2.30,-2.70,-2.69,-2.65,

      -2.40,-2.81,-3.00,-2.83,-2.81,-2.90,-2.86,-2.81,-2.25,-2.88,-3.00,-2.90,-2.81,-2.88,

      -2.80,-2.85,-2.98,-2.80,-2.83,-2.63,-2.58,-2.85,-2.76,-2.75,-2.85,-2.86,-2.81,-2.78,

      -2.75,-2.69,-2.62,-2.81,-2.92,-2.88,-2.92,-2.85,-2.94,-2.86,-2.96,-2.67,-2.63,-2.83,

      -2.81,-2.70,-2.69,-2.90,-2.80,-2.85,-2.67,-2.43,-2.85,-2.86,-2.98,-2.86,-2.94,-2.94,

      -2.83))

     

    node       mean     sd       MC error 2.50%    median   97.50%   start sample

    AUC1       0.943    0.02681  0.001064 0.878    0.9477   0.9806   501   9500

    AUC2       0.9691   0.01819  6.61E-04 0.9244   0.9726   0.9938   501   9500

    P[1]       0.185    0.02216  5.45E-04 0.1442   0.1839   0.2315   501   9500

    P[2]       0.815    0.02216  5.45E-04 0.7685   0.8161   0.8558   501   9500

    delta1     0.2119   0.06481  0.00261  0.09918  0.2073   0.3482   501   9500

    delta2     0.06812  0.03248  0.001226 0.01842  0.06366  0.1423   501   9500

    diff       -0.02602 0.01865  4.96E-04 -0.06952 -0.02332 0.004911 501   9500

    lambda1[1] -0.422   0.1916   0.007302 -0.8298  -0.4124  -0.06829 501   9500

    lambda1[2] -2.71    0.04259  5.44E-04 -2.793   -2.711   -2.628   501   9500

    lambda2[1] -0.9113  0.1459   0.005447 -1.202   -0.908   -0.6317  501   9500

    lambda2[2] -2.751   0.01361  2.92E-04 -2.778   -2.751   -2.725   501   9500

    rho[1]     0.7485   0.05997  0.001617 0.6134   0.7548   0.8469   501   9500

    rho[2]     0.1643   0.05411  6.68E-04 0.06006  0.1643   0.2694   501   9500

    sigma1[1]  1.388    0.3378   0.01171  0.8661   1.341    2.185    501   9500

    sigma1[2]  0.6539   0.04901  6.52E-04 0.5618   0.6518   0.7567   501   9500

    sigma2[1]  0.8766   0.1709   0.00485  0.5913   0.8592   1.253    501   9500

    sigma2[2]  0.05655  0.005542 1.55E-04 0.04604  0.05643  0.06796  501   9500

  • ROC curve estimation for qPCR assay in the absence of a gold standard using a mixture model with point mass and truncation
  • JAGS 1.0.3 code to accompany the paper entitled "Jafarzadeh SR, Johnson WO, Utts JM, Gardner IA. Bayesian estimation of the receiver operating characteristic curve for a diagnostic test with a limit of detection in the absence of a gold standard. Statistics in Medicine 2010; 29(20):2090-2106. DOI: 10.1002/sim.3975".

     

    A mixture model with point mass and truncation was developed to model quantitative real-time polymerase chain reaction (qPCR) data without a reference standard (i.e. gold standard). Test scores for qPCR, which are reported as cycle-to-threshold values, are subject to a limit of detection that is the maximum number of PCR amplification cycles run (e.g. 40 or 50 cycles). Therefore, qPCR test scores are censored at the maximum number of cycles run.

     

    Denote d as the maximum number of cycles for a qPCR. Most non-infected subjects have test scores equal to d although some may have scores slightly less than d. Thus all subjects with test scores of d assumed to be non-infected. Test scores for infected subjects assumed to be much less than d becasue they should be detected at earlier cycles. Therefore, the distribution of test scores for non-infected subjects assumed to be a mixture of point mass at d and a truncated distribution for those that are less than d. The distribution of test scores for infected subjects is assumed to be a truncated distribution (with truncation point d), or a mixture of tuncated distributions when test scores for infected population are assumed to come from distinct subgroups (e.g. high- and regular-shedders in bovine Johne's disease).

     

    We modeled the qPCR data in our example (Section 4 in the paper) assuming a mixture of point mass at 50 and a truncated gamma distribution (with truncation point 50) for non-infected subjects, and a mixture of two truncated gamma distributions (with truncation point 50) for infected subjects.

     

     

     

    model {

      # n = 253, number of subjects with test scores '< 50'

      for (i in 1:n) {

        CowAvgCT[i] ∼ dgamma(aa[i], bb[i])T(0, 50)

        # Shape

        aa[i] <- a[1] * equals(Z[i], 1) + a[2] * equals(Z[i], 2) + a[3] * equals(Z[i], 3)

        # Rate

        bb[i] <- b[1] * equals(Z[i], 1) + b[2] * equals(Z[i], 2) + b[3] * equals(Z[i], 3)

        # Latent infection status Z[i] = 1 if high-shedder, Z[i] = 2 if regular-shedder, and Z[i] = 3 if non-infected

        Z[i] ∼ dcat(P[])

      }

      # n.censored = 322, number of subjects with test score '= 50'

      # N = 575

      n.censored ∼ dbin(p50, N)

      # Infected, high-shedder prevalence, mode = 0.05, 95 per cent sure that prev. < 0.10

      P[1] <- w * (1 - P[3])

      w ∼ dbeta(6.1946, 99.6983)

      # Infected, regular-shedder prevalence

      P[2] <- (1 - w) * (1 - P[3])

      # Non-infected prevalence, mode = 0.55, 95 per cent sure that prev. < 0.70

      P[3] <- q

      q ∼ dbeta(14.6145, 12.1391)

      # p50 = Pr(50) = Pr(50 and non-infected)

      # = Pr(non-infected) * Pr(50 | non-infected)

      # = q * Pr(50 | non-infected)

      p50 <- P[3] * pp50

      # pp50 = Pr(50 | non-infected)

      pp50 ∼ dbeta(2, 2)

      # Re-parametrization, shape = rate * mean

      a[1] <- bbb[1] * mu[1]

      a[2] <- bbb[2] * mu[2]

      a[3] <- bbb[3] * mu[3]

      b[1] <- bbb[1]

      b[2] <- bbb[2]

      b[3] <- bbb[3]

      # High-shedder group mean = 20, 95 per cent sure that mean > 15

      a.mu[1] <- 35.840

      b.mu[1] <- 1.792

      # Regular-shedder group mean = 35, 95 per cent sure that mean < 39

      a.mu[2] <- 222.730

      b.mu[2] <- 6.363

      # Non-infected population mean = 47, 95 per cent sure that mean > 42

      a.mu[3] <- 227.996

      b.mu[3] <- 4.851

      # To achieve identifiability, the priors on the means of the infected populations are set to be lower than the mean of the non-infected population, and the high-shedder group is forced to have a lower mean than the regular-shedder group

      for (k in 1:3) {

        mu.unsorted[k] ∼ dgamma(a.mu[k], b.mu[k])

      }

      mu[1:3] <- sort(mu.unsorted)

      bbb[1] ∼ dgamma(1, 1)

      bbb[2] ∼ dgamma(1, 1)

      bbb[3] ∼ dgamma(1, 1)

      # Standard deviation

      sigma[1] <- sqrt(a[1]) / b[1]

      sigma[2] <- sqrt(a[2]) / b[2]

      sigma[3] <- sqrt(a[3]) / b[3]

      # Numerical computation of AUC

      for (j in 1:1000) {

        X1[j] ∼ dgamma(a[1], b[1])T(0, 50)

        X2[j] ∼ dgamma(a[2], b[2])T(0, 50)

        Y[j] ∼ dgamma(a[3], b[3])T(0, 50)

        AUC.temp[j] <- p50 + (1 - p50) * (w * (1 - step(X1[j] - Y[j])) + (1 - w)* (1 - step(X2[j] - Y[j])))

      }

      AUC <- mean(AUC.temp[])

      # Numerical computation of Se and Sp for cutoffs = 18, 19, . . . , 50 to plot ROC

      for (C in 1:33) {

        for(CC in 1:1000) {

          XX1[C, CC] ∼ dgamma(a[1], b[1])T(0, 50)

          XX2[C, CC] ∼ dgamma(a[2], b[2])T(0, 50)

          YY[C, CC] ∼ dgamma(a[3], b[3])T(0, 50)

          Se.temp[C, CC] <- w * (1 - step(XX1[C, CC] - (17 + C))) + (1 - w) * (1 - step(XX2[C, CC] - (17 + C)))

          Sp.temp[C, CC] <- p50 + (1 - p50) * step(YY[C, CC] - (17 + C))

        }

        Se[C] <- mean(Se.temp[C, 1:1000])

        Sp[C] <- mean(Sp.temp[C, 1:1000])

      }

    }

     

    # how to run JAGS code via R?

    # you need to have JAGS, R, and rjags library within R.

    # install JAGS if not installed from http://mcmc-jags.sourceforge.net/

    # install R if not installed from http://www.r-project.org/

    # install the "rjags" library within R if not installed, > install.packages("rjags")

     

    # save your JAGS (BUGS) code in a text file, e.g. "model_mixtrgamma.txt", and

    # place it in your R working directory (folder)

    # save your data in a text file, e.g. "data_qpcr.txt" and place it in your R working directory

     

    # open R software and load the rjags library in R

    > library(rjags)

    # load your data file from your R working directory into R environment

    > source("data_qpcr.txt")

    # compile

    > mixtrgamma.model <- jags.model(file = "model_mixtrgamma.txt", n.chains = 1)

    # burn-in

    > update(mixtrgamma.model, n.iter = 500000)

     

    # sample and monitor

    > mixtrgamma.samples <- coda.samples(model = mixtrgamma.model, variable.names = c("AUC", "P", "p50", "pp50", "w", "mu", "sigma", "a", "b"), n.iter = 1000000, thin = 10)

    # posteriors

    > summary(mixtrgamma.samples)

    Iterations = 501010:1501000

    Thinning interval = 10

    Number of chains = 1

    Sample size per chain = 1e+05

     

    1. Empirical mean and standard deviation for each variable,

       plus standard error of the mean:

     

                 Mean        SD  Naive SE Time-series SE

    AUC       0.89637  0.046043 1.456e-04      3.383e-04

    P[1]      0.01848  0.007903 2.499e-05      2.382e-05

    P[2]      0.33501  0.052246 1.652e-04      2.622e-04

    P[3]      0.64651  0.054443 1.722e-04      2.741e-04

    a[1]     30.06037 23.530303 7.441e-02      9.005e-02

    a[2]     29.53650 10.802924 3.416e-02      5.442e-02

    a[3]     28.79491 15.055470 4.761e-02      1.243e-01

    b[1]      1.35189  1.048753 3.316e-03      3.882e-03

    b[2]      0.86601  0.337705 1.068e-03      1.826e-03

    b[3]      0.66246  0.357613 1.131e-03      2.942e-03

    mu[1]    22.24581  2.979304 9.421e-03      8.817e-03

    mu[2]    34.47396  2.032121 6.426e-03      1.364e-02

    mu[3]    43.97850  2.025455 6.405e-03      8.727e-03

    p50       0.55280  0.020426 6.459e-05      6.901e-05

    pp50      0.86024  0.068525 2.167e-04      3.289e-04

    sigma[1]  5.37876  3.998725 1.265e-02      1.214e-02

    sigma[2]  6.66371  1.363037 4.310e-03      7.642e-03

    sigma[3]  8.98547  2.353363 7.442e-03      1.775e-02

    w         0.05233  0.020765 6.567e-05      5.690e-05

     

    2. Quantiles for each variable:

     

                 2.5%      25%      50%      75%    97.5%

    AUC       0.80141  0.86443  0.89836  0.93240  0.97273

    P[1]      0.00653  0.01271  0.01729  0.02302  0.03712

    P[2]      0.22094  0.30153  0.34039  0.37349  0.42206

    P[3]      0.55688  0.60626  0.64060  0.68117  0.76618

    a[1]      2.41568 13.02490 24.37827 40.71251 90.97702

    a[2]     14.25610 21.93573 27.63882 35.04748 55.26001

    a[3]     11.51352 17.78824 24.64292 35.76988 67.92466

    b[1]      0.11649  0.59157  1.09787  1.82802  4.08015

    b[2]      0.39108  0.62651  0.80440  1.03924  1.67686

    b[3]      0.24260  0.39959  0.56625  0.83254  1.58522

    mu[1]    16.47456 20.29745 22.18623 24.11231 28.43715

    mu[2]    31.14931 32.92800 34.25774 35.83093 38.83756

    mu[3]    40.89891 42.58488 43.61417 45.02471 48.93764

    p50       0.51244  0.53894  0.55294  0.56677  0.59231

    pp50      0.71803  0.81278  0.86542  0.91288  0.97483

    sigma[1]  2.29197  3.46637  4.50324  6.14079 13.57972

    sigma[2]  4.39712  5.68394  6.53947  7.49462  9.69202

    sigma[3]  5.19212  7.17967  8.75371 10.55969 14.05263

    w         0.01980  0.03716  0.04957  0.06457  0.09993

UC Legal Disclaimer

The Regents of the University of California disclaim all warranties with regard to these software, including all implied warranties of merchantability, non-infringement, and fitness for a particular use. In no event shall the Regents of the University of California be liable for any direct, special, indirect, or consequential damage or any damage whatsoever resulting from the loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of these software. Use at your own risk. If you do not agree to this, do not use these software.