Software

Diagnostic Test Se and Sp Estimation

  • 2 independent tests, 2 populations, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.2.2.
    Two independent tests, two populations
    Estimate Se and Sp of microscopic examination and PCR.
    Hui-Walter Model for N. salmonis in trout.
    Data source: Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Preventive Veterinary Medicine 2000; 45(1-2):61-81. DOI: 10.1016/S0167-5877(00)00117-3.

    model{
    y[1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q], n1)
    z[1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q], n2)
    p1[1,1] <- pi1*Seme*Sepcr + (1-pi1)*(1-Spme)*(1-Sppcr)
    p1[1,2] <- pi1*Seme*(1-Sepcr) + (1-pi1)*(1-Spme)*Sppcr
    p1[2,1] <- pi1*(1-Seme)*Sepcr + (1-pi1)*Spme*(1-Sppcr)
    p1[2,2] <- pi1*(1-Seme)*(1-Sepcr) + (1-pi1)*Spme*Sppcr
    p2[1,1] <- pi2*Seme*Sepcr + (1-pi2)*(1-Spme)*(1-Sppcr)
    p2[1,2] <- pi2*Seme*(1-Sepcr) + (1-pi2)*(1-Spme)*Sppcr
    p2[2,1] <- pi2*(1-Seme)*Sepcr + (1-pi2)*Spme*(1-Sppcr)
    p2[2,2] <- pi2*(1-Seme)*(1-Sepcr) + (1-pi2)*Spme*Sppcr
    Seme ~ dbeta(2.82, 2.49) ## Mode=0.55, 95% sure Seme < 0.85
    Spme ~ dbeta(15.7, 1.30) ## Mode=0.98, 95% sure Spme > 0.80
    pi2 ~ dbeta(1.73, 2.71) ## Mode=0.30, 95% sure pi2 > 0.08
    Sepcr ~ dbeta(8.29, 1.81) ## Mode=0.90, 95% sure Sepcr > 0.60
    Sppcr ~ dbeta(10.69, 2.71) ## Mode=0.85, 95% sure Sppcr > 0.60
    Z ~ dbern(tau1)
    pi1star ~ dbeta(1.27, 9.65) ## Mode=0.03, 95% sure pi2 < 0.30
    pi1 <- Z*pi1star
    }

    list(n2=30, n1=132, z=structure(.Data=c(3,0,24,3),.Dim=c(2,2)),
    y=structure(.Data=c(0,0,3,129),.Dim=c(2,2)), Q=2, tau1=0.95)
    list(Z=1, pi1star=0.03, pi2=0.30, Seme=0.55, Spme=0.98, Sepcr=0.90, Sppcr=0.85

    node  mean        sd             MC error   2.50%         median     97.50%     start   sample
    Seme  0.1745     0.0652      2.32E-04  0.06738      0.1678     0.3195    10001 100000
    Sepcr  0.9301     0.04626    1.86E-04  0.8173        0.9391     0.9919    10001 100000
    Spme  0.9914     0.007498  3.93E-05  0.9717        0.9935     0.9996    10001 100000
    Sppcr  0.963       0.01671    6.88E-05  0.9247        0.9651     0.9893    10001 100000
    pi1      0.009058 0.01219    5.79E-05  0                 0.003916 0.04176  10001 100000
    pi2      0.8524     0.06658    2.56E-04  0.7027        0.8596     0.9596    10001 100000
  • 2 independent tests, 2 populations, no gold standard (TAGS) - frequentist approach
  • TAGS Instructions
    Worked Example

    Dubey et al. 1995 (Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036) compared 5 serologic tests for the diagnosis of toxoplasmosis in 1000 naturally-exposed sows using bioassay methods as the gold standard (definitive test). Bioassays were done in mice (all sows) and cats (183 sows) using cardiac muscle from sampled sows. Samples in the study were collected in two batches: nos. 1-463 and 464-1000.

    If Toxoplasma gondii were isolated from either mice or cats, the sow was considered infected. A sow was considered non-infected if the bioassay results were negative. To demonstrate use of TAGS, we will use results of the bioassay test and 2 of the 5 serologic tests: the modified agglutination test (MAT) and the enzyme-linked immunosorbent assay (ELISA). These 2 serologic tests were the most accurate of the tests evaluated and are commonly used in screening of pigs for toxoplasmosis. The MAT was considered positive if the titer was >= 20 and the ELISA was positive if the OD value was > 0.36. The sensitivity (Se) and specificity (Sp) of the MAT using bioassay as the gold standard were calculated to be 82.9% and 90.2%, respectively. This represents the traditional approach that assumes that the combined bioassay (cat + mouse) has Se = Sp = 1.

    Calculations 
    The TAGS method requires a minimum of 2 populations with 2 conditionally independent tests. Use of the batch data (batch 1 = 463; batch 2 = 537) provides a logical way to create 2 populations of similar size in which the sensitivity and specificity of the tests should be equivalent. Cross-classified test results for the batches are in the following table:

    Batch  MAT+/      MAT+/      MAT-/      MAT-/
           Bioassay+  Bioassay-  Bioassay+  Bioassay-
     1     37         55         7          364
     2     104        26         22         385
    What is the sensitivity and specificity of the MAT using “TAGS” when the MAT and bioassay are the tests under consideration?

    Estimates for the MAT using TAGS are exactly the same as using the traditional approach i.e. Se = 82.9% and Sp = 90.2%. Also, TAGS estimated that the bioassay was perfectly sensitive and specific – exactly the same as the traditional approach. Hence, this provides evidence in support of bioassay as a true gold standard.

    Now let ’s ignore the bioassay results and the use the cross-classified data from MAT and ELISA by batch as follows (1 pig had a missing ELISA value):
    Batch  MAT+/   MAT+/   MAT-/   MAT-/

           ELISA+  ELISA-  ELISA+  ELISA-
     1     67      25      41      329
     2     97      33      36      371

    What is the sensitivity and specificity of the MAT now using “TAGS” when the MAT and the ELISA are the tests under comparison?

    MAT is estimated to have a Se = 100% and a Sp = 97.3% which clearly are overestimated compared with the correct values obtained when bioassay is used at the comparison.

    What is the most likely reason for the change in MAT estimates?

    Results of the two tests (MAT and ELISA) are positively correlated (dependent), conditional on infection status (see Gardner IA, Stryhn H, Lind P, Collins MT. Conditional dependence between tests affects the diagnosis and surveillance of animal diseases. Preventive Veterinary Medicine 2000; 45(1-2):107-122. DOI: 10.1016/S0167-5877(00)00119-7). This positive dependence between the tests results in overestimation of MAT accuracy with TAGS.

    The main conclusion is that is inappropriate to use the TAGS software to compare 2 serologic tests without a gold standard because the tests are likely to be dependent. Other methods (and software) that account for this dependence (2 conditionally dependent tests, 2 populations) are necessary to obtain unbiased estimates.
  • 2 independent tests, 2 populations, no gold standard - spreadsheet workbook
  • Excel workbook to estimate Se and Sp, and for frequentist sample size calculations.

  • 2 dependent tests, 1 population, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.3.3.1.
    Two dependent tests, one population.
    Estimation of the Se and Sp of two FAT tests.
    Classical swine fever virus.
    Data source: Bouma A, Stegeman JA, Engel B, de Kluijver EP, Elbers AR, De Jong MC. Evaluation of diagnostic tests for the detection of classical swine fever in the field without a gold standard. Journal of Veterinary Diagnostic Investigation 2001; 13(5):383-388. DOI: 10.1177/104063870101300503.

    model{
    node   mean   sd      MC error 2.50%    median 97.50% start sample
    Sefat1 0.7691 0.06151 9.16E-04 0.6497   0.7693 0.887  10001 100000
    Sefat2 0.8147 0.06045 9.83E-04 0.6959   0.8156 0.9281 10001 100000
    Spfat1 0.8851 0.0581  5.35E-04 0.7533   0.8924 0.975  10001 100000
    Spfat2 0.8485 0.07411 5.83E-04 0.6856   0.8566 0.9667 10001 100000
    rhoD   0.6814 0.1484  0.001845 0.3135   0.7049 0.9071 10001 100000
    rhoDc  0.3629 0.2655  0.002054 -0.09339 0.361  0.858  10001 100000
    x[1:4] ~ dmulti(p[1:4], n)
    p[1] <- pi*(Sefat1*Sefat2+covDp) + (1-pi)*((1-Spfat1)*(1-Spfat2)+covDn)
    p[2] <- pi*(Sefat1*(1-Sefat2)-covDp) + (1-pi)*((1-Spfat1)*Spfat2-covDn)
    p[3] <- pi*((1-Sefat1)*Sefat2-covDp) + (1-pi)*(Spfat1*(1-Spfat2)-covDn)
    p[4] <- pi*((1-Sefat1)*(1-Sefat2)+covDp) + (1-pi)*(Spfat1*Spfat2+covDn)
    ls <- (Sefat1-1)*(1-Sefat2)
    us <- min(Sefat1,Sefat2) - Sefat1*Sefat2
    lc <- (Spfat1-1)*(1-Spfat2)
    uc <- min(Spfat1,Spfat2) - Spfat1*Spfat2
    pi ~ dbeta(13.322, 6.281) ### Mode=0.70, 95% sure > 0.50
    Sefat1 ~ dbeta(9.628,3.876) ### Mode=0.75, 95% sure > 0.50
    Spfat1 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70
    Sefat2 ~ dbeta(9.628, 3.876) ### Mode=0.75, 95% sure > 0.50
    Spfat2 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70
    covDn ~ dunif(lc, uc)
    covDp ~ dunif(ls, us)
    rhoD <- covDp / sqrt(Sefat1*(1-Sefat1)*Sefat2*(1-Sefat2))
    rhoDc <- covDn / sqrt(Spfat1*(1-Spfat1)*Spfat2*(1-Spfat2))
    }

    list(n=214, x=c(121,6,16,71))
    list(pi=0.7, Sefat1=0.75, Spfat1=0.90, Sefat2=0.75, Spfat2=0.90)
     

    node   mean   sd      MC error 2.50%    median 97.50% start sample
    Sefat1 0.7691 0.06151 9.16E-04 0.6497   0.7693 0.887  10001 100000
    Sefat2 0.8147 0.06045 9.83E-04 0.6959   0.8156 0.9281 10001 100000
    Spfat1 0.8851 0.0581  5.35E-04 0.7533   0.8924 0.975  10001 100000
    Spfat2 0.8485 0.07411 5.83E-04 0.6856   0.8566 0.9667 10001 100000
    rhoD   0.6814 0.1484  0.001845 0.3135   0.7049 0.9071 10001 100000
    rhoDc  0.3629 0.2655  0.002054 -0.09339 0.361  0.858  10001 100000

  • 2 dependent tests, 2 populations, no gold standard
  • WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

    Example from section 3.3.1.
    Two dependent tests, two populations.
    Estimation of Se and Sp of ELISA and MAT.
    Toxoplasmosis in pigs.
    Data source: Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036.

    model{
      y1[1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q], n1)
      y2[1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q], n2)
      p1[1,1] <- pi1*eta11 + (1-pi1)*theta11
      p1[1,2] <- pi1*eta12 + (1-pi1)*theta12
      p1[2,1] <- pi1*eta21 + (1-pi1)*theta21
      p1[2,2] <- pi1*eta22 + (1-pi1)*theta22
      p2[1,1] <- pi2*eta11 + (1-pi2)*theta11
      p2[1,2] <- pi2*eta12 + (1-pi2)*theta12
      p2[2,1] <- pi2*eta21 + (1-pi2)*theta21
      p2[2,2] <- pi2*eta22 + (1-pi2)*theta22
      eta11 <- lambdaD*eta1
      eta12 <- eta1 - eta11
      eta21 <- gammaD*(1-eta1)
      eta22 <- 1 - eta11 - eta12 - eta21
      theta11 <- 1 - theta12 - theta21 - theta22
      theta12 <- gammaDc*(1-theta1)
      theta21 <- theta1 - theta22
      theta22 <- lambdaDc* theta1
      eta2 <- eta11 + eta21
      theta2 <- theta22 + theta12
      rhoD <- (eta11 - eta1*eta2) / sqrt(eta1*(1-eta1)*eta2*(1-eta2))
      rhoDc <- (theta22 - theta1*theta2) / sqrt(theta1*(1-theta1)*theta2*(1-theta2))
      pi1 ~ dbeta(1.3, 5)
      pi2 ~ dbeta(1.5, 3)
      eta1 ~ dbeta(24.09, 5.73)
      theta1 ~ dbeta(23.05, 3.45)
      lambdaD ~ dbeta(1.376, 1.161) ## Mode=0.70, 5th %tile=0.10
      gammaD ~ dbeta(1.376, 1.161)
      lambdaDc ~ dbeta(2, 1.176) ## Mode=0.85, 5th %tile=0.20
      gammaDc ~ dbeta(2, 1.176)
    }

    list(n1=462, n2=537, Q=2,
    y1=structure(.Data=c(
    67,25,41,329),.Dim=c(2,2)),
    y2=structure(.Data=c(
    97,33,36,371),.Dim=c(2,2)))
    list(pi1=0.07, pi2=0.20, eta1=0.83, theta1=0.90, lambdaD=0.50, lambdaDc=0.50, gammaD=0.50, gammaDc=0.50)

    node   mean   sd      MC error 2.50%   median 97.50% start sample
    eta1   0.8075 0.07051 4.54E-04 0.6516  0.8143 0.9246 10001 100000
    eta2   0.7403 0.1159  0.00108  0.4863  0.7498 0.9326 10001 100000
    theta1 0.897  0.04269 6.07E-04 0.809   0.9016 0.9688 10001 100000
    theta2 0.8629 0.04344 6.18E-04 0.7756  0.867  0.9387 10001 100000
    rhoD   0.3121 0.2676  0.00196  -0.1906 0.3249 0.7918 10001 100000
    rhoDc  0.4015 0.1916  0.002407 -0.004  0.4337 0.6932 10001 100000
    pi1    0.1438 0.05738 7.88E-04 0.0269  0.1483 0.2484 10001 100000
    pi2    0.1921 0.05861 7.95E-04 0.06753 0.1966 0.2982 10001 100000