F
Simulating False Match Probabilities Based on Normal Theory^{1}
WHY THE FALSE MATCH PROBABILITY DEPENDS ON ONLY RATIO δ / σ
As a function of δ, we are interesed in P{match on one element is declared  µ_{x} − µ_{y} > δ}, where µ_{x} and µ_{y} are the true means of one of the seven elements in the melts of the CS and PS bullets, respectively. The withinreplicate variance is generally small, so we assume that the sample means of the three replicates are normally distributed; that is,
where “~” stands for “is distributed as.” Thus, the difference in the means is δ. We further assume that the errors in the measurements leading to and are independent. Based on this specification (or “these assumptions”), statistical theory asserts that
where and denotes the chisquared distribution on four degrees of freedom. If σ^{2} is estimated from a pooled variance on B (more than 2) samples, then Let v equal the number of degrees of freedom used to estimate σ, for example, v = 4 if
…+ s^{2}_{B})/B. The ratio of to is the same as the distribution of namely a Student’s t_{v} (v degrees of freedom), so the twosample t statistic is distributed as a (central) Student’s t on v degrees of freedom:
The FBI criterion for a match on this one element can be written
Because E(s_{x}) = E(s_{y}) = 0.8812σ, and E(s_{p}) ≈ σ if v > 60, this reduces very roughly to
The approximation is very rough because E(P{t < S}) ≠ P{t < E(S)}, where t stands for the twosample t statistic and S stands for But it does show that if δ is very large, this probability is virtually zero (very small false match probability because the probability that the sample means would, by chance, end up very close together is very small). However, if δ is small, the probability is quite close to 1.
The equivalence t test proceeds as follows. Assume
where H_{0} is the null hypothesis that the true population means differ by at least δ, and the alternative hypothesis is that they are within δ of each other. The twosample t test would reject H_{0} in favor of H_{1} if the sample means are too close, that is, if where K_{α}(n,δ) is chosen so that does not exceed a preset perelement risk level of α (in Chapter 3, we used α = 0.30). Rewriting that equation, and writing K_{α} for K_{α}(n,δ),
When v is large s_{p} ≈ 0, and therefore the quantity
That shows that the false match probability depends on δ and σ only through the ratio. (The argument is a little more complicated when v is small, because the ratio is a random quantity, but the conclusion will be the same.) Also, when v is large, the quantity which is distributed as a standard normal distribution. So the probability can be written
where Φ(·) denotes the standard cumulative normal distribution function (for example, Φ(1.645) = 0.95). So, for large values of v, the nonlinear equation can be solved for K_{α}, so that the probability of interest does not exceed α. For small values of v, K_{α} is the 100(1 − α)% point of the noncentral t distribution with v degrees of freedom and noncentrality parameter (Ref. 14).
Values of K_{α} are given in Table F.1 below, for various values of α (0.30, 0.25, 0.20, 0.10, 0.05, 0.01, and 0.0004), degrees of freedom (4, 40, 100, and 200), and δ / σ (0.25, 0.33, 0.50, 1, 1.5, 2, and 3). The theory for Hotelling’s T^{2}
TABLE F.1 Values of K_{α}(n,v) Used in Equivalence t Test (Need to Multiply by
α = 0.30, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
1.5 
2 
3 
v = 4 
0.43397 
0.44918 
0.49809 
0.81095 
1.35161 
1.94726 
3.12279 
40 
0.40683 
0.42113 
0.46725 
0.77043 
1.31802 
1.92530 
3.13875 
100 
0.40495 
0.41919 
0.46511 
0.76783 
1.31622 
1.92511 
3.14500 
200 
0.40435 
0.41857 
0.46443 
0.76697 
1.31563 
1.92510 
3.14734 
α = 0.30, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
v = 4 
0.44761 
0.47385 
0.56076 
1.11014 
2.63496 
4.12933 
40 
0.41965 
0.44436 
0.52681 
1.07231 
2.63226 
4.19067 
100 
0.41771 
0.44232 
0.52445 
1.06984 
2.63546 
4.20685 
200 
0.41710 
0.44167 
0.52370 
1.06906 
2.63664 
4.21278 
α = 0.25, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
1.5 
2 
3 
v = 4 
0.35772 
0.37030 
0.41092 
0.68143 
1.19242 
1.77413 
2.91548 
40 
0.33633 
0.34818 
0.38655 
0.64811 
1.16900 
1.77305 
2.98156 
100 
0.33484 
0.34664 
0.38484 
0.64578 
1.16765 
1.77420 
2.99223 
200 
0.33437 
0.34615 
0.38430 
0.64503 
1.16722 
1.77461 
2.99595 
α = 0.25, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
1.5 
2 
3 
v = 4 
0.36900 
0.39075 
0.46350 
0.95953 
1.70024 
2.44328 
3.88533 
40 
0.34696 
0.36748 
0.43648 
0.92903 
1.69596 
2.47772 
4.02810 
100 
0.34542 
0.36586 
0.43459 
0.92698 
1.69672 
2.48365 
4.05178 
200 
0.34493 
0.36534 
0.43399 
0.92633 
1.69700 
2.48570 
4.06021 
α = 0.222, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
1.5 
2 
3 
4 
0.31603 
0.32716 
0.36318 
0.60827 
1.09914 
1.67316 
2.79619 
40 
0.29754 
0.30804 
0.34207 
0.57848 
1.07949 
1.68119 
2.88735 
100 
0.29625 
0.30670 
0.34060 
0.57638 
1.07834 
1.68290 
2.90000 
200 
0.29584 
0.30627 
0.34013 
0.57571 
1.07798 
1.68350 
2.90436 
α = 0.222, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
1.5 
2 
3 
3 
0.32601 
0.34528 
0.41003 
0.87198 
1.60019 
2.33249 
3.74571 
40 
0.30695 
0.32514 
0.38655 
0.84440 
1.60422 
2.38467 
3.93060 
100 
0.30562 
0.32374 
0.38490 
0.84252 
1.60548 
2.39187 
3.95822 
200 
0.30520 
0.32329 
0.38438 
0.84192 
1.60592 
2.39434 
3.96795 
(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
v = 4 
0.28370 
0.29370 
0.32612 
0.55032 
1.59066 
2.69968 
40 
0.26736 
0.27680 
0.30744 
0.52321 
1.60451 
2.80887 
100 
0.26622 
0.27561 
0.30613 
0.52129 
1.60656 
2.82294 
200 
0.26585 
0.27523 
0.30571 
0.52068 
1.60725 
2.82774 
α = 0.20, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
v = 4 
0.29266 
0.30999 
0.36844 
0.80094 
2.24256 
3.63322 
40 
0.27582 
0.29219 
0.34759 
0.77521 
2.30710 
3.84954 
100 
0.27464 
0.29094 
0.34612 
0.77341 
2.31517 
3.88010 
200 
0.27426 
0.29054 
0.34566 
0.77285 
2.31790 
3.89081 
α = 0.10, n = 3 
(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
v = 4 
0.14025 
0.14521 
0.16138 
0.28009 
1.14311 
2.19312 
40 
0.13257 
0.13726 
0.15256 
0.26552 
1.16523 
2.36203 
100 
0.13203 
0.13670 
0.15193 
0.26449 
1.16738 
2.38036 
200 
0.13186 
0.13653 
0.15174 
0.26416 
1.16808 
2.38652 
α = 0.10, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
v = 4 
0.14470 
0.15332 
0.18272 
0.44037 
1.76516 
3.05121 
40 
0.13678 
0.14493 
0.17277 
0.42178 
1.86406 
3.39055 
100 
0.13622 
0.14434 
0.17207 
0.42044 
1.87408 
3.43264 
200 
0.13604 
0.14416 
0.17184 
0.42001 
1.87741 
3.44712 
α = 0.05, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
4 
0.07000 
0.07241 
0.08048 
0.14085 
0.80000 
1.82564 
40 
0.06614 
0.06847 
0.07612 
0.13329 
0.80877 
2.00110 
100 
0.06580 
0.06812 
0.07584 
0.13280 
0.80951 
2.01774 
200 
0.06588 
0.06822 
0.07573 
0.13263 
0.80976 
2.02351 
α = 0.05, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
4 
0.07215 
0.07645 
0.09118 
0.22900 
1.41106 
2.64066 
40 
0.06825 
0.07232 
0.08626 
0.21748 
1.50372 
3.02532 
100 
0.06798 
0.07203 
0.08591 
0.21672 
1.51184 
3.06786 
200 
0.06789 
0.07194 
0.08580 
0.21647 
1.51462 
3.08296 
α = 0.01, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
4 
0.01397 
0.01447 
0.01608 
0.02823 
0.25124 
1.21164 
40 
0.01322 
0.01369 
0.01522 
0.02671 
0.24129 
1.33049 
100 
0.01317 
0.01364 
0.01516 
0.02660 
0.24062 
1.34080 
200 
0.01315 
0.01352 
0.01514 
0.02656 
0.24040 
1.34432 
α = 0.01, n = 5 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
4 
0.01442 
0.01528 
0.01823 
0.04651 
0.79664 
1.98837 
40 
0.01364 
0.01446 
0.01724 
0.04400 
0.83240 
2.35173 
100 
0.01359 
0.01440 
0.01717 
0.04383 
0.83521 
2.38989 
200 
0.01357 
0.01438 
0.01715 
0.04378 
0.83616 
2.40330 
α = 0.0004, n = 3 

(δ / σ) 


0.25 
0.33 
0.50 
1 
2 
3 
4.4 
4 
0.00056 
0.00058 
0.00064 
0.00113 
0.01071 
0.34213 
1.5877 
40 
0.00053 
0.00055 
0.00061 
0.00107 
0.01013 
0.34139 
1.9668 
100 
0.00053 
0.00055 
0.00061 
0.00107 
0.01009 
0.34133 
2.0072 
200 
0.00053 
0.00055 
0.00060 
0.00106 
0.01008 
0.34131 
2.0215 
is similar (it uses vectors and matrices instead of scalars), and the resulting critical value comes from a noncentral F distribution (Ref. 15).^{2}
ESTIMATING MEASUREMENT UNCERTAINTY WITH POOLED STANDARD DEVIATIONS
Chapter 3 states that a pooled estimate of the measurement uncertainty σ, s_{p}, is more accurate and precise than an estimate based on only s_{x}, the sample SD based on only three normally distributed measurements. That statement follows from the fact that a squared sample SD has a chisquared distribution; specifically, (n − 1)s^{2} / σ^{2} has a chisquared distribution on (n − 1) degrees of freedom, where s is based on n observations. The mean of the square root of a chisquared random variable based on v = (n − 1) degrees of freedom is where Γ(·) is the gamma function. For v = (n − 1) = 2, E(s) = 0.8812σ; for v = 4 (i.e., estimating σ by E(s) = 0.9400σ; for v = 200 (that is, estimating σ by the square root of the mean of the squared SDs from 100 bullets), E(s) ≈ σ. In addition, the probability that s exceeds 1.25σ when n = 2 (that is, using only one bullet) is 0.21 but falls to 0.00028 when v = 200. For those
^{2} 
These values were determined by using a simple binary search algorithm for the value α and the R function pf(x, 1, dof, 0.5*n*E), where n = 3 or 5 and E = (δ / σ)^{2}. R is a statisticalanalysis software program that is downloadable from http://www.rproject.org. 
reasons, s_{p} based on many bullets is preferable to estimating σ by using only three measurements on a single bullet.
WITHINBULLET VARIANCES, COVARIANCES, AND CORRELATIONS FOR FEDERAL BULLET DATA SET
The data on the Federal bullets contained measurements on six of the seven elements (all but Cd) with ICPOES. They allowed estimation of withinbullet variances, covariances, and correlations among the six elements. According to the formula in Appendix K, now applied to the six elements, the estimated withinbullet variance matrix is given below. The correlation matrix is found in the usual way (for example, Cor (Ag, Sb) = Covariance(Ag,Sb)/[SD(Ag) SD(Sb)]. Covariances and correlations between Cd and all other elements are assumed to be zero. The correlation matrix was used to demonstrate the use of the equivalence Hotelling’s T^{2} test. Because it is based on 200 bullets measured in 1991, it is presented here for illustrative purposes only.
WithinBullet Variances and Covariances ×10^{5}, log(Federal Data) 


ICPAs 
ICPSb 
ICPSn 
ICPBi 
ICPCu 
ICPAg 
ICPAs 
187 
27 
31 
31 
37 
77 
ICPSb 
20 
37 
25 
18 
25 
39 
ICPSn 
31 
25 
106 
16 
29 
41 
ICPBi 
31 
18 
16 
90 
14 
44 
ICPCu 
37 
25 
29 
14 
40 
42 
ICPAg 
77 
39 
41 
44 
42 
681 
WithinBullet Correlations, Federal Data 


ICPAs 
ICPSb 
ICPSn 
ICPBi 
ICPCu 
ICPAg 
(Cd) 
ICPAs 
1.000 
0.320 
0.222 
0.236 
0.420 
0.215 
0.000 
ICPSb 
0.320 
1.000 
0.390 
0.304 
0.635 
0.242 
0.000 
ICPSn 
0.222 
0.390 
1.000 
0.163 
0.440 
0.154 
0.000 
ICPBi 
0.236 
0.304 
0.163 
1.000 
0.240 
0.179 
0.000 
ICPCu 
0.420 
0.635 
0.440 
0.240 
1.000 
0.251 
0.000 
ICPAg 
0.215 
0.242 
0.154 
0.179 
0.251 
1.000 
0.000 
(Cd) 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
1.000 
BETWEENELEMENT CORRELATIONS
In Chapter 3, correlations between mean concentrations of bullets were estimated by using the Pearson correlation coefficient (see equation 2). One reviewer suggested that Spearman’s rank correlation may be more appropriate, as it provides a nonparametric estimate of the monotonic association between two variables. Spearman’s rank correlation coefficient takes the same form as Equation 2, but with the ranks of the values (numbers 1, 2, 3, …, n = number of data
pairs) rather than values themselves. The table below consists of 49 entries, corresponding to all possible pairs of the seven elements. The value 1.000 on the diagonal confirms a correlation of 1.000 for an element with itself. The values in the cells on either side of the diagonal are the same because the correlation between, say, As and Sb is the same as that between Sb and As. For these offdiagonal cells, the first line reflects the conventional Pearson correlation coefficient based on the 1,373bullet subset from the 1,837bullet subset (bullets with all seven measured elements or with six measured and one imputed for Cd). The second line is Spearman’s rank correlation coefficient on rank(data), again for

As 
Sb 
Sn 
Bi 
Cu 
Ag 
Cd 
As 
1.000 
0.556 
0.624 
0.148 
0.388 
0.186 
0.242 


0.697 
0.666 
0.165 
0.386 
0.211 
0.166 


0.678 
0.667 
0.178 
0.392 
0.216 
0.279 


1750 
1,381 
1742 
1,743 
1,750 
856 
Sb 
0.556 
1.000 
0.455 
0.157 
0.358 
0.180 
0.132 

0.697 

0.556 
0.058 
0.241 
0.194 
0.081 

0.678 

0.560 
0.054 
0.233 
0.190 
0.173 

1,750 

1,387 
1829 
1,826 
1,837 
857 
Sn 
0.624 
0.455 
1.000 
0.176 
0.200 
0.258 
0.178 

0.666 
0.556 

0.153 
0.207 
0.168 
0.218 

0.667 
0.560 

0.152 
0.208 
0.165 
0.385 

1,381 
1,387 

1385 
1380 
1387 
857 
Bi 
0.148 
0.157 
0.176 
1.000 
0.116 
0.560 
0.030 

0.165 
0.058 
0.153 

0.081 
0.499 
0.103 

0.178 
0.054 
0.152 

0.099 
0.522 
0.165 

1,742 
1,829 
1,385 

1,818 
1,829 
857 
Cu 
0.388 
0.358 
0.200 
0.116 
1.000 
0.258 
0.111 

0.386 
0.241 
0.207 
0.081 

0.206 
0.151 

0.392 
0.233 
0.208 
0.099 

0.260 
0.115 

1,743 
1,826 
1,380 
1818 

1826 
855 
Ag 
0.186 
0.180 
0.258 
0.560 
0.258 
1.000 
0.077 

0.211 
0.194 
0.168 
0.499 
0.206 

0.063 

0.216 
0.190 
0.165 
0.522 
0.260 

0.115 

1,750 
1,837 
1,387 
1829 
1,826 

857 
Cd 
0.242 
0.132 
0.178 
0.030 
0.111 
0.077 
1.000 

0.166 
0.081 
0.218 
0.103 
0.151 
0.063 


0.279 
0.173 
0.385 
0.165 
0.251 
0.115 


857 
857 
857 
857 
855 
857 

the 1,373bullet subset. The third line is Spearman’s rank correlation coefficient on the entire 1,837bullet subset (some bullets had only three, four, five, or six elements measured). The fourth line gives the number of pairs in Spearman’s rank correlation coefficient calculation. All three sets of correlation coefficients are highly consistent with each other. Regardless of the method used to estimate the linear association between elements, associations between As and Sb, between As and Sn, between Sb and Sn, and between Ag and Bi are rather high. Because the 1,837bullet subset is not a random sample from any population, we refrain from stating a level of “significance” for these values, noting only that regardless of the method used to estimate the linear association between elements, associations between As and Sb, between As and Sn, between Sb and Sn, and between Ag and Bi are higher than those for the other 17 pairs of elements.