Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
INCORPORATING INVARIANTS IN MAHALANOBIS DISTANCE-BASED CLASSIFIERS: APPLICATIONS TO FACE 186 RECOGNITION Given H and Cw, one can calculate using Equations (3), (4), and (5). Then by using the determinant |Cθ| to quantify goal Big: (allow θ to be large) and using to quantify goal Small: (keep small), we get the constrained optimization problem: Maximize the determinant |Cθ| Subject to (6) where γ is a constant. The solution to the problem is (7) where α, which is a function of γ, is a constant that balances the competing goals. To verify that Eqn. (7) indeed solves the optimization problem, note: In the coordinates that diagonalize Eqn. (6) only constrains the diagonal entries of Cθ. Of the symmetric positive definite matrices with specific diagonal entries, the matrix that has the largest determinant is simply diagonal. So Cθ and must be simultaneously diagonalizable, and the problem reduces to The Lagrange multipliers method yields Eqn. (7). Summary: Given a new image Y, we estimate its class with where We have derived the parameters of this classifier by synthesizing statistics from training data with analytic knowledge about transformations we wish to ignore. III. FACE RECOGNITION RESULTS We tested our techniques by applying them to a face recognition task and found that they reduce the error rate by more than 20% (from an error rate of 26.7% to an error rate of 20.6%). We used an analytic expression for transformations in image space and developed procedures for evaluating first and second derivatives of the transformations. The transformations have the following five degrees of freedom: ⢠Horizontal translation ⢠Vertical translation ⢠Horizontal scaling ⢠Vertical scaling ⢠Rotation To implement the test, we relied on the FERET data set [5] and a source code package from Beveridge et al. [6], [7] at CSU for evaluating face recognition algorithms. Version 4.0 (October 2002) of the CSU package contains source code that implements 13 different face recognition algorithms, scripts for applying those algorithms to images from the FERET data set, and source code for Monte Carlo studies of the distribution of the performance of the recognition algorithms. Following Turk and Pentland [8], all of the CSU algorithms use principal component analysis as a first step. Those with the best recognition rates also follow Zhao et al. [9] and use a discriminant analysis. For each algorithm tested, the CSU evaluation procedure reports a distribution of performance levels. The specific task is defined in terms of a single probe image and a gallery of NG images. The images in the gallery are photographs of NG distinct individuals. The gallery contains a single target image, which is another photograph of the individual represented in the probe image. Using distances reported by the algorithm under test, the evaluation procedure sorts the gallery into a list, placing the target image as close to the top as it can. The algorithm scores a success at rank n if the target is in the first n entries of the sorted list. The CSU evaluation procedure randomly selects NGÃ10,000 gallery-probe pairs and reports the distribution of successful recognition rates as a function of rank. Restricting the test data set to those images in the FERET data that satisfy the following criteria: ⢠Coordinates of the eyes have been measured and are part of the FERET data. ⢠There are at least four images of each individual. ⢠The photographs of each individual were taken on at least two separate occasions. yields a set of 640 images consisting of 160 individuals with 4 images of each individual. Thus we use NG=160. Of the remaining images for which eye coordinates are given, we used a training set of 591 images consisting of 3 images per individual for 197 individuals. The testing and training images were uniformly preprocessed by code from the CSU package. In [6] the authors describe the preprocessing as, âAll our FERET imagery has been preprocessed using code originally developed at NIST and used in the FERET evaluations. We have taken this code and converted it⦠Spatial normalization rotates, translates and scales the images so that the eyes are placed at fixed points in the imagery based on a ground truth file of eye coordinates supplied with the FERET data. The images are cropped to a standard size, 150 by 130 pixels. The NIST code also masks out pixels not lying within an oval shaped face region and scales the pixel data range of each image within the face region. In the source imagery, grey level values are integers in the range 0 to 255. These pixel values