This project team will develop a generic authentication system, preferably in Java, using the dichotomy model. The dichotomy model was used in Dr. Cha's dissertation, see key paper for this project, and in an on-line fingerprint verification study. Also, see the subset of dichotomy slides from a conference paper.
Your generic dichotomy-model authentication system will accept feature-vector data in the specified format
(see data format link on main Projects page).
For example, take the stylometry example data given to the class and repeated below (in blue).
This example feature data file contains 6 feature vectors,
2 from each of 3 subjects (so we have 3 pattern classes), with each vector having only 2 measurements
so we are operating in 2D feature space.
Stylometry biometric data example created September 2007
MaryJones/F/26, bachelors degree, Dell laptop, structured email task, 2, 0.13668, 0.53375
MaryJones/F/26, bachelors degree, Dell laptop, structured email task, 2, 0.14378, 0.56275
JohnSmith/M/27, masters degree, Compaq handheld, free email task, 2, 0.53628, 0.43865
JohnSmith/M/27, masters degree, Compaq handheld, free email task, 2, 0.43628, 0.53865
ChrisHill/F/02-04-1983, PhD degree, Dell desktop, free email task, 2, 0.39734, 0.92862
ChrisHill/F/02-04-1983, PhD degree, Dell desktop, free email task, 2, 0.49924, 0.98861
The above three-class example file can easily be converted into a two-class dichotomy-model authentication data file. The two authentication classes are the within-class (same person) and the between-class (different people) categories. We perform the conversion by taking all possible difference vectors. In this case, there are only 3 within-class vector pairs, one for each of the three people, and 12 (6*4/2, each of the 6 instances can be compared with the 4 instances from other people, then divide by 2 to eliminate duplicates) between-class vector pairs. In general, if n people provide m biometric samples each, there are m*(m-1)*n/2 within-class pairs and m*m*n*(n-1)/2 between-class pairs (see the key paper reference above). The number of between-class pairs usually far exceeds the number of within-class pairs. Sometimes both the number of within-class and between-class pairs can be large (possibly in the millions), and then the training and test samples can be generated at random and not explicitly elaborated as indicated here. For each pair, a difference vector is computed by taking the absolute difference between each vector component. Because our biometric features are in the range 0-1, the difference vector features will also be in the range 0-1.
For the illustrative file above, for example, the feature vector record for the first within-class (same person) pair
and for the first between-class (different people) pair would be:
same, ?, ?, ?, 2, |0.13668-0.14378|, |0.53375-0.56275|
different, ?, ?, ?, 2, |0.13668-0.53628|, |0.53375-0.43865|
This conversion procedure can easily be implemented, and we recommend using Java for coding. You are to use this procedure to convert the feature data that you receive from the mouse movement, stylometry, and keystroke teams into dichotomy-model authentication data for further processing by your team and by the Data Mining team.
After implementing the dichotomy-model conversion, authentication system performance results will be obtained on data from the various biometric front-end teams. A textbook (Guide to Biometrics, by Bolle, et al., Springer 2004, ISBN 0387400893) was provided to the team (book must be returned at the end of the semester) that describes the performance statistics, namely False Accept Rate (FAR) and False Reject Rate (FRR), that should be obtained on the mouse movement, stylometry, and keystroke data.
|Mouse||10 Fixed Buttons||50-50||12.0%||16.0%||86.0%|