D. Accuracy

“With an identification rate above 95% as measured by U.S. government-sponsored Face Recognition Vendor Tests, our technology is the industry’s finest.”

- FaceFirst website
“FaceFirst makes no representations or warranties as to the accuracy and reliability of the product in the performance of its facial recognition capabilities.”

- FaceFirst contract with the San Diego Association of Governments

In police face recognition, there are high stakes to accuracy. An accurate algorithm correctly identifies a face in an ATM photo and leads police to a robber’s door. An inaccurate algorithm sends them to the wrong house—and could send an innocent person to jail.175

Face recognition companies understand this, and promise police departments seemingly sky-high accuracy standards. The website of FaceFirst, which uses Cognitec’s algorithm in face recognition software that it sells to police, states that “[w]ith an identification rate above 95% as measured by U.S. government-sponsored Face Recognition Vendor Tests, our technology is the industry’s finest.”176 This is misleading: the 95% figure is a decade old and vastly oversimplifies the nuances of accuracy into a single number from a single test.177 Since 2006, Cognitec’s algorithm has doubtlessly changed dramatically—and the tests have certainly gotten harder too.178

In fact, FaceFirst has made sure that it will not be held to this high standard. A 2015 contract with one of the largest police face recognition systems in the country, the San Diego Association of Governments, includes the following disclaimer: “FaceFirst makes no representations or warranties as to the accuracy and reliability of the product in the performance of its facial recognition capabilities.”179

Compared to fingerprinting, state-of-the-art face recognition is far less reliable and well-tested. Yet other than instructing the recipients of potential face recognition matches that search results are only investigative leads—not conclusive evidence—jurisdictions and other stakeholders take too few steps to protect against false positives and other errors.

  • 175. See Simson Garfinkle, Future Tech, Discover, 23.9 (2002): 17–20 (reporting false positive error generated by face recognition technology in use at the Fresno Yosemite International Airport), http://simson.net/clips/2002/2002.Discover.09.FaceID.pdf; Cf. Eric Licthblau, U.S. Will Pay $2 Million to Lawyer Wrongly Jailed, N.Y. Times (Nov. 30, 2006) (describing the case of Brandon Mayfield, who was wrongly linked to the 2004 Madrid train bombings as the result of a faulty fingerprint identification); Office of the Inspector General, U.S. Department of Justice, A Review of the FBI’s Handling of the Brandon Mayfield Case (Jan. 2006) at 1, https://oig.justice.gov/special/s0601/exec.pdf (describing process by which automated fingerprint matching system and FBI human examiner incorrectly matched Mayfield’s prints to Madrid bomber’s).
  • 176. FaceFirst, Frequently Asked Questions, http://www.facefirst.com/faq (last visited Sept. 1, 2016).
  • 177. See FaceFirst, Frequently Asked Questions, http://www.facefirst.com/faq (last visited Sept. 1, 2016) (archived copy available at https://web.archive.org/web/20160119232512/http://www.facefirst.com/faq and on file with authors) (acknowledging 95% figure is drawn from a 2006 accuracy test).
  • 178. See Patrick Grother and Mei Ngan, Face Recognition Vendor Test: Performance of Face Identification Algorithms, NIST Interagency Report 8009 (May 26, 2014), http://biometrics.nist.gov/cs_links/face/frvt/frvt2013/NIST_8009.pdf.
  • 179. SANDAG, ARJIS Contract with Facefirst, LLC, Document p. 008358.

1. Accuracy remains a work in progress. Real-time systems and systems with large databases are especially error-prone.

Face recognition is widely considered to be less accurate than fingerprint identification.180 Age changes faces, as do cosmetics, inebriation, and obstructions like glasses or hair.181 Fingerprints, in contrast, are relatively consistent over time, although they can be altered by accidents or prolonged, manual labor.182 Fingerprinting has over a century-long track record in law enforcement. The first, primitive face recognition algorithms were developed in the early 1990s.183

When face recognition is run on photos captured at a distance, it is often subject to a wider range of environments. In the “wild,” photos rarely contain the frontal images that face recognition algorithms prefer. Poor and uneven lighting can confuse algorithms that rely on facial features or skin textures. Algorithms have an especially tough time mixing photos taken in different circumstances, like mug shots and surveillance camera stills.184

Real-time, continuous video surveillance systems tend to combine the worst of these traits, rendering them less accurate than many other deployments. Unlike mug shot-based systems, which use photos captured in controlled settings according to strict standards,185 real-time systems must contend with people going about their daily lives. Subjects rarely face the camera straight on, and video stills are often poorly or unevenly lit. The security cameras themselves vary in quality and are often mounted on ceilings. They often capture only the tops of people’s heads.

In a real-time experiment set in a train station in Mainz, Germany from 2006 to 2007, lighting was a major problem. Accuracy was at 60% during the day but 10–20% at night.186 A report on the experiment notes the challenge of uncontrolled image capture: “cooperative behavior must be attained from the wanted person.”187 Overall recognition rates averaged between 17% and 29%.188

Accuracy also drops as databases become larger.189 Larger databases are more likely to contain lookalikes that mislead face recognition algorithms into picking the wrong matches. As a database size rises to a national scale, an algorithm will inevitably encounter highly similar faces. Larger databases may also be more likely to contain older images, which can drive down accuracy.190 (See Sidebar 6 for an explanation of accuracy measurements.)

  • 180. See, e.g., Patrick J. Grother, et. al., Multiple-Biometric Evaluation, Report on the Evaluation of 2D Stll-Image Face Recognition Algorithms, NIST Interagency Report 7709 at 2, National Institute of Standards and Technology (Aug. 24, 2011), http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=905968 (“Face images have been collected in law enforcement for more than a century, but their value for automated identification remains secondary to fingerprints.”).
  • 181. See Anil Jain & Brendan Klare, Face Matching and Retrieval in Forensics Applications, 19 IEEE MultiMedia 1, 20 (“The face recognition community has recognized four key factors that significantly compromise recognition accuracy: pose, illumination, expression, and aging.”).
  • 182. See Mark Hawthorne, Fingerprints: Analysis and Understanding 21 (2008) (“Friction skin is permanent. That is, the skin does not change under normal conditions from the time of formation until decomposition after death. . . Friction skin will deteriorate with age as well as all skin, but classification and identification normally will not be affected.”).
  • 183. See Turk & Pentland, Eigenfaces for Recognition, 3 J. Cognitive Neurosci. 1, 71 (1991).
  • 184. See Patrick Grother & Mei Ngan, Face Recognition Vendor Test: Performance of Face Identification Algorithms, NIST Interagency Report 8009, 25 (May 26, 2014), http://biometrics.nist.gov/cs_links/face/frvt/frvt2013/NIST_8009.pdf.
  • 185. See National Institute of Standards and Technology, U.S. Department of Commerce, American National Standard for Information Systems: Data Format for the Interchange of Fingerprint, Facial & Other Biometric Information, ANSI/NIST-ITL 1-2011 (Dec. 2013), http://biometrics.nist.gov/cs_links/standard/ansi_2012/Update-Final_Approved_Version.pdf.
  • 186. See Federal Criminal Police Office of Germany, Federal Ministry of the Interior, Face recognition as a search tool—foto-fahndung, https://www.bka.de/SharedDocs/Downloads/EN/Research/PhotographBasedSearches/fotofahndungAbschlussberichtEnglisch.pdf;jsessionid=8A41E1E76C3A9180114D9669DE618B34.live0612?__blob=publicationFile&v=1 (English version).
  • 187. See Federal Criminal Police Office of Germany, Federal Ministry of the Interior, Face recognition as a search tool—foto-fahndung at 6.
  • 188. See Federal Criminal Police Office of Germany, Federal Ministry of the Interior, Face recognition as a search tool—foto-fahndung at 25.
  • 189. Patrick J. Grother, et. al., Multiple-Biometric Evaluation, Report on the Evaluation of 2D Stll-Image Face Recognition Algorithms, NIST Interagency Report 7709, 2, National Institute of Standards and Technology (Aug. 24, 2011), http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=905968; Patrick Grother & Mei Ngan, Face Recognition Vendor Test: Performance of Face Identification Algorithms, NIST Interagency Report 8009, 58 (May 26, 2014), http://biometrics.nist.gov/cs_links/face/frvt/frvt2013/NIST_8009.pdf.
  • 190. See generally Lacey Best-Rowden & Anil Jain, A Longitudinal Study of Automatic Face Recognition, Proc. of the IEEE International Conference on Biometrics (May 19-22, 2015), http://www.cse.msu.edu/rgroups/biometrics/Publications/Face/BestRowdenJain_LongitudinalStudyFaceRecognition_ICB15.pdf.

Sidebar 6: Understanding Face Recognition Accuracy

The accuracy of a face recognition algorithm cannot be reduced to a single number. Algorithms make mistakes in a variety of different ways, some of which are more problematic than others. Algorithms use a photo of a subject (a probe photo) to search for matching faces in a database of identified face images. An algorithm can return one of two responses: an accept—a photo that it thinks is a possible match, or a reject—a concession that no matching photos were found.

  • If the algorithm finds a match that indeed contains the subject, it has achieved a true accept—it correctly made a match.
  • If the subject isn’t in the database of images and the algorithm correctly returns nothing, it has achieved a true reject—it correctly found that there was no match.
  • If the subject isn’t in the database of images but the algorithm mistakenly suggests a match with the image of someone else, it has produced a false accept—it matched to the wrong person.
  • If the subject is in the image database, but the algorithm fails to find a match or mistakenly suggests a match containing someone else, it has produced a false reject—it should have found the right person but didn’t.191

In figuring out how to handle false accepts and rejects, a law enforcement agency has to make a difficult choice. If the goal is to identify as many leads as possible, it might prefer to be over-inclusive and err on the side of false accepts, giving more possible leads from which to choose (assuming the correct lead could eventually be identified from this set). This is the approach of the FBI face recognition database (NGI-IPS), which returns between two and 50 candidate mug shots for any given search—most of which necessarily will be false accepts.192 

Yet a false accept could be devastating to someone mistakenly implicated by face recognition. To avoid these errors, an agency might prefer false rejects instead, and stipulate that searches will return a small number of candidate images. Doing so, however, risks failing to find the right person at all.

  • 191. See MBE 2010 at Patrick J. Grother, et. al., Multiple-Biometric Evaluation, Report on the Evaluation of 2D Still-Image Face Recognition Algorithms, NIST Interagency Report 7709 at 15, National Institute of Standards and Technology (Aug. 24, 2011), http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=905968.
  • 192. U.S. Gov’t Accountability Office, GAO-16-267, Face Recognition Technology: FBI Should Better Ensure Privacy and Accuracy 14 (May 2016).

2. Law enforcement agencies use too few protections for accuracy.

Most agencies that provided a face recognition use policy included some form of disclaimer stating that potential matches were investigative leads only and could not form the sole basis for arrest. Beyond this, however, agencies appear to take remarkably few steps to protect against errors in their face recognition systems.

a. Agencies do not consistently consider accuracy when purchasing systems.

The contracting process gives agencies a chance to ensure system accuracy by requiring certain accuracy thresholds, or that algorithms be submitted to accuracy tests, both before and after purchase.

Few agencies provided a full set of contracting documents in response to our records requests. Of the nine contract-related responses we did receive, four were sole source contracts, meaning that there was no competitive process for selecting or upgrading the face recognition system.193 The other agencies providing responses demonstrated very different approaches to face recognition accuracy.

On one end of the spectrum, the Los Angeles County Sheriff’s Department and Ohio Bureau of Criminal Investigation did not require any demonstration or testing for face recognition accuracy. When one company asked whether there were accuracy expectations for face and iris recognition, L.A. County responded: “There are no expectations, as we are requesting vendors to enlighten us as to [the] accuracy capability for standalone face and iris.”194 These vague standards contrast sharply with both agencies’ strong accuracy requirements for fingerprint algorithms.195

  • 193. The nine agencies that provided contract documents, including RFPs, responses to RFPs, sole source purchasing documents, and contracts, are: Maricopa County Sheriff’s Office; Los Angeles County Sheriff’s Department; SANDAG; San Francisco Police Department; Pinellas County Sheriff’s Office; Michigan State Police; Virginia State Police; South Sound 911; and the West Virginia Intelligence Fusion Center (WVI/FC). Agencies with sole source contracts, either for the initial system purchase or for the latest system upgrade are: Maricopa County Sheriff’s Office; Pinellas County Sheriff’s Office; Virginia State Police; and WVI/FC.
  • 194. See Los Angeles County Sheriff’s Office, Bulletin Number 1: Questions and Responses Release, Multi-Biometric Identification System (MBIS) Request for Information Number 414-SH (Feb. 2, 2010), Document p. 000205.
  • 195. Both agencies required specific accuracy rates for the systems’ fingerprint algorithms, broken down by true match rates, failure to match rates, and by probe image type such as mobile searches and latent-to-criminal comparisons (analogous to the remote biometric identification application of face recognition). See LA County Sheriff’s Office, Request for Proposals for Multimodal Biometric Identification System (MBIS) Solution (July 2013), Document pp. 000935–000937; Ohio Bureau of Criminal Investigation, Response to Ohio Attorney General’s Office Request for Proposals No. RFP-BCI-ITS-AB01 from 3M Cogent, Document p. 016396.
“[W]e are requesting vendors to enlighten us as to [the] accuracy capability for standalone face and iris.”

- L.A. County Contracting Bulletin

On the other end, for the face recognition component to its multi-biometric system, the San Francisco Police Department required that bidding companies:

  • Meet specific target accuracy levels—an error rate of 1% or better;
  • Provide copies of the results from all prior accuracy tests conducted by NIST in which their algorithm was evaluated;
  • Upon acceptance, submit to verification tests to ensure the system “achieves the same or better accuracies than what has been achieved by relevant NIST and/or other independent and authoritative 3rd party testing;” and
  • Submit to regular future accuracy testing “to reconfirm system performance and detect any degradation.”196

South Sound 911 also considered accuracy in its request for face recognition proposals, requiring that: “The search results must meet a match rate of a 96% confidence rating,” and “[t]he system must have high threshold facial recognition search capability for both in-car and booking officer queries.”197

b. Few agencies used trained human reviewers to bolster accuracy.

Since face recognition accuracy remains far from perfect, experts agree that a human must double-check the results of face recognition searches to ensure that they are correct. As the architect of a leading face recognition algorithm put it, “I wouldn’t like my algorithm to take someone to jail as a single source” of identifying evidence.198

Simple human review of results is not enough, however. Without specialized training, human reviewers make so many mistakes that overall face recognition accuracy could actually drop when their input is taken into account. Humans instinctively match faces using a number of psychological heuristics that can become liabilities for police deployments of face recognition. For example, studies show that humans are better at recognizing people they already know199 and people of the same race.200

As evidence of the benefits of training, one study tested the performance of Australian passport personnel, who use Cognitec’s algorithm to check for duplicate passport applications.201 Facial reviewers, who receive limited instruction in face matching, identified the correct match or correctly concluded there was no match only half the time; they did no better than college students. Specially trained facial examiners, however, did about 20% better.

Unfortunately, while other agencies may do this training, documents we received identified only eight systems that employed human gatekeepers to systematically review matches before forwarding them to officers: the FBI face recognition unit (FACE Services), the Albuquerque Police Department, the Honolulu Police Department, the Maricopa County Sheriff’s Office, the Michigan State Police, the Palm Beach County Sheriff’s Office, the Seattle Police Department, and the West Virginia Intelligence Fusion Center.202

Even these systems are still not ideal. For all but two of these systems—the FBI face recognition unit and the Michigan State Police—the level of training required for these human gatekeepers is unclear. Some searches evade review altogether. When a Michigan State Police officer conducts a face recognition search from a mobile phone (such as for a field identification during a traffic stop), the algorithms’ results are forwarded directly to the officer without any human review.203 Similarly, while the FBI subjects its own searches of its database to trained human review, states requesting FBI searches of that same database are returned up to 50 candidate images without any kind of human review.204

c. Human reviewer training regimes are still in their infancy.

Agencies that are eager to implement human training may encounter yet another difficulty: the techniques for manually comparing photos of faces for similarity—techniques that would inform this sort of training—are still in their infancy. The FBI’s Facial Identification Scientific Working Group (FISWG), whose members include academic institutions and law enforcement agencies at all levels of government, has developed training and standardization materials for human facial comparison.

  • 196. See San Francisco Police Department, SFPD Request for Proposal, Automated Biometric Identification System Section 02—Technical Specifications (Mar. 31, 2009), Document pp. 005555–005558.
  • 197. Law Enforcement Support Agency (South Sound 911), Request for Proposal: Mug Shot Booking Capture Solution, Specification No. 3002-12-05 (2012), Document p. 009432.
  • 198. See Interview with Face Recognition Company Engineer (Anonymous) (Mar. 9, 2016) (notes on file with authors).
  • 199. See Ritchie, et al., Viewers base estimates of face matching accuracy on their own familiarity: Explaining the photo-ID paradox, 141 Cognition 161–169 (2015).
  • 200. See Christian Meissner & John Brigham, Thirty Years of Investigating the Own-race Bias in Memory for Faces: A meta-analytic review, 7 Psychology, Public Policy, and Law 3–35 (2001).
  • 201. See White, et al., Error Rates in Users of Automatic Face Recognition Software, PLoS ONE 10(10) (2015). The study presented the images to subjects for only 18 seconds, however, so it is possible that results might have improved if the subjects had more time. Documents from the Facial Identification Scientific Working Group suggest that “review” should take 45 sections and “examination” longer than two hours. See Facial Identification Scientific Working Group, Guidelines for Facial Comparison Models, Version 1.0 (Feb. 2, 2012), https://www.fiswg.org/document/viewDocument?id=25.
  • 202. FBI Face Services, U.S. Gov’t Accountability Office, GAO-16-267, Face Recognition Technology: FBI Should Better Ensure Privacy and Accuracy 17 (May 2016) (searches are manually reviewed and only “the top one or two” candidates are returned to the FBI agent); Albuquerque Police Department, Procedural Order – Facial Recognition Technology, Document p. 009203 ("When trained RTCC personnel identify a possible match, they will notify the officer or case agent and supply them with possible names and images of known offenders."); Honolulu Police Department, Policy: Facial Recognition Program (Sept. 14, 2015), Document p. 014705 ("If the facial recognition system detects a viable candidate, the CAU shall complete a follow-up report for the assigned detective. The CAU analyst's follow-up report shall contain the steps taken to compare the known and unknown photographs and how the CAU analyst came to his or her conclusion(s)."); Maricopa County Sheriff’s Office, MCSO/ACTIC Facial Recognition Procedures: Image Records Request, Document p. 014963–014965 (describing a process where searches are reviewed and approved by facial recognition supervisors twice, and results are accompanied by an explanatory narrative); Michigan State Police, Statewide Network of Agency Photos (SNAP) Unit: Overview and Workflow, Document p. 011467–11468 (latent (investigate and identify) searches go through a team of trained examiners who narrow down candidates to a single match or none at all, which are peer reviewed by a second examiner to confirm the result.); Palm Beach County Sheriff’s Office, SOPICS Facial Recognition Program Policy, Document p. 008651 (at least two analysts review candidate lists before the search results are returned to the requestor); Seattle Police Department, Booking Photo Comparison Software Manual (Feb. 19, 2014), Document p. 009907 (“Only Department-Trained Photo Personnel Will Use BPCS”); West Virginia Intelligence Fusion Center, Letter from Thomas Kirk, General Counsel for the Office of the Secretary, West Virginia Department of Military Affairs and Public Safety, to Clare Garvie (Jan. 25, 2016), Document p. 009911 ("When an image has been checked against the facial recognition database and results are shown, a visual check by the analyst is performed to check the probability of the match against the target image.").
  • 203. Interview with Peter Langenfeld, Program Manager, Digital Analysis and Identification Section (May 25, 2016) (notes on file with authors).
  • 204. See U.S. Gov’t Accountability Office, GAO-16-267, Face Recognition Technology: FBI Should Better Ensure Privacy and Accuracy 14 (May 2016) (“The search of NGI-IPS is a completely automated process…”).
“It’s not science at this point—it’s more of an art.”

Its preferred approach is “morphological comparison,” which examines the similarity of different facial features depending on their “permanence.” However, the science behind this approach is murky—as FISWG’s materials report, “only limited studies have been done on accuracy or reproducibility.”205An engineer at one company was more direct: “it’s not science at this point—it’s more of an art.”206

  • 205. See Facial Identification Scientific Working Group, Guidelines for Facial Comparison Models, Version 1.0 at 5 (Feb. 2, 2012), https://www.fiswg.org/document/viewDocuments;jsessionid=6A11990853BB99B8EBA42E6C03883543.
  • 206. Interview with anonymous engineer (June 22, 2016) (notes on file with authors).

3. Testing regimes are voluntary, sporadic, and resource-limited.

There is only one public, independent benchmark for comparing the accuracy of these algorithms—a face recognition competition offered by the National Institute of Standards and Technology (NIST) every three or four years. All leading manufacturers currently submit to these tests, but participation in the competition is entirely voluntary, and manufacturers are under no obligation to submit to NIST tests before selling their algorithms to law enforcement agencies.

In 2010, NIST observed that accuracy had improved by “an order of magnitude in each four-year period” between tests, a dramatic pace of technological innovation.207 However, the last round of testing was in 2013, a lifetime ago at the pace that face recognition technology moves. Thus, state and local law enforcement agencies seeking to purchase face recognition in 2016 would have less reliable information to go on when contracting with face recognition vendors given the time that has passed since the last test.

  • 207. See Patrick J. Grother, et. al., Multiple-Biometric Evaluation (MBE) 2010, Report on the Evaluation of 2D Still-Image Face Recognition Algorithms, NIST Interagency Report 7709 at 3, National Institute of Standards and Technology (Aug. 24, 2011), http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=905968

4. Publicly available photo sets do not reflect the size or diversity of the human population.

Outside of NIST’s accuracy tests, several publicly available, academic collections of facial photos provide a limited basis for making independent accuracy comparisons between algorithms. The most prominent of the collections is called “Labeled Faces in the Wild” and dates to 2007.208 These datasets typically feature celebrities. This makes it easier to label thousands of individual faces, but fails to capture the full range of human diversity.

These datasets are also small, on the order of a few thousand photos. By contrast, a single state may have millions of photos in its face recognition databases. (For example, Pennsylvania has over 34 million and Michigan has over 40 million.)209 The difference in size matters: as explained above, as a dataset grows in size, the likelihood of similar faces increases, challenging accuracy.210

A public dataset becomes less useful over time as researchers calibrate their design decisions to the specific photos it contains rather than to face recognition in general. As a consequence, it is common to see algorithms that perform flawlessly on one dataset but struggle in other contexts.211 The Intelligence Advanced Research Projects Activity (IARPA), a U.S. intelligence organization that funds intelligence-related research,212 is sponsoring an initiative called the Janus project that will generate a wave of new, more difficult datasets.213

  • 208. See University of Massachusetts — Amherst, Labeled Faces in the Wild, http://vis-www.cs.umass.edu/lfw/ (last visited Sept. 22, 2016).
  • 209. Pennsylvania JNET, JNET Facial Recognition Presentation Slides (2014), Document p. 010750; Michigan State Police, Interview with Peter Langenfeld, Program Manager, Digital Analysis and Identification Section (May 25, 2016) (notes on file with authors).
  • 210. See above Section 1: Accuracy remains a work in progress. Real-time systems and systems with large databases are especially error-prone.
  • 211. Brendan F. Klare et al., Pushing the Frontiers of Unconstrained Face Detection and Recognition: IARPA Janus Benchmark A, 28 IEEE Conference on Computer Vision and Pattern Recognition 1 (June 2015) (“ . . . performance has begun to saturate on LFW, YTW, and other unconstrained datasets. At the same time, unconstrained face recognition is hardly considered a solved problem.”).
  • 212. Office of the Director of National Intelligence, https://www.iarpa.gov/ (last visited Sept. 22, 2016).
  • 213. Brendan F. Klare et al., Pushing the Frontiers of Unconstrained Face Detection and Recognition: IARPA Janus Benchmark A, 28 IEEE Conference on Computer Vision and Pattern Recognition (June 2015); Office of the Director of National Intelligence, Janus, https://www.iarpa.gov/index.php/research-programs/janus/baa?highlight=WyJqYW51cyJd (last visited Sept. 22, 2016).

Sidebar 7: Scoring Accuracy Protections

As this section explains, an agency can take a variety of steps to safeguard against errors into their face recognition system. Our accuracy score considers a range of different measures, with particular weight given to the use of trained human examiners as a backstop to accuracy.

  • + Agency demonstrates four or five criteria listed below.
  • 0 Agency demonstrates three of the criteria.
  • - Agency demonstrates two or fewer of the criteria.

The criteria are:

  • Algorithms have been tested by the National Institute of Standards and Technology;
  • Contract with vendor company contains provisions that require face recognition algorithms to have been tested for accuracy and to be tested at all future opportunities;
  • Most or all face recognition queries are validated by trained human examiners or agencies have a unit or designated personnel that perform a review and screening function of the candidate lists (weighted as two criteria);
  • Face recognition results or candidate lists are treated as investigative leads only.