Facial recognition technologies use algorithms derived from copyrighted sources that create a “faceprint” to identify or verify an individual’s identity. The use of facial recognition has become increasingly prevalent, such as on Facebook to “tag” friends, at airports for easy check-in, and on cell phones for authentication purposes. Until recently, facial recognition was also commonly used by law enforcement for general surveillance and to identify wanted or suspected persons.
On June 8, IBM CEO Arvind Krishna announced in a letter to Congress that IBM will no longer research, develop, market, or sell any of its facial recognition technologies to any new or existing clients. Krishna stated, IBM “firmly opposes and will not condone uses of any technology, including facial recognition, offered by other vendors for mass surveillance, racial profiling, and violations of basic human rights and freedoms.” The same week, both Amazon and Microsoft followed suit, announcing moratoriums on its facial recognition systems, but only as applied to police departments. The companies’ decision to abandon facial recognition came just two weeks after the death of George Floyd and worldwide protests for basic equality and against police brutality.
Since its inception, research reports have revealed the significantly disproportionate negative impact of facial recognition between genders and races. For example, a popular study of IBM’s facial recognition system revealed that results for darker-skinned women were 34.4% less accurate than those for lighter-skinned men. Research reports as such highlight a fundamental human rights concern, that facial recognition systems prone to implicit bias and misidentification can be (and oftentimes are) used by police to inadvertently violate citizens’ constitutional rights to privacy. But from where is this inherently biased data derived?
Data used to train facial recognition technologies is often derived from copyrighted datasets, such as the largest publicly available dataset of roughly 99.2 million faces, known as the YFCC100M. The YFCC100M dataset was compiled by Yahoo and is subject to the Creative Commons license.10 The Creative Commons is a public license framework that “permits the creator of a work to retain copyright while allowing others to copy, distribute, and make some uses of their work.” Because Section 107 of the Copyright Act permits the “fair use” of copyrighted works for research purposes, IBM and other vendors are legally permitted to use the YFCC100M dataset to train their facial recognition systems under the Creative Commons license. But how could this be “fair” under the fair use doctrine when considered unfair from a human rights perspective?
While some may argue with the ethics of this area of copyright law as applied to facial recognition, it is clear that “copyright law was not designed to protect individual privacy, because, at its core, copyright protects creator’s rights in and to original works and prevents unauthorized uses of such works.” Although copyright law may not be the solution to addressing research ethics or dismantling the systematic oppression of minorities in America, it certainly appears to be perpetuated by instances of misidentification inherent in facial recognition made permissible by copyright law.
It is far too early to say what we can expect when or if facial recognition will be widely reintroduced for law enforcement purposes, whether lawmakers will step in to regulate, or whether we can kiss facial recognition goodbye forever. But for now, the denouncement by major producers of the technology is certainly a win to be celebrated in this face-off between copyright and human rights.
Jaren is a third-year law student at Wake Forest University School of Law. Jaren earned her Bachelor of Science degree in Criminal Justice from Virginia Commonwealth University in 2018. Jaren is interested in corporate law, intellectual property law, and privacy/ cybersecurity issues.