What Is Redundant Encoding And Should I Care?
The short answer: “yep.”
For us old school lawyers, things like big data, algorithms, and Silcon Valley and Alley, can be difficult to understand sometimes. But it’s a new world: we must do our best.
In employment law we have learned to use statistics. Good for us! But there’s a whole lot more that’s happening in the field as a result of artificial intelligence — like the substitution of machines for humans in the HR recruiting and decision making process in hiring.
At first glance, it would appear that using a machine to recruit and hire would eliminate bias and discrimination in the process. What great news! A computer has no inherent bias, right? And won’t discriminate, right?
Not so fast.
Software Can Discriminate
Can algorithms learn to discriminate in employment decisions? Tough concept to grasp.
As I noted, the Times reporter wrote that “Algorithms have become one of the most powerful arbiters in our lives. They make decisions about the news we read, the jobs we get, the people we meet, the schools we attend and the ads we see. Yet there is growing evidence that algorithms and other types of software can discriminate. The people who write them incorporate their biases, and algorithms often learn from human behavior, so they reflect the biases we hold.”
Makes sense: the bias of the programmer is reflected in the bias of the computer. As we learned about computers eons ago, “garbage in, garbage out.”
But now there’s another concern about discriminatory possibilities called “redundant encoding.”
According to an article in The Stack, “even if a specific data marker is not included in the data set, it may be included by proxy in a combination of other, relevant data.”
‘Splain that, please!
The article states what we initially (naively) presumed: “On the surface, discrimination by machines seems like a flawed concept. An algorithm is a mathematical construct, and as such, should not logically be subject to discriminatory outcomes.”
So far so good.
“However, algorithms may potentially rely on flawed input, logic and probabilities, as well as the unintentional biases of their human creators.”
Again, so far so good. We now know that.
Here’s where redundant encoding comes in. The Stack gives an example: “If the information that a loan applicant is female is not included in the data set, the application should be judged without including that information. However, this is a flawed answer, as gender can be inferred from other data factors which are included: for example, if the applicant is a single parent, and 82% of single parents are female, there is a high probability that the applicant is female (emphasis added).”
So gender — or even age or race, can be inferred from the accumulation of other big data? I see problems.
The article explains that Google wizards published a paper entitled “Equality of Opportunity in Supervised Learning.” The paper “provides a detailed, step-by-step framework to test existing algorithms for problematic, discriminatory outcomes, as well as how to adjust a machine learning algorithm to prevent those outcomes, with the result of equal opportunity in supervised learning.”
OK, now you got me again. I was almost there. …
You may want to read the Google paper if for no other reason than to confirm that you made the right choice majoring in English Literature.
If not, just understand that the concept that we call “disparate impact” is being actively studied in all of its permutations (or names) by mathematicians and computer scientists to make sure that “employers [or their computers] may not ask a potential employee’s gender, religion or race, but should instead evaluate them on their relevant skills for the post.”
“Fairness through unawareness” is how The Stack puts it.
Ok. ‘Nuff computer stuff.
PS. As I noted last year, the Times article provided an example of computer bias: “[it]came from Carnegie Mellon University, where researchers found that Google’s advertising system showed an ad for a career coaching service for “$200k+” executive jobs to men much more often than to women.”
So a Google ad reflected this bias?