Fairness and Philosophy in the Age of Artificial Intelligence
The Cairo Review speaks with philosopher Aaron Wolf about complexities of algorithmic fairness

Artificial intelligence has been implemented in a variety of fields to help humans make faster, smarter decisions, but this technology is not a magic wand. It is built on algorithms designed to make decisions based on data collected from our human world, one that is plagued by systemic inequalities. If these algorithms are trained on data that reflect an unfair reality, then how can we expect them to help us build a better world? And, perhaps more importantly, do we really know what a ‘better’ world looks like?
Philosophers have wrestled with what it means to be ‘fair’ or ‘just’ for centuries. In recent years, these debates have escaped the dusty tomes of the library and found their way into the realm of computer science. While software engineers tinker with ensuring the algorithm works correctly, philosophers ask what ‘correctly’ really means. As AI is now being used to decide which defendants are denied parole and which students are allowed into university, examining exactly what we expect from a ‘fair’ and ‘just’ world is equally important as exploring how we can use AI to achieve it.
To examine these topics, the Cairo Review’s Senior Editor Abigail Flynn spoke with Aaron Wolf, senior lecturer in university studies and research affiliate in philosophy at Colgate University, to share his insight on the nuances of algorithmic fairness.
Cairo Review: As someone who researches moral theory, what made you interested in studying AI?
Aaron Wolf: In the last handful of years, I’ve come to feel like the more pressing issue, a thing that’s way more interesting to me, is something called the ‘value alignment problem’.
It’s the question of how we get autonomous systems to behave on their own in ways that we would want. This is a somewhat more difficult problem than you might think, because it’s difficult to specify exactly what it is that we want. There are lots of interesting cases where an automated system thinks you want one thing and then gives you that thing, but that thing turns out to be very different from what you actually wanted. Before we allow machines to, you know, take over the world, it’s fairly important that we get them to behave in predictable and acceptable ways.
So, you’re worried about what might happen in the future as AI nears superintelligence?
Actually, my little slice of the value alignment problem is a more near-term thing. Lots of people working on value alignment have future AI applications in mind. But there’s also something that is happening now, and has been happening for some time, which is that we have these automated systems that are ubiquitous and make a lot of morally significant decisions about people here and now.
Artificial intelligence is already being used to make decisions in real life?
Sure. From job applications to insurance and loan applications, healthcare, finance, education. Even criminal justice, like in the case of COMPAS.
What’s COMPAS?
‘COMPAS’ is an algorithm, it stands for Correctional Offender Management Profiling for Alternative Sanctions. It takes about 115 data points about a criminal defendant and makes a prediction about the likelihood that they will be arrested again in the next handful of years. In other words, it predicts the likelihood that the defendant will be rearrested in the future if they are released on bond or parole now. The judge uses the system to make decisions about who gets parole and who doesn’t.
Now, this system isn’t technically AI in the sense that we talk about today, because it’s a hand-coded algorithm. But it’s still an important example of potential bias. ProPublica wrote a research white paper on it and alleged that the way that the algorithm makes its decisions is racially biased toward black defendants.
What did the designers of the algorithm say?
They wrote up a response saying something like, ‘No, it’s not like that, we went out of our way to make this algorithm fair. Here’s the metric which we used, and this metric is the industry standard’.
At that point, the computer scientists got involved and they said that there are two different metrics for fairness involved here: first, the ‘industry standard’ one being used by the designers of COMPAS, and second, our intuitive sense of fairness and unfairness that the ProPublica researchers were using. The computer scientists said that these two metrics of fairness were at odds with each other, and they can’t both be satisfied at the same time.
Wait—what does ‘metric’ mean here? Does each ‘metric’ mean a specific definition for fairness?
Well, COMPAS uses a metric that is mathematically defined, we can express it in probability. ProPublica used a more intuitive conception of fairness, but it can also be expressed mathematically. That’s how the computer scientists showed that both metrics can’t be satisfied at the same time.
The computer scientists concluded that fairness is impossible, or at least, total fairness is impossible. They say that there’s different flavors of fairness and you pick the one that’s best suited for your case and run with that. Most of the industry agrees; most the AI ethics desks at major U.S. consulting firms have statements like that on their landing page.
For them, that’s the end of the story.
But it’s not the end of the story, is it?
In my opinion, no. It strikes me as a bit too quick and also a bit dangerous. Imagine a software engineer is trying to defend the decisions that an AI product has made in court and the opposing lawyer is grilling them, asking ‘What makes this decision fair?’.
And the software engineer says, ‘Well, I don’t know. We just picked a metric.’
That seems like a really unsatisfying defense of the choices you’ve made. And that’s where people like me get involved. This kind of nihilistic approach, ‘just pick and choose’, rubs philosophers the wrong way.
So, as a philosopher, how do you approach the question?
The field of research that I’m in these days is trying to look at the mathematical possibilities and ask, ‘Which one of these is best at capturing the ordinary, humanistic, intuitive ideas about fairness? Which matches the concept of fairness that ordinary people on the street have, but also the concepts that philosophers have been generating over many centuries?’.
For philosophers, we’re really interested in what ‘just’ or ‘fair’ means, and whether the two are actually compatible.
How could something be fair but not just?
Let’s say that we’re defining fair as ‘everyone gets treated equally’. We could make an argument that as long as we treat people equally, the outcome is automatically just. But that doesn’t always reflect reality. Sometimes treating people exactly the same can produce unjust outcomes because the mechanisms by which people got to be where they are, are often deeply unjust.
If you want to de-bias your algorithm in a way that’s going to undo past injustice, you have to, in some way or another, put your thumb on the scale, so to speak. This goes against the idea of treating people exactly equally.
So you basically have two approaches to fairness here, one that assumes a very narrow definition of treating people exactly the same, and another that has a more proactive lens.
Is there a situation, somewhere in the future, where an algorithm can be de-biased and trained in a way that makes it perfectly just?
People’s sense of what is and what is not morally acceptable from the perspective of fairness is always changing over time. So we’re going to have to come up with new ways of re-tinkering, reorganizing, or reweighting the algorithm, putting our thumb back on the scale again in order to build a data set that tells the algorithm how to make decisions in a way that’s getting us the outcomes we want.
But we haven’t reached the point of a perfect algorithm yet. So how do we use these programs responsibly in the here and now?
I wrote a paper about this that just came out in May. It talks specifically about how algorithmic fairness plays out from a philosophical perspective in higher education.
Do you mean in the classroom, like AI education?
Not exactly. In this case, AI is being used by academic advisors to decrease the rate of students dropping out of university by steering them toward certain majors. Institutions do this to save money—if you can keep track of which types of students are most likely to drop out of a certain major, then you can advise similar students to pursue different options. It helps advisers who have several hundred students under their guidance to make decisions more efficiently. Some of these universities include the student’s race as a variable.
So if students from a certain ethnic background usually drop out of a specific major, the program will tell the adviser to guide other students from the same background away from that program. That sounds like it could get tricky fast.
Yeah, unfortunately. I’m interpolating a bit here from the original piece, but if the primary goal is to keep the student enrolled, then the easiest thing to do is steer them into the ‘safer major’. Usually this is something like area studies, like African American or Latin American studies, sociology, and so on.
There’s a few different ways to spin this, but these are fields that have a reputation for being more welcoming and, for lack of a better term, less competitive for grades. Compare this to economics, international relations, or pre-med, where the programs are actively trying to weed out students. This is already an issue for advisers, even without AI. When you combine this existing problem with an AI advising tool that makes recommendations primarily to increase retention rates, you can end up exacerbating inequality.
Black and brown college students might be pushed toward less competitive and less lucrative fields, because the system is designed to give the most weight to protecting retention, at the expense of allowing the student the freedom to choose the course of study that they think is best for them. And most of the time, the students aren’t aware that these programs are being used, and, by extension, why the adviser might be pushing them toward a specific field. The goals of the institution don’t necessarily overlap with the student’s own best interest.
It sounds like these programs are very biased. Should we throw them out altogether, or is there any way to use them responsibly?
I think we can use them responsibly. I’m not involved in this specific type of research, but there’s some very clever and interesting work being done about how we can take historical data and clean it up to train the algorithm to produce a more just world. There’s all kinds of ways for bias to creep into the data set, and lots of people are out there working on ways to de-bias data.
This is one of the fundamental tensions when it comes to automated or algorithmic decision making. We rely on historical data from a world that is very non-ideal in terms of fairness.
If the current data comes from a biased world, what’s the best way for humans to use this type of technology?
This relates to the concept of ‘human in the loop’, or how humans actually implement their AI programs. This is a problem that shows up all over the place, where the human is just pushing a button and letting the program function by itself.
People who use these machines should have a better-than-average understanding of what the tool is doing, how its results should be interpreted, and what action should be taken. To let the program decide without any meaningful human oversight is a disservice to the person whom we are making these decisions about.
As an adviser, maybe the program tells me that the student in front of me is likely to drop out of the econ program. But I know that the student is from a disadvantaged background, so maybe I’ll discount the algorithm’s decision because I know it’s based on historical patterns, not on the individual sitting in front of me. Maybe this student is highly motivated, so I’ll advise him to pursue econ anyway.
The algorithms are a tool and they can often catch things that the human in the loop might miss. But like all tools, it has strengths and limitations, and it needs to be used well.