Can We Force AIs to Be Fair Towards People?
Artificial intelligence, it seems, can figure out how to do just about anything. It can simulate the Universe, learn to solve a Rubik’s Cube with just one hand, and even find ghosts hidden in our past.
All these kinds of advancements are meant to be for our own good. But what about when they’re not? In recent times, algorithmic systems that already affect people’s lives have demonstrated alarming levels of bias in their operation, doing things like predicting criminality along racial lines and determining credit limits based on gender.
Against this backdrop, how can scientists ensure that advanced thinking systems can be fair, or even safe?
A new study led by researchers from the University of Massachusetts Amherst looks to offer an answer, describing a framework to prevent what the team calls “undesirable behaviour” in intelligent machines.
“When someone applies a machine learning algorithm, it’s hard to control its behaviour,” says machine learning researcher Philip Thomas.
“Making it easier to ensure fairness and avoid harm is becoming increasingly important as machine learning algorithms impact our lives more and more.”
The framework – which could help AI researchers to develop new kinds of machine learning (ML) algorithms – doesn’t imbue AIs with any inherent understanding of morality or fairness, but rather makes it easier for ML researchers to specify and regulate undesirable behaviour when they are designing their core algorithms.
At the heart of the new system are what the team calls ‘Seldonian’ algorithms, named after the central character of Isaac Asimov’s famous Foundation series of sci-fi novels. These algorithms aren’t just about ensuring ethical operation; any kind of behaviour can be controlled, such as complex safety features in medical systems.
“If I use a Seldonian algorithm for diabetes treatment, I can specify that undesirable behaviour means dangerously low blood sugar, or hypoglycaemia,” Thomas says.
“I can say to the machine, ‘While you’re trying to improve the controller in the insulin pump, don’t make changes that would increase the frequency of hypoglycaemia.’ Most algorithms don’t give you a way to put this type of constraint on behaviour; it wasn’t included in early designs.”
As part of their research, the team developed just such a Seldonian algorithm to control an automated insulin pump, identifying a tailored way to safely predict doses for a person based on their blood glucose reading.
In another experiment, they developed an algorithm to predict student GPAs, while avoiding gender bias found in commonly used regression algorithms.
The researchers empathise that these experiments only serve as a proof of principle of what Seldonian algorithms are capable of, and that the primary focus of the work is the framework itself, which other scientists can use as a guide to build future AI systems.
“We believe there’s massive room for improvement in this area,” Thomas says.
“Even with our algorithms made of simple components, we obtained impressive results. We hope that machine learning researchers will go on to develop new and more sophisticated algorithms using our framework, which can be used responsibly for applications where machine learning used to be considered too risky.”
The findings are reported in Science.