> a fair, merit-based AI reviewer
That's a dream which is unlikely to come true.
One reason being that the training data is not unbiased and it's very hard to make it less biased, let alone unbiased.
The other issue being that the AI companies behind the models are not interested in this. You can see the Grok saga playing out in plain sight, but the competitors are not much better. They patch a few things over, but don't solve it at the root. And they don't have incentives to do that.
Wouldn't that just require a robust, predefined ruleset we could all agree on? Let's make the dream come true!
The rule set is simple: "Don't be biased." What does that mean? And that is the problem. It's hard (read: impossible) to define in technical, formal terms. That's because bias is at the root a social problem, not a technical one. Therefore you won't be able to solve it with technology. Just like poverty, world peace, racism.
The best you can hope for is to provide technical means to point out indicators of bias. But anything beyond that could, at worst, do more harm than good. ("The tool said this result is unbiased now! Keep your skepticism to yourself and let me publish!")
> That's because bias is at the root a social problem, not a technical one.
Bias is systematic error.
Maybe your thermometer just always reads 5° high.
Maybe it reads high on sunny days and low on rainy days.
Bias is distinct from random error, say if it's an electronic thermometer with a loose wire.
For classification problems, there's also this impossibility result: https://www.marcellodibello.com/algorithmicfairness/handout/...
Then let's try to be the least biased and fully transparent (which should also help with bias)