This looks really interesting - Maybe someone with more knowledge on the subject could make a comparison with other frameworks and how far these guarantees can go? (e.g. fairness seems even ambiguous to define objectively)
The repository is extremely disappointing
>This repository provides a framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean 4
FV of AI systems is hard and this repo massively oversells itself. The Lean code itself is weird (AI generated?) and the "fairness" example proves that if a (linear) classifier always returns "yes" then the percentage of people classified as "yes" is the same across all demographics which isn't meaningful since this percentage is 100% for every possible group of people.
I was watching a video [1] the other day about the challenges FV faces for AI if it's something you're be interested in. I'm not really familiar with FV for neural networks and not sure what progress looks like in the field but I know ANSR [2] and TIAMAT [3] are doing related things
[1]: https://www.youtube.com/watch?v=bs5snugP1VA
[2]: https://www.darpa.mil/research/programs/assured-neuro-symbol...
[3]: https://www.darpa.mil/research/programs/transfer-from-imprec...
> this repo massively oversells itself
that's literally every single paper in academic CS (except theory). every claim is an exaggeration or lie by omission ("this only works on a small subset of ..."). every title is clickbait. every impl is worse than junior-level code. you become so jaded to it that you start to evaluate papers/projects based a premonition of what it could be rather than what it is.
At least the author[0] seems to have some clout behind him. However, given that his code doesn't even compile and the premise seems massively over-stated, I wonder how his credentials (Stanford, etc) can even be genuine.
Yeah, that was my thought too. Is this system formally verifying low-level stuff like, "if neurons A and B have a connection to neuron C, and A and B both fire, then..."?
It seems impossible to prove high-level things like "this computer vision system will never mis-identify a cat as a dog."