I'm not talking about GPT-4o here - every benchmark I've seen has had the new models from the past ~12 months out-perform the March 2023 GPT-4 model.
To pick just the most popular one, https://lmarena.ai/?leaderboard= has GPT-4-0314 ranked 83rd now.
How have you been able to tie benchmark results to better results?
Vibes and intuition. Not much more than that.
Don't you think that presenting this as learning or knowledge is unethical?