semiinfinitely 2 days ago

i think that log-sum-exp should actually be the function that gets the name "softmax" because its actually a soft maximum over a set of values. And what we call "softmax" should be called "grad softmax" (since grad of logsumexp is softmax).

1
GistNoesis 1 day ago

softmax is badly named and should rather be called softargmax.