regularfry 8 days ago

That's not quite true. The system prompt is state that you can use for "training" in a way that fits the problem here. It's not differentiable so you're in slightly alien territory, but it's also more comprehensible than gradient-descending a bunch of weights.

1
immibis 7 days ago

If you treat it as vectors instead of words, it might be differentiable.