Item 43674280

freeone3000 • 8 days ago

Continual feedback means continual training. No way around it. So you’d have to scope down the functional unit to a fairly small lora in order to get reasonable re-training costs here.

regularfry • 8 days ago

That's not quite true. The system prompt is state that you can use for "training" in a way that fits the problem here. It's not differentiable so you're in slightly alien territory, but it's also more comprehensible than gradient-descending a bunch of weights.

1 reply

immibis • 7 days ago

If you treat it as vectors instead of words, it might be differentiable.

nico • 8 days ago

Or maybe figure out a different architecture

Either way, the end user experience would be vastly improved