Played with the demo a bit and I got confused.
1. The chat context is always provided, and that introduces a bit of uncertainty - when the chat history mentioned something the model is always inclined to connect with it.
2. When I tried to set each context to an empty string, the model doesn't show any evidence of remembering concepts. I told it 5 times that I love cats, and when asked about its favorite animal, its output remains "honeybee" and "octopus".
I can’t decide if I’m skeptical of the entire concept or not. I guess I believe it will do something to the network to add this EMA of vectors in, so I’m surprised you didn’t get at least a change in animals after talking about cats. But, I’m not clear that reweighting logits at the end is super useful. I guess this is supposed to be in some way a realtime LoRA, but then what do you have except a super-undertrained LoRA, trained just off whatever conversations you’ve had?