This is a really, really, good point.
Devil's advocating for conversation's sake: at the end of the day, the user and client app want very little persistent data coming from the server - if nothing else than the client is expecting to store chats as text, with external links or Potemkin placeholders for assets like files.
I agree with the devil's advocacy you've posed, and in retrospect I probably should have said "I bet these folks have a plan for binary data". These are clearly very serious people so it might be more accurate to say that I strongly suspect a subsequent revision of the protocol will bake in default transport-level handling of arbitrary tensors in an efficient way.