The big challenge with the approach not touched on in the post is version skew. During a deploy you'll have some new clients talk to old servers and some old clients talk to new servers. The ViewModel is a minimal representation of the data and you can constrain it with backwards compatibility guarantees (ex. Protos or Thrift), while the UI component JSON and their associated JS must be compatible with the running client.
Vercel fixes this for a fee: https://vercel.com/docs/skew-protection
I do wonder how many people will use the new React features and then have short outages during deploys like the FOUC of the past. Even their Pro plan has only 12 hours of protection so if you leave a tab open for 24 hours and then click a button it might hit a server where the server components and functions are incompatible.
Wouldn't this be easy to fix by injecting a a version number field in every JSON payload and if the expected version doesn't match the received one, just force a redirect/reload?
Forcing a reload is a regression compared to the "standard" method proposed at the start of the article. If you have a REST API that requests attributes about a model, and the client is responsible for the presentation of that model, then it is much easier to support outdated clients (perhaps outdated by weeks or months, in the case of mobile apps) without interruption, because their pre-existing logic continues to work
Arguable that it's a 'regression'...loading pages is kinda the normal behaviour in a web browser. You can try to paper over that basic truth but you can't abstract it away forever. Also, the original comment I replied to said it would be a 'big challenge', but if you accept that the web is the web and sometimes pages can load or even reload, then it's not really a 'challenge' any more at all.
Vercel's skew protection feature keeps old versions alive for a while and routes requests that come from an old client to that old version, with some API endpoints to forcibly kill old versions if need be, etc. I find it works reasonably well.
Wouldn't a solution that works perfectly be better than one that works 'reasonably well'?
Your solution doesn’t work perfectly, it works perfectly in the sense that your engineers wont see errors related to this situation; but it does not work perfectly in that your users have a crappy experience. For example if you have some long form and after a user inputs a ton of stuff, you just refresh their browser for them and wipe it all out, then that is a crappy experience. Or you refresh their browser when their internet connection is bad and then prevent them from using your app until the whole thing reloads.
Maybe that doesn’t matter for your use case or you’re willing to do a lot more legwork to prevent issues like that from occurring but there will always be tradeoffs.
Thrashing is why
Sorry what do you mean by 'thrashing' in this context?
Reload causes skew causes reload
How does reload cause skew? Reload will just load the latest version of the webapp. That's the point.
If you force a reload before the rollout is complete, the user will still experience skew, because you haven't finished the rollout. The website will be completely unusable for a significant fraction of users. You might as well turn off the website during the rollout. This is the main concern of skew - how to keep the website usable at all times for all users across versions.
If your rollout times are very short then skew is not a big concern for you, because it will impact very few users. If it lasts hours, then you have to solve it.
After the rollout is complete, then reload is fine. It's a bit user hostile but they will reload into a usable state.
If a webapp rollout lasts hours, you have a much bigger problem than skew which needs to be addressed urgently.
For most large scale apps (web or native) rollouts take multiple hours or even days. Ramps are slow to avoid widespread incidents and allow canary analysis to detect issues.