So the system design puts the burden on what seems to be synchronous, not queued, writes to get easy reads. I usually prefer simpler cheaper writes at the cost of more complicated reads as the reads scale and parallelize better.
you're underestimating the read load, by a lot