Thanks! I'm curious:
> I’d want friction on breaking this boundary
Why do you want friction?
> implications of joining across shards are not the same
That's usually well understood and can be tracked with real time metrics. Ultimately, both are necessary and alternative solutions, like joining in the app code, are not great.
> Why do you want friction?
Because 99% of the time, breaking tenancy boundary is not the right thing to do. Most likely it's a sign that the tenant ID has been lost along the way, and that it should be fixed. Or that the use-case is shady and should be thought about more careful ("what are you _actually_ trying to do" type of thing).
A tenet I truy to stick to is "make the right thing look different (and be easier) than the wrong thing": in this case I think that breaking tenancy boundary should be explicit and more difficult than respecting it (ie sticking to one shard).
That's of course on the assumption that cross-shard queries mean (potentially) cross-tenancy, and that this isn't something that's usually desirable. That's the case in the apps I tend to work on (SaaS) but isn't always the case.
> That's usually well understood
By who? Certainly wouldn't be well-understood by the average dev in the average SaaS company I don't think! Especially if normal joins and cross-shard joins look the exact same, I don't think 90% of devs would even think about it (or know they should think about it).
---
This sounds like negative feedback: it's not! I fully believe that this is a really good tool, I'm really happy it exists and I'll absolutely keep it in my back pocket. I'm saying that the ergonomics of it aren't what I'd (ideally) want for the projects I work on professionally
It sounds like we just need to add a config:
cross_shard_queries = off
What else am I missing? >> I’d want friction on breaking this boundary
> Why do you want friction?
Probably because it makes accidental or malicious attempts to leak among tenants harder, therefore less likely.
Check this out and let me know what you think: https://pgdog.dev/blog/multi-tenant-pg-can-be-easy
I think there are a few good solutions for multi-tenant safety. We just need ergonomic wrappers at the DB layer to make them easy to use.
It’s an interesting idea, but how would such a system handle queries that should cross tenant boundaries? (E.g. system-level reporting)
1. Go around pgcat/pgdog?
2. I have had good luck using pragma comments for that kind of thing: a way to communicate to the infrastructure without the target system seeing it
3. From the "malicious compliance department," I would also accept "include it but in a tautological way" (tenant_id = :ten_id or tenant_id <> :ten_id)
Echoing the comment below (above), since we can fingerprint queries using the Postgres parser, we can create an allow list and a more fine-grained ruleset.