I enjoyed reading this. Admittedly I'm very new to the activity pub protocol, but it's hard to grasp at first how this leak actually occurs.
I read this part of the activity pub spec and I think I get it, but not completely. So it is really up to the activity pub server implementation to strip the bto/bcc audience fields and do the "right thing" in order to preserve privacy? Could anyone shed some light on this?
https://www.w3.org/TR/activitypub/#remove-bto-bcc-before-del...
@[email protected] follows @[email protected]. pixelfed.social now receives all posts Alice makes which Bob can see. It has to, because it has to show these posts to Bob.
@[email protected] does not follow Alice. pixelfed.social has a bug where it's not correctly filtering out Alice's posts when Mallory visits her profile.
The root cause is pixelfed.social thinks Mallory is following Alice but mastodon.social thinks Mallory does not follow Alice. This is due to pixelfed.social accepting follow requests slightly too early in the handshake between pixelfed.social and mastodon.social.
There was an interesting follow-up to this post that adds more context to the incident and problem space: https://lemmy.world/post/27522773
This really sounds like a problem with ActivityPub if it doesn't have a protocol-level mechanism for this. The idea that an incomplete AP implementation is less secure than a complete one is worrisome to say the least.
> the release dropped. While the version increment (v0.12.4 to v0.12.5) implies a minor update, it’s a huge leap. We’re totalling more than 450 commits, including the requirement of a new version of PHP
yeah this is not a great way of doing things (even for solo devs)
ActivityPub just hands out "private" posts and trusts the foreign server implicitly to only show them to the right users.
But it's pixelfed's fault
The post states clearly that the foreign server only gets private posts if one of that server’s users is authorized to read the posts. How else do you expect it to work?
Either encrypt it such that only the authorized user can read it, or require the authorized user to retrieve it directly. Or don't lead users to expect that you're offering any privacy.
From the user's point of view, authorizing some other user to read something doesn't mean authorizing whoever runs that user's instance to read it. If your protocol has an architectural problem with that, it means you designed your protocol wrong.
> Either encrypt it such that only the authorized user can read it, or require the authorized user to retrieve it directly.
> From the user's point of view, authorizing some other user to read something
You keep using the singular "user", but the posts in question are not messages between one user and one other user but rather "followers only", such that anyone who follows the account is authorized to read the post.
> doesn't mean authorizing whoever runs that user's instance to read it.
That wasn't even the issue.
I would expect that only authorized users can access authenticated data and that we don't blindly assume a foreign server is 110% trustworthy at all times.
Maybe instead of just propagating authenticated cleartext data to unknown servers, users should get that data directly from the authenticating server?
We replicate public posts to reduce server load. Your server can rebroadcast my message to your 10k users instead of my server handling those 10k requests. But doing this for private data you need to be logged in to access is unnecessary and dumb. I have a perfectly fine server that is trustworthy. I want to send some private data to a single individual. So obviously I send that data off in plaintext to some random third server which I must assume is as trustworthy as my server so it can relay that data to (hopefully) only the recipient.
I expect that if I send private data to one user, it goes to that user and no one else. The fact that the only thing stopping a foreign server from publicly posting my private data for everyone is a "please don't" flag on the packet. Does this sound like a well designed and robust protocol?
The real meat
> The problem only becomes apparent when you have at least one legit accepted follower from a Pixelfed server. Now that server is allowed to fetch all your private posts. And when it knows the posts, it has to decide who to show them. When you accept a follower, you not only place your trust to keep a secret on them, but also on their admin and the software they are running.
Like I get it, compromises had to be made due to cacheing because it would be untenable if the same server had to fetch a single post hundreds of thousands of times but this makes activitypub an extremely high trust protocol between servers.
So wait. You have a federated protocol that trusts and expects every instance to enforce a user privacy setting?
That is, put simply, utterly incompetent shitty design.
How is that different from email?
Most email is not E2EE. Thus, it's up to the email server to ensure that only authorized accounts on the server can read the received emails. The sender's email server has no control over the receiver's email server.
It would be a scandal if a popular mail server implementation allowed any account on the server to read private emails.
> How is that different from email?
Expectations have evolved just a bit since 1982. We've learned things. Competent protocol designers don't ignore decades of improvement in the state of the art.
Also, and/or as part of that--
1. Email was expected to be mostly business email, and the operator of the instance was expected to be the business involved... meaning they were fairly likely to "own" the content anyway. Insofar as people were thinking about personal email, the expectation was that, in the long term, you'd be running your own server. Which you should still be doing.
2. Even if that broke down, you at least expected that the person you were sending mail to had some much closer relationship with their server operator than you usually do with a social media operator. Or with GMail.
3. Users were much more sophisticated and could reasonably be expected to understand the risks.
4. Nobody actually expected there to be all that much "embarrassing" content that people wanted to keep private.
5. Cryptography was far less widely understood, and most people thought it was legally risky to use it because of export controls and various government threats.
6. Nobody was offering a security setting and then failing to deliver on the obvious expectation it created.
> It would be a scandal if a popular mail server implementation allowed any account on the server to read private emails.
Indeed. And what Pixelfed is doing is also bad. That doesn't change the fact that the protocol is a bad design... unforgiveably bad any time after about 2000.
One would hope that a protocol designed in 2018 would have more security than email.
This would be solved with encrypted messages. I'm sure dansup can figure this one out, we just need keypairs at the user level.
wait so if you have 10k followers you’re proposing to encrypt every post 10k times? (we’re talking about posts not DMs)
That's not how it should work though. The post should get encrypted only once with a symmetric key i.e. AES, and then this gets encrypted with each of the followers' public keys. So it's not the post itself but the encryption key that must be scaled. This is how PGP-encrypted E-mail works.
you’re still encrypting n-times, the post needs to be stored n-times and that’s even before thinking about key rotation and server portability
this is not practical
> if you have 10k followers you’re proposing to encrypt every post 10k times?
I mean, yes. Why you’re sending a “private” post to ten thousand people is another question.
that sounds like a UX nightmare, what happens when you approve a new follower? do you encrypt your entire post history for them? how long would that take ?
> do you encrypt your entire post history for them?
Sure.
> how long would that take ?
Shouldn’t take THAT long. This is the cost of privacy.