NavinF 8 days ago

Last time I saw someone complaining about scrapers, they were talking about 100gib/month. That's 300kbps. Less than $1/month in IP transit and ~$0 in compute. Personally I've never noticed bots show up on a resource graph. As long as you don't block them, they won't bother using more than a few IPs and they'll backoff when they're throttled

3
marcusb 8 days ago

For some sites, things are a lot worse. See, for example, Jonathan Corbet's report[0].

0 - https://social.kernel.org/notice/AqJkUigsjad3gQc664

NavinF 6 days ago

He provides no info. req/s? 95%ile mbps? How does he know the requests come from an "AI-scraper" as opposed to a normal L7 DDoS? LWN is a pretty simple site, it should be easy to saturate 10G ports

nonrandomstring 8 days ago

Didn't rachelbytheebay post recently that her blog was being swamped? I've heard that from a few self-hosting bloggers now. And Wikipedia has recently said more than half of traffic is noe bots. ARe you claiming this isn't a real problem?

NavinF 6 days ago

How exactly can a blog get swamped? It takes ~0 compute per request. Yes I'm claiming this is a fake problem

lmz 8 days ago

How can you say it's $0 in compute without knowing if the data returned required any computation?

NavinF 6 days ago

Look at the sibling replies. All the kvetching comes from blogs and simple websites, not the ones that consume compute per request