Item 42258330

orf • 17 hours ago

My experience: I put parquet files on R2, but HTTP Range requests were failing. 50% of the time it would work, and 50% of the time it would return all of the content and not the subset requested. That’s a nightmare to debug, given that software expects it to work consistently or not work at all.

Seems like a bug. Had to crawl through documentation to find out the only support is on Discord (??), so I had to sign up.

Go through some more hoops and eventually get to a channel where I received a prompt reply: it’s not an R2 issue, it’s “expected behaviour due to an issue with “the CDN service”.

I mean, sure. On a technical level. But I shoved some data into your service and basic standard HTTP semantics where intermittently not respected: that’s a bug in your service, even if the root cause is another team.

None of this is documented anywhere, even if it is “expected”. Searching for [1] “r2 http range” shows I’m not the only one surprised

Not impressed, especially as R2 seems ideal for serving Parquet data for small projects. This and the janky UI plus weird restrictions makes the entire product feel distinctly half finished and not a serious competitor.

1. https://www.google.com/search?q=r2+http+range&ie=UTF-8&oe=UT...

saurik • 15 hours ago

> given that software expects it to work consistently or not work at all

I mean... that's wrong? If you come across such software, do you at least file a bug?

1 reply

orf • 14 hours ago

Of course not, and it’s completely correct behaviour: if a server advertises it supports Range requests for a given URL, it’s expected to support it. Garbage in, garbage out.

It’s not clear how you’d expect to handle a webserver trying to send you 1Gb of data after you asked for a specific 10kb range other than aborting.

1 reply

saurik • 13 hours ago

"Conversely, a client MUST NOT assume that receiving an Accept-Ranges field means that future range requests will return partial responses. The content might change, the server might only support range requests at certain times or under certain conditions, or a different intermediary might process the next request." -- RFC 9110

1 reply

orf • 13 hours ago

Sure, but that’s utterly useless in practice because there is no way to handle that gracefully.

To be clear: most software does handle it, because it detects this case and aborts.

But to a user who is explicitly asking to read a parquet file without buffering the entire file into memory, there is no distinction between a server that cannot handle any range requests and a server that can occasionally handle range requests.

Other than one being much, much more annoying.