Item 42241624

That will probably never happen because of the fundamental nature of blob storage.

Individual objects are split into multiple blocks, each of which can be stored independently on different underlying servers. Each can see its own block, but not any other block.

Calculating a hash like SHA256 would require a sequential scan through all blocks. This could be done with a minimum of network traffic if instead of streaming the bytes to a central server to hash, the hash state is forwarded from block server to block server in sequence. Still though, it would be a very slow serial operation that could be fairly chatty too if there are many tiny blocks.

What could work would be to use a Merkle tree hash construction where some of subdivision boundaries match the block sizes.

texthompson • 3 days ago

Why would you PUT an object, then download it again to a central server in the first place? If a service is accepting an upload of the bytes, it is already doing a pass over all the bytes anyway. It doesn't seem like a ton of overhead to calculate SHA256 in the 4092-byte chunks as the upload progresses. I suspect that sort of calculation would happen anyways.

2 replies

willglynn • 3 days ago

You're right, and in fact S3 does this with the `ETag:` header… in the simple case.

S3 also supports more complicated cases where the entire object may not be visible to any single component while it is being written, and in those cases, `ETag:` works differently.

> * Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-S3 or plaintext, have ETags that are an MD5 digest of their object data.

> * Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-C or SSE-KMS, have ETags that are not an MD5 digest of their object data.

> * If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption. If an object is larger than 16 MB, the AWS Management Console will upload or copy that object as a Multipart Upload, and therefore the ETag will not be an MD5 digest.

https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.h...

danielheath • 3 days ago

S3 supports multipart uploads which don’t necessarily send all the parts to the same server.

1 reply

texthompson • 3 days ago

Why does it matter where the bytes are stored at rest? Isn't everything you need for SHA-256 just the results of the SHA-256 algorithm on every 4096-byte block? I think you could just calculate that as the data is streamed in.

2 replies

jiggawatts • 3 days ago

The data is not necessarily "streamed" in! That's a significant design feature to allow parallel uploads of a single object using many parts ("blocks"). See: https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMu...

Dylan16807 • 2 days ago

> Isn't everything you need for SHA-256 just the results of the SHA-256 algorithm on every 4096-byte block?

No, you need the hash of the previous block before you can start processing the next block.

flakes • 3 days ago

You have just re-invented IPFS! https://en.m.wikipedia.org/wiki/InterPlanetary_File_System

losteric • 3 days ago

Why does the architect of blob storage matter? The hash can be calculated as data streams in for the first write, before data gets dispersed into multiple physically stored blocks.

1 reply

willglynn • 3 days ago

It is common to use multipart uploads for large objects, since this both increases throughput and decreases latency. Individual part uploads can happen in parallel and complete in any sequence. There's no architectural requirement that an entire object pass through a single system on either S3's side or on the client's side.

Salgat • 3 days ago

Isn't that the point of the metadata? Calculate the hash ahead of time and store it in the metadata as part of the atomic commit for the blob (at least for S3).