moefh 4 days ago

> Passing pointers to the middle of a data structure. For example, free takes a pointer to the start of an allocation. The management structure appears just before that in memory; computing the address of which appears to be undefined behavior to the compiler.

To clarify, the undefined behavior here is that the sanitizer sees `free` trying to access memory outside the bounds of what was returned by `malloc`.

It's perfectly valid to compute the address of a struct just before memory pointed to by a pointer you have, as long as the result points to valid memory:

    void not_free(void *p) {
      struct header *h = (struct header *) (((char *)p) - sizeof(struct header));
      // ...
    }
In the case of `free`, that resulting pointer is technically "invalid" because it's outside what was returned by `malloc`, even though the implementation of `malloc` presumably returned a pointer to memory just past the header.

1
josephg 4 days ago

Yeah; I used to enjoy poking through C code in well written programs. It’s amazing what gems you can find by people who really know their stuff.

I saw a string library once which took advantage of this. The library passed around classic C style char* pointers. They work in printf, and basically all C code that expects a string. But the strings had extra metadata stored before the string content. That metadata contained the string’s current length and the total allocation size. As a result, you could efficiently get a string length without scanning, append to a string, and do all sorts of other useful things that are tricky to do with bare allocations. All while maintaining support for the rest of the C ecosystem. It’s a very cool trick!

alexvitkov 4 days ago

The Windows API uses this scheme for one of its 50 string types [1].

I"m not very fond of this design as it's easy to pass a "normal" C string, which compiles because BSTR is just a typedef to it.

You can allocate the exact same data structure, but store a pointer to the size prefix, instead of the first byte - you avoid that issue, and can still pass the data field to anything expecting a zero-terminated string:

  struct WeirdString { int size; char data[0]; };
  struct WeirdString* ws = ...;
  fopen(ws->data);

[1] BSTR - https://learn.microsoft.com/en-us/previous-versions/windows/...