He doesn't mention how he calibrates the camera extrinsics or intrinsics. This isn't new, it's basic structure from motion. He is going to be amazed when he "discovers" photogrammetry and bundle adjustment next.
I checked out the code and there is no consideration for camera parameters it seems. It's a neat demo but impractical given the need for precise camera calibration over distance. Long baseline stereo has problems unless you can figure out how to keep the cameras aligned within fractions of a mm over great distances.
At 2:39 he talks about camera orientation and says that he has a solution.
What he is doing isn't long-baseline stereo imagery? And clearly is not aligning the cameras: I'm guessing he is ensuring they don't wobble and he is solving the camera orientation outside of the code you can see (maybe using reference objects - sun or daytime moon?).
I enjoyed his ponderings on bird-strike avoidance - maybe commercialisable. The asteroid detection seemed way more farfetched (light gathering is hard).
In order to project the rays into the voxel grid and have them intersect when they are projecting the same object, the cameras do need to be calibrated. This is very much multi-view stereo CV. I’d guess that he’s taking nighttime photos and comparing with a ground-truth star map and choosing correspondences to compute a rotation/translation matrix for each camera.