He didn’t go into detail on how the cameras are calibrated by I wonder if he’s using stars to do that or what. Standard SfM would struggle with such featureless images and wide baseline.
I thought: airplanes, the sun, the moon could all work
You need a lot more than one reference though. At minimum I think you need 8 correspondences to solve for the camera’s pose. Things like airplanes could work if you had enough of them, but then the imagery in all your cameras would have to be shutter synced, which is impractical with a bunch of web cams