I haven't installed this yet, but does it require camera access? i.e. does it "transform" your own image to the target image while maintaining facial expression, pose, etc.? Based on the animations, I'd assume it doesn't use the camera since there are techniques that can lipsync from audio.
no camera access needed! it directly generates the image via audio. this is more then just lip sync btw, it's animating the head of the image.