I wonder what the implementation into e.g live streaming would require.

On Linux there are techniques involving a loopback, such as here: https://github.com/umlaeute/v4l2loopback

Effectively these let an app (eg some VToonify tool) generate content that from the perspective of your live streaming app look like they are from a webcam