End-to-end encryption has been named as a required feature for telehealth in Australia. Interest in telehealth has gone from zero to infinity over the past two weeks for obvious reasons. So I've been trying really hard to work out if Zoom is E2E, and reached the same conclusions as the article. First, it isn't, and second, Zoom are really going out of their way to obscure that fact.
It's great that The Intercept is taking a look at this, because it's absolutely beyond the capabilities of healthcare practitioners and the professional bodies to get to the bottom of. There's a ridiculous amount of confusion here, compounded by "you need to get the HIPAA version because HIPAA means privacy".
How do you E2E encrypt a video stream and still allow adaptive bit rates?
If the server can't read (decrypt) the video, it cannot re-encode the video at different bitrates for different clients.
Or the Zoom client has to encode multiple steams and upload them locally...or it just downgrades to the bitrate of the slowest client...
You get shitty video and E2E encryption or good video and transport encryption.
For group calls, it depends on how it's implemented, but many group calls are implemented using what's called a Selective Forwarding Unit (SFU) and the sending clients send multiple resolutions (either independent, called "Simulcast" or dependent, called "SVC"). In that case, the adaptation is done by the server in selecting which resolution to forward at any given time. This is fairly common practice in the industry. For example: https://github.com/jitsi/jitsi-videobridge and https://tools.ietf.org/html/draft-aboba-avtcore-sfu-rtp-00 and https://www.w3.org/TR/webrtc-svc/.
For those types of group calls, the server only needs to know the sizes of the various streams and which packet is for what stream. It does not need to see the decrypted media, so one can implement e2e encryption for such types of group calls. This is less common in the industry, but is possible. For example: https://support.google.com/duo/answer/9280240?hl=en
(I used to work at Google on WebRTC, Duo, and Hangouts, but now work on video calling at Signal).