1. It is impossible to architect end-to-end encryption in a multi-participant, client-server architecture conference meeting. End-to-end encryption is possible only with two endpoints, without an intervening server. For a Zoom conference, or Skype, or Google Hangouts, or Google Meet, or M$ Team, or whatever, each client can have an encrypted channel to the server, but the server must decrypt all of them and then re-send the stream back out to each participant in a single re-encrypted stream. To have true E2E encryption, the server would have to tunnel all encrypted streams to every participant, and each client end point would have to use resources to decrypt in-bound and encrypt outbound in a separate crypto session for each participant. Thus, If I am in a meeting with 100 in the meeting, my computer would have to run 99 crypto stream sessions at the same time instead of only one.
2. In the article they once more say Yuan claimed AES 256 encryption. In a separate thread on Zoom here is a link to an article stating research on Zoom proved their claim of AES 256 was not true, that they were using AES 128 in a reduced security mode.
3. Big deal of the hosting server for a group meeting is not a Zoom data center. The enterprise host must be running the proprietary, licensed Zoom host software, IF that s/w has a monitor and stream feature tucked into it, then the Zoom s/w could easily backchannel monitored streams and keys back to Zoom data centers during "routine s/w patch and update" sessions. Yeah, not much for me to trust there.
@CraginS wrote: It is impossible to architect end-to-end encryption in a multi-participant, client-server architecture...
Impossible is a very strong word. All that is needed is for them to share the same symmetric "session" key.
the server must decrypt all of them and then re-send the stream back out to each participant in a single re-encrypted stream.
Or, it could simply repeat the encrypted data and let the endpoints deal with it. No different than how Internet routers handle VPN traffic; they don't know the details of the payload; they just pass it on.
Thus, If I am in a meeting with 100 in the meeting, my computer would have to run 99 crypto stream sessions at the same time instead of only one.
Imagine a two-phased approach. In the first phase, each computer negotiates a secure channel with the meeting initiator to download the meeting's session key. In the second phase all computers use the common session key for en/decryption. For enhanced security, the initiator could even send out a new session key over phase-one every hour or so.
Although phase-one is extremely expensive from computational and bandwidth standpoints, it would only transfer maybe 1K per participant over the course of a meeting. The much computationally cheaper phase 2 would contain the megabytes of conference payload (video, audio, etc) and could be duplicated amongst participants.
Full disclosure, I might have leaned upon some IETF prior art and perhaps a fairly famous use of broadcast encryption.