Here's how it could work for limited upload bandwidth users ...
First each participant generates its own public-private key pair, and sends its public key to all the other participants. Next each participant generates its own, random symetrical encryption key (e.g. AES), and sends that key to all the other participants, each encrypted with their respective public keys. Each participant will thus receive the PK-encrypted encryption keys of all other participants which they can decrypt using their respective private keys and store internally. Note all these communications can go via a public server, because the server cannot decrypt any key as it does not have any of the private keys.
After that initial handshaking, the video streaming can take place, with the outgoing video encrypted with the AES key of the participant sending the video. It can be securely sent & distributed via the public server because the server cannot decrypt any of the streams. Each incoming video stream (relayed/distributed via the server) can be decrypted by each participant using the stored key of the participant from which stream originated. Thus Tx bandwidth is just a single video stream which is then distributed by the server - but the Rx bandwidth is that of all streams from the other participants.
To be more secure, AES keys can be changed at regular intervals and all participants updated with the new keys using PK encryption as before.
It's very easy to implement, (easier in fact than SSL), and I don't understand why Zoom did not do this from the start.