Let me be honest, I really dislike marketing in general. Public researcher by training, I’m looking forward to the truth, to reproducible results, and claims that are backed up by data and processes I can access by myself to reproduce the results and the conclusions of the analysis. Marketing is very often the opposite of researching of the truth: it aims at making the market buy your product(s) and service(s). Very often, the end justify the mean, and FUD, deceiving and/or unsubstantiated self-serving claims become the norm.
Let’s be clear, if you don’t have a killer feature, you might as well pretend that either a/ you have it, b/ it is not something important, c/ what you have is so much better. Eventually, as all lawyers know, if you can’t argue the facts, go after the credibility of those bringing them to light. Microsoft have made this approach very famous and even gave it a name: Fear, Uncertainty and Doubt: FUD.
When I read some Wowza marketing piece I cannot help but noticing a strange correlation with those technics, and it’s frankly upsetting. We have just made a “Fact checking” session about WebRTC at Live Streaming West in New York. I think it’s time to do it again here, so at least people can have access to enough verifiable information to make their own mind.
This is not the first time that astonishing claims in wowza blog post drive me to write my own posts. For those interested, you can read my piece about “Streaming protocols and ultra-low latency“, and “Using webrtc as a replacement for RTMP.“
so let’s start an revisit some of the claims here:
“Low-latency CMAF is the new kid on the streaming block. Much like WebRTC, it aims to overcome a key stumbling block in the industry: reducing the delay between video capture and playback.”
CMAF is not a streaming protocol in itself, it is a part of both the HLS and MPEG-DASh stacks. CMAF design was not aimed at reducing latency. It was mainly aimed at unifying the file format for the chunks in file-based streaming protocols. Before CMAF, the CDNs had to store / cache the same content at least twice to support both HLS and MPEG-DASH. With a common storage denominator, the CDNs can now reduce their storage drastically, and possibly reduce their cost to serve a given content. That is recognised in the wowza blog post, but much much farther down, in the “What is CMAF” section.
While they were at it, they took the opportunity to implement an optimisation that most people where implementing in a proprietary way anyway: the reduction of the chunk size, which turns into lower latency. This is but only a by-product of the effort, and only an incremental improvement in a design that CAN NOT go below a second delay.
The inventor of MPEG-DASH and Founder of BitMoving was at least honest at NAB during his talk: “CMAF is a great improvement for people already using HLS and MPEG-DASH, but if you re looking for real-time, other protocols like WebRTC are better suited.” He is also an ex-academic with a PhD, go figure.
“CMAF is capable of sub-three-seconds latency and WebRTC is capable of sub-Second latency“
This is not a wrong statement, just a deceiving one. WebRTC is capable of sub 300ms latency in normal condition, which indeed makes it sub-second, or even sub-minute, sub-hour, sub-month ….. latency. WebRTC latency is by the way referred to as “sub-500ms” later in the text of same blog post. Go Figure.
“CDN scales, at cost, with great quality, WebRTC does not scale, is costly and quality limited.“
Last months, We listed WebRTC broadcast server vendors for our Live Streaming Media West presentation:
red5, ant media, Janus, Jitsi, Medooze, INTEL’s Open WebRTC Toolkit, …
and the service providers:
MilliCast, LimeLight, PhenixRTS, Agora.io, …..
All of them proposing solution that reach to millions of viewers with a cost of ownership getting them close to the CDN prices today.
Their only citation (setting aside self-citations) is a blog post by Tsahi, a very knowledgeable webrtc consultant, but dating from early 2018, and complete obsolete today. Moreover, They ignored or did not read all the way through to the comments, including one from august 2018 which already pointed to a more accurate answer to the question, namely that WebRTC DOES scale, scientific publications proving it provided therein.
“You need transcoding to scale, which adds latency“.
The transcoding on the server-side is a known limitation of WebRTC … pre-2017. For all the services that tried to adopt WebRTC before, like phenixRTS, Red5 / LimeLight, or wowza itself, since WebRTC did not yet implement support for ABR (simulcast), or an SVC codec, you had to transcode server-side. It is not the case anymore.
Funny enough, even with this transcoding which could go as much as doubling the latency, one would go from 300ms to 600ms latency and still achieve sub-second streaming. Yes, it added latency, but it was still 10 times faster than what HLS and MPEG-DASH could do at the same time. Go Figure.
“it is browser only“,
Have you ever used Facebook messenger, Cicso webex, microsoft skype native apps? You have then used WebRTC. It is definitely not browser-only. Actually, if you re using any app that includes real-time audio or video communication today, you have 80+% chance that it is using webrtc under the hood. Before being bought by google, GIPS media engine was the best engine on the market. Now, it is still the best (arguably after DOLBY’s) and it is open-source.
WebRTC is a technology that has two parts: JS APIs for browsers, and media, codec, and security protocols defined for all that connects to the internet at the IETF. That latest part allows for ANY software or hardware with capacity to connect to the internet to be WebRTC-Compliant.
There is an open source native implementation in C++, another one in python, and multiple SDKs out there, including implementations that run on IoT hardware. There are also multiple open source solutions out there, including signalling servers and media servers. The latest one to have been open source was the Complete WebRTC Communication Suite by INTEL under the “Open WebRTC Toolkit”.
“it employees three HTML5 APIs built into Chrome, Firefox, and Safari to allowing direct browser-based communication“
The official “Use Case” document (RFC 7478 published in 2015), lists many other uses case, including gaming, as well as non-browser-to-browser use cases.
In 2017, Jean-ivar Bruaroey Mozilla employee and now editor of the WebRTC working Group at W3C published This article explaining the evolution of WebRTC use case and API since ints inception in 2011.
While in 2011 the official WebRTC use case was the one Wowza describes in its May 2019 blog post, it is not really the case since at least 2015.
Nowadays the use case has been extended, with actually dedicated specification for non-browsers peers, and also take into account more use cases than p2p. The original 3 HTML APIs have morphed into 224 pages of APIs nowadays (HTML5 APIs here), and that’s without accounting for the capturing of media (98 extra pages here), or media recording, and the extension of permission APIs and corresponding testing (chapter 10.5 here).
“The highest resolution when streaming with WebRTC is 720p“
This is wrong in many ways.
Then, browser-based WebRTC has NO SIZE LIMIT when it comes to screensharing. It will stream up to your screen, or dual screen resolution.
Finally, when using the native code, there is no limit to the size of the frame you can pass on to the webrtc streaming engine. You can take a look at our OBS-Studio fork which has webrtc support: we re-use the OBS capturers, of any size, and pass the corresponding frames to the webrtc.org stack.
“Here is the table of our recommendations“
This will act as a conclusion tho this blog post. Everything that Wowza has described in their blog post as WebRTC is not WebRTC like it is today, it is WebRTC like it was designed almost 10 years ago, and apparently like Wowza is implementing it in their product. While it would be unreasonable to ask everyone to attend the standard committee meetings and be a world expert (there is only a dozen of us around the table really), and while I understand simplification is necessary to bring a complicated concept to a larger audience, repeating over and over the same 5-to-10 years old cliches with about a technology you compete with, seems counterproductive at best.
Here is the table the way it should be:
Low_latency CMAF (HLS or MPEG-DASH) should be used for:
1. Existing Use Cases that are file-based, and HTTP-based. You will likely enjoy lower price, less storage, and lower latency. It’s a great incremental improvement, with low cost to upgrade.
2. New Use cases that can accommodate multi-seconds delay without loss of value. The Current tech stack is arguably more matured than most WebRTC offers today, and some business cases (ad-based, …) are not yet supported by any WebRTC vendor today.
Wowza’s WebRTC Implementation should be used for:
1. One-to-few interactive streaming
2. Video conferencing for small groups
3. Audio/video calling
4. Small-scale product demos
5. Only if you don’t need TURN, and no WebRTC egress.
6. You have a huge number of existing Wowza servers with a lot of ad-hoc logic and additional extensions which makes transitioning out a huge pain.
WebRTC 2019 (a.k.a. WebRTC 1.0) should be used for:
1. One-to-many interactive streaming
2. Live sports and e-gaming
3. Online gambling/auctions
4. Large-scale product demos
5. In general, any service or product that cannot compromise on Latency or Interaction, at scale, with quality.
As mentioned in conclusion of our previous blog post, the big players like LimeLight or Verizon Digital Media Services are not trying to differentiate on latency, scale, nor quality with WebRTC, they have understood it’s a lost war already. Why fighting the obvious, at the risk of appearing incompetent or lose hard-won credibility? They are already trying to differentiates on advanced features like ad-insertion, watermarking, recording.