It’s been clear for some time now that SVC Codecs are the future. Most of the biggest players (IBM Sametimes, Cisco, MS Skype, …) already use an SVC codec in their product, but what about webRTC? Well, things are getting together to support SVC in webRTC, and while everybody was waiting for VP9, it looks like VP8 will actually be one of the first SVC codec available!
2016 Quarter 1 has seen a lot of improvement in the video codec part of webRTC:
- Capacity to support more than one video codec at a time,
- Capacity to support external codecs,
- Capacity to support FEC/RED/RTX for several codecs at a time (Cr50),
- Simulcast signaling support,
- Better (faster) ramping of bandwidth usage,
All of those are under the hood modifications that do not impact the JS application per say, as they do not modify the JS API (except for the simulcast signaling), but improve greatly the user experience: faster call set up, less video freeze, smaller freezes, more interoperability, ….
However, as far as Video is concerned, everybody was waiting for three things: H.264 for interoperability with legacy VoIP, vanilla VP9 for bandwidth saving, and SVC for .. well many reasons I already exposed in previous talks (e.g. here).
The specs and the implementation for VP9 SVC are not finished yet, and while google and vidyo are working hard on this, it looks like it’s gonna take some time. No problem, VP8 actually includes limited SVC capacity called temporal scalability.
SVC can theoretically handle different kind of scalability, the most usual being spatial (frame resolution), temporal (frame rate / frame drops), and quality (color quantization). Quality manipulation is usually used for bandwidth adaptation in non SVC codecs.
The advantage of doing temporal resolution manipulation through SVC instead of just dropping frames in an usual MCU for example, is that it does not require to decode the media stream to know which frame to drop. the information is readily available in the packet headers, and dropping frames becomes as easy as dropping packets, where decoding the frame (and re-encoding the resulting stream) can represent as much as 80% of the cost of media manipulation in an MCU.
So VP8 can handle temporal scalability, but was not built that way in chrome. Some smart vendors (*cough* tokbox *cough*) were already taking advantage of it. One can add temporal scalability support to all native code (in this specific case, the mobile SDKs) to get an extra edge in the mobile-mobile use case, while still using normal VP8 when a browser in involve. Smart.
Well, it seems that google is now adding this feature in the webrtc engine and in chrome. The ETA is unknown at this point, but interested parties can follow this code review entries (header modifications here, and SDP options here) to know more.
It makes a lot of sense to start experimenting with SVC with only one scalability aspect, and on a codec that has been supported in webrtc for a very long time now, before jumping on VP9 which is less mature.
Happy hacking.
This work by Dr. Alexandre Gouaillard is licensed under a Creative Commons Attribution 4.0 International License.
This blog is not about any commercial product or company, even if some might be mentioned or be the object of a post in the context of their usage of the technology. Most of the opinions expressed here are those of the author, and not of any corporate or organizational affiliation.
Thanks for helpful update, Alex.
Apparently Open H.264 also supports temporal scalability:
> Temporal scalability up to 4 layers in a dyadic hierarchy
Do you have any insights into whether it may be leverage in Chrome and/or FF now or in the future?
chrome is using a mix of OpenH264 and ffmpeg for H.264, which makes it more complicated. In any case, it looks like Chrome, firefox and Microsoft are all pushing for VP9 for full SVC support from what I can see and hear from them. H.264 is being relegated to interoperability and Mobile HW accelerated use cased (the later being another case which, AFAIK, which does not support temporal scalability).