native libwebrtc for windows: WinRTC

I’m a little bit late on this one, but in May 2019 Microsoft had its “Build” event, and disclosed the new iteration of their windows native webrtc Library. This is the third iteration, and it s remarkable in many ways, especially for Gaming and Hardware Acceleration, so let’s dig into the history and current support of libwebrtc on windows, backed by Microsoft!

I. webrtc UWP

It was a heavily modified fork of libwebrtc, synchronised up to m71. Eventually, maintaining a fork was a daunting task, even for Microsoft with the resources of Unity, Mixer, Hololens, and all the other groups depending on it. They then moved the base layer to the new WinRTC project. You can read more about webrtc UWP here:

https://webrtc-uwp.github.io

II. “3D Streaming Toolkit” and “Mixed Reality webrtc”

Both are additional layers on top of webrtc-UWP that was adding functionalities closer to the gaming apps, including support for more formats, Immersive technologies (AR/VR) and partial Hardware Acceleration support.

Both depend on webrtc-UWP and are effectively deprecated now. WebRTC has changed in several fundamental ways since microsoft had started the 3DST, MR-webrtc and WebRTCUWP.

When they started, there wasn’t really any supported extensible model to either add new encoder/decoder paths nor for adding new or different video formats.

So Microsoft elected to instead overwrite the default H264 pipeline with hooks into (originally nvenc directly) NVPIPE and then into nvenc. Those interested in how to support Nvidia hardware acceleration in webrtc can take a look at the next section.

The main reason for going this route was that it allowed microsoft to support webrtc cross-platform, e.g. running on both windows and linux, requirements that came from the customers they were working with.

Nvidia was the sole hardware encoder of interest, as the main purpose of the project was cloud rendering and only nvidia gpus are ubiquitous in public cloud infrastructures.

Today, webrtc has a much much better factored structure for hardware accelerated implementations which allow for injection and winRTC leverages that. 

III. Notes on Hardware Acceleration

Synopsis of the libraries available for HW acceleration per platform/OS.
Courtesy Tyler Gibson, Author of the 3D Streaming toolkit, Microsoft

1. Conceptual level

While not there yet, the webrtcuwp project had the base capability to enable universal hardware acceleration for any video codec on any supported hardware for windows clients.

There’s a set of complications around nvenc/AMF/quicksync – Namely that the only way to support them on UWP (and thus HoloLens) is through Media Foundation.

Unfortunately, media foundation isn’t supported by any other platform

2. Implementation in webrtc-uwp

Webrtc-UWP uses this encoder for h264 https://github.com/webrtc-uwp/webrtc-windows/tree/releases/m75/third_party/winuwp_h264/H264Encoder

Which because of https://github.com/webrtc-uwp/webrtc-windows/blob/releases/m75/third_party/winuwp_h264/H264Encoder/H264Encoder.cc#L133 enables available hardware transforms. 

3. Choosing the encoder/decoder hardware

Media Foundation is tricky in that it explicitly doesn’t expose underlying hardware.  It chooses the “best available” transform for given conditions to assure compatibility over control.

If you have an nvidia card, it will use it for both encode and decode when possible.

It *is* possible to force priority or explicitly choose an encoder/decoder –

Forcing priority is done through https://docs.microsoft.com/en-us/windows/win32/medfound/codec-merit

But to change this priority requires creating a custom MFT wrapping a system MFT and then assigning it a higher merit value.

See https://github.com/zzhpublic/obs-studio/tree/master/plugins/win-mf for explicit selection from enumeration.

The webrtc-uwp code could be extended to enable this forced codec selection in a straightforward manner.

4. If I had to do it again …

The alternative then is FFMPEG (libavcodec) which does have great support cross platform and basically bundles all the individual vendor sdk’s in a convenient wrapper space.

It uses d3d11 and d3d9 (dxva2) for decoding on windows, and amf/nvenc/libmfx(intel quicksync)/libavcodec for encoding (the latter being a full software implementations)

Microsoft also made some enhanced versions for all microsoft platforms. This is both https://github.com/microsoft/FFmpegInterophttps://github.com/ffmpeginteropx/FFmpegInteropX and https://github.com/M2Team/FFmpegUniversal for building it.

Since ffmpeg is already bundled with webrtc build processes, this should be feasible to extend for uwp. However in any of these cases, there is work to do in webrtc core to override and/or provide alternative flags to openh264 for encoding on windows (there are android/ios hwaccel already).

Swapping out openh264 for ffmpeg (for encode) would be the ideal long-term path forward, but there are very real potential licensing issue there that would need to be carefully examined.

5. Notes on INTEL QuickSync/libmx

INTEL hardware do support the codecs through QuickSync on recent hardware https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video

On windows, you also need driver-level support for the feature to be available to applications.

Here is the current situation on windows:

  • VP9 HW decoding => has been there for quite long (through d3d11va).
  • VP9 HW encoding => Intel graphics driver includes vp9 encoder MFT, but that’s not enabled in Chrome;
  • VP8 HW decoding -> enabled behind a flag.
  • VP8 HW encoding -> encoder MFT was removed from Intel graphics driver some times ago. It thus cannot be enabled in Chrome, or any other native app for that matter.

OWT is a different story.  OWT uses Intel media sdk for the hardware acceleration, and not DirectX or Media Foundation.

III. WinRTC

So WinRTC is the new kid around the block.

It is based on m84, which is not only more recent, but also does not contain the webrtc security bug recently disclosed. Millicast.com SDKs and native apps like IBS-Studio-Webrtc have been updated already, but if you re using another provider for your WebRTC solution, you might want to check which version of webrtc you are being provided.

One of the most visible change is that instead of making a full, heavily modified, fork with it s own build system and everything, like they did previously, it is now a shallow fork. They are staying very close to the original webrtc.org code, with a patch system on top. Several time in the corresponding presentation, they mention the maintenance burden, so we can safely imagine it was the main reason behind this move.

Mixed-reality Webrtc (UWP only) is still alive and kicking on top of the webrtc.org code.

WinRTC is a windows component, a wrapper on top of webrtc.org. In turns it is used as a base for .NET bindings, but also very interestingly React Native bindings!

The architecture of WinRTC

For those interested: you can see a clear presentation with demo here: https://www.youtube.com/watch?v=GKrTmgZT-EA and read more here: https://github.com/microsoft/winrtc/blob/documentation-edits/docs/FAQ.md

It seems like microsoft is really serious about supporting and leading this, while the support for MSVC (non-clang) compilation of webrtc.org is difficult at best. I have no experience of interaction with the microsoft repository, but I think it is worth a try for all those who want to use libwebrtc from MSVC.

Interesting enough, a new GN flag has been added to trigger the use of Hardware Accelerated H264 encoders and decoders through Media Foundation. It is more advanced than the original implementation which was hardcoding NVENC, at the cost of not choosing which encoder/decoder will be used when several are present, as explained earlier in this blog post.

# When is set to true, a H264 encoder using Windows Media Foundation
# will be included in the libwebrtc.lib
rtc_win_use_mf_h264 = false

It will be interesting to see if support for other codecs like VP9, H265, … will be added through MFT as well. We know that the intel drivers support some of those for example, as Apple is leveraging them for H265, and INTEL mentioned VP9 is doable. Otherwise, it sounds like a cool project for a hackathon. I really hope that IETF 109 in novegebr will happen in real-life in bangkok, but given the recent announcement by the Thai government to keep the borders closed until september, it is now more a dream than a hope 🙁

Credits

Thanks to the webrtc teams at Microsoft and at Intel for kindly providing information, and of course for making the code available in the first place.

Thanks to Voluntas, leader of the SORA SFU project in Japan for pointing to some interesting source and material.

Happy hacking y’all.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.