This is a translated, adapted version of an original post by NTT’s Iwase Yoshimasa available here, with agreement from the author. As the ecosystem move quickly, some updates were added in blue and in italic.
This post describes the current state (as of september 2016) of MCU and SFU media servers used in WebRTC solutions. I hope it will serve as a quick reference for those wanting to know more about the concepts and the available projects. The details of each product introduced here are not provided, but a link to each product is, so you can read further if you want. Moreover, we almost only mention stand alone media servers, and did not touch on webRTC CPaas or PaaS.
One can divide WebRTC system architectures roughly into two types:
- Those who do not terminate the encryption or access the media
- p2p architecture
- using TURN server
- [Alex Note] Those supporting PERC in the future.
- Those who do
- (also here VoIP-webRTC interoperability server, but not covered here)
As usual, choosing the best solution depends heavily on the use case, and no architecture is considered best overall.
Main Media servers (In alphabetical order)
TL;DR, this is for you, most of the content to follow is summarised in this table. Make sure to read the comment on the right side.Intel Collaboration Suite for WebRTC
Include client SDKs (JS / Android / iOS / Windows), server SDKs (SFU / MCU / SIP gateway). It provides almost everything you wanted already packaged. Instead of developing everything from scratch, the MCU / SFU is Licode-based.
By the way, recently, it was mentioned as a core technology in the partnership between INTEL and South Korea’s telecommunications carriers giant SK Telecom.
Janus core is WebRTC “gateway”, it has been developed on top of libsrtp and libnice (implementation of the SRTP and ICE protocols also used by Google and mozilla). By adding a variety of plug-ins, you can achieve different functions or use cases, for example an SFU. As I wrote in a previous article, it has been used in Slack. Implementation is made in C language. License was originally AGPL, but was changed to GPLv3 after a discussion with Dr Alex and Oleg (creator of CoTurn) on Discuss-Webrtc mailing list..
Acquired by Atlassian, it is a SFU written in Java. Atlassian allowed for the project to remain open source (even though they changed the license) and the development is continuing. From the start Jitsi has been using XMPP for signalling (with clients), and its own XMPP extension: Colibri, to exchange (with the signaling server). It also provides a REST API.
The entire team is very close to the IETF standards and to the browsers. It is the fastest to implement simulcast (maybe the only so far). Emil, the ex-CEO and founder of jitsi and tech lead for the video bridge, is also the author of Trickle ICE.
While Atlassian is obviously using it in their products, others like http://talky.io/ have used it for different use cases.
Originally designed as an MCU, Kurento itself is implemented in C ++. There is SDKs for JS / Node / and Java. Developers can manipulate the Kurento using the SDK. How Kurento media server can be managed with Node.js (Kurento + WebRTC + Node.js) is very detailed. At present, it can also behave as an SFU.
[Alex Note] : bought by twilio on September 20th.
There is also a managed service based on Kurento called NUBOMEDIA, people who do not want to operate on their own servers can either use that or elasticRTC.
Although it was only originally MCU, it can now also behave as an SFU. Licode itself is implemented in C ++. As described above, it has been used by INTEL as a base for their media server logic.
The project being a little bit older than the others, there is no github repository but only a source forge repository.
Developed by Dialogic, it is a commercial media server. The fixed layout is painful, but is an historical problem coming from RFC5707. If you have implemented support for RFC5707 you can probably control it. It should be noted that, in Japan, Softbank Corporation is using it for a very large scale install base. (Reference)
The only SFU “made in Japan” (by Shiguredo) among those mentioned so far. Unlike other SFUs, in addition to functioning as a video router, it implements resource optimisation feature (by mean of snapshot), and many other unique features. Refer to Shiguredo WebRTC SFU Sora development logs for other advanced features.
While this post is about media servers, I think it’s good to remind the audience that WebRTC does not only achieve communication through media servers, there is of course also form of communication that does not pass through the media server (P2P / TURN).
The problem with P2P (a.k.a. full mesh) is that it does not scale very well on client side, i.e. the number of people in a given conversation is limited. If you implements things smartly, according to Mr. Philipp Hancke, you should be able to handle about 8 audio+video flux on a normal PC. In addition, (this is my own opinion) If the communication is not both way, or if you do not render all the video flux, you can handle more than that.
15 thoughts on “Overview of WebRTC Media Servers”
What about Spreed?
Please ask the original author by following the link in the introduction of the post.
Hi Alex – just wanted to provide a few more updates on the Dialogic PowerMedia XMS as it has had some updates since this post (Disclaimer I work for Dialogic):
-PowerMedia XMS now supports both MCU and SFU so developers can chose which they want based on their use-case
-It supports other control API’s beyond RFC5707 (MSML) including a RESTFul, JSR-309, VXML and NetAnn interface
-Support for VP9 codec (including transcoding VP9 to any codec)
-Support for AWS (prebuilt AMI’s provided in regions)
I don’t think the original author was trying to be exhaustive there. In any case, thanks for the information. Given the overwhelming reaction to this post, we will likely make an update sometimes in Q1, and we will add the info your provided.
A small correction by mozilla’s Nils Ohlmeier (@nilsohlmeier):
“” @agouaillard small correction for your #webrtc media server article: Mozilla uses nICEr https://github.com/resiprocate/nICEr … not libnice like the others “”
Robert Poschenrieder @robposch:
@agouaillard ever heard of OpenScape MediaServer? We use it to power all things WebRTC in @CircuitHQ.
We at Infrared5 / Red5 Pro, just released our WebRTC supporting server offering as well. https://blog.red5pro.com/red5pro-release-2-webrtc/
WebRTC to WebRTC, WebRTC to HLS, WebRTC to Flash, Flash to WebRTC, WebRTC to Mobile RTSP, Mobile RTST to WebRTC.
Does any of the media servers for webRTC have adaptive bitrate video capability? Or is there any method of implement adaptive streaming in webRTC in general?
My region has low bandwidth which also fluctuates. Would really appreciate an answer!
The codecs used by WebRTC are almost all bandwidth adaptive themselves, no need for anything specific to be done on the server side.
What does OSS means?
Open Source Software.
Thanks for this article. It would be nice to get similar review regarding webrtc / SIP gateways
I have been thinking about having a separate dedicated list for the flash world (wowza, ….) and the sip world (asterix, freeswitch, kamailio, …) but never found the time to go through all of them. The problem is also to find the right tooling to benchmark and compare them. With the emergence of KITE, we hope to be able to get closer to such tool that would allow to give us unbiased comparison. Maybe early 2018.