WebSockets and MSE: Does Wowza’s Solution Provide Real-Time Latency?

Deployment
SHARE

Live video streaming is a complicated process involving an array of codecs and protocols. The team at Red5 Pro has spent the past 14 years analyzing the best approach to real-time live streaming. As such reducing latency and increasing scalability have guided the decisions around what protocols were implemented in the Red5 Pro Platform. One… Continue reading WebSockets and MSE: Does Wowza’s Solution Provide Real-Time Latency?


Live video streaming is a complicated process involving an array of codecs and protocols. The team at Red5 Pro has spent the past 14 years analyzing the best approach to real-time live streaming. As such reducing latency and increasing scalability have guided the decisions around what protocols were implemented in the Red5 Pro Platform.

One such decision made was to not use WebSockets and Media Source Extensions (MSE). As a part of our ongoing series of technical articles, this post outlines how the WebSockets and Media Source Extensions (MSE) works, but ultimately is not fast enough to provide real-time latency.

First, let’s break down the two main elements of this solution – MSE and WebSockets:


What is MSE?

Media Source Extensions is a W3C browser API specification that allows JavaScript to send data streams to media codecs within Web browsers that support HTML5 video and audio using the corresponding HTML5 tags: audio and video. Instead of a single track src value, MSE references an information container known as a MediaSource object. Furthermore, multiple SourceBuffer objects represent different chunks of media that make up the entire stream. By directly manipulating the source for the HTML5 video, this provides greater control over how much and how often content is fetched.

Even though in this article we are discussing WebSocket implementations, players using HTTP based protocols are based on MSE as well. HLS fragments are passed to MSE and displayed by the player. MSE uses the DASH or HLS manifest to obtain the necessary media information and is used directly by the HTML5 video to decode the content. In fact, the underlying technology used in MSE was always being used by the browsers, but only recently has the API been made accessible to JavaScript developers.


What is a Websocket?

WebSocket is a computer communications protocol that uses a single TCP connection to provide full-duplex communication. The WebSocket protocol was standardized by the IETF, and the WebSocket API is being standardized by the W3C.


How do WebSockets and MSE Work Together?

Although WebSocket is different from HTTP, they still work together. WebSocket works over HTTP ports 80 and 443 and supports HTTP intermediaries and proxies. To achieve this compatibility, the WebSocket handshake uses the HTTP Upgrade header to change from the HTTP protocol to the WebSocket protocol.

WebSockets creates a connection using TCP port 80 (or 443 for TLS-encryption), to enable messages to be passed back and forth between a web browser (or other client application) and a web server. By creating a standardized way to send content to the client without being first requested by the client, a bidirectional data flow is established to deliver audio/video data and support interactivity.

Once the path for sending data has been established through a WebSocket, MSE is then used to display the media itself.


Who is using this Solution and (More Importantly) What Is the Latency)?

This creative use of Javascript was pioneered by NanoCosmos and is used in Wowza’s proprietary WOWZ technology. In an attempt to reduce latency, the packets, which represent segments of audio and video, are stored on the client-side in a 250ms buffer before being delivered to the MSE API for playback.

Despite the short segment size, browser implementations of the MSE API are slow to process the video, resulting in three seconds of latency: no where near good enough for real-time. True interactivity cannot happen when it takes three seconds for the subscriber to view the video.


Real-Time Live Streaming with WebRTC

As we’ve mentioned, you can only have real-time streaming if the latency is under 500 milliseconds. The fast pace of information spurred forward through the ever increasing adoption of mobile devices means every second (even partial seconds) count. Broadcasting live events. social media chatting, drone surveillance, and live auctions, among many other use-cases, all require real-time latency.

That’s why Red5 Pro integrated with WebRTC. Currently, our sub 500 milliseconds of latency is the only way to achieve true real-time live streaming. Importantly, Red5 Pro maintains that performance even when scaling to millions of broadcasters and subscribers.

For a more in-depth view of how Red5 Pro works along with all the live streaming protocols in general (WebRTC, HLS, CMAF, and more), take a look at our whitepaper.

Looking for something else? Send an email to info@red5.net or schedule a call.