Effectively delivering live streaming video is hard. Lots of pieces have to be put together which means connections can drop in plenty of places if everything doesn’t fit correctly.
A very big piece of the puzzle is the infrastructure behind transporting a video. You need to find a way to connect the video that is being broadcast with a subscriber who will watch it. Up until now, the best option to send a live video over the internet was a content delivery network (CDN).
The three most important features for live video streaming are:
- Real-time latency
- Multi-directional streams
- Data synchronization
However, it is becoming increasingly obvious that live video streaming and content distribution networks are not an effective combination and cannot support the necessary features. This post covers the reasons why.
CDNs Don’t Meet Today’s High Streaming Demands
CDNs emerged in the ’90s as a way to ease some of the bottlenecks in media delivery for websites, including video — initially prerecorded — or other content. They use HTTP-based protocols such as HLS and CMAF. As such, they work by caching information on a network of geographically distributed proxy servers and data centers. Though perfectly suited for delivery of static content, HTTP is less conducive for packet delivery of constantly updated elements such as live video because the ongoing storage of cached content delays the delivery.
The delay between the moment when a video is broadcast and when it is seen is known as latency. Live video increasingly requires real-time latency as any sort of delay between the broadcaster and subscriber has consequences — for example, conversations suffer as one person might have already started a new statement before others in the room could hear what was just said, resulting in staggered statements and, ultimately, a breakdown in communication.
More than just delaying the video images and audio, latency issues compound when metadata is delivered as well. This causes synchronization problems between when data was collected and when it is delivered. Use cases such as live auctions, live video shopping, video surveillance applications, accompanying text chats, and other interactive live event experiences all need synchronized data to work effectively.
While the CDN approach works well for distributing a stream from a publisher to a series of subscribers, it is not built to address streaming information in the opposite direction, from subscriber to broadcaster. Each server in the CDN is essentially used as an ingest point that pushes the stream to the CDN for delivery at scale. Under this architecture, two-way communication is not efficient since a CDN is best for broadcasting single streams that will just be watched by subscribers rather than a two-way chat where a subscriber is also broadcasting a video while subscribing to a video as well.
That kind of unidirectional delivery is a barrier to interactivity and more exciting experiences. An engaging live streaming infrastructure must support a wide variety of ingest and egress sources. Having access to cameras and being able to send that stream out adds important versatility. For instance, you can incorporate user-generated videos or multiple camera views.
WebRTC to the Rescue
WebRTC works directly in a web browser without requiring additional plugins or downloading native apps. It establishes a connection using UDP and delivers encrypted video over RTP. As such, WebRTC produces the lowest possible latency of 500 milliseconds or less. As opposed to older HTTP-based protocols, WebRTC was designed to create real-time latency.
Such low latency also addresses the issue of data synchronization.. CDNs operate with higher latency which can be leveraged at the application level to synchronize the streams and data as the data is collected. With WebRTC the data can be simultaneously sent over the WebRTC data channel eliminating the need for any additional configurations. Furthermore, the data channel can be connected across multiple clients through a shared object. This consistent transfer of data ensures full interactivity between broadcaster and subscriber and supports any extra features.
Since WebRTC is designed to work in the browser, what about mobile devices running native apps? Mobile apps can create the same experience by using RTSP to deliver streaming video. Since RTSP also uses RTP it provides the same sub 500ms latency.
Does WebRTC Scale?
There is a common misconception that WebRTC is not scalable. This simply is not true. WebRTC does scale, you just have to reimagine the network you use to deliver your content.
Take Red5 Pro’s autoscaling solution which uses a cloud-based clustering architecture rather than a centralized server. Each cluster consists of a system of distributed server instances, or nodes, and includes origin, relay, or edge nodes. Each node in the architecture is a virtual machine (VM) or cloud instance running on either a private network or public cloud. Within this topology, any given origin node ingests incoming streams and communicates with multiple edge nodes to support thousands of participants. For larger deployments, origin nodes can stream to relay nodes, which in turn stream to multiple edge nodes to scale the cluster even further to realize virtually unlimited scale. It’s important to note that the process of scaling is performed dynamically, with a stream manager monitoring the network traffic in real-time to spin up new nodes or spin down old nodes as the network traffic fluctuates.
Furthermore, when connecting an autoscaling solution to a cross-cloud XDN, things can get really exciting. The intention behind implementing an experience delivery network (XDN) is to create fully interactive experiences. Using protocols such as WebRTC and RTSP, an XDN is able to deliver real-time latency of under 500ms while still remaining fully scalable.
Instead of using a single hosting provider such as AWS, Microsoft Azure, Google Cloud Platform, or DigitalOcean, a multi-cloud XDN works with a full variety of hosting platforms. Not only does it do this through real-time latency using advanced protocols such as WebRTC and RTSP, it also unlocks the full flexibility of any hosting platform compatible with Terraform. This means that you can use the best platform based on geographic availability, price, or other factors.
Though CDNs are still useful for passive, VOD-based applications, the future is moving toward more interactive streaming. Users are looking for new features like watch parties, where they come together to watch and react to a feed of a big sports game, or live concerts while establishing a video call between them. These kinds of experiences look to make the virtual feel more like real life. However you build your streaming application, it will have to support three main features: real-time latency, full scalability, and multidirectional streaming.