Live streaming involves integrating a few different pieces of technology. Those pieces must work seamlessly together and offer a full range of features. Achieving the best performance, highest utility, and best value is paramount.
Based on the latest trends in video-first applications here are 8 key components for assembling your live video streaming tech stack.
Real-Time Latency: 500 Milliseconds or Less
Latency is the difference in time between when the video is taken and when it is delivered to the viewer. As such it has a fundamental effect on the user experience.
More than just low latency, you need real-time latency of under 500 milliseconds. Anything higher than that will negatively impact the proper flow of live streaming. If there’s a delay of multiple seconds, it’s not live. It’s really that simple.
Obviously a two-way (multi-directional) feed that involves back and forth communication will suffer from high latency, but one-way streaming will equally suffer. Take for example this year’s broadcast of America’s largest sporting event, The Superbowl where the fastest latency was 28 seconds. Some even saw the game with around a minute of latency. That’s plenty of time for someone at the stadium to text or tweet spoilers to one of the world’s most watched events, certainly no where near real-time streaming. Not to mention the issues that could create for gambling. Forget spoilers, that’s just plain cheating.
Furthermore, if you are looking to add interactive features such as a fan wall or watch party with your friends that will require real-time latency as well. The NBA for example is one of the major interests behind bringing fans into the stadium virtually. That entails creating a video panel that displays selfie videos of fans as they are enjoying–score dependent–the game. By capturing audience feedback in real-time, crowds can actually cheer as something happens. Not only does this improve the fan experience, but the players enjoy it too. Other live events such as concerts and stand-up comedy performances can benefit from the same technology.
Outside of the live entertainment industry, real-time latency is required for many other use cases.
Drone surveillance plays a key role in safety applications such as combating fires, floods and other emergencies. In addition to surveying the landscape, drones can also provide coverage outside the range of fixed cameras and, in the case of smoke-choked air space, outside the range of helicopters and airplanes. Drones also serve as a major component in military and border patrol operations worldwide.
Additionally, the education field benefits from real-time latency. In order to have fluid back and forth conversations between students and teachers the video must be able to travel as fast as possible. That way clarifying questions are asked in a timely manner avoiding any confused backtracking. Other features such as anti-cheating student monitoring services–like Honorlock–need to be able to respond to instances of cheating as soon as possible.
Of course, the simple act of having a back and forth conversation requires real-time latency to prevent the conversation from being stuttered and unnatural. This affects many applications from social platforms, conferencing solutions, and telehealth practices. When there is a delay of seconds, there is no way for conversation to flow smoothly. This leads to a confusing experience for your end users.
Your video streaming technology stack needs to incorporate WebRTC as that is currently the only browser based protocol that can meet the sub 500 milliseconds of real-time latency requirement. In addition to WebRTC, other protocols like RTSP can be used in Native mobile apps to achieve the same performance.
The delivery of actual real time latency helps with synchronization as well.
To over-simplify, there are essentially three types of data transported through a live stream: audio frames, video frames, and metadata. The audio and visual frames of data taken respectively from the microphone and camera are packaged together for transportation across a network. Since the audio and video are muxed together, there is less concern about them arriving at the same time. That leaves the other part: the metadata.
Essentially, metadata plays an important role in expanding the functionality of live video. Elements included with the video feed can perform actions such as placing a bet or sending a chat message, to containing information like GPS coordinates or video overlays. For example, a sports broadcast could feature a real-time graphic display which should remain up to date with everything happening on the screen. for Esports events.
Educational applications can feature a digital whiteboard where the teacher can share notes and pictures with the class.
That data needs to be synchronized properly in order to be useful. Otherwise there will be a disconnect between what is on the screen and any additional information that should correspond with it. If there is a delay between when a user clicks to place an auction bid and when it actually registers in the system, that would be a problem. The bid could either be invalidated by a subsequent bid or–even worse–the price was higher than anticipated as the real-world auction could be ahead of what the online attendee sees. Furthermore, a live sports stream could contain viewer information such as how long they watched and if they switched away at any point. Esports applications such as those made by Singular Live and Activision Blizzard, is another case where interactive graphics, overlays, and gameplay data needs to correctly align.
One approach to ensure proper synchronization is to have high latency so that there is time to line everything up correctly as the data is collected. This was the approach taken by Net-Insight's DY product (now owned by Amazon). As already discussed, high latency is a blocker for many use cases making real-time live streaming the only viable option. and make sure it’s synced after the fact, or just have real-time latency to begin with)
Just as WebRTC is the fastest way to transport the video feed, the data can be sent over the WebRTC data channel. Alternatively, separate websocket channels can provide a similar latency. Both techniques were implemented by Red5 Pro with our SharedObjects method. SharedObjects manage data feeds across multiple clients allowing for the consistent transfer of data. This ensures full interactivity between broadcaster, subscriber, and any extra features.
A third component to a live streaming video stack is a method for adding as many video streams as your app demands. Both the ability to add more broadcasts such as audience webcams, or non user contributed video like a drone feed or additional camera angle during a live event. It’s equally important to have the ability to add many subscribers that can watch these streams. Scale of the ingest and egress is equally important, and most platforms don’t do a good job on both of these. More on the importance of multidirectional streaming in the next section. For now let’s take a look at how app developers scale their one to many live OTT broadcasts.
For egress streams the longstanding approach is using an HTTP based CDN architecture to provide scalability. However, this comes with a major tradeoff in latency. This is due to the need for CDNs to cache data in different physical data centers which causes a slow down in the live streaming process.
Even though scaling is a pretty easy concept to understand, configuring a fully scalable architecture is a little more involved. The use of cloud-based infrastructure circumvents CDN limitations. The configuration of cloud-based autoscaling enables servers to dynamically spin up or down as needed. Red5 Pro utilizes the peer-to-peer communications mechanism of WebRTC, and server software that is hierarchically deployed in three-tiered clusters across both public or private cloud environments.
Each autoscale cluster can consist of four nodes: Origin, Edge, Relay, and Trancoder. The core Origin is where encoded content is created or ingested, Edge nodes are responsible for delivering unicast streams to subscribers, and Relays connect Origins to Edges to add more connections exceeding what a single Origin can handle on its own. Transcoders generate a configurable number of different quality and bitrates ladders to ensure that stream quality is maximized according to the available network conditions.
Basically, the autoscaling system works by spinning up as many Origins, Edges and/or Relays as needed. The way the system knows when to spin up new instances, comes from an additional node called a Stream Manager. While the Stream Manager is technically outside the autoscaling cluster, it contains all the connection logic to connect a broadcast stream to the correct Origin and the subscribing stream to the corresponding Edge.
For geographic scaling, autoscaling server clusters can be set up in different regions. This ensures that participants streaming from across the globe can still access the video stream. To get around the issue of different hosting providers in different parts of the world, a cross-cloud solution allows access to the best regional data center regardless of providers.
Another layer to scalability is supporting multi-directional streaming.
The common CDN approach is good for unidirectional delivery where a publisher can send a stream to many subscribers. However, CDN infrastructure does not effectively support subscribers communicating back to the publisherer. Being confined to only streaming in a single direction blocks the creation of interactive live events. Most of the excitement of live activities is the connections between all participants. Otherwise, it’s basically just like streaming a movie from your couch.
A good live video streaming technology stack requires supporting a wide range of media players while also handling the ingress and egress of live streams coming from publishers. Having access to cameras and being able to send that stream out adds important versatility.
Multi-directional streaming bolsters a variety of use cases. It unlocks the ability to handle user generated streams and works well with surveillance applications that often have many ingest sources with a small number of subscribers.
Other uses for multi-directional streaming are hybrid applications such as watch parties where groups of multiple people gather in a video chat room to watch the same broadcast at the same time. For venue based live events such as music or sports, a video wall can be setup where spectators stream video of themselves watching the event which creates a more similar atmosphere to that of in-person crowds. Social applications such as Twitch and Facebook Live also require multi-directional streaming.
Having a full set of options is only as useful as the ability to actually integrate them into any specific application.
An HTML5 SDK makes configuring a webapp much easier and practical, while a Mobile SDK for Android and iOS makes configuring a native app much more effective.
By its very nature, security configurations such as a round-trip authentication plug-in need to be customizable as well.
Of course, there is the consideration of adding custom features. In addition to all the extra features discussed above–all of which also require customization–another example is adding custom camera angles for live events where users can choose how they want to see the action. Depending upon the business model, ad insertion would be useful. There is also the consideration of the always evolving VR technology. Customization is very important so the tech stack can be flexible enough to adapt to future requirements.
Being able to build your own server side logic can also be a major feature in choosing a video streaming platform. Make sure that you are able to build in those custom webhooks and tie into your own authentication. Oftentimes a provider’s API can come up short on key features you need for your application. A good example of this is Vimeo Live. They work really well for configuring a specific feature but adding more can be a problem.
In other words, customization allows your application to stand out and be truly unique and useful.
Another important choice is where and how you will host your entertainment streaming platform. While choosing a convenient Platform as a Service (PaaS) solution can be an easy way to go, it can have real consequences if you end up in a service trap. As Platform as a Service (PaaS) providers make decisions, you may need to move your application onto one that better supports your needs.
Business decisions that are almost entirely outside of your control, can greatly affect your business. PaaS providers are subject to the same issues many other businesses face. Internal decisions shaped by evolving market trends or a more dramatic company buyout can result in big changes. They could drop support for a region where many fans live, or change their pricing. Apprehension over having to rebuild your entire application from scratch can lock you into working with a company that could slow down your growth.
The best option is to use a solution with a flexible API that is hosting agnostic and allows you to build your own server side application logic. This allows you the freedom to port your application over to another hosting provider. It also means that if you want to add features that you didn’t foresee early in your development, you can do so. Multi-platform support removes the risk of being permanently bound to a single provider. The aforementioned mobile SDKs should be portable as well as you don’t want to tear everything down and have to build it all up again.
Any technology stack should include good support. It is essential for ensuring that everything works as expected and that nothing is left out in terms of functionality. Making the most out of every part of the stack is important. Advanced support contracts, chat channels and online ticketing systems can all be leveraged to make sure those needs are met.
Everyone wants to know they are getting the best value. Some streaming platforms employ per minute or per stream charges. While this may work in the short term, it could be very costly as everything starts to scale. Depending upon the choice of hosting platform, charges for data usage are calculated in different ways. Sometimes it’s tiered, or metered usage with a surcharge added.
Red5 Pro features a monthly or yearly subscription for a software license. As it is a non-hosted solution all data charges and charges for running a server instance are paid directly to a hosting provider. There are no additional surcharges tacked on the data you use. Instead Red5 Pro charges for each server node running the Red5 Pro software. In this way it doesn’t matter how many connections are established or how much data is consumed.
Interested in building your own live streaming application? Let us show you what Red5 Pro can do by sending an email to email@example.com or schedule a call.