AWS Outage Underscores Need for Cross-Cloud Streaming Architecture

Cross-Cloud
SHARE

Significant internet service disruptions triggered by the latest AWS outage have once again made the case that providers of live-streamed video can only attain the persistent performance they’re looking for through reliance on a multi-cloud streaming infrastructure. The widely reported outage centered in the AWS US-EAST-1-Region over several hours on December 7 caused disruptions impacting… Continue reading AWS Outage Underscores Need for Cross-Cloud Streaming Architecture

Significant internet service disruptions triggered by the latest AWS outage have once again made the case that providers of live-streamed video can only attain the persistent performance they’re looking for through reliance on a multi-cloud streaming infrastructure.

The widely reported outage centered in the AWS US-EAST-1-Region over several hours on December 7 caused disruptions impacting users in several regions worldwide. As described by Fortune, “The list of high-profile companies significantly impacted by Tuesday’s Amazon Web Services outage reads like a who’s-who of everyday 21st century brands: Netflix, Disney+, Ring, Robinhood, Instacart, Roku, McDonald’s, even Amazon itself.”

Such outages, of course, are not unique to Amazon. Indeed, outages are a fact of life with all iterations of datacenter technology, whether it’s used in private operations or to support public services.

Multiple Disruptions Impacting Millions

But disruptions are especially damaging when the affected facilities are used to provide services to hundreds of millions of end users, as in the cases of AWS and the other major infrastructure-as-a-service (IaaS) providers, Microsoft Azure and Google Cloud. For example, in December 2020 a Google Cloud outage interrupted several services with mass user bases,  as did another the following April Microsoft Azure experienced two outages with global impact from different causes three weeks apart in March and April 2021. And in late November 2020, AWS registered an outage with an impact similar to that of the recent disruption.

When it comes to video streaming, the outage issue is also a major problem for services that depend on content delivery network (CDN) platforms that run on a fixed set of facilities under proprietary control of the CDN operator. Akamai is a case in point with several incidents that generated headlines over the past two years, including two in June and July 2021.

Akamai’s problems were resolved after an hour or so but were disruptive enough to have knocked out online operations at several banks, airlines, and other entities around the world. The risks to companies that depend solely on a single provider like Akamai are unavoidable. As noted by Light Reading, Akamai has acknowledged in at least one quarterly SEC filing that “cybersecurity breaches and attacks on us … could lead to significant costs and disruptions that would harm our business, financial results and reputation.”

Given all that can go wrong when datacenter facilities are used to support complex streaming operations, outages are an inevitable downside to capitalizing on the many benefits that have driven the world’s shift from proprietary hardware to reliance on software running on commodity appliances. Reporting on findings from a comprehensive survey of cloud outages over several years, researchers at the University of Chicago noted, “Service outages are hard to escape from. It has become a new year’s tradition that news websites report the worst cloud outages in the previous year.”

Nor is there any sign that power outages, cyberattacks, software glitches, component failures, and other causes behind the disruptions will diminish anytime soon. As Rob Enderle, principal at the Enderle Group, told the E-Commerce Times following the 2020 AWS outage, “These are complex systems undergoing maintenance at a component level and almost always under attack.”

The New Cross-Cloud Streaming Model

The way forward was well articulated in the aftermath of the December 2021 AWS outage by Yale Law School cybersecurity expert Sean O’Brien. “The latest AWS outage is a prime example of the danger of centralized network infrastructure,” O’Brien said in an interview with CBS. Instead, he added, we need a new network model that resembles “the peer-to-peer roots of the early internet.”

Nothing could be truer when it comes to video streaming, especially when the use cases involve live video streaming. As the ability to stream video in real time becomes ever more essential to applications across every segment of the digital economy, the market can no longer tolerate the risks of relying on traditional approaches to streaming content, whether the support comes from servers of a single IaaS provider or those under exclusive control of a CDN operator.

This is why Red5 Pro’s Experience Delivery Network (XDN) platform is built on a cross-cloud architecture. The platform supports fail-safe real-time streaming across multiple clouds at any distance in any direction with end-to-end latencies no greater than 200ms-400ms and often much lower.

Customers can take advantage of XDN pre-integrations with AWS, Microsoft Azure, Google Cloud, and DigitalOcean to employ any combination of these IaaS platforms in cross-cloud operations. And they can tap any of more than a dozen other IaaS providers whose facilities can be integrated into an XDN multi-cloud operation via the widely used Terraform open-source multi-cloud toolset provided by Hashicorp.

Terraform facilitates cross-cloud instantiations by translating IaaS resources into a high-level configuration syntax that allows IaaS APIs to be abstracted for access through a Terraform Cloud API specific to each cloud operator. By leveraging those APIs, the XDN platform can manage any combination of contractually available Terraform-compatible IaaS resources as holistically integrated components of the live streaming infrastructure. In addition, the platform can be manually integrated to work with the APIs of any cloud provider that isn’t integrated with Terraform.

The Fail-Safe XDN Architecture

When customers choose to deploy their XDN infrastructures on more than one of these cloud services, the XDN Stream Manager ensures all streams are optimally routed based on the best options available at any instance in time across all cloud instantiations. This orchestration of routes occurs over a hierarchy of XDN Nodes instantiated in clusters consisting of one or more core Origin Nodes. From there, encoded content is ingested and streamed out to Relay Nodes, each of which serves an array of Edge Nodes that deliver live unicast streams to their assigned service areas.

The XDN platform makes use of the Real-Time Transport Protocol (RTP) as the foundation for interactive streaming. RTP is the underlying transport for both WebRTC (Real-Time Communications), the peer-to-peer video communications protocol Red5 Pro has adapted for one-to-many applications, and Real-Time Streaming Protocol (RTSP), a one-to-many video streaming alternative to HTTP that became an IETF standard in 1998.

In most cases, WebRTC is the preferred option for streaming on the XDN platform by virtue of its support by all the major browsers, which eliminates the need for device plug-ins. RTSP, often the preferred option when mobile devices are targeted, can be activated through Red5 Pro iOS and Android SDKs.

There are also other options for receiving and transmitting video in real time when devices are not equipped to work with WebRTC or RTSP. These include the Real-Time Messaging Protocol (RTMP), Secure Reliable Transport (SRT), and MPEG-Transport Protocol (TS), all of which can be retained as encapsulations transmitted over RTP.

The XDN platform also provides full support for the multi-profile transcodes used with ABR streaming by utilizing intelligent Edge Node interactions with client devices to deliver content in the profiles appropriate to each user. And to ensure ubiquitous connectivity for every XDN use case, the platform supports content delivery in HTTP Live Streaming (HLS) mode as a fallback. In the rare instances where devices can’t be engaged via any of the other XDN-supported protocols, they will still be able to render the streamed content, albeit with the multi-second latencies that typify HTTP-based streaming.

The cluster-wide redundancy that’s essential to fail-safe operations is enabled by the Stream Manager’s autoscaling mechanism with reliance on platform controllers designed to work with each cloud provider’s APIs. With persistent performance monitoring of all engaged IaaS providers, the Red5Pro platform can instantaneously shift processing from a malfunctioning component within a node to another appliance in that node.

These capabilities also apply to XDN-wide load balancing. By translating the commands of the XDN operations system (OS) to the API calls of the cloud operators, the OS is able to execute the load balancing essential to persistent high performance across the entire infrastructure without manual intervention.

—————————————————————————-

As described in this white paper and many blogs, the era of real-time streaming is upon us, impacting everything from how traditional live content is streamed to the myriad interactive modes of operation associated with extended reality (XR) applications, watch parties, live-stream shopping, multiplayer gaming, remote work, and much else. Indeed, as explained in this blog, all the talk about the emerging Metaverse would be meaningless without assurance that massively scalable real-time video connectivity as provided by XDN infrastructure is a well-established fact of internet life.

With the implementation of cross-cloud operations on the XDN platform, there’s no reason to anticipate the disastrous disruptions to digital commerce posed by inevitable cloud outages will carry into the real-time streaming era. To learn more about how to take advantage of the cross-cloud flexibility of XDN architecture, contact info@red5.net or schedule a call.