Witnessing the cascade of extended reality (XR) innovations spilling into commercial and private life, it doesn’t take long to see the rainbow effect many people are referring to as the Metaverse.

The question is, what will it take to put some real pots of gold under that rainbow?

An outpouring of recent mainstream press headlines touting the emerging Metaverse reflects accelerating progress across all the segments of XR development: virtual reality (VR), augmented reality (AR), mixed reality (MR) and holography. But, as noted in a previous blog, all this activity won’t morph into the transformative force visionaries expect without recourse to network connectivity that enables execution of the audio/visual functionalities essential to Metaverse-caliber communications and commerce.

Here and in succeeding blogs we’re taking a deeper dive into how progress along each track of XR development requires networking capabilities akin to those supported by Red5 Pro’s Experience Delivery Network (XDN) platform. This is the technology that is allowing the world to move beyond unidirectional streaming supported by traditional content delivery networks (CDNs) to create a real-time interactive networking foundation for livestreamed video-rich content, whatever the use case might be.

Much has been made about the role 5G will play in bringing XR to life at massive scales, and for good reason, given this mobile platform’s superiority over previous generations. But it’s important to remember that pervasive access to XR applications will also depend on an XR-optimized internet streaming infrastructure that can interoperate with 5G, and other proprietary multigigabit access networks as well, to deliver XR payloads to and from the cloud in any direction, at any scale and distance.

An Emerging Norm Belies the Futuristic Connotations of the Metaverse

It’s also important to note that these are not speculative capabilities awaiting commercial activation. Entities of every description are employing XDN infrastructure in response to demand for XR and myriad other applications involving real-time interactivity with live streamed video.

Seen in this light, the Metaverse is just a convenient label for a futuristic-seeming transformation that’s already well underway. Soon enough, references to the Metaverse and its sci-fi connotations will give way to more mundane branding suited to life in this new era of internet usage.

Whatever the label, it will be about the pervasive integration of computationally generated virtual experience with everyday reality. While the fully immersive experiences associated with VR have been highlighted in much of the commentary about the Metaverse, the truth is, the many strands of XR technology will be entwined in multiple ways to support different use cases.

It’s a point Facebook CEO Mark Zuckerberg made in conjunction with his recent declaration that Facebook is a “Metaverse company.” Zuckerberg drew a lot of attention to his thoughts about VR as the linchpin to life in the Metaverse, but lest anyone bridle at the thought of going about their daily routines as cartoonish avatars, he made it clear that he’s talking about something much more compelling.

In an interview with The Verge, he said, “The interactions that we have will be a lot richer, they’ll feel real. In the future, instead of just doing this [interview] over a phone call, you’ll be able to sit as a hologram on my couch, or I’ll be able to sit as a hologram on your couch, and it’ll actually feel like we’re in the same place, even if we’re in different states or hundreds of miles apart.”

Facebook’s Metaverse ambitions, like those of many other VR developers, extend into AR and MR as well as holography. Unlike people wearing Oculus and other head-mounted VR devices (HMDs), AR users, with the aid of more compact eyewear, can interact with virtual objects and graphics without losing contact with their real-world surroundings.

MR, widely viewed as a subset of AR that uses more advanced eyewear, takes this half-in, half-out relationship to virtual existence a step farther. In these use cases, real-world spaces are populated with holographic and other renderings of life-like virtual beings and 3D objects users can interact with.

Progress Toward Normalizing XR Eyewear

Metaverse naysayers often claim that widespread public aversion to donning specialized headgear will prevent XR usage from gaining the universal traction foreseen by proponents. But it remains to be seen whether market resistance to early bulky versions of HMDs and AR eyewear like Google Glass will persist as form factors shrink.

A major force driving improvements in form factors comes from advances in systems-on-chips (SoCs) that have freed users from connecting to PCs or smartphones for compute power. Dense combinations of CPU, GPU and AI processing enable capabilities approaching those of high-end tethered HMDs, which, as evidenced by booming sales of the tetherless Oculus Quest, has resonated with consumers.

For example, Qualcomm’s Snapdragon XR2 SoW, used with the recently released Quest 2, supports 4K video resolution at 120 frames per second (fps), 6K at 90 fps, and 8K at 60 fps. The chipset offers 11 times the AI processing power of the XR1 and interacts with multiple graphics APIs to enable hardware-accelerated composition, dual-display functionality and 3D overlays.

Another contribution to reductions in AR, MR, and VR headgear form factors comes from breakthroughs in display technology. For example, startups DigiLens and WaveOptics have developed eyeglass-size lenses that support fields of vision (FOVs) appropriate to XR headgear using waveguides based on the principles of light ray refraction employed in holographic systems. Facebook, too, is now developing the technology, according to VentureBeat.

Made of thin transparent material, the embedded waveguides project light from micro LEDs or other miniaturized light sources to form an image on the users’ eyes in contrast to the image forming process that occurs when illuminated pixels are arrayed across a screen. For VR purposes a waveguide is paired with a liquid crystal blackout layer that blocks out external light.

Other approaches to miniaturizing XR displays are underway as well. Fraunhofer, a German developer of micro displays with high pixel densities, is a case in point. The company says the combination of two one-inch-square displays per eye delivers a high-resolution, 100+0 wide FOV, resulting in viewing experiences comparable to VR displays with headgear that is half the weight and a quarter the size of typical HMDs.

Qualcomm, in a post assessing future trends, describes the types of “sleek and stylish XR glasses” people might one day happily wear to engage with all categories of XR experience. The company predicts such glasses will feature multi-functional, semi-transparent lenses supporting display surfaces and telescopic viewing. Rims and earpieces will be embedded with multiple dot-size devices, including tracking and recording cameras; motion health, ambient light and thermal imaging sensors; directional speakers and microphones; image projectors, and haptic devices conveying a sense of touch in user interactions with virtual elements.

How close we are to such eyewear is reflected in the smart glasses produced by Ray-Ban for use with Facebook View, an app developed for this use case. This is not an XR app, but the glasses, running on Qualcomm SoWs, are a big step in the direction envisioned in Qualcomm’s post. Billed as designer-caliber eyewear, they allow people to make phone-free mobile calls, take photos, record video, and listen to music via touch and verbal commands.

Trends in VR Usage

So how close are we to entering an era where real-world experience routinely and seamlessly blends with virtualized inter-personal experiences and pervasive, screen-free access to computer-generated graphics and data applications? Progress along all the XR vectors justifies the growing optimism expressed by people in the Metaverse camp.

Where VR is concerned, the technology has weathered years of well-publicized disappointments to gain a significant, albeit still limited role in entertainment, social networking, game playing, workplace collaboration and training, education, and health care. Multiple research studies cite rapid increases in VR usage across all sectors over the past two or three years, and are projecting high growth rates in the years ahead.

For example, in March, eMarketer predicted the number of U.S. consumers accessing VR at least once monthly will reach 58.9 million by year’s end, 22.2 million of whom will do so wearing headsets, while the rest will engage in 2D 360-degree displays of live action on smartphone and PC screens. That would equate to a 7.5-million increase in headset and 8.3-million increase in non-headset users compared to 2019. By YE 2023, eMarketer predicts the total number of monthly U.S. VR users will reach 65.9 million.

Researchers’ projections for a rapid increase in global spending on VR are in broad agreement. In 2018, ResearchAndMarkets predicted the spend would go from $7.9 billion that year to $34.08 billion in 2023, equating to a 33.95% CAGR. More recently, Grandview Research predicted the VR market, valued at $15.81 billion in 2020, will grow at 18% CAGR to reach $59.43 billion in 2028. Another, more aggressive take on the prospects comes from Fortune Business Insights, which predicts the global spend will increase at a 44.8% CAGR from $6.30 billion in 2021 to $84.09 billion in 2028.

VR’s strongest growth is occurring where the technology has practical ramifications beyond entertainment. Just a third of the spending documented by Grandview is going toward consumer uses of VR while 53% is related to commercial applications in retail, real estate and other sales arenas. The remaining share of projected spending goes to healthcare, the fastest growing segment, then enterprise, aerospace, and defense.

The VR Networking Imperative

Of course, not all this market activity is related to networked applications of VR. In the consumer sector, most content developed for VR so far, including games, music video, documentaries, and other types of short- and long-form entertainment, has been made available for downloading over networks without a live streaming requirement.

But now the networking possibilities made possible by bandwidth-reducing innovations in distribution are swinging the focus to fully immersive multiplayer gaming and socialization, as well as less immersive 360-degree approaches to viewing sports, concerts and other live events with and without HMDs. In the realm of multiplayer participation, producers are seeing significant consumer engagement in everything from high-action competitions to social environments where players in avatar mode interact in virtual sports bars to play darts, paintball, laser tag, and other popular games.

Semi-immersive sports viewing through HMDs has become an expanding area of VR development, typically involving user-controlled viewing of live action from virtual positions in the stands. For example, VR coverage of the Olympics, which was featured more or less as a novelty with limited content during the 2016 summer and 2018 winter competitions, played a much bigger role during this year’s airing of the pandemic-delayed 2020 Summer Olympics from Tokyo.

Networked connectivity is equally, if not more important to fully immersive VR applications in enterprise segments, where group participation in collaboration, training, and other activities is a priority. For example, VR-based training is cropping up with increasing frequency across the business world as the technology-driven need for continuous workforce training intensifies. Networked connectivity of VR-equipped trainees offers an alternative to allocating physical space for on-premises training.

In some industries, VR is taking hold to the point that it’s being integrated into many facets of day-to-day operations. For example, some auto manufacturers are using VR for design, technician training, buyers’ feature selections and virtual showrooms. Ford is especially aggressive with multiple VR programs in these areas.

Another promising arena for networked VR applications is healthcare, where the technology is proving to be useful in surgery, diagnostics, pain control, injury rehabilitation and treatment for conditions affecting mental health. As telemedicine gains more traction, network support for VR will enable remote execution of many of these applications, including extensions of life-saving procedures that would otherwise be unavailable in most locations. For example, specialists in pre-surgery VR modeling based on CT, ultrasound, and MRI scans using techniques like those supplied by Bioflight VR could be called on remotely to help surgeons who don’t have that expertise.

As in other fields, the medical profession is also using VR for training and basic research. A case in point is training like Stanford’s Salisbury Robotics Lab provides for would-be surgeons. There, students work in a surgical simulation environment that uses sensors monitoring treatment of a virtual patient to enable computer analysis of their techniques. More broadly, VR is used to help all types of medical students learn anatomy, study specialized procedures like infection control, and hone their treatment skills through feedback on their performance in virtual situations.

Ultimately, networked connectivity is key to realizing the Metaverse vision for workplace activity in general, as exemplified by Facebook’s beta release of the virtual workspace it calls Horizon Workrooms. The platform is designed to enable live interactions among employees occupying the space as avatars. These avatars can converse realistically with the aid of sensors conveying their movements and facial expressions. They can sketch out ideas on a shared whiteboard and display their computer screens to each other in the virtual space.

This is merely the opening gambit for Facebook as it pursues Zuckerberg’s vision of using holography to add verisimilitude to the proceedings. Indeed, Workroom is already pushing the XR envelope by making use of advances like MR desk and keyboard tracking, hand tracking, remote desktop streaming, video conferencing integration, and spatial audio.

Bandwidth-Friendly Advances Supporting VR Networking

VR introduces new challenges when it comes to streaming live, interactive immersive content. But they are no longer insurmountable in an era characterized by high fixed-bandwidth connectivity, 5G and real-time streaming as enabled by Red5 Pro.

Enabling live networked applications of VR requires incessant real-time transmission of volumetric payloads in multiple directions. To ensure a simultaneously shared experience, those payloads must keep pace with every action impacting what’s happening in the virtual space, delivering each participant’s unique view of the unfolding scene in tandem with every turn of the head.

When animation is involved, as in the Workrooms use case or multiplayer gaming and socialization, the VR impact on bandwidth consumption is minimal. But when cameras capturing scenes are involved, the bit load soars.

Most providers strive to deliver an optimal viewing experience by transmitting two 4K feeds, one for each eye, with at least 10-bit coding at 60 fps or better. But some sports producers, as in the case of the U.K.’s BT, have begun using 8K resolution to avoid the pixelating effects that occur when eyes are close to HMD viewing surfaces.

Providers can get around the high levels of bandwidth usually associated with 4K and 8K by limiting how much of the captured camera input is transmitted at each moment. This is done through viewport-dependent streaming (VDS), also known as “tiling,” which leverages MPEG’s Omnidirectional Media Format (OMAF) for 360-degree VR and the OMAF-based specifications for volumetric content under development by the Virtual Reality Industry Forum (VRIF).

So far, VDS as embodied in OMAF does not support six degrees of freedom (6DoF), which is to say, VR applications common to multiplayer gaming that allow the user to realistically interact with the virtual space while moving around. Having accomplished 6DoF for audio, OMAF developers anticipate they will achieve support for video 6DoF within the next year or so. This is critical to most of the networked implementations of VR embodied in the Metaverse concept.

VDS breaks the entire panorama into segments, or tiles, in a way that permits an immersive experience when only the content needed to fill the user’s screen or viewport with whatever has changed in the previously received viewport is transmitted. Tiles assembled for each user’s viewport can be compressed and delivered at varying degrees of resolution in tandem with how the eye registers different parts of the immediate FOV in real life. Researchers have found that use of tiling with VR produces bitrate savings in the range of 40%-65% compared to viewport-independent streaming (VIS), which transmits the entire 360-degree  viewing space.

VIS remains in wide use and likely will continue to have a role in delay-tolerant live scenarios that don’t support a fully immersive experience, as in the case of many sports applications. The technology can be used to deliver a reasonable viewing experience over high-capacity broadband networks through buffering techniques that reduce bandwidth consumption to some extent. But pervasive use of VR for live-streamed use cases will depend on infrastructure that can support VDS.

Whether the transmitted content comes from cameras or graphics engines, every FOV must be constantly updated without causing discomfort as users shift their gazes across the virtual panorama. This requires real-time interactive streaming capabilities that transcend the limitations of HTTP-based streaming technology.

The Roles of AR and MR in Networked XR

The same real-time connectivity imperative holds when networks are used to support user experiences with AR and MR applications. The previously cited ResearchAndMarkets report predicted revenue generated by AR usage would grow at a 40.29% CAGR from $11.4 billion in 2018 to $60.55 billion in 2023. The widely shared assumption that spending on AR is eclipsing the VR spend at a faster growth rate reflects how rapidly the technology, which is usually depicted as including MR applications, has taken hold across the consumer and enterprise markets.

Of course, most AR apps employ local processing in smartphones to produce the effects appearing on screens with other content, including video or photos captured by phone cameras. In addition to playing games like Pokemon Go, consumers are putting AR apps to use for pre-purchase looks at themselves wearing new makeup or clothes, for redesigning homes and arranging furniture, and much else.

There’s also an abundant supply of general-use AR apps running in cell phones that support workday routines. For example, Measure, an Apple app, takes advantage of iOS phone cameras and spatial awareness to measure everything from objects to entire rooms. Google’s Just a Line app allows users to draw in 3D space.

But network-delivered support for AR and MR apps with content delivered from the cloud is a fast-growing aspect to use of the technology, especially in workplace scenarios. Enterprises, institutions and government entities are putting immersive, hands-free AR to use where data and images delivered from remote sources make it much easier to get things done.

As listed in a recent guide issued by TechRepublic, some of the more common use cases requiring real-time connectivity include:

  • Maintenance assistance providing technicians guidance from experts, Internet of Things sensors and other sources that can be used to support a task immediately at hand.
  • Training beyond simple educational apps where users can view and zoom in on cloud-based 3D representations of complex machinery or other objects to see how things work.
  • Support in engineering and architectural design through access to 3D modeling templates and CAD files that can be displayed in real space.

The amount of data transmitted moment to moment in such instances is typically a tiny fraction of the volume involved with VR. But the mission-critical nature of these applications mandates infrastructure that will deliver data and graphics in tandem with real-time usage requirements.

Incorporating Holographic Rendering into Networked XR Experiences

All of the networking requirements associated with XR applications are essential to making holographic elements part of the interactive virtualized experience. Of course, holograms occupy a generic category that covers a lot of ground, much of which is of no interest in the XR space. For example, holographic techniques used in concert settings featuring deceased performers and in free-floating images bursting from billboards and storefronts aren’t applicable.

But many companies have made significant progress with other versions of holographic technology that will play an important role. Advanced MR eyewear like Microsoft’s HoloLens, Nreal’s Light and Magic Leap’s glasses make it possible for engineers, doctors, and anyone else interacting with virtual objects to get a 360-degree perspective on holographic images that can be manipulated in real space.

One illustration of what’s in store involves use of a system devised by Washington University scientists in St. Louis to facilitate physicians’ visualization of heart interiors during procedures targeting irregular heartbeats. Physicians gain insight into how to conduct the ablation procedures through gestures controlling HoloLens-projected 3D images of patients’ hearts that have been derived from electroanatomic and catheter data.

A glasses-free approach to use of holograms is underway at Light Field Lab, a startup that enables projections of unusually realistic free-standing holograms from purpose-built screens. The need for networked delivery of such displays in livestreamed consumer and enterprise use cases has sparked development of the Immersive Technology Media Format (ITMF) by the Immersive Digital Experience Alliance, an organization spearheaded by CableLabs, Charter Communications, Light Field Lab and others.

ITMF provides a standardized approach to creating and distributing holograms with light field technology, which projects and converges rays of light from every direction through volumetric points of space known as voxels. These are the three-dimensional equivalent of a pixel.

ITMF is also meant to encapsulate use of VR technology, thereby providing what its backers deem to be a more inclusive approach to networking XR experiences than can be accomplished via the VRIF formats. It remains to be seen what impact this approach to delivering holographic experiences with VR will have on the marketplace, but it could have a role to play with implementation of the real-time interactive infrastructures that will anchor the new era in internet usage.

The XDN Foundation for Networking XR

This new networking paradigm will involve an interplay between transport over 5G and competing proprietary networks with the internet connectivity provided by the type of infrastructure supported by Red5 Pro’s multi-cloud XDN technology.

As described at length in this white paper, the multidirectional real-time streaming capabilities of XDN infrastructure are implemented through a software stack hierarchically deployed and orchestrated in three-tiered clusters across one or more public or private clouds. Each cluster consists of core Origin Nodes where encoded content is ingested and streamed out to Relay Nodes, each of which serves an array of Edge Nodes that deliver live unicast streams to their assigned service areas.

The XDN platform has been pre-integrated for use with AWS, Microsoft Azure, Google Cloud and DigitalOcean. And it can be extended to over a dozen other cloud services as well through use of the integration APIs and other tools comprising Hashicorp’s widely used Terraform software stack.

While end-to-end latency over any number of XDN connections at any distance involving any number of clouds may be as high as 400ms, which is more than adequate for most real-time use cases, lower latencies in the sub-50ms range required by the XR use cases discussed here are supported as well. These latencies are attained in instances where usage is limited to a small geographic area or the applications running on the XDN rely on 5G connectivity to users, as occurs in the tie-in between XDN infrastructure and AWS Wavelength Zones.

Wavelength Zones are instantiations of AWS compute and storage services housed in carrier datacenters that aggregate local cell tower traffic. Direct access to AWS facilities through these edge datacenters eliminates transmission delays of anywhere from tens of milliseconds to multiple seconds that are incurred by traffic traversing cell sites, metro, and regional aggregation centers, and the internet to get to and from the cloud. By deploying XDN infrastructure in Wavelength Zones, applications developers, service providers and the carriers themselves can deliver interactive video streams at end-to-end latencies below 100ms between any points served by these AWS cloud on/off ramps.

This path pioneered by the partnership between AWS Wavelength and Red5 Pro is just the beginning of the global transition to the integration between next-generation real-time internet streaming and 5G technology. Other cloud operators, including Microsoft Azure and Google Cloud, are following Amazon’s lead with their own 5G edge strategies.

MNOs across the world are taking advantage of these initiatives to accelerate the transition to the capabilities essential to realizing the full potential of XR technology. Integrations with real-time internet streaming infrastructure will complete that transition.

For more information on how to employ XDN architecture in pursuit of these goals contact  info@red5pro.com, or schedule a call.

  • Share: