As we all know, the pandemic has catapulted video conference calls from occasional use to an everyday essential. Now, businesses and schools rely on video calls to conduct day-to-day operations. This drastic change exposed the shortcomings of many platforms currently in use for video conference calls.
High latency, security concerns, and an inability to scale are just a few of the issues plaguing those that depend on video conferencing. There is also the concern that most streaming platforms were designed for general use and may not have all the necessary features for a virtual or hybrid lifestyle.
So what can one do?
This post covers the essential features when considering how to build a conferencing application.
If your goal is to support as many users as possible, your application must be accessible on a variety of platforms. Participants should be able to join the conference call from their preferred devices, whether a laptop, tablet, or mobile device. One way to do this is through a browser-based application.
For the highest degree of effectiveness, your platform should work without any additional hassles like configuring a native app (as with Zoom) or downloading a plugin. Although it may not sound like a particularly difficult problem, setting up an app can be an issue since it will need to be configured according to the user’s operating system and specific device setup. This extra level of complexity can be an obstacle to entry, and can cause compatibility issues with different operating systems and configurations.
Directly running a web application in the browser is a much easier way to ensure accessibility via a variety of platforms. Additionally, since this approach involves just one web application rather than multiple native apps, it minimizes the amount of maintenance required to keep everything up to date.
Of course, not all web apps are created equal. Some browser-based ones require users to download plugins for their browser, which can lead to conflicts and create more upkeep. A web app that can run free without any plugins is the easiest and best-functioning solution. This is where WebRTC comes in. Built according to web standards, WebRTC is a real-time latency solution that works directly in the browser. By using a simple API to connect to the browser, WebRTC efficiently delivers live video streams with a latency of under 500ms. The combination of low latency and ease of use means that solutions supporting WebRTC are the best choice for video conferencing apps.
Additionally, if you wish to create separate native applications for mobile devices, Red5 Pro has developed mobile SDKs using RTSP to allow for seamless performance on native apps for Android and iOS.
Authentication and Security
Pre-built conferencing solutions such as Zoom have faced big security issues with uninvited guests and hackers jumping into chats. Building a fully customized solution is the best way to prevent unauthorized stream access, and helps to make sure you are always in control of your data.
To prevent anyone tampering with the video stream, security must be a top priority when building a conferencing application. One of the main ways to protect the stream is through encryption. Encryption keeps information secure, and makes it available only to those who are authorized to have it.
For browser-based applications, WebRTC is a great way to include encryption. WebRTC was designed with security in mind and has built-in encryption so that all audio, video, and other data is protected, without having to rely on third-party plugins. WebRTC’s security implementation is also standard across regions, which makes it adaptable and reliable no matter where users are located.
Round-trip authentication creates an extra layer of defense to further guard against intruders during online calls. With this authentication, you can fully customize who has permission to view and interact with the stream. This keeps hackers at bay so you can hold your virtual hangouts in peace.
For even more security, choose a hosting solution that allows you to have full authority over and access to your data. If it is available to you, setting up a self-hosted solution means that data is stored on your own servers, so you can retain total control over that information. Alternatively, most major hosting platforms take data seriously, but you will have to review their standards and choose which provider best fits the needs of your particular use case. Also, consider the compatibility of your video streaming SDK with your hosting solution. For example, Red5 Pro is hosting-agnostic, which means it supports a variety of hosting platforms including AWS, Azure, Digital Ocean, GCP, or your own servers.
Even before COVID-19, there was a trend that businesses were hiring employees far away from the main office. More than just offshore operations, sometimes the best person for the job does not live within commuting distance. In our increasingly virtual world, ensuring that your streams maintain high performance across geographic regions is imperative.
To support a geographically diverse workforce, each person needs access to a high-quality stream that stays in sync with all the other people on a call. While it would be great if everyone had a stable and fast internet connection, that’s not always the case. Features such as ABR and transcoding (covered in more detail later in the post) ensure that users will have the best possible experience given any connection issues.
Additionally, the server infrastructure must be able to support multiple regions at scale to provide a positive streaming experience. Software that supports both single- and- cross-cloud deployments further ensure that the best cloud environment for each region can be accessed. Red5 Pro’s cloud-based auto scaling architecture, which spins up new servers to meet increased demands, is one good example of such a setup, and with the upcoming XDN infrastructure, users can choose their regions accordingly based on their expected audience.
Here’s where things get tricky. You now have synchronized, secure, interactive live video conferences, but the goal is to support a large number of those conferences concurrently. Depending on the use case, each individual call may have a high number of participants. A distribution network at full capacity lacks the resources to deliver new streams, regardless of where they need to go.
How can you prepare for concurrent conferences at scale? To start, use a load-testing tool to discern what type of load your current setup can handle, and then adjust your system as needed. As an example, Red5 Pro has developed load-testing “bees” that can be used to send any number of simultaneous attacks to a server. The bees create clients that subscribe to a video stream on the server, and as the number of clients increases, the load-testing process reveals how many concurrent connections the system can handle at once. This allows you to prepare the necessary number of server instances ahead of time, based on the expected audience.
With the load-testing results in mind, autoscaling technology such as the one provided by Red5 Pro allows automatic creation and deletion of server instances based on the amount of traffic experienced at a given time. Leveraging cloud infrastructure, Red5 Pro’s autoscaling solution enables full scalability to millions of concurrent users.
Under the operating logic of a Stream Manager—a Red5 Pro server application that manages traffic and monitors server usage—clusters or groups of active server nodes are established in specified geographic regions. Through this fully automated process, the system can scale up and scale down in real time depending upon demand. Not to mention that all of this happens while maintaining a real-time latency of under 500 milliseconds.
Those more familiar with the peer-to-peer nature of WebRTC might be wondering how it can be realistically scaled without a gigantic data center the size of ENIAC. Rather than creating direct peer-to-peer connections between the broadcaster and subscribers, Red5 Pro configured their architecture so that each WebRTC connection is routed through a server instance (node). Connecting distant users to each other using a powerful cloud-based backbone rather than forcing clients to navigate across the world through the internet makes for a much better, more efficient experience.
Speaking of efficiency, reducing the total resource consumption improves application performance across the board, since the resulting decrease in CPU usage will free up space for other processes. System inefficiencies waste resources, acting as a blocker to full system optimization.
For efficient cloud-based server architecture, you need to get the highest number of streams per server instance. This prevents running more servers than necessary, which reduces the total cost of running on your servers. Ensuring a smoothly operating streaming system requires a streamlined platform that maximizes the number of connections you can get per server instance.
Mobile users also benefit from reduced CPU consumption, as it will ease their battery consumption as well. Mobile devices tend to be more CPU-lean in comparison to a desktop or laptop computer, so efficiency gains can make all the difference for mobile app performance.
Furthermore, it should be noted that choosing transport protocols also affects efficiency. While HTTP-based protocols such as HLS, RTMP, and MPEG DASH are easy to scale, they can result in high latencies of at least 2-3 seconds. That is not nearly close enough to the 500ms needed for fully interactive conference calls. Alternatively, WebRTC supports full scalability and sub-500ms of real-time latency.
Lastly, codec choice also has its consequences. Factors such as hardware support, browser compatibility, and bandwidth consumption all have an effect on the latency and efficiency of a live stream.
As outlined above, there are many different factors that go into setting up a live stream, all of which affect how well your video conferencing app will work. Image and audio quality certainly play major roles in creating an app that lives up to consumer expectations.
When conferencing, communication needs to be clear and smooth, not choppy and fragmented. Internet connectivity and bandwidth speed can have detrimental effects on a video stream, especially when participants join the call from different global regions. Even users who are in highly connected areas can still suffer through periods of throttled bandwidth or instability.
WebRTC has a built-in feature that helps ensure clear communication. In most video calls, audio matters more than video. Even in the case of a presentation, a blip in the slideshow is easier for the viewer to recover from and continue following along. A gap in audio is much more sensitive. Accordingly, if frames are dropped due to bandwidth issues, WebRTC will prioritize sending the audio frames while dropping the video frames. This allows the video to catch up and correctly synchronize with the audio again.
However, it would be ideal if frames were not dropped in the first place. Features such as transcoding and ABR will make sure that each call participant is always getting the best quality in proportion to their connection speed. Transcoding splits the stream into multiple qualities such as high, medium, and low. Simultaneously, ABR allows both the client and/or the publisher to automatically request a lower bitrate that is more conducive to their current network conditions.
Since video streaming is a bandwidth-intensive activity, the trick is allowing those with better connectivity to get the highest possible quality without having to shape the entire video call around the person with the lowest quality. When it comes to bandwidth speed, transcoding solves for the lowest common denominator, while still allowing those with better internet access to get full quality.
Red5 Pro’s approach to transcoding streams and ABR is detailed in the Red5 Pro server documentation along with an example for iOS on Github.
In addition to the security features discussed earlier, there are other ways you may want to customize your app. While out-of-the-box options are convenient, they can have limitations and may not be compatible with your product’s specific needs.
Red5 Pro’s Experience Delivery Network (XDN), for example, provides a fully customizable, flexible solution to video conferencing. XDN provides the infrastructure for synchronized video streaming, and handles tasks like deployment and autoscaling while leaving room for you to add the customizations you wish to include in your application. For example, server-side logic can be written in any language or use any platform: Node, Go, Ruby on Rails, Java, etc. Data can be sent and received in a shared data channel, which is useful for creating a variety of elements. Take a look at Red5 Pro’s SharedObject implementation. Using the SharedObject, you can enhance your conference call experience by developing custom features to encourage collaboration, such as live overlays, chat functions, interactive graphics and virtual whiteboards that are synchronized with the video stream.
There are plenty of other features that will help your app stand out, including the following:
- Screen sharing
- VoIP integration
- Visual and audio effects
- Virtual and augmented reality
- Integration with other software e.g. web browsers and search engines, social media sites like Facebook and Twitter, Google Docs, and more.
There are many factors to consider when it comes to building a conferencing application. If you are interested in creating a conferencing app that can be customized to your needs, we’d love to help you. Message email@example.com or schedule a call to discuss how Red5 Pro’s XDN architecture can work for you.