By its nature, video streaming is a quality-sensitive application, especially when it comes to live video. Maintaining a consistent stream is very important. Every video and audio frame has the potential to include key information so mishandling those frames could be very detrimental to the user experience.
In the real world, when things go wrong they tend to happen at the very worst time like the last minute of a big match between two undefeated teams. Any problem at that point can have dire consequences with upset users turning to social media to vent their frustrations and causing permanent damage to the reputation of your streaming platform.
Maximizing stability and ensuring consistent performance means fixing issues as they are discovered. The question is how proactive you can be in identifying problem areas. Waiting for a stream to crash or completely degrade into a garbled mess of lego blocks and static is an easy way to figure out what needs to be fixed. However, it is far from effective. Rather it is much better to proactively identify issues and address them in a timely manner. Fixing problems right as they arise will save money as well. Like a disease, it is better to find a problem in the early stages and prevent the damage from spreading further.
Maintaining the stream quality for your users involves the constant monitoring of production environments. System administrators need to check the state and manage the resources of the production environment. To make this process a little less manually intensive, a monitoring system should automatically send alerts if any problem is detected.
Beyond stream quality, monitoring systems can also collect information like CPU, memory, network, and disk space utilization. The collection of user data can be analyzed to measure different metrics such as the total number of viewers, geographic data, bandwidth usage, and other insights into user behavior. This kind of information can be attained through the use of monitoring services or through API calls designed to gather stream statistics.
The following is a list of three tools that we recommend using to create an effective system for live video stream monitoring. It should be noted that we focused on Red5 Pro autoscaling deployments in this post, but much of this setup could be used for other video streaming systems.
Grafana is a set of visualization and analytics software that displays monitoring data on a graphical interface. According to their website, “Grafana allows you to query, visualize, alert on, and understand your metrics no matter where they are stored.“
Grafana includes an alerting system that can monitor the resources used by the instances and notify system administrators if any problem is detected. In this way, system administrators will always have the most up to date information about their environments and nodes and will be able to quickly respond to any issue that may arise.
Prometheus is an open-source monitoring system with a dimensional data model, flexible query language, efficient time-series database, and a modern alerting approach.
For a typical monitoring solution, Prometheus is installed in a central server that acquires the data from a stream source (such as Red5 Pro instances) and manages it. The instances to monitor are specified in a dynamically updatable JSON file that resides on the Prometheus central server. The instances that need to be monitored install the node exporter software to allow them to capture the data from the operating system and report it back to Prometheus.
Finally, Grafana interfaces with Prometheus to receive the data and display it in a user interface.
Here’s a quick rundown of the steps to deploy Grafana and Prometheus on a Red5 Pro autoscaled system with links to specific documentation.
Prepare a monitoring instance with Grafana + Prometheus + Node Exporter
Install Grafana Dashboard
Install Prometheus Monitoring Server
Install Node Exporter (Agent)
Prepare Red5 Pro Stream Manager instance with Node Exporter
Install the Red5 Pro Stream Manager
Choose your Cloud provider and install a stream manager instance:
Install Node Exporter on the Stream Manager instance:
Open port: 9100 on your firewall to Monitoring Server IP.
Prepare Red5 Pro Node image with Node Exporter
Install Red5 Pro Node image
Choose your Cloud provider and install node instance follow the instruction:
Install Node Exporter on the Node instance
Open port: 9100 on your firewall to Monitoring Server IP. (This is a security group that will be used for auto-scaling and deploying node instances)
Setup and Configure the monitoring instance with Grafana + Prometheus + Node Exporter
Prometheus system collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets. To add monitoring targets to Prometheus such as stream manager and nodes you need to use script
- Configure the Prometheus targets
- Create new folder in Prometheus folder:
- Copy script update_targets.sh to folder
- Modify script update_targets.sh:
- Create new folder in Prometheus folder:
|`SM_DOMAIN="stream-manager.com"`||Stream manager domain name. It is used for API calls to get information about deployed nodes|
|SM_API_KEY="abc123"||Stream manager API key. It is used for API calls to get information about deployed nodes|
|ENVIRONMENT="Prod"||Set your environment. This parameter could be used for selecting a different environment in the Grafana WEB application.|
|ADDITIONAL_TARGETS=(192.168.1.201 example.com)||In this array you can set additional target hosts such as stream managers and others.|
jqfor working bash with
sudo apt update
sudo apt install jq
Add record in CRONTAB for start script automatically:
echo "* * * * * root /etc/prometheus/scripts/update_targets.sh >> /etc/prometheus/scripts/targets.log" >> /etc/crontab
systemctl restart cron
Check log file
Import Grafana into Red5 Pro Stream Manager dashboard
Go to new created Red5 Pro Stream Manager dashboard
- Explore the dashboard and start monitoring your live streams
Red5 Pro Node Checker
Now that you have Grafana and Prometheus up and running, we recommend adding one additional tool to your video stream monitoring arsenal: the Red5 Pro Node Checker. Red5 Pro built our own Node Checker specific to live streaming. It is a stand-alone Node.js server that can be implemented into a Red5 Pro Autoscaling environment. Like the solutions above, the Node Checker will monitor the Red5 Pro Edge nodes in your deployment and ensure that WebRTC is properly running.
In practice, the Node Checker periodically retrieves the list of in-service edges and published streams using the Stream Manager API. This is in order to check the health of the edges. If at least one edge and one stream are found, a live stream is randomly selected from the list and an attempt is made to subscribe to that live stream using every edge.
It should be mentioned that the health check is run on a single live stream because it has been observed that when an edge is unresponsive for one stream, it is unresponsive for the others as well. Rather than testing every possible stream, this is a more efficient process.
The subscribe attempt consists of having the Node Checker launch a headless Google Chrome instance which will load a locally served HTML5 page. The URL of the page will include the IP of the edge server to use and the stream name to subscribe to as well as the maximum number of retries. Once the page can successfully subscribe, or it reaches the maximum number of retries, it will call a REST API exposed by the Node Checker to inform it whether subscribing was successful or not. If the Node Checker is informed that the page could not subscribe, or the page timeouts, the Node Checker will flag the edge as unresponsive and after a configurable number of unresponsive health checks, the Node Checker will report the unresponsive edge to the Stream Manager using the sunset node API.
If a specific edge is found to contain a problem, the Node Checker will report it to the Stream Manager. The Stream Manager will then stop forwarding new clients to it so the process of shutting down the dysfunctional node can begin. Upon receiving the report, the Stream Manager will monitor the existing clients on the reported edge. Once all of them disconnect, the Stream Manager will proceed to terminate that dysfunctional edge that was reported. Additionally, a new edge instance will be spun-up so to compensate for the loss in capacity.
Monitoring your application is important for making sure that everything is running smoothly. Effective monitoring identifies problems as soon as they emerge and addresses those issues as soon as possible. That kind of efficiency is essential to maintaining customer satisfaction by minimizing any negative effects on the end-user experience. Grafana, Prometheus, and the Red5 Node Checker are all good tools to add stream monitoring to your application.