Red5 Pro Autoscaling and Stream Manager

Red5 Pro enables developers to deploy servers in clusters to allow for unlimited scaling for their live streaming application. Red5 Pro features an autoscaling solution that can be deployed on a cloud platform such as Google Compute Engine or Amazon's AWS. The Stream Manager is a Red5 Pro Server Application that manages other Red5 Pro instances using live stream information in realtime to scale up or down the overall streaming architecture depending on the current load.

Red5 Pro Stream Manager's Autoscaler component manages streaming load efficiently by adding/removing Red5 Pro servers in real-time to meet dynamic traffic demands.


The following documents also refer to Autoscaling and the Red5 Pro Stream Manager:

> Red5 Pro Stream Manager API

> Red5 Pro Stream Manager User Guide

> Deploying Stream Manager and Autoscaling on Amazon Web Services (AWS)

> Deploying Stream Manager and Autoscaling on Google Cloud Compute


NOTE : You can refer to stream manager concepts in the stream manager user guide


Design Overview


Autoscaling refers to the ability to scale server instances on the fly when you do not know the size of your network traffic or the change in traffic is very dynamic. However having said that, autoscaling can also be used in a normal scenario when the network size is stable and traffic conditions are known in advance.

Autoscaling also ensures that you can optimize operation costs by having an automatic cluster management in place which is able to monitor Nodes (Red5 Pro Server instances) and add or remove new Nodes to ensure efficient use of servers thus reducing operation costs.

A smart autoscaling solution can also take care of server failure and recovery in real-time by providing backup instances as needed.

Failure recovery is not implemented as an explicit feature. However if a Node loss causes load stress on the NodeGroup the Autoscaler will automatically create a new node as a replacement and add it to the group depending on autoscaling load condition and the scale policy file that the group uses. Node replacement will also take place if the failure causes total node count to go below the minimum count to be maintained as specified in the scale policy file.

Red5 Pro Autoscaler is a software module built into Red5 Pro Stream Manager that is capable of smart Node management in real-time. Simply put, Autoscaler adds or removes server nodes - edges and origins - from a NodeGroup based on load conditions in real-time without manual intervention.


Red5 Pro Autoscaler Responsibilities

  1. Collect NodeGroup stats Red5 Pro Stream Manager receives basic load statistics for each NodeGroup through Red5CloudWatch servlet over HTTP. The stats are provided by the origin node in the NodeGroup.

  2. Use predefined policies for scaling behaviour Autoscaler will use scaling policies (rules) that describe certain basic norms of scaling a NodeGroup. Scaling policies can be attached/associated to a nodegroup. Each nodegroup can use a different scale policy.

  3. Add server node to a NodeGroup Autoscaler adds a new instance to a NodeGroup when the total load for the given node type (origin / edge) is nearing an overload condition and there is a discreet shortage of sibling nodes of the specific role with having connection slots. Autoscaler uses the group's scale policy to determine how to scale it outwards.

  4. Remove server node from a nodegroup Autoscaler removes an existing server instance from a NodeGroup. Autoscaler uses the group's scale policy to determine how to scale down a node.


Scaling Strategies


Scaling plans outline the different types of strategies that autoscaling can implement for a given cluster or system wide. Given below are a few possible scaling plans that are supported by Autoscaler component.

Dynamic Scaling

Scaling automatically based on load conditions.

  • Autoscaler implements scaling outwards (edge / origin addition) automatically when a NodeGroup reports overload on group edges or origins respectively.

Autoscaler supports two distinct mechanisms for scaling ouwards, the traditional(rigid) mode and the flexible competitive mode.

rigid mode : The rigid mode of autoscaling where it waits for a fixed estimated time (coolDownPeriod) before next autoscale operation and the new competitive mode where it only waits for few seconds between consecutive autoscaling operations (if alarm is persisting).

competitive mode : The competitive mode can be more optimal than waiting for coolDownPeriod to be over (traditional model). To set autoscale to use competitive mode set autoscale.scaleout.mode to competitive and to use the old method, set autoscale.scaleout.mode value to rigid.

Default mode is the competitive mode.

  • Autoscaler implements scaling inwards (node removal) automatically when there is a under utilizezation or idleness of node. Origins and edges are scaled-in in a different way. For edges, nodegroup's origin requests a scale in operation whereas for origins the streammanager evaluates the scale-in condition and issues a scale-in on the best suited origin node.

Autoscaling mechanism modes and other parameters are explained in details in the Red5 Pro Stream Manager User Guide.


Autoscaling Lifecycle

Autoscaling life cycle generally comprises of two types of operations: scale-out (expansion) and scale-in (contraction).

Autoscaling Lifecycle

Scale-Out

Adding a new Node to the NodeGroup:

Autoscaling Lifecycle - Scale-Out

PSEUDO FLOW

  1. Red5CloudWatch: Receives reports of a group's load status and stream statistics via Red5CloudWatch servlet.
  2. Red5CloudWatchClient: Evaluates a possible alarm condition for targeted node types (edges / origins).
  3. Autoscaler: Receives alarm and checks for any overlapping autoscaling activities and other necessary conditions. If autoscaling condition is not satisfied, Autoscaler will not perform any autoscaling activities on the NodeGroup till the condition is available. If autoscale conditions are met then system proceeds to performing autoscaling outwards.
  4. Autoscaler: Reads in the Launch Configuration and Scale Policy for this group.
  5. Autoscaler: Prepares a new node for the group using the Launch Configuration a5d Scale Policy.
  6. Stream Manager: Tracks instance state transitions A newly launched instance is tracked and its status is updated in database to the point that it becomes an active part of the targeted NodeGroup.
    • Newly launched Instance : PENDING
    • Instance Machine running but Red5 Pro not ready: RUNNING
    • Instance Machine running and Red5 Pro ready: INSERVICE

Once the new node is ready it calls back Red5 Pro Stream Manager with its host IP address.

  1. Node: Checks role (Edge or Origin).
  2. Red5CloudWatchClient: Helps the new node find its instance role and the parent(s) if any it has to link with.
  3. Red5CloudWatchClient: Performs dynamic clustering by associating the new edge / origin to other edge(s) / origin(s) in the NodeGroup.

At this point, NodeGroup now has N+1 node count and the alarm condition for load goes below threshold.


Scale-In

Removing a Node from a NodeGroup:

Autoscaling Lifecycle - Scale-In

  1. Red5CloudWatchClient: For edge scale-in, an origin places a call to Red5 Pro Stream Manager to remove an edge. For an origin streammanager monitors the under utilization condition and trigges a scale-in of origins.
  2. Autoscaler: Attempts to remove the instance.
  3. Autoscaler: Retrieves the Launch Configuration and Scale Policy for this group.
  4. Autoscaler : Checks Scale Policy.
  5. Autoscaler : Requests node removal.
  6. Stream Manager : Track instance state transitions A terminating instance is tracked and its status is updated in database to the point that the machine shuts down on the cloud and reference is removed from the database.
    • Pending deletion: TERMINATING
  7. The record is removed from database once the cloud platform reports that the instance was shutdown properly.

Group now has N-1 node count.

Autoscaling uses the group's scale policy to automatically keep track to minimum number of nodes that it must maintain at all times (by role).


Cloud Platform Support

Red5 Pro Stream Manager provides tight integration with Google Cloud Compute Platform and Amazon Web Services for managing Red5 Pro instances.

Server administrator needs to create a Red5 Pro image on the platform. Red5 Pro Stream Manager will then be able to generate live instances from this image and use them in real-time.

Each cloud platform where the system must run, supports a platform specific API for interacting with machines directly. Since this varies across cloud platforms we need to standardize the interface for interacting with the cloud platform by generating a common interface for communicating with external APIs. Generally this interaction involves operations such as starting a new instance, terminating an instance, tracking instance states while its starting up or terminating, etc.

The current version of Red5 Pro Stream Manager supports the following operations on the Google Cloud Platform and Amazon Web Services via the streammanager-cloudplatform API bridge.

  1. Create Instance
  2. Delete Instance
  3. Read Instance
  4. Update Instance (Metadata only)

Red5 Node Lifecycle

In order for scale-in and scale-out to work properly we need to track the state of an instance, Stream Manager needs to track the different states of a instance via the cloud platform API. Since instances as such are not directly accessible to Stream Manager, we leverage the cloud platform's API to launch instances and track changes in instance state.

Each cloud platform provides different instance states with discrete semantics attached to each state. Red5 Pro Stream Manager normalizes instance states provided by the platform into its own set of states for internal use. This allows for a potential cross-platform integration capability with any cloud service.

Red5 Node Lifecycle

Node Lifecycle States

  • PENDING: Instance is just launched and it starting up
  • RUNNING: Instance has completed boot procedure but Red5 Pro has not started
  • INSERVICE: Red5 service on the instance is ready and reachable
  • TERMINATING: Instance is ordered to terminate and is now terminating / shutting down
  • NULL: State unknown (Optional)

Node Lifecycle Flow

  • Instance Launch | State - PENDING
  • Instance started on cloud | State - RUNNING
  • Red5 Pro ready | State - INSERVICE
  • Instance terminate | State - TERMINATING
  • Instance terminated | Node is removed from database

Red5 Pro Stream Manager Features

  • Red5 Pro Node Management
  • Cloud platform instance management
  • NodeGroup load monitoring
  • RDS (Database) interaction service
  • Stream statistics collection and storage
  • Autoscaling
  • REST services for operations

Red5 Pro Stream Manager Components

Cloud Platform Bridge

Handles communication with cloud platform for instance lifecycle management.

Database Layer

Manages communication with your RDS data source for database operations

REST API Layer

Allows simple cluster administration through Stream Manager and provides a simple public interface for co-ordinating between publishers and subscribers for stream production and consumption.

Autoscaler

Handles automatic scaling operations to expand and contract your server fleet.

Policy Manager

Manages your scaling policies in Stream Manager's policy store

Configuration Manager

Manages launch configurations in Stream Manager's configuration store.

Instance Manager

Handles creation and termination of cloud server instances directly from Stream Manager.

Red5CloudWatch

Monitors your Red5 Pro instances for availability and traffic status.


Red5 Pro Stream Manager Stream Opertions

Streaming Lifecycle: Broadcast to Subscribe Streaming Operations

Red5 Pro Stream Manager Stream Operations