Iotellect Server Failover

Iotellect Server failover cluster includes:

  • Single Master Server

  • One or more Failover Servers

  • Clustered or replicated database shared between all servers

The failover servers are activated when the master server fails, e.g. due to:

  • Network outage

  • Hardware failure

  • Operating System crash

  • Iotellect Server failure

  • Shortage of disk space

  • Any other reason

Failover Server Modes

Failover servers may work in Normal Mode or Read-only Mode. The difference between these modes is explained below. The mode of failover server is controlled by Failover Mode global cluster setting.

Normal Failover Mode

In Normal Failover Mode, failover server takes full control of Iotellect cluster upon Master Server failure. It controls and monitors the devices, services operators connections, etc. All configuration changes and events are stored in the database and will be available for the Master server once it becomes operative again.

Read-only Failover Mode

In Read-Only Failover Mode, failover server does not perform any change to the underlying database. Its behaviour appears to be similar to the Normal Failover server at first glance: devices are being controlled and their configuration settings may be changed by operators, actions can be executed, and all system functions are available. However, no configuration changes and events are stored in the database. This causes several limitations:

  • Historical events received by Read-Only Failover Server will not be available, e.g. when browsing event history of building charts

  • All configuration changes will be lost if Master Server is re-activated or Failover Server is restarted

Read-only Failover Nodes are very useful for quickly restoring cluster reliability in case of permanent Master node failure:

  • Make a Normal Failover node new Master node

  • Make a Read-only Failover node new Normal Failover node by editing its Failover Mode global cluster setting

The above operations will take mere minutes, however cluster reliability will be preserved for the case of new Master node failure. It is now possible to set up a new Read-only Failover node without any rush.

Only one Normal Failover node is allowed within a high availability cluster. Other failover nodes must work in Read-Only Mode.

Failover Scenarios

This section describes several common Failover Cluster configurations. Note that database cluster is shown as a "cloud" on the below images. In practice, databases participating the database cluster will run on the same physical servers with Iotellect Server installations.

Two Nodes

The most common failover cluster configuration includes two servers: Master Server and Normal Failover Server. Once the Master fails, the Failover switches to Failover Master mode, taking over Master's operations.

Three Nodes

The three nodes failover cluster helps to maintain system reliability even when the Master server has failed.

If the Master server fails, three nodes cluster will work similar to two nodes cluster. This allows to protect from Failover Master's failure and gives system administrators spare time to restore three nodes operation.

Failover Cluster Setup

Perform the following steps to set up Iotellect Server failover cluster:

  • Install two or more copies of Iotellect Server on different physical servers, each instance of Iotellect Server will be referred to as a Node.

  • Configure the same data storage to be used by every node, i.e. configure all nodes to use a single shared (and possibly clustered/replicated) database.

Using a single (clustered or non-clustered) database for all Iotellect Server cluster nodes is an absolute requirement.

The master node should start normal operations, while failover nodes should switch to standby mode, showing "Monitoring Master Server Status" message on the splash screens.

Failover Mode Operation

If a Master node fails, it stops performing regular cluster node updates called "heart beats". The absence of these updates is notified by the Failover nodes. If no Master heart beats occur for longer than a Node Failure Detection Time, the failover nodes are activated and start servicing normal system activities, such as device control operations and operator actions.

The service interruption interval equals to the sum of Node Failure Detection Time and failover node activation time. This gap is typically less than a minute.

Disconnection of Failover Nodes

If a failover node is disconnected from the cluster, e.g. for an update, cluster operation continues without any change. However, the Master server constantly monitors the heart beat of Failover nodes. If no failover nodes seem to be alive for longer than the Node Failure Detection Time, Master server will fire a warning event in the Administration context.

Failover Alert

Once a Master Server of Iotellect failover cluster fails, the Failover node raises a Failover Alert. This helps to quickly notify system administrators of the situation. By default, an e-mail message is sent to the administrators. It is however recommended to configure an SMS message to be sent in case of Failover alert. See Alert SMS Notifications for details.

Making Failover Server a Master

In some rare cases the Master server may be completely lost in a severe accident, e.g. due to a major hardware failure. In this case it's necessary to make one of the Failover nodes the Master node.

To switch a Failover node to be the Master node:

  • Set up a new Failover node to preserve the total number of nodes in the cluster. This new Failover node must have the same configurations as the Failover node being changed to master.

  • Change Cluster Role global configuration setting of the node to Master

  • For all nodes in the cluster, update the Heartbeat Interface IP Address, Heartbeat Port, and Heartbeat Addresses of Other Nodes to reflect the IP address and port of the new Master and the replacement Failover node.

  • Restart the new Master node

Configuring Client for Failover

To prepare Iotellect Client to work in the clustered environment:

  • Create two or more server connections in your workspace: first one for the Master Server and others for the Failover servers. Specify addresses of Master and Failover servers in the connection settings.

  • Disable connections to the Failover servers to suppress startup connection errors. This is necessary since Failover nodes won't accept Iotellect Client connections while working in standby mode.

  • Once the Master node fails (you'll see connection errors), just enable one of the failover connections.

Configuring Web UI for Failover

Once the Master node of the high availability cluster fails, the system operators won't be longer able to log in to the Web UI since the IP address an host name of Failover node differs from the address of failed Master node. There are two resolutions for this issue:

  • All operators may manually navigate to the URL of the Failover node. In practice, this URL may be bookmarked in their browsers for emergency cases.

  • It is possible to set up automatic DNS redirection. Search the Internet for "DNS failover" to find available solutions. Here is just one useful link: http://www.simplefailover.com/scenario3.aspx

Cluster Servers Security

All servers in the cluster have access to all information flowing inside Iotellect. Therefore, the same security precautions should be taken for both Master and Failover servers.

Was this page helpful?