iWay Hot Backup Listener

Topics:

The iWay Hot Backup listener is a distributed backup facility. In distributed backup iWay Service Manager (iSM) can be configured to operate normally with its own compliment of listeners, and also to carry a configuration of listeners and services defining work to be done in the event that a monitored iSM fails. In this configuration, the iSMs share the workload in normal operation and in the event of unanticipated failure of one iSM the other can invoke the required services to accomplish the tasks of the failing iSM.

For example assume that there are two iSM instances in the enterprise. The first is configured with four active listeners, A, B, C and D. The second has its own listeners E, F, G and H. In this case, the first iSM instance would also be configured with E, F, G and H, but these would be marked as inactive. Similarly, the second iSM would be configured with A through D, marked inactive. As these two iSMs run, each handles its own four listeners. In the event of failure of the either iSM, the other marks the inactive listeners as active and begins to handle the failing workload.

To accomplish this, install the Hot Backup listener, available from iWay Software. This adds a new listener type, Backup, to the list of available listeners. The Backup listener must be configured on the iSM that is to handle the monitoring and recovery of the four listeners. No additional software is required on the iSM being monitored. The iSM being monitored needs to point to the server on which the Backup listener is running through the backup settings screen on the console.

The Hot Backup listener can monitor multiple configurations. Each configuration being monitored specifies the host name and port number of the Hot Backup listener. The generated state signal delineates the host and configuration that was being monitored when its state has changed.

Application designers should be aware that when multiple servers are being monitored, it is possible for one or more servers to fail simultaneously (for example, a section of the network failing), which will trigger more than one copy of the backup process flow. Also, application designers are reminded that a process flow can be selected based on routing, which examines the host or configuration attributes.

You can use the following diagram for reference purposes during the configuration stages of the iWay Hot Backup listener.

Configuring Backup Settings

To access the backup settings pane, click Backup Settings, under the Server section in the iSM

In the Backup Settings page, provide a value for the Location of Backup field with the host name and port number of the iSM that is monitoring this instance of iSM.

iSM can be deployed to automatically failover to another waiting machine usually referred to as a hot backup host. Simple failover relies on the native functionality of iWay to emit and respond to heartbeat messages which signify normal operation of the primary server. More sophisticated backup can be configured through the Backup listener on the backup server.

Using the Backup Settings page, provide a value for the Location of Backup field with the host name and port number of the iSM that is monitoring this instance of iSM, as shown in the following image.

The following table lists and describes the parameters that are found in the Backup Settings page:

Parameter

Description

Location of backup

Location of the live system failover partner. This is the server that would be monitoring the iway server. Specify the host name and port number (host:port) of the werver to which heartbeat signals are sent. The port should match the port on the backup listener configured on the system which is monitoring this server.

Heartbeat port

The port to listen on for the live system heartbeat. This entry applies only when operating in backup (-b) mode.

Threshold

Period for backup to tolerate no heartbeat. This entry applies only when operating in backup (-b) mode.

In this example, this iSM is being monitored by a computer named otherbox on port 1559. Do not fill in the other fields since these fields are not considered for the backup listener, which is being explained here. These fields instruct the iSM that it is a full backup, and prevents any processing until a monitored signal on the heartbeat port detects a failure. This means that distributed backup cannot operate on a computer that is monitoring for full backup.

While the iSM is operating, a heartbeat is sent to the backup iSM. In this case it is sent to otherbox.

Configuring the Backup Listener

Topics:

The backup listener on the otherbox machine performs the task of monitoring the iWay server, serverA. To participate in distributed backup, a backup listener needs to be added and configured. The listener properties are explained below:

Property Name

Property Type

Property Description

Port*

integer

Port(socket) number on which messages are exchanged. This port number must match the port number of the server that is defined in the Location of backup field in the Backup Settings pane.

Local Bind Address

string

Local bind address for multi-homed hosts: usually leave empty.

Note: The Local bind Address is used for multiple homed networks, and is not further considered here.

Threshold

integer

Period (seconds) for backup to tolerate no heartbeat. For instance, if the heartbeat were set to 6 (seconds), then the listener would wait for 6 seconds to generate a signal say, DEAD if the server did not return any response (heartbeat). The default is 60 seconds. The time needs to be set as per the network speed requirements. Refer Notes below.

Whitespace Normalization

choice

Specifies how the parser treats white spaces in Element content. Choose preserve to turn off all normalization as prescribed by the XML Specification. Choose condense to remove extra white spaces in pretty printed documents and for compatibility with earlier versions.

Record in Activity Log(s)

boolean

If set, activity on this channel will be recorded in the activity logs, else the activity will not be recorded.

Note: A heartbeat is generated every tenth of a second by the monitored iSM and sent to the monitoring server. When the server being monitored does not send a heartbeat, even after the period specified by the Threshold value, the monitoring server sends a final message to reconfirm that the server being monitored is DEAD. In the event of not receiving a response, the monitoring server determines that the server is DEAD and generates a message accordingly.

The listener generates messages that can be handled by a normal iSM workflow or workflows of varying complexity. The dispatched messages are in XML, and take the form:

<signal host='hostname' config='configname'>state</signal> 

where:

hostname

Is the name of the system where iSM is running.

configname

Is the name of the iSM configuration being used (for example, base).

state

Can be LIVE, DEAD, or CLOSE.

The following table describes each state in more detail.

State

Description

LIVE

The monitored iSM has gone live. This message appears when the monitored iSM resumes or begins operation. If the monitoring iSM starts first, this message will be dispatched when the monitored iSM starts.

DEAD

The monitored iSM has not sent a heartbeat for the threshold duration, and the monitored iSM was previously in a LIVE state.

CLOSE

The monitored iSM has signaled that it is terminating normally. If it restarts you will receive a LIVE message. Once a monitored iSM signals CLOSE, missed heartbeats are ignored: heartbeats are monitored only when the monitored iSM is in a LIVE state.

The dispatched messages are used to process the signals coming from the monitored iWay server. This can be done by:

  • Using the control agent (XDControlAgent) your workflow can post internal messages to start or stop other named listeners on the monitoring server. In this manner the monitoring iSM controls the startup and shut down of listeners on its own server as the state of the iSM being monitored changes. The agent accepts a list of configured listener names and posts a start or stop message, as selected in the Action parameter.

  • Using an email emit agent to send email notifications when the server comes down or starts up.

You are not limited to using the control agent. Any form of messaging system that can alert a manager about a change in status or request to perform specific actions will suffice.

Selecting the appropriate action to take based on the dispatched message(s) is accomplished by standard iSM routing and switching control, based on an analysis of the incoming message.

Configuring the Backup Listener

How to:

The backup listener on the otherbox machine performs the task of monitoring the iWay server, serverA. To participate in distributed backup, a backup listener needs to be added and configured. The listener properties are explained below:

Property Name

Property Type

Property Description

Port*

integer

Port(socket) number on which messages are exchanged. This needs to be the same as the one set under host:port under backup settings.

Local Bind Address

string

Local bind address for multi-homed hosts: usually leave empty.

Note: The Local bind Address is used for multiple homed networks, and is not further considered here.

Threshold

integer

Period (seconds) for backup to tolerate no heartbeat. For instance, if the heartbeat were set to 6 (seconds), then the listener would wait for 6 seconds to generate a signal say, DEAD if the server did not return any response (heartbeat). The default is 60 seconds. The time needs to be set as per the network speed requirements. Refer Notes below.

Whitespace Normalization

choice

Specifies how the parser treats white spaces in Element content. Choose preserve to turn off all normalization as prescribed by the XML Specification. Choose condense to remove extra white spaces in pretty printed documents and for compatibility with earlier versions.

Record in Activity Log(s)

boolean

If set, activity on this channel will be recorded in the activity logs, else the activity will not be recorded.

Note: A heartbeat is generated every tenth of a second by the monitored iSM and sent to the monitoring server. When the server being monitored does not send a heartbeat, even after the period specified by the Threshold value, the monitoring server sends a final message to reconfirm that the server being monitored is DEAD. In the event of not receiving a response, the monitoring server determines that the server is DEAD and generates a message accordingly.

The listener generates messages that can be handled by a normal iSM workflow or workflows of varying complexity. The dispatched messages are in XML, and take the form:

<signal host='hostname' config='configname'>state</signal> 

where:

hostname

Is the name of the system where iSM is running.

configname

Is the name of the iSM configuration being used (for example, base).

state

Can be LIVE, DEAD, or CLOSE.

The following table describes each state in more detail.

State

Description

LIVE

The monitored iSM has gone live. This message appears when the monitored iSM resumes or begins operation. If the monitoring iSM starts first, this message will be dispatched when the monitored iSM starts.

DEAD

The monitored iSM has not sent a heartbeat for the threshold duration, and the monitored iSM was previously in LIVE state.

CLOSE

The monitored iSM has signaled that it is terminating normally. If it restarts you will receive a LIVE message. Once a monitored iSM signals CLOSE, missed heartbeats are ignored: heartbeats are monitored only when the monitored iSM is in LIVE state.

The dispatched messages are used to process the signals coming from the monitored iWay server. This can be done by:

  • Using the control agent (XDControlAgent) your workflow can post internal messages to start or stop other named listeners on the monitoring server. In this manner the monitoring iSM controls the startup and shut down of listeners on its own server as the state of the iSM being monitored changes. The agent accepts a list of configured listener names and posts a start or stop message, as selected in the Action parameter.

  • Using an email emit agent to send email notifications when the server comes down or starts up.

You are not limited to using the control agent. Any form of messaging system that can alert a manager about a change in status or request to perform specific actions will suffice.

Selecting the appropriate action to take based on the dispatched message(s) is accomplished by standard iSM routing and switching control, based on an analysis of the incoming message.

Procedure: How to Create a Backup Listener

To create a Backup listener:

  1. Start the iSM.
  2. Once the iSM is open, click Registry in the top frame.

    The Registry section is displayed.

  3. Under the Components section on the left side, select Listeners.

    The Listeners pane is displayed.

  4. Click Add, to add a new listener.
  5. Select backup from the Type drop-down list, as shown in the following image.

    Configuration parameters for the listener are displayed.

  6. Provide values for the configuration parameters according to the following table below and click Next.

    Parameter

    Value

    Port *

    1559

    Local bind address

    Do not specify a value for this parameter

    Threshold

    60

    Whitespace Normalization

    preserve

    Record in Activity Log(s)

    true

  7. Enter the name of the listener as Backup and leave the description empty.
  8. Click Finish to create the listener.

    The backup listener is created.

Procedure: How to Configure the Backup Listener Channel and the System Being Backed Up

This section describes how a backup listener can be used to monitor a system.

  1. Construct an inlet consisting of a backup listener on the machine, which is performing the backup.

    L=t the backup listener be configured on the machine, which is performing the backup.

  2. Construct a process flow which consists of a QAAgent configured with the properties in the following image.
  3. Add the process flow, QAprocess, to a route called QAroute, as shown in the following image.
  4. Construct a channel called testbackup, which consists of the inlet QAroute and a default outlet.
  5. Build and deploy the channel.
  6. Start the channel.
  7. Let ServerA be the machine which needs to be backed up.
  8. Stop ServerA then Start it from the command line.
  9. Restart the server using the restart link in the iSM console.

    The channel, testbackup, on the server that is taking the backup, otherpc, processes a message upon server stop and processes another message when the server is started. The parameters of the messages exchanged can be viewed using the trace files created by the QAAgent. For example, the following entries in the trace files created by the QA Agent show the signal state.

    <?xml version="1.0" encoding="UTF-8" ?><debug>Out_trace.txt
    <document> <signal errors="0">CLOSE</signal> </document> 
    <?xml version="1.0" encoding="UTF-8" ?><debug>Out_trace.txt 
    <document> <signal errors="0">LIVE</signal> </document>

    The message can also be processed using xpath(/signal) as described in the next section. Similarly, the channel processes messages when the restart option is clicked from the console.

Monitoring Server State Through the Backup Listener

How to:

As the state of the monitored iSM changes, the monitoring iSM dispatches messages.

For instance, consider the below process flow, where the control agent is invoked based on the state of the monitored iSM using a decision switch. The switch is based on a signal element in the XML message generated by the backup listener.

The Status switch is a decision based object, which is configured as follows:

The control agent Start Channels is invoked when the listeners need to be started, that is, when they are no longer running on the configured iSM. The edges are configured as follows:

In the following example, the control agent processes the signal DEAD and STOP and accordingly stops the channels (Notification_Channel and DashBoard_Channel) as required by the monitoring backup application (backup listener). The control agent Start Channels is configured, as shown in the figure below.

Procedure: How to Start the Notification and Dashboard Channels

This example will show you how to start the channels (Notification_Channel and DashBoard_Channel).

  1. Start iWay Service Manager.
  2. Publish the process flow (Failover_Processing).
  3. Add the process flow (Failover_Processing) to a route, and call it Failover_Route.
  4. Configure the Backup listener to monitor the iWay server desired.
  5. Add the listener to an inlet called backupinlet.
  6. Click Registry in the top pane.
  7. Click Channels under the Conduits section.
  8. Build, Deploy, and start the channel backup.
  9. Stop the monitored iWay server.

    The channels (Notification_Channel and DashBoard_Channel) are started when the monitored application is stopped and the signal is generated by the backup listener. The backup listener is monitoring ServerA.

Hot Backup Use Cases

Topics:

The use case in this section describes how to add distributed Hot Backup capability to provide automatic failover for the primary server.

An iSM server instance must be able to perform automatic failover to a backup iSM server instance in the event of failure. It is also desirable for the backup server to be able to be actively running to perform other server tasks to maximize server resources while it is also waiting to pick up the workload of a failed server.

In this scenario, there are two deployed channels that contain two process flows:

  • One process flow using a File Listener to pick up a batch of files containing financial records and then transform the records to SWIFT XML files.
  • Second process using a File Listener to pick up the SWIFT XML files and then transform them to standard SWIFT FIN formats.

Note: These channels are deployed to the primary iSM server instance and are in their normal Started and Active states.

In the event of a failure:

  • An iSM server instance is installed on a different machine with the same configuration as the primary. This server is designated as the backup iSM server instance.
  • Deploy the same channels to the backup server but put them in the Stopped and Inactive states.
  • Deploy a Backup listener.
  • Finally, update the primary server to include a Backup setting that points to the backup server.

Hot Backup Use Case Architecture

The following diagram illustrates the normal operation of a primary iSM server.

The following diagram illustrates the failover operation when the primary iSM server has failed.

Configuring the Primary iSM Server Instance

To navigate to the Backup system properties, go to the iAM console (blue console). Click Configuration, System Properties, then Backup.

At this screen you can enter the name of the backup server and its port number. Port number is the port specified in the Backup Listener configured on the backup server.

  • A heartbeat is generated approximately every tenth of a second by the monitored iSM and sent to the monitoring server.
  • Heartbeat rate could be affected if the system clock were to slow down. But the work load of the server should not have any real impact on the heartbeat rate.

Deploying the Primary iSM Server Instance

Channels are deployed as usual, as shown in the image below.

Backup iSM Server Instance

The same Channels in the primary server are deployed in the backup but in the Stopped and Inactive states. They are only to be activated when there is a failover detected from the primary server. Failover is detected by the backup.channel by listening for heartbeats from the primary server.

The image below is a normal operation that is receiving heartbeats.

The following image illustrates a backup operation that did not receive heartbeats.

Configuring the Backup Channel

The backup.channel consists of a backup listener and process flow that handles messages resulted from the heartbeats.

Backup Listener

The Backup Listener is created from the iSM Console by selecting the Backup type from the Listener type drop-down list. The listener listens for the heartbeats emitted from the primary server. Depending on the condition, it may create an XML message that contains:

<signal host='hostname' config='configname'>state</signal> 

where:

hostname

Is the name of the system where iSM is running.

configname

Is the name of the iSM configuration being used (for example, base).

state

Can be LIVE, DEAD, or CLOSE.

The following table describes each state in more detail.

State

Description

LIVE

The monitored iSM has gone live. This message appears when the monitored iSM resumes or begins operation. If the monitoring iSM starts first, this message will be dispatched when the monitored iSM starts.

DEAD

The monitored iSM has not sent a heartbeat for the threshold duration, and the monitored iSM was previously in LIVE state.

CLOSE

The monitored iSM has signaled that it is terminating normally. If it restarts, you will receive a LIVE message. Once a monitored iSM signals CLOSE, missed heartbeats are ignored. Heartbeats are monitored only when the monitored iSM is in LIVE state.

  • Enter a port number that this server will use to listen for heartbeats emitted from the primary server.
  • Threshold is the length of time that the backup server waits for a heartbeat before starting itself up.

Note: The threshold value will depend on the deployed environment.

Process Flow

The process flow interrogates the inbound XML that is generated by the backup listener. Using XPATH, the state of the heartbeat or the primary server is retrieved and using a Decision Switch to determine the next action to take.

The Decision Switch Object is shown below.

The Start Channel Service contains a list of channels to be started when a failover from the primary server occurs, as shown in the image below.

The Email Object notifies someone when the primary server failure is detected and backup has become active, as shown in the image below.

The Stop Channel Service contains a list of channels to be stopped when the primary server is LIVE, as shown in the image below.

The Email Object notifies someone when the primary server is back up and the backup server is on stand-by, as shown in the image below.