Kea  1.5.0
isc::ha::CommunicationState Class Referenceabstract

Holds communication state between the two HA peers. More...

#include <communication_state.h>

+ Inheritance diagram for isc::ha::CommunicationState:

Public Member Functions

 CommunicationState (const asiolink::IOServicePtr &io_service, const HAConfigPtr &config)
 Constructor. More...
 
virtual ~CommunicationState ()
 Destructor. More...
 
virtual void analyzeMessage (const boost::shared_ptr< dhcp::Pkt > &message)=0
 Checks if the DHCP message appears to be unanswered. More...
 
bool clockSkewShouldTerminate () const
 Indicates whether the HA service should enter "terminated" state as a result of the clock skew exceeding maximum value. More...
 
bool clockSkewShouldWarn ()
 Indicates whether the HA service should issue a warning about high clock skew between the active servers. More...
 
virtual bool failureDetected () const =0
 Checks if the partner failure has been detected based on the DHCP traffic analysis. More...
 
int64_t getDurationInMillisecs () const
 Returns duration between the poke time and current time. More...
 
int getPartnerState () const
 Returns last known state of the partner. More...
 
bool isCommunicationInterrupted () const
 Checks if communication with the partner is interrupted. More...
 
bool isHeartbeatRunning () const
 Checks if recurring heartbeat is running. More...
 
std::string logFormatClockSkew () const
 Returns current clock skew value in the logger friendly format. More...
 
void poke ()
 Pokes the communication state. More...
 
void setPartnerState (const std::string &state)
 Sets partner state. More...
 
void setPartnerTime (const std::string &time_text)
 Provide partner's notion of time so the new clock skew can be calculated. More...
 
void startHeartbeat (const long interval, const boost::function< void()> &heartbeat_impl)
 Starts recurring heartbeat (public interface). More...
 
void stopHeartbeat ()
 Stops recurring heartbeat. More...
 

Protected Member Functions

virtual void clearUnackedClients ()=0
 Removes information about clients which the partner server failed to respond to. More...
 
bool isClockSkewGreater (const long seconds) const
 Checks if the clock skew is greater than the specified number of seconds. More...
 
void startHeartbeatInternal (const long interval=0, const boost::function< void()> &heartbeat_impl=0)
 Starts recurring heartbeat. More...
 

Protected Attributes

boost::posix_time::time_duration clock_skew_
 Clock skew between the active servers. More...
 
HAConfigPtr config_
 High availability configuration. More...
 
boost::function< void()> heartbeat_impl_
 Pointer to the function providing heartbeat implementation. More...
 
long interval_
 Interval specified for the heartbeat. More...
 
asiolink::IOServicePtr io_service_
 Pointer to the common IO service instance. More...
 
boost::posix_time::ptime last_clock_skew_warn_
 Holds a time when last warning about too high clock skew was issued. More...
 
int partner_state_
 Last known state of the partner server. More...
 
boost::posix_time::ptime poke_time_
 Last poke time. More...
 
asiolink::IntervalTimerPtr timer_
 Interval timer triggering heartbeat commands. More...
 

Detailed Description

Holds communication state between the two HA peers.

The HA service constantly monitors the state of the connection between the two peers. If the connection is lost it is an indicator that the partner server may be down and failover actions should be triggered.

Any command successfully sent over the control channel is an indicator that the connection is healthy. The most common command sent over the control channel is a lease update. If the DHCP traffic is heavy, the number of generated lease updates is sufficient to determine whether the connection is healthy or not. There is no need to send heartbeat commands in this case. However, if the DHCP traffic is low there is a need to send heartbeat commands to the partner at the specified rate to keep up-to-date information about the state of the connection.

This class uses an interval timer to run heartbeat commands over the control channel. The implementation of the heartbeat is external to this class and is provided via CommunicationState::startHeartbeat method. This implementation is required to run the poke method in case of receiving a successful response to the heartbeat command. It must also run poke when the lease update is successful.

The poke method sets the "last poke time" to current time, thus indicating that the connection is healty. The getDurationInMillisecs method is used to check for how long the server hasn't been able to communicate with the partner. This duration is simply a time elapsed since last successful poke time. If this duration becomes greater than the configured threshold, the server assumes that the communication with the partner is interrupted.

The derivations of this class provide DHCPv4 and DHCPv6 specific mechanisms for detecting server failures based on the analysis of the received DHCP messages, i.e. how long the clients have been trying to communicate with the partner and message types they sent. In particular, the increased number of Rebind messages may indicate issues with the DHCP server.

This class is also used to monitor the clock skew between the active servers. Maintaining a reasonably low clock skew is essential for the HA service to function properly. This class calculates the clock skew by comparing local time of the server with the time returned by the partner in response to a heartbeat command. If this value exceeds the certain thresholds, the CommunicationState::clockSkewShouldWarn and the CommuicationState::clockSkewShouldTerminate indicate whether the HA service should continue to operate normally, should start issuing a warning about high clock skew or simply enter the "terminated" state refusing to further operate until the clocks are synchronized. This requires administrative intervention and the restart of the HA service.

Definition at line 74 of file communication_state.h.

Constructor & Destructor Documentation

◆ CommunicationState()

isc::ha::CommunicationState::CommunicationState ( const asiolink::IOServicePtr io_service,
const HAConfigPtr config 
)

Constructor.

Parameters
io_servicepointer to the common IO service instance.
configpointer to the HA configuration.

Definition at line 44 of file communication_state.cc.

◆ ~CommunicationState()

isc::ha::CommunicationState::~CommunicationState ( )
virtual

Destructor.

Stops scheduled heartbeat.

Definition at line 52 of file communication_state.cc.

References stopHeartbeat().

+ Here is the call graph for this function:

Member Function Documentation

◆ analyzeMessage()

virtual void isc::ha::CommunicationState::analyzeMessage ( const boost::shared_ptr< dhcp::Pkt > &  message)
pure virtual

Checks if the DHCP message appears to be unanswered.

This method is used to provide the communication state with a received DHCP message directed to the HA partner, to detect if the partner fails to answer DHCP messages directed to it. The DHCPv4 and DHCPv6 specific derivations implement this functionality.

This check is orthogonal to the heartbeat mechanism and is usually triggered after several consecutive heartbeats fail to be responded.

The general approach to server failure detection is based on the analysis of the "secs" field value (DHCPv4) and "elapsed time" option value (DHCPv6). They indicate for how long the client has been trying to complete the DHCP transaction. If these values exceed a configured threshold, the client is considered to fail to communicate with the server. This fact is recorded by this object. If the number of distinct clients failing to communicate with the partner exceeds a configured maximum value, this server considers the partner to be offline. In this case, this server will most likely start serving clients which would normally be served by the partner.

All information gathered by this method is cleared when the poke method is invoked.

Parameters
messageDHCP message to be analyzed. This must be the message which belongs to the partner, i.e. the caller must filter out messages belonging to the partner prior to calling this method.

Implemented in isc::ha::CommunicationState6, and isc::ha::CommunicationState4.

◆ clearUnackedClients()

virtual void isc::ha::CommunicationState::clearUnackedClients ( )
protectedpure virtual

Removes information about clients which the partner server failed to respond to.

This information is cleared by the CommunicationState::poke. The derivations of this class must provide DHCPv4 and DHCPv6 specific implementations of this method. The poke method is called to indicate that the connection has been successfully (re)established. Therefore the clients counters are reset and the failure detection procedure starts over.

See CommunicationState::analyzeMessage for details.

Implemented in isc::ha::CommunicationState6, and isc::ha::CommunicationState4.

Referenced by poke().

◆ clockSkewShouldTerminate()

bool isc::ha::CommunicationState::clockSkewShouldTerminate ( ) const

Indicates whether the HA service should enter "terminated" state as a result of the clock skew exceeding maximum value.

If the clocks on the active servers are not synchronized (perhaps as a result of a warning message caused by clockSkewShouldWarn) and the clocks further drift, the clock skew may exceed another threshold which should cause the HA service to enter "terminated" state. In this state the servers still respond to DHCP clients normally, but they will neither send lease updates nor heartbeats. In this case, the administrator must correct the problem (synchronize the clocks) and restart the service. This method indicates whether the service should terminate or not.

Currently, the terminal threshold for the clock skew is hardcoded to 60 seconds. In the future it may become configurable.

Returns
true if the HA service should enter "terminated" state.

Definition at line 206 of file communication_state.cc.

References isClockSkewGreater().

+ Here is the call graph for this function:

◆ clockSkewShouldWarn()

bool isc::ha::CommunicationState::clockSkewShouldWarn ( )

Indicates whether the HA service should issue a warning about high clock skew between the active servers.

The HA service monitors the clock skew between the active servers. The clock skew is calculated from the local time and the time returned by the partner in response to a heartbeat. When clock skew exceeds a certain threshold the HA service starts issuing a warning message. This method returns true if the HA service should issue this message.

Currently, the warning threshold for the clock skew is hardcoded to 30 seconds. In the future it may become configurable.

This method is called for each heartbeat. If we issue a warning for each heartbeat it may flood logs with those messages. This method provides a gating mechanism which prevents the HA service from logging the warning more often than every 60 seconds. If the last warning was issued less than 60 seconds ago this method will return false even if the clock skew exceeds the 30 seconds threshold. The correction of the clock skew will reset the gating counter.

Returns
true if the warning message should be logged because of the clock skew exceeding a warning thresdhold.

Definition at line 179 of file communication_state.cc.

References isClockSkewGreater(), and last_clock_skew_warn_.

+ Here is the call graph for this function:

◆ failureDetected()

virtual bool isc::ha::CommunicationState::failureDetected ( ) const
pure virtual

Checks if the partner failure has been detected based on the DHCP traffic analysis.

In the special case when max-unacked-clients is set to 0 this method always returns true. Note that max-unacked-clients set to 0 means that failure detection is not really performed. Returning true in that case simplifies the code of the HAService which doesn't need to check if the failure detection is enabled or not. It simply calls this method in the 'communications interrupted' situtation to check if the server should be transitioned to the 'partner-down' state.

Returns
true if the partner failure has been detected, false otherwise.

Implemented in isc::ha::CommunicationState6, and isc::ha::CommunicationState4.

◆ getDurationInMillisecs()

int64_t isc::ha::CommunicationState::getDurationInMillisecs ( ) const

Returns duration between the poke time and current time.

Returns
Duration between the poke time and current time.

Definition at line 167 of file communication_state.cc.

References poke_time_.

Referenced by isCommunicationInterrupted().

◆ getPartnerState()

int isc::ha::CommunicationState::getPartnerState ( ) const
inline

Returns last known state of the partner.

Returns
Partner's state if it is known, or a negative value otherwise.

Definition at line 92 of file communication_state.h.

References partner_state_.

◆ isClockSkewGreater()

bool isc::ha::CommunicationState::isClockSkewGreater ( const long  seconds) const
protected

Checks if the clock skew is greater than the specified number of seconds.

Parameters
secondsa positive value to compare the clock skew with.
Returns
true if the absolute clock skew is greater than the specified number of seconds, false otherwise.

Definition at line 212 of file communication_state.cc.

References clock_skew_.

Referenced by clockSkewShouldTerminate(), and clockSkewShouldWarn().

◆ isCommunicationInterrupted()

bool isc::ha::CommunicationState::isCommunicationInterrupted ( ) const

Checks if communication with the partner is interrupted.

This method checks if the communication with the partner appears to be interrupted. This is the case when the time since last successful communication is longer than the confgured max-response-delay value.

Returns
true if communication is interrupted, false otherwise.

Definition at line 174 of file communication_state.cc.

References config_, and getDurationInMillisecs().

+ Here is the call graph for this function:

◆ isHeartbeatRunning()

bool isc::ha::CommunicationState::isHeartbeatRunning ( ) const
inline

Checks if recurring heartbeat is running.

Returns
true if heartbeat is running, false otherwise.

Definition at line 129 of file communication_state.h.

References timer_.

◆ logFormatClockSkew()

std::string isc::ha::CommunicationState::logFormatClockSkew ( ) const

Returns current clock skew value in the logger friendly format.

Definition at line 226 of file communication_state.cc.

References clock_skew_.

◆ poke()

void isc::ha::CommunicationState::poke ( )

Pokes the communication state.

Sets the last poke time to current time. If the heartbeat timer has been scheduled, it is reset (starts over measuring the time to the next heartbeat).

Definition at line 138 of file communication_state.cc.

References clearUnackedClients(), poke_time_, startHeartbeatInternal(), and timer_.

+ Here is the call graph for this function:

◆ setPartnerState()

void isc::ha::CommunicationState::setPartnerState ( const std::string &  state)

Sets partner state.

Parameters
statenew partner's state in a textual form. Supported values are those returned in response to a ha-heartbeat command.
Exceptions
BadValueif unsupported state value was provided.

Definition at line 57 of file communication_state.cc.

References isc::ha::HA_HOT_STANDBY_ST, isc::ha::HA_LOAD_BALANCING_ST, isc::ha::HA_PARTNER_DOWN_ST, isc::ha::HA_READY_ST, isc::ha::HA_SYNCING_ST, isc::ha::HA_TERMINATED_ST, isc::ha::HA_UNAVAILABLE_ST, isc::ha::HA_WAITING_ST, isc_throw, and partner_state_.

◆ setPartnerTime()

void isc::ha::CommunicationState::setPartnerTime ( const std::string &  time_text)

Provide partner's notion of time so the new clock skew can be calculated.

Parameters
time_textPartner's time received in response to a heartbeat. The time must be provided in the RFC 1123 format.
Exceptions
isc::http::HttpTimeConversionErrorif the time format is invalid.
Todo:
Consider some other time formats which include millisecond precision.

Definition at line 218 of file communication_state.cc.

References clock_skew_, isc::http::HttpDateTime::fromRfc1123(), and isc::http::HttpDateTime::getPtime().

+ Here is the call graph for this function:

◆ startHeartbeat()

void isc::ha::CommunicationState::startHeartbeat ( const long  interval,
const boost::function< void()> &  heartbeat_impl 
)

Starts recurring heartbeat (public interface).

Parameters
intervalheartbeat interval in milliseconds.
heartbeat_implpointer to the heartbeat implementation function.

Definition at line 81 of file communication_state.cc.

References startHeartbeatInternal().

+ Here is the call graph for this function:

◆ startHeartbeatInternal()

void isc::ha::CommunicationState::startHeartbeatInternal ( const long  interval = 0,
const boost::function< void()> &  heartbeat_impl = 0 
)
protected

Starts recurring heartbeat.

Parameters
intervalheartbeat interval in milliseconds.
heartbeat_implpointer to the heartbeat implementation function.

Definition at line 87 of file communication_state.cc.

References heartbeat_impl_, interval_, io_service_, isc_throw, and timer_.

Referenced by poke(), and startHeartbeat().

◆ stopHeartbeat()

void isc::ha::CommunicationState::stopHeartbeat ( )

Stops recurring heartbeat.

Definition at line 128 of file communication_state.cc.

References heartbeat_impl_, interval_, and timer_.

Referenced by ~CommunicationState().

Member Data Documentation

◆ clock_skew_

boost::posix_time::time_duration isc::ha::CommunicationState::clock_skew_
protected

Clock skew between the active servers.

Definition at line 315 of file communication_state.h.

Referenced by isClockSkewGreater(), logFormatClockSkew(), and setPartnerTime().

◆ config_

◆ heartbeat_impl_

boost::function<void()> isc::ha::CommunicationState::heartbeat_impl_
protected

Pointer to the function providing heartbeat implementation.

Definition at line 307 of file communication_state.h.

Referenced by startHeartbeatInternal(), and stopHeartbeat().

◆ interval_

long isc::ha::CommunicationState::interval_
protected

Interval specified for the heartbeat.

Definition at line 301 of file communication_state.h.

Referenced by startHeartbeatInternal(), and stopHeartbeat().

◆ io_service_

asiolink::IOServicePtr isc::ha::CommunicationState::io_service_
protected

Pointer to the common IO service instance.

Definition at line 292 of file communication_state.h.

Referenced by startHeartbeatInternal().

◆ last_clock_skew_warn_

boost::posix_time::ptime isc::ha::CommunicationState::last_clock_skew_warn_
protected

Holds a time when last warning about too high clock skew was issued.

Definition at line 319 of file communication_state.h.

Referenced by clockSkewShouldWarn().

◆ partner_state_

int isc::ha::CommunicationState::partner_state_
protected

Last known state of the partner server.

Negative value means that the partner's state is unknown.

Definition at line 312 of file communication_state.h.

Referenced by getPartnerState(), and setPartnerState().

◆ poke_time_

boost::posix_time::ptime isc::ha::CommunicationState::poke_time_
protected

Last poke time.

Definition at line 304 of file communication_state.h.

Referenced by getDurationInMillisecs(), and poke().

◆ timer_

asiolink::IntervalTimerPtr isc::ha::CommunicationState::timer_
protected

Interval timer triggering heartbeat commands.

Definition at line 298 of file communication_state.h.

Referenced by isHeartbeatRunning(), poke(), startHeartbeatInternal(), and stopHeartbeat().


The documentation for this class was generated from the following files: