Open-E Knowledgebase

[JDSS] JovianDSS failover mechanism technologies explained

Article ID: 3161
Last updated: 03 Apr, 2020

Additional information:

  • product name: JovianDSS
  • product version: all
  • build: all

Subject:

JovianDSS failover mechanism technologies explained

Contents:

JovianDSS's uses STONITH ("Shoot The Other Node In The Head" or "Shoot The Offending Node In The Head"), a technique for fencing in computer clusters that prevents cluster split-brain and removes potential cluster instability.

The JovianDSS has plenty of functions to prevent cluster split-brain or instability. This provides STONITH functionality and much more:

  1. Network-based ring-ping (heartbeat and ping nodes) controller by Cluster Resource Manager which can decide to reboot a node or export/import a pool. Reboots can be soft-reboot, immediate kernel-reboot or IPMI based reboot. Network-based split-brain protection works well if cluster is properly configured and hardware works as expected.

  2. In case of wrong configuration or unexpected hardware malfunction JovianDSS uses pool based split-brain protection. The function is described in the document:

    http://open-zfs.org/w/images/d/d9/05-MMP-openzfs-2017.4.pdf

    Overview of the MMP functionality:

    "MMP prevents ZFS from importing a pool that is active on another host, under most circumstances"

    The MMP prevents pool import in case of cluster resource manager malfunction. The MMP does not allow for forced pools import if it is used by other cluster nodes.

  3. JovianDSS has built-in “Critical system error response policy”  (please find the screenshot attached) which prevents cluster instability and triggers failover in case of unexpected hardware malfunctions.

  4. JovianDSS has built-in Cluster watchdog which is monitoring user volumes for availability. (please find  config screenshot attached)  In the case of volumes, the malfunction system is rebooted in order to start failover. If the kernel triggered reboot will not work, JovianDSS is using IPMI hardware watchdog to guarantee and force the reboot for clean failover.



This article was:   Helpful | Not helpful Report an issue


Article ID: 3161
Last updated: 03 Apr, 2020
Revision: 1
Views: 0
Posted: 03 Apr, 2020 by Kowalski .
Updated: 03 Apr, 2020 by Rybak M.
print  Print email  Subscribe email  Email to friend share  Share pool  Add to pool
Tags
JovianDSS failover split-brain jdss
Prev     Next
[JDSS] How to properly shut down functioning cluster to avoid...       Backup/restore

The Knowledge base is managed by Open-E data storage software company.