Monday, August 31, 2009

Multi-Node Cluster Shared Nothing Storage

Multi-Node Cluster Shared Nothing Storage

Abstract

A number of months back, a new release of Sun Cluster was released, in conjunction with OpenSolaris 2009.06. This release offered a new architecture for a lower cost fail-over cluster capability using Shared-Nothing Storage. This paper discussed the benefits using a broader implementation plan to further reduce costs and increase scalability.

Shared Nothing Storage

With the advent of ZFS under Solaris and ComStar under OpenSolaris, there is a new no-cost architecture in the world of high-availability under Sun - Shared Nothing Storage.

The benefits are clear in this environment:


  • External Storage is not required (with it's complexity and costs)

  • Additional storage area network infrastructure is not required (with it's complexity and costs)

  • The OS of the active node continually keeps all the local disks in sync (with virtually no complexity)
There are some drawbacks to this environment:


  • Complete CPU capacity is needed on both platforms for peak CPU capacity for active applications.
Applications can be run under Node-1 while Node-2 is always kept up to date, ready for failover of the storage as well as the applications which are sitting on that storage pool.

Dual-Node Shared Nothing Storage

Some people may not bee too impressed - there is still a node which is completely unused. This additional node may be considered a pure cost in an H-A or D-R environment. This is not necessarily true, if other strategies are taken into consideration.

For example, a dual-active node, where individual internal storage could be leveraged on dual active nodes through dual initiators, to completely leverage CPU capacity on both nodes during peak times.

The benefits are clear in this environment:


  • External Storage is not required (with it's complexity and costs)

  • Additional storage area network infrastructure is not required (with it's complexity and costs)

  • The OS of the active node continually keeps all the local disks in sync (with virtually no complexity)

  • 200% CPU capacity on two platforms can be leveraged during peak usage times
There are some drawbacks to this environment:


  • Fail-over of a single node results in reduction to 100% of CPU capacity
Applications can be run under Node-1 and Node-2 while disks on the opposing node is always kept up to date, ready for failover of the storage as well as the applications which are sitting on that storage pool.

Multi-Node Shared Nothing Storage

The dual-active node share nothing architecture seems very beneficial, but what can be done in very typical three-tier environments?

Considering how simple it is to move around pools as well as zones, multi-node clustering can be done with a couple of simple scripts.

For example, a triple-active node, where individual internal storage could be leveraged on all three active nodes through triple initiators, to completely leverage CPU capacity on all nodes during peak times.


The benefits are clear in this environment:


  • External Storage is not required (with it's complexity and costs)

  • Additional storage area network infrastructure is not required (with it's complexity and costs)

  • The OS of the active node continually keeps all the local disks in sync (with virtually no complexity)

  • 300% CPU capacity across all platforms can be leveraged during peak processing times

  • Failover of a single node means only a decrease to 200% CPU processing capacity
Applications can be run under Node-1, Node-2, and Node-3 while disks on the opposing nodes are always kept up to date, ready for failover of the storage as well as the applications which are sitting on that storage pool.

Application in Network Management

What does this have to do with Network Management?

Very often, there are multiple platforms which are used on polling platforms, with a high-availability requirement on an embedded database. There is usually a separate cost for H-A kits for applications as well as databases.

Placing each of the tiers within a Solaris Container is the first step to business optimization, higher availability, and cost reduction.


As a reminder, Oracle RDBMS can legally be run within a CPU Capped Solaris 10 Container, in order to reduce CPU licensing costs, leaving plenty of CPU available for failing over applications from other tiers. As additional capacity is needed by the business, the additional license can be purchased and the cap extended to other cores on the existing platform.

Pushing down the H-A requirements to the OS level eliminates application & license complexities and enables drag-and-drop load balancing or disaster-recovery under Solaris 10 or OpenSolaris using Solaris Containers. Running a RDBMS within a Capped Solaris 10 Container gives the business the flexibility to buy/stage hardware without having to pay the unused cpu cycles until they are actually needed.

- - - - - - - - - - - - - - - - - - -

Update - 2009-01-07: Another blog posting about this feature:

Solaris tip of the week: iscsi failover with COMSTAR


Update - 2019-10-21: Previous "Solaris tip of the week" no longer exists, transferred post:
https://jaydanielsen.wordpress.com/2009/12/10/solaris-tip-of-the-week-iscsi-failover-with-comstar/
I've been researching HA iscsi configurations recently, and I'd like to capture and share what I've learned about the COMSTAR stack. I have a simple demo that you can use for your own experiments...