Wednesday, August 29, 2012

ZFS: A Multi-Year Case Study in Moving From Desktop Mirroring (Part 4)

Abstract:
ZFS was created by Sun Microsystems to innovate the storage subsystem of computing systems by simultaneously expanding capacity & security exponentially while collapsing the formerly striated layers of storage (i.e. volume managers, file systems, RAID, etc.) into a single layer in order to deliver capabilities that would normally be very complex to achieve. One such innovation introduced in ZFS was the ability to dynamically add additional disks to an existing filesystem pool, remove the old disks, and dynamically expand the pool for filesystem usage. This paper discusses the upgrade of high capacity yet low cost mirrored external media under ZFS.

Case Study:
A particular Media Design House had formerly used multiple external mirrored storage on desktops as well as racks of archived optical media in order to meet their storage requirements. A pair of (formerly high-end) 400 Gigabyte Firewire drives lost a drive. An additional pair of (formerly high-end) 500 Gigabyte Firewire drives experienced a drive loss within one month later. A media wall of CD's and DVD's was getting cumbersome to retain.

First Upgrade - Migration to Solaris:
A newer version of Solaris 10 was released, which included more recent features. The Media House was pleased to accept Update 8, with the possibility of supporting Level 2 ARC for increased read performance and Intent Logging for increase write performance. A 64 bit PCI card supporting gigabit ethernet was used on the desktop SPARC platform, serving mirrored 1.5 Terabyte "green" disks over "green" gigabit ethernet switches. The Media House determined this configuration performed adequately.


ZIL Performance Testing:
Testing was performed to determine what the benefit was to leveraging a new feature in ZFS called the ZFS Intent Log or ZIL. Testing was done across consumer grade USB SSD's in different configurations. It was determined that any flash could be utilized in the ZIL to gain a performance increase, but an enterprise grade SSD provided the best performance increase, of about 20% with commonly used throughput loads of large file writes going to the mirror. It was determined at that point to hold off on the use of the SSD's, since the performance was adequate enough.

Second Upgrade - Drives Replaced:
One of the USB drives experienced some odd behavior from the time it was purchased, but it was decided the drives behaved well enough under ZFS mirroring. Eventually, the drive started to perform poorly and were logging occasional errors. When the drives were nearly out of capacity, they were upgraded from 1.5 TB mirror to a 2 TB mirror.

Third Upgrade - SPARC Upgraded:
The Ultra60 desktop was being moved to a new location in the media house, a PM (preventative maintenance) was conducted (to remove dust), but the Ultra 60 did not boot in the new location. It was time to move the storage to a newer server.

The old Ultra60 was a nice unit, with 2 Gig of RAM and a dual 450MHz UltraSPARC II CPU's, but did not offer some of the features that modern servers offered. An updated V240 platform was chosen: Dual 1.5GHz UltraSPARC IIIi, 4 Gig of RAM, redundant power supplies, and an upgraded UPS.

Listing the Drives:

After booting the new system, attaching the USB drives, a general "disks" command was run, to force a discovery of the drives. Whether this is needed or not, is not necessarily important, but it is a step seasoned system administrators do.

The listing of the drives is simple to do through
V240/root$ ls -la /dev/rdsk/c*0
lrwxrwxrwx 1 root root 46 Jan  2  2010 /dev/rdsk/c0t0d0s0 -> ../../devices/pci@1e,600000/ide@d/sd@0,0:a,raw
lrwxrwxrwx 1 root root 47 Jan  2  2010 /dev/rdsk/c1t0d0s0 -> ../../devices/pci@1c,600000/scsi@2/sd@0,0:a,raw
lrwxrwxrwx 1 root root 47 Jan  2  2010 /dev/rdsk/c1t1d0s0 -> ../../devices/pci@1c,600000/scsi@2/sd@1,0:a,raw
lrwxrwxrwx 1 root root 47 Mar 25  2010 /dev/rdsk/c1t2d0s0 -> ../../devices/pci@1c,600000/scsi@2/sd@2,0:a,raw
lrwxrwxrwx 1 root root 47 Sep  4  2010 /dev/rdsk/c1t3d0s0 -> ../../devices/pci@1c,600000/scsi@2/sd@3,0:a,raw
lrwxrwxrwx 1 root root 59 Aug 14 21:20 /dev/rdsk/c3t0d0 -> ../../devices/pci@1e,600000/usb@a/storage@2/disk@0,0:wd,raw
lrwxrwxrwx 1 root root 58 Aug 14 21:20 /dev/rdsk/c3t0d0s0 -> ../../devices/pci@1e,600000/usb@a/storage@2/disk@0,0:a,raw
lrwxrwxrwx 1 root root 59 Aug 14 21:20 /dev/rdsk/c4t0d0 -> ../../devices/pci@1e,600000/usb@a/storage@1/disk@0,0:wd,raw
lrwxrwxrwx 1 root root 58 Aug 14 21:20 /dev/rdsk/c4t0d0s0 -> ../../devices/pci@1e,600000/usb@a/storage@1/disk@0,0:a,raw

The USB storage was recognized. ZFS may not recognize the drives, when plugged into different USB ports on the new machine. ZFS will see the drives through the "zpool import" command.
V240/root$ zpool status
no pools available
V240/root$ zpool list
no pools available
V240/root$ zpool import
  pool: zpool2
    id: 10599167846544478303
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        zpool2      ONLINE
          mirror    ONLINE
            c3t0d0  ONLINE
            c4t0d0  ONLINE

Importing Drives on New Platform:
Since the drives were taken from another platform, ZFS tried to warn the administrator, but the admin is all to well aware that the old Ultra60 is dysfunctional and the importing the drive mirror is exactly what is desired to be done.
V240/root$ time zpool import zpool2
cannot import 'zpool2': pool may be in use from other system, it was last accessed by Ultra60 (hostid: 0x80c6e89a) on Mon Aug 13 20:10:14 2012
use '-f' to import anyway

real    0m6.48s
user    0m0.01s
sys     0m0.05s

The drives are ready for import, use the force flag, and the storage is available.
V240/root$ time zpool import -f zpool2

real    0m23.64s
user    0m0.02s
sys     0m0.08s

The pool was imported quickly.
240/root$ zpool status
  pool: zpool2
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zpool2      ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c3t0d0  ONLINE       0     0     0
            c4t0d0  ONLINE       0     0     0

errors: No known data errors
V240/root$ zpool list
NAME     SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
zpool2  1.81T  1.34T   480G    74%  ONLINE  -
The storage movement went very well to the existing SPARC server.

Conclusions:
ZFS for this ongoing engagement has proved very reliable. The ability to reduce rebuild time from days to seconds, upgrade underlying OS releases, retain compatibility with older file system releases, increase write throughput by adding consumer or commercial grade flash storage, recover from drive failures, and recover from chassis failure demonstrates the robustness of ZFS as the basis for a storage system.

No comments:

Post a Comment