Showing posts with label VMware. Show all posts
Showing posts with label VMware. Show all posts

Monday, October 19, 2015

Solaris 11.2: Extending ZFS rpool Under Virtualized x86

Solaris 11.2: Extending ZFS "rpool" Under Virtualized x86

Abstract

Often when an OS is first installed, resources or redundancy may be required beyond what was originally in-scope on a project. Adding additional disks by adding file systems was an early solution, but the disks were always next to the original file system while pushing the effort to applications to resolve them. Virtual file systems were created to be able to add or mount additional storage anywhere in a filesystem. Volume managers were later created, to create volumes which file systems could sit on top of, with tweeks to file systems to allow expansion. In the modern world, file systems like ZFS provide all of those capabilities. In a virtualized environment, underlying disks are no longer even disks, and can be extended using shared storage, making file systems like ZFS even more important.

[Solaris Zone/Container Virtualization for Solaris 10+]

Use Cases

This document will discuss use cases where Solaris 11.2 was installed in an x86 environment on top of VMWare where a vSphere administrator will extend the virtual disks which the ZFS root file system was installed upon.

Two use specific cases to be evaluated include:
1) A simple Solaris 11.2 x86 installation with a single "rpool" Root Pool where it needs a mirror and was sized too small.
2) A more complex Solaris 11.2 x86 installation with a mirrored "rpool" Root Pool where it was sized too small.

A final Use Case is evaluated, which can be applied after either one of the previous cases:
3) Extend swap space on a ZFS "rpool" Root Pool

The terminology for ZFS is "autoexpand" for the ZFS filesystem filling the extended virtual disk file. For this article, the VMWare vSphere virtual disk extend is out of scope. It is expected that this process will work with other hypervisors.


[Solaris Logo, courtesy former Sun Microsystems]

Use Case 1: Simple OS Complexity Install Problem

Problem Background: Single Disk Lacks Redundancy and Capacity

When a simple Solaris 11.2 installation occurs, a single disk may be the original installation.
sun9999/root# zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME      STATE     READ WRITE CKSUM
        rpool     ONLINE       0     0     0
          c2t1d0  ONLINE       0     0     0

errors: No known data errors

sun9999/root#

As the platform becomes more important, additional disk space (beyond the original 230GB) may be required in the root pool as well as additional redundancy (beyond the single disk.)
sun9999/root# zpool list
NAME   SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  228G   182G  46.4G  79%  1.00x  ONLINE  -

sun9999/root#

Under Solaris, these attributes can be augmented without additional software or reboots.
[Sun Microsystems Logo]

Solution: Add and Extend Virtual Disks

Solaris systems under x86 are increasingly deployed under VMWare. Virtual disks  may be the original allocation, and these disks can be added and later even extended by the hypervisor. It will take some time before Solaris 11 recognizes that a change is done against the underlying virtual disks and these disks can be extended. The disks must be carefully identified before making any changes. Only the 3 steps in purple are required.

[OCZ solid state hard disk]

Identifying the Disk Candidates

The disks can be identified with "format" command.
sun9999/root# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
       0. c2t0d0
          /pci@0,0/pci15ad,1976@10/sd@0,0
       1. c2t1d0
          /pci@0,0/pci15ad,1976@10/sd@1,0
       2. c2t2d0
          /pci@0,0/pci15ad,1976@10/sd@2,0

Specify disk (enter its number):

The 3x disks identified above are clearly virtual, but it is unclear the role of each disk.

The "zpool status" performed earlier identified Disk "1" as a root pool disk.

The older style Virtual File System Table will show other disks with older file system types. In the following case, clearly Disk "2" is a UFS filesystem, which can not be used for root.
sun9999/root# grep c2 /etc/vfstab
/dev/dsk/c2t2d0s0 /dev/rdsk/c2t2d0s0 /u000 ufs 1 yes onerror=umount
This leaves us with Disk "0", to be verified via format, which may be a good candidate for root mirroring.
Specify disk (enter its number): 0
selecting c2t0d0
[disk formatted]
Note: detected additional allowable expansion storage space that can be
added to current SMI label's computed capacity.
Select to adjust the label capacity.
...
format>
Solaris 11.2 has noted that Disk "0" can also be extended.

The "format" command will also verify the other sliced.
Specify disk (enter its number): 1
selecting c2t1d0
[disk formatted]
/dev/dsk/c2t1d0s1 is part of active ZFS pool rpool. Please see zpool(1M).

...
format> disk
...

Specify disk (enter its number)[1]: 2
selecting c2t2d0
[disk formatted]
Warning: Current Disk has mounted partitions.
/dev/dsk/c2t2d0s0 is currently mounted on /u000. Please see umount(1M).

format> quit

sun9999/root#

Clearly, no other disk is available, with the exception of Disk "0", for mirroring the root pool.

[Sun Microsystems Storage Server]
Adding Disk "0" to Root Pool "rpool"

It was already demonstrated the single "c2t1d0" device is in the "rpool" and the new disk candidate is "c2t0d0". To create a mirror, use the "attach" to add to the existing device disk a new candidate device disk and observe progress with "status" until resilvering is completed.
sun9999/root# zpool attach -f rpool c2t1d0 c2t0d0
Make sure to wait until resilver is done before rebooting.
sun9999/root# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function in a degraded state.
action: Wait for the resilver to complete.
        Run 'zpool status -v' to see device specific details.
  scan: resilver in progress since Thu Oct 15 17:19:49 2015
    184G scanned
    39.5G resilvered at 135M/s, 21.09% done, 0h18m to go
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       DEGRADED     0     0     0
          mirror-0  DEGRADED     0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t0d0  DEGRADED     0     0     0  (resilvering)

errors: No known data errors
sun9999/root#
The  previous resilver suggests future maintenance on the mirror with similar data may take ~20 minutes.
[Seagate External Hard Disk]

Extending Root Pool "rpool"

Verify there is a known good mirror so the root pool can be extended safely.
sun9999/root# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 184G in 0h19m with 0 errors on Thu Oct 15 17:39:34 2015
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0

errors: No known data errors


sun9999/root#

The newly added "c2t0d0" virtual disk has been automatically extended by zpool.
sun9999/root# prtvtoc -h /dev/dsk/c2t0d0
       0     24    00        256    524288    524543
       1      4    00     524544 1048035039 1048559582
       8     11    00  1048559583     16384 1048575966
sun9999/root# prtvtoc -h /dev/dsk/c2t1d0
       0     24    00        256    524288    524543
       1      4    00     524544 481803999 482328542
       8     11    00  482328543     16384 482344926
sun9999/root#
Next, enable auto expand or (extend) on rpool to resize, once the "c2t1d0" disk has been resized.
sun9999/root# zpool set autoexpand=on rpool
sun9999/root# zpool get autoexpand rpool
NAME   PROPERTY    VALUE  SOURCE
rpool  autoexpand  on     local

sun9998/root#
Detect the new disk size for the existing "c2t1d0" disk that was resized.
sun9999/root# devfsadm -Cv
...
devfsadm[13903]: verbose: removing file: /dev/rdsk/c2t1d0s14
devfsadm[13903]: verbose: removing file: /dev/rdsk/c2t1d0s15
devfsadm[13903]: verbose: removing file: /dev/rdsk/c2t1d0s8
devfsadm[13903]: verbose: removing file: /dev/rdsk/c2t1d0s9
sun9999/root#
The expansion should now take place, nearly instantaneously.

[Oracle Logo]

Verifying the Root Pool "rpool" Expansion

Note the original disk "c2t1d0" disk was extended.
sun9999/root# prtvtoc -h /dev/dsk/c2t0d0
       0     24    00        256    524288    524543
       1      4    00     524544 1048035039 1048559582
       8     11    00  1048559583     16384 1048575966

sun9999/root# prtvtoc -h /dev/dsk/c2t1d0
       0     24    00        256    524288    524543
       1      4    00     524544 1048035039 1048559582
       8     11    00  1048559583     16384 1048575966


sun9999/root#
The disk space is now extended to 500GB
sun9999/root# zpool list
NAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  498G   184G  314G  37%  1.00x  ONLINE  -

sun9999/root#
And it is not a bad time to scrub the new disks, it will take about 1 hour, to ensure there are no errors.

sun9999/root# zpool scrub rpool
sun9999/root# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0 in 1h3m with 0 errors on Thu Oct 15 19:58:09 2015
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0

errors: No known data errors
sun9998/root#

The Solaris installation on the ZFS Root Pool "rpool" is healthy.

[Oracle Servers]

Use Case 2: Medium Complexity OS Installation

Problem:  Mirrored Disks Lacks Capacity

The previous section was extremely detailed, this section will be more brief. Like the previous section, there is a lack of capacity in the root pool. Unlike the previous section, this pool is already mirrored.

Solution: Extend Mirrored Root Pool "rpool"

 The following use case is merely to extend the Solaris 11 Root Pool "rpool" after the VMWare Administrator had already increased the size of the root virtual disks. Note, only the two steps in purple are required.

Extend Root Pool "rpool"

The following steps take only seconds to run.

sun9998/root# zpool list
NAME   SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  228G   179G  48.9G  78%  1.00x  ONLINE  -


sun9998/root# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 99.1G in 0h11m with 0 errors on Tue Apr  7 15:48:39 2015
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t3d0  ONLINE       0     0     0

errors: No known data errors


sun9998/root# echo | format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c2t0d0
          /pci@0,0/pci15ad,1976@10/sd@0,0
       1. c2t2d0
          /pci@0,0/pci15ad,1976@10/sd@2,0
       2. c2t3d0
          /pci@0,0/pci15ad,1976@10/sd@3,0
Specify disk (enter its number): Specify disk (enter its number):

sun9998/root# zpool set autoexpand=on rpool
sun9998/root# zpool get autoexpand rpool
NAME   PROPERTY    VALUE  SOURCE
rpool  autoexpand  on     local


sun9998/root# devfsadm -Cv
devfsadm[7155]: verbose: removing file: /dev/dsk/c2t0d0s10
devfsadm[7155]: verbose: removing file: /dev/dsk/c2t0d0s11
...

devfsadm[7155]: verbose: removing file: /dev/rdsk/c2t3d0s8
devfsadm[7155]: verbose: removing file: /dev/rdsk/c2t3d0s9

sun9998/root# zpool list
NAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  498G   179G  319G  35%  1.00x  ONLINE  -


sun9998/root#

And, the effort is done, as fast as you can type the commands.

[Sun Microsystems Flash Module]

Verify Root Pool "rpool"

 The following verification is for the paranoid, the scrub will be kicked off in the background, performance will be monitored for about 20 seconds on 2 second polls, and the verification may take about 1-5 hours (depending on how busy the system or I/O subsystem is.)

sun9998/root# zpool scrub rpool

sun9998/root# zpool iostat rpool 2 10
          capacity     operations    bandwidth
pool   alloc   free   read  write   read  write
-----  -----  -----  -----  -----  -----  -----
rpool   179G   319G     11    111  1.13M  2.55M
rpool   179G   319G    121      5  5.58M  38.0K
rpool   179G   319G    103    189  6.15M  2.53M
rpool   179G   319G    161      8  4.60M   118K
rpool   179G   319G     82      3  10.3M  16.0K
rpool   179G   319G    199    113  6.38M  1.56M
rpool   179G   319G     31      5  1.57M  38.0K
rpool   179G   319G    117      3  9.64M  18.0K
rpool   179G   319G     30     96  2.28M  1.74M
rpool   179G   319G     24      4  3.12M  36.0K

sun9998/root# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0 in 4h32m with 0 errors on Fri Oct 16 00:42:28 2015
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t3d0  ONLINE       0     0     0

errors: No known data errors
sun9998/root#
Solaris installation and ZFS Root Pool "rpool" is healthy.

Use Case 3: AddSwap in a ZFS "rpool" Root Pool

Problem: Swap Space Lacking

After more disk space is added to the ZFS "rpool" Rooi Pool, it may be desired to extend the swap space. This must be done in another operation, after the "rpool" is already extended.

Solution: Add Swap to ZFS and the Virtual File System Table

The user community determines they need to increase swap from 12 GB to 20 GB, but they can not afford reboot. There are 2 steps required:
1) add swap space
2) make swap space permanent
First, existing swap space must be understood.

Review Swap Space

Swap space can be reviewed for reservation, activation, and persistence with "swap", "zfs", and "grep".
sun9999/root# zfs list rpool/swap
NAME         USED  AVAIL  REFER  MOUNTPOINT
rpool/swap  12.4G   306G  12.0G  -


sun9999/root# swap -l -h
swapfile                 dev    swaplo   blocks     free
/dev/zvol/dsk/rpool/swap 279,1     4K      12G      12G


sun9999/root# grep swap /etc/vfstab
swap                      -  /tmp    tmpfs  - yes     -
/dev/zvol/dsk/rpool/swap  -  -       swap   - no      -


sun9999/root# 
Note, the "zfs list" above will only work with a single swap dataset. When adding a second swap dataset, a different methodology must be used.

Swap Space Dataset Creation

To add swap space to the existing root pool, without a reboot, requires adding another dataset. To increase from 12 GB to 20 GB, the additional dataset should be 8 GB. This takes a split second.
sun9999/root# zfs create -V 8G rpool/swap2
sun9999/root# 
Swap dataset is now ready to be manually activated.

Swap Space Activation


The swap space is activated using the "swap" command. This takes a split second.
sun9999/root# swap -a /dev/zvol/dsk/rpool/swap2

sun9999/root# swap -l -h
swapfile                    dev    swaplo   blocks     free
/dev/zvol/dsk/rpool/swap  279,1        4K      12G      12G
/dev/zvol/dsk/rpool/swap2 279,3        4K     8.0G     8.0G

sun9999/root#
This swap space is only temporary, until the next reboot.

Swap Space Persistence

To make the swap space persistent, after a reboot, it must be added to the Virtual File System Table
sun9999/root# cp -p /etc/vfstab /etc/vfstab.2015_10_16_dh
sun9999/root# vi /etc/vfstab

(add the following line)
/dev/zvol/dsk/rpool/swap2  -  -       swap   - no      -
sun9999/root#
 The added swap space will now be activated automatically, upon the next reboot.

Swap Space Validation

Commands to verify: zfs swap datasets, active swap datasets, and persistent datasets
sun9999/root# zfs list | grep swap
rpool/swap                         12.4G   298G  12.0G  -
rpool/swap2                        8.25G   297G  8.00G  -


sun9999/root# swap -l -h
swapfile                    dev    swaplo   blocks     free 
/dev/zvol/dsk/rpool/swap  279,1        4K      12G      12G
/dev/zvol/dsk/rpool/swap2 279,3        4K     8.0G     8.0G


sun9999/root# grep swap /etc/vfstab
swap                       -   /tmp  tmpfs  -  yes     -
/dev/zvol/dsk/rpool/swap   -   -     swap   -  no      -
/dev/zvol/dsk/rpool/swap2  -   -     swap   -  no      -


sun9999/root#
Note, the zfs list command now uses a "grep", to capture multiple datasets.
A total of [12G + 8G =] 20GB is now available in swap.

Conclusions

Most of the above document is fluff, filled with paranoia, checking import items to ensure no data loss multiple times. Very few commands are required to perform the aspects of mirroring and root pool extension, Solaris provides a seemless methodology at the OS level to perform activities which are often painful under other operating systems or require additional 3rd party software to perform.

Tuesday, January 22, 2013

Cisco Fires Shot at EMC: Parallels to Replace VMWare?

A very short update on Cloud Computing...

EMC: Build Their Own Server...
We remember that during EMC World in June 2012, EMC started the process of building their own cloud system... without Cisco.

EMC: The Missing Switch...
EMC acquired Nicira, to fix their hole in their networking stack, as discussed during VMWare World 2012 in August 2012. VMWare also started selling cloud engineering professional services.

Cisco: The Missing Hypervisor...
This left Cisco in a very difficult position - where would Cisco go to get a Hypervisor? Cisco just tool a significant equity stake in Parallels, in order to gain one. Perhaps, they should have thought about KVM on Illumos.

EMC & Cisco: The Missing OS...
While EMC and Cisco are continuing to gobble up components (still missing an OS) for their proprietary clouds, Oracle had released The First Cloud OS back in 2012 - it was called Solaris 11. Of course, Microsoft can't be left behind, copy'ing Oracle, saying Windows Server 2012 is the First Cloud OS! LOL!


Of course, Illumos is still an option for both EMC and Cisco... and Cisco would not have needed to buy an equity stake in Parallels, had they gone the Illumos route from the beginning. Joyent has been selling Clouds on Illumos for some time, even appearing in Gartner's Magic Quadrant starting in 2009.

What was Cisco thinking?

Monday, September 24, 2012

EMC: Shakeup and Network Managment Implications

[image courtesy: blog of Chuck Hollis, EMC VP --Global Marketing CTO]
Changes at EMC

EMC has been going through a great deal of changes over the years.

2003-12 - [html] - EMC Purchases VMWare for Hypervisor
2004-12 - [html] - EMC Purchases SMARTS for Network Fault Management
2007-11 - [html] - EMC Purchases Voyence for VoyenceControl
2009-11 - [html] - VMWare, EMC, Cisco Announce VCE (Virtual Computing Environment) VBlock Architecture
2012-05 - [html] - EMC Purchases Watch4Net for APG
2012-06 - [html] - Cisco, NetApp Announce FlexPod Architecture
2012-06 - [html] - EMC Produces Own Blade Servers

Some of internal changes are more political rather than product infrastructural.

2012-09 - [html] - EMC CEO Succession Politics

During EMC world, Oracle was made to look like the red-headed bastard step child, while SPARC hardware probably drives more EMC storage than either company cares to acknowledge.
Implications to Network Management

EMC had normally played well in the multi-vendor environment, because they were a software and storage company - they would sell disks and software to anyone who used any vendor's equipment. This started to change in 2003.

With the acquisition of VMWare in 2003, there was an internal drive to virtualize more software in the proprietary Intel space, rather than play in the Open Systems space. With the VCE announcement, using Cisco to push into the carrier space further pressed the Open Systems vendors.  In 2012, with the announcement from Cisco to partner with NetApp and EMC producing their own blades, the internal political pressure to abandon Open Systems will continue.
Ironically, historical analysis of performance, configuration, and event data from Network and Systems Management platforms drove the need for robust disk storage systems... Telecommunications Market -was one of the original Big Data platforms. Big-data using off-the-shelf network management software requires Open Systems (with massive vertical [socket-count] and horizontal [blade-count] scalability.) No robust system implemented under traditional Open Systems platforms would be done, without external EMC storage. The push away from Open Systems platforms (to lower-end Linux & Windows platforms) ironically drives EMC storage out of the solutions... yet this [increasingly] is the direction from EMC.

Will the investment of Open Systems management tools from SMARTS, Voyence, and Watch4Net continue to be made by EMC - with the transition from EMC CEO Joe Tucci? EMC recently killed cross-vendor object storage. With VMWare CEO Paul Maritz assuming a higher profile, will the traditional EMC suffer greater loss in their EMC Network Management and EMC storage customer bases? Pat Gelsinger, president and COO of EMC's Information Infrastructure Products, became CEO of VMware, which may offer greater influence for Open Systems management in VMWare's proprietary Intel sphere.

[Graph courtesy: seekingalpha.com article]
Traditional Network, Systems, Storage, and Security Management may be losing the last viable multi-vendor player, as the industry consolidates into vertical, proprietary stove-piped systems. It is really EMC's choice as to whether their political structure decides to carry the Open Systems banner and say "we are different - we manage everything on everything" (and are worth the premium we charge) or whether they choose to lose the moral high ground offered by Open Systems and suffer the lower profit margins of a solely proprietary Intel platform [which offers no marketing differentiation.]

[Tombstone of Elizabeth in Yangzhou courtesy: wikimedia.org]
If EMC's SMARTS, VoyenceControl, and APG will not "manage everything on anything" in the near future: EMC will lose their main competitive position against dominate industry players like HP & IBM. Without "everything on anything" - EMC will not be worth the money they currently charge. Any two-bit open source network & systems management framework supports "management of everything on anything"... so if EMC continues to choose vertical isolation [with the abandonment of Open Systems such as Itanium, POWER, and SPARC] - EMC's may soon no longer be "the only game in-town" and there will no longer be the need to even consider EMC as a competitor to Tivoli and OpenView. Ignoring a great marketing feature and poor management decisions will result in the death of EMC NSM.

Tuesday, August 28, 2012

VMWare Resolves Some Issues

VMWare 5.1 Resolves Some Issues

Abstract:
With the advent of simple and cost effective virtualization under Solaris 10, Zones, LDoms, and Virtual Box - pressure has been placed upon dominate virtualization vendors to create less expensive alternatives. VMWare, after being purchased by EMC, had decided to move in the opposite direction, making purchasing of VMWare very difficult, with odd pricing constraints in ESXi 5.0 in July 2011. The market has moved to 2012 and ESXi 5.1 has been released, fixing some of VMWare's problems.

Compatibility Issue Resolved:
If customers wanted to move an older VM to newer hardware, the VM's needed to be upgraded. In other words, there was compatibility issues which needed to be resolved. VM's created under ESX Server 3.5 and later will now run under ESXi 5.1 unchanged. This is good news for service providers.

No Longer Windows Bound:
Customers who had VMWare ESXi were required to use a lousy Microsoft Windows platform to manage the VMWare platform. When managing an ESXi server in a DMZ, this makes little sense for a service provider. This has now been resolved, with a web interface.

Memory Tax Issue Resolved:
The pricing constraints of ESXi 5.0 forced service providers to have to decide - is VMWare the correct hypervisor for the job... is Windows and/or Linux worth the aggravation of being nickel and dimed to death? When trying to determine hardware and hypervisor pricing for a new cluster where one does not know exactly how much memory will be required per instance because infrastructure is being purchased by a managed services provider before the first customer deal is sold, how does one know how much to buy?

Clearly, EMC's VMWare did not have a clue. The confusion that the pricing placed upon managed service providers negatively impacted purchasing of other EMC software products such as ITOI (aka Ionix, aka SMARTS) and RSA Archer, enVision, etc. If a managed service provider can not determine what to buy, they will not buy from that vendor. Solaris is clearly the better choice for Network Management, and other vendors are clearly the better choice for tools bound to VMWare & Windows.

The removing of the memory constraints for ESXi 5.1 was a good move, to simplify pricing. EMC Software is now in a better position to compete against other virtualized platforms.

Outstanding Core Issues:
For reasonable flexibility in the data center environment, when there is a spike in usage, there needs to be a way to easily migrate heavy usage live instances to lower utilized hypervisors. Dynamic migration with autobalancing is included with Oracle LDom's, but not quite there yet with VMWare.

When dealing with network virtualization, if one is trying to emulate a WAN environment, one could spin up dozens of zones under a Solaris 11 platform, and apply the WAN characteristics to the virtual network (latency, throughput, etc.) Technology like Solaris Crossbow is missing from VMWare.
Conclusions:
VMWare is a great benefit to the Windows and Linux world, but constraints by the vendor made purchasing difficult and implementation less desirable. Some of the issues have been resolved, but management is not yet what it needs to be for managed service providers.

Wednesday, June 20, 2012

EMC: Building The Cloud, Kicking Cisco Out?

EMC: Building The Cloud, Kicking Cisco Out?

Abstract:
EMC used to be a partner in the Data Center with a close relationship with vendors such as Sun Microsystems. With the movement of Sun to create ZFS and their own storage solution, the relationship was strained, with EMC responding by suggesting the discontinuance of software development on Solaris platforms. EMC purchased VMWare and entered into a partnership with Cisco - Cisco produced the server hardware in the Data Center while EMC provided VMWare software and with EMC storage. The status-quo is poised for change, again.

[EMC World 2012 Man - courtesy: computerworld]

EMC World:
Cisco, being a first tier network provider of choice, started building their own blade platforms, entered into a relationship with EMC for their storage and OS virtualization (VMWare) technology. EMC announced just days ago during EMC World 2012 that they will start producing servers. EMC, a cloud virtualization provider, a cloud virtual switch provider, a cloud software management provider, a cloud storage provider, has now moved into the cloud server provider.

Cisco Response:
Apparently aware of the EMC development work before the announcement, Cisco released FlexPods with NetApp. The first release of FlexPods can be managed by EMC management software, because VMWare is still the hypervisor of choice. There is a move towards supporting HyperV, in a future release of FlexPods. There is also a movement towards providing complete management solution through Cisco Intelligent Automation for Cloud. Note, EMC's VMWare vCenter sits as a small brick in the solution acquired by Cisco, including NewScale and Tidal.

[Cisco-NetApp FlexPod courtesy The Register]

NetApp Position:
NetApp's Val Bercovici, CTO of Cloud, declares "the death of [EMC] VMAX." Cisco has been rumored to have been in a position to buy NetApp in 2009, 2010, but now with EMC marginalizing Cisco in 2012 - NetApp becomes more important, and NetApp's stock is dropping like a stone.
[former Sun Microsystems logo]
Cisco's Mishap:
Cisco, missing a Server Hardware, Server Hypervisor, Server Operating System, Tape Storage, Disk Storage, and management technologies, decided to enter into a partnership with EMC. Why this happened, when system administrators in data centers used to use identical console cables for Cisco and Sun equipment - this should have been their first clue.

Had Cisco been more forward-looking, they could have purchased Sun and acquired all their missing pieces: Intel, AMD, and SPARC Servers; Xen on x64 Solaris, LDom's on SPARC; Solaris Intel and SPARC; Storage Tek; ZFS Storage Appliances; Ops Center for multi-platform systems management.

Cisco now has virtually nothing but blade hardware, started acquiring management software [NewScale and Tidal]... will NetApp be next?

[illumos logo]

Recovery for Cisco:
An OpenSolaris base with hypervisor and ZFS is the core of what Cisco really needs to rise from the ashes of their missed purchase of Sun and unfortunate partnership with EMC.

From a storage perspective - ZFS is mature, providing a near superset of all features offered by competing storage subsystems (where is the embedded Lustre?) If someone could bring clustering to ZFS - there would be nothing missing - making ZFS a complete superset of everything on the market.

Xen was created around the need for OpenSolaris support, so Xen could easily be resurrected with a little investment by Cisco. Cloud provider Joyent created KVM on top of OpenSolaris and donated the work back to Illumos, so Cisco could easily fill their hypervisor need, to compate with EMC's VMWare.

[SmartOS logo from Joyent]
SGI figured out they needed a first-class storage subsystem, and placed Nexenta (based upon Illumos) in their server lineup. What Cisco really needs is a company like Joyent (based upon Illumos) - to provide storage and a KVM hypervisor. Joyent would also provide Cisco with a cloud solution - a completely intregrated stack, from the ground on up... not as valuable as Sun, but probably a close second, at this point.

Thursday, June 14, 2012

Network Management at EMC World 2012

 
[EMC World 2012 Man - courtesy: computerworld]

Network Management at EMC World 2012

Abstract:
EMC purchase network management vendor SMARTS with their InCharge suite, a number of years ago, rebranding the suite as Ionix. EMC purchased Voyence, rebranding it as NCM (Network Configuration Manager). After EMC World 2012, they completed the acquisition of Watch4Net APG (Advanced Performance Grapher.) The suite of these platforms is now being rolled into a single new brand called EMC IT Operations Intelligence. EMC World 2012 was poised to advertize the new branding in a significant way.
Result:
EMC World 2012 in Las Vegas, Nevada was unfortunately pretty uneventful for service providers. Why was it uneventful?

The labs for EMC IT Operations Intelligence did not function. There were a lot of other labs, which functioned, but not the Network Management labs. EMC World 2012 was a sure "shot-in-the-head" for demonstrating, to service providers, the benefits of running EMC Network Management tools in a VM.

After 7 days, EMC could not get their IT Operations Intelligence Network Management Suite running in a VMWare VM.

Background:
Small customers may host their network management tools in a VMWare VM. Enterprises will occasionally implement their network management systems on smaller systems, where they know they will get deterministic behavior from the underlying platform.

Service Providers traditionally run their mission critical network management systems on larger UNIX Systems, so as to provide instant scalability (swap in CPU boards) and 99.999 availability (reboot once-a-year, whether they need to or not.)

The platform of choice in the Service Provider market for scalable Network Management platforms has been SPARC Solaris, for decades... clearly, for a reason. This was demonstrated well at EMC World 2012.

The Problem:
Why not host a network management platform in a VMWare infrastructure? Besides, the fact that EMC could not make it happen, after 1 year of preparation, and 7 days of struggling... there are basic logistics.

Network Management is dependent upon ICMP and SNMP.  Both of these protocols are "connectionless protocols" - sometimes referred to as "unreliable protocols". Why would a network management platform use "unreliable protocols"?

The IETF understands that network management should always be light (each poll is a single packet, while a TCP protocol requires a 3-way handshake to start the transaction, poll the single packet, then break down with another 3-way handshake. Imagine doing this for thousands of devices every x seconds - not very light-weight, not very smart. A "connection based protocol" will also hide the nature of an unreliable underlying network, which is what a network management platform is supposed to expose - so it can be fixed.

Now stick a network management platform in a VM, where the network connection from the VM (holding an operating system, with a TCP/IP stack), going down through the hypervisor (which is another operating system, with another TCP/IP stack, which is also sharing the resources of that VM with other VM's.) If there is the slightest glitch in the VM or the hypervisor, which may cause the the packets to be queued or dropped - the actual VMWare infrastructure will signal to the Network Management Centers that there is a network problem, in their customer's network!

Politics:
Clearly, someone at EMC does not understand Network Management, nor do they understand Managed Service Providers.

The Network Management Platform MUST BE ROCK SOLID, so the Network Operations Center personnel will NEVER mistake a alerts in their console from a customer's managed device as a local performance issue in their VM.

With EMC using Solaris to reach into the Telco Data Centers,  EMC later using Cisco to reach into the Telco Data Centers - EMC is done using their partners. VMWare was the platform of choice, to [not] demonstrate their Network Management tools on. Cisco was the [soon to be replaced] platform of choice, since EMC announced they will start building their own servers.

Either someone at EMC is sleeping-at-the-wheel or they need to get a spine to support their customers. Either way, this does not bode well for EMC as a provider of software solutions for service providers.


Business Requirements:
In order for a real service provider to reliably run a real network management system in a virtualized environment:
  • The virtualized platform must not insert any overhead.
  • All resources provided must be deterministic.
  • Patches are installed while the system is live.
  • Engagement of patches must be deterministic.
  • Patch engagement must be fast.
  • Rollback of patches must be deterministic.
  • Patch rollback must be fast.
  • Availability must be 99.999.  




Solutions:
There are many platforms which fulfill these basic business requirements, but none of them are VMWare. Ironically, only SPARC Solaris platform is currently supported by EMC for IT Operations Intelligence, EMC does not support SPARC Solaris under VMWare, and EMC chose not to demonstrate their Network Management suite under a platform which meets service provider requirements.

Today, Zones is about the only virtualized technology which offers 0%-overhead virtualizataion. (Actually, on SMP systems, virtualizing via Zones can increase application throughput, if Zones are partitioned by CPU board.) Zones, to work in this environment, seem to work best with external storage providers, like EMC.

Any platform which offers 0% virtualization penalty with ZFS support can easily meet service providers technical platform business requirements. Of these, the top 3 are probably the best supported by commercial interests
  • Oracle SPARC Solaris
  • Oracle Intel Solaris
  • Joyent SMART OS
  • OpenIndiana
  • Illumian
  • BeleniX
  • SchilliX
  • StormOS
Conclusion:
Today's market is becoming more proprietary each passing day. The movement towards supporting applications only under proprietary solutions (such as VMWare) has demonstrated it's risk during EMC World 2012. A network management provider would not be well advised to use any network management tool which is bound to a single proprietary platform element and does not support POSIX platforms.

Tuesday, May 22, 2012

EMC: DataBridge


EMC: DataBridge
The Register posted an interesting article regarding EMC DataBridge, which is being displayed during EMC World 2012 in Las Vegas Nevada this week.
DataBridge will support EMC's ProSphere, Data Protection Advisor, Unified Infrastructure Manager, IT Operations Intelligence Suite, Storage Configuration Advisor and EMC AppSync… There will be two DataBridge apps from EMC covering chargeback and resource analysis for visualisations of storage capacity utilisation… Actually DataBridge has an obvious possible extension to cover Vblocks, with storage, network and compute data coming across the data bridge, as it were, from the Vblock's component Cisco and EMC physical gear and the VMware software gear.


What makes this interesting are the variety of software platforms covered in the EMC DataBridge solution. It is a good sign to see them brought in under a single umbrella.

Network Management Connection
EMC is primarily a storage company, which did not have a real management solution for complete storage solutions... but at least they understood managed services. They produced a suite of applications to manage their storage solutions, which were available under multiple hardware and multiple OS platform vendors.

There was a company called SMARTS (an acronym for Systems Management ARTS) which specialized in multi-vendor network fault management (the suite was branded as SMARTS InCharge.) SMARTS understood managed services accounts, where multiple hardware and OS platform vendors were supported. They were later purchased by EMC with their tool suite rebranded as EMC Ionix. Now, EMC's management tool suite is being rebranded again as IT Operations Intelligence Suite, as can be seen above as one of the feeds into DataBridge. Once again, SMARTS was available under multiple hardware and OS platform vendors.

There was another company called Voyence, which specialized Configuration and Policy Management of multi-vendor network equipment. Voyence was considered a best-in-class provider, also attractive to the Service Provider business, since they supported multiple hardware and OS platform vendors. They were also purchased by EMC. Their product was brought under the Ionix umbrella and re-branded as NCM or Network Configuration Manager. This is rolled into IT Operations Intelligence Suite.

EMC also purchased a hypervisor company called VMWare. The union of EMC with VMWare seemed to be odd, but it started to become more clear. With EMC providing storage and hypervisor, they partnered with Cisco, and produced a product called Vblock, noted in the article. This is where EMC began to lose focus on Managed Services arena. Multi-hardware and OS vendor support disappearing from their portfolio, it seems EMC is dropping support for platforms which will not run in their proprietary VMWare hypervisor.

Conclusion:
While DataBridge looks like an interesting for EMC shops, it should be noted that EMC appears to be backing away from Managed Service Accounts. VMWare still runs on other platforms, besides Cisco, but one might suggest (from recent EMC history) that is only the case because EMC did not buy Cisco.

If DataBridge will capable of running on any non-VMWare hosted operating system, that would be a surprise with EMC's trend to abandon Managed Services Accounts. If EMC wants DataBridge to be a serious competitor in the market, EMC will have to demonstrate a commitment to partners not locked into VMWare.