Thursday, October 31, 2019

How to Kill a Zombie in Solaris

How to Kill a Zombie in Solaris

Abstract:

When a parent spans a child process, the child process will return a signal to the parent once the child process has died or was terminated. If the parent dies first, the init process inherits the children, and will receive the signals once the children die. This process is called "reaping". Sometimes, things do not go as planned. It is a good topic for Halloween.

[artwork for "ZombieLoad" malware, courtesy zombieloadattack]

When things do not go as planned:

It may take a few minutes for the exit signal to be reaped by a parent or init process, which is quite normal.

If children processes are dying and the parent is not reaping the signals, the child remains in the process table and becomes a Zombie, not taking Memory or CPU, but consuming a process slot. Under modern OS's, like Solaris, the process table can hold millions of entries, but zombies still consumes kernel resources and userland resources when process tables need to be parsed.

Identifying Zombies

Zombies are most easily identified as "defunct" processes.
# ps -ef | grep defunct
root 1260 1 0 - ? 0:00 
This defunct process would normally be managed by the parent process, which is "1" or init, but in this case we can clearly see that this process is not disappearing.
# ps -ef | grep init
root 1 0 0 Oct 25 ? 8:51 /sbin/init
But why call them Zombies and not just Defunct?
$ ps -elf | egrep '(UID|defunct)'
 F S  UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
 0 Z root 125 4549 0 0   -  -    0  -     -     ?   0:00
The "S" or "State" flag identifies the defunct process with a "Z" for Zombie, and all can see them.

(Plus, this is being published on Halloween, or All Hallows' Eve, the day before All Hallow's Day or All Saints' Day... this is when people remember the death of the "hallows" or Saints & Martyrs, who had passed on before. So, let's also remember the deaths of the processes!)


[The Grim Reaper, courtesy Encyclopedia Britannica]

To Kill a Zombie:

How does one kill a Zombie?
Well, they are already dead... in the movies, they are shot in the head.
In the modern operating system world of Solaris, we seek the reaper, we Don't Fear The Reaper.

The tool is called Process Reap or "preap" - the manual page is wonderfully descriptive!
# preap 1260
1260: exited with status 0
It should be noted, processes being traced can not be reaped, damage can occur to the parent process if the child is forcibly reaped, and the OS may also put restrictions on reaping recently terminated processes.

To force a reaping, one can place a proverbial "bullet in the head" of the zombie.
# preap -F 125
125: exited with status 0
So, there we go, two dead zombies, see how they no longer run.

Conclusion:

This administrator had personally seen poorly written C code, leaving thousands of zombies behind daily. The application  development team no longer had no C programmers on their staff, so this was a good option. It should be carefully exercised on a development or test box, to evaluate the results on the application, before conducing a procedure in production.

Monday, October 28, 2019

Germany: Oracle Updates on SPARC & Solaris 11.4

Germany: Oracle Updates on SPARC & Solaris 11.4

Abstract

Oracle CloudDay will be opening in varound countries around the world, from 2019q4 to 2020q1! November & December of 2019 will afford people in Germany to discuss the continued advances in Oracle's SPARC Solaris!

The Annunciation

Joerg Moellenkamp published an short announcement in German, which is translated to English:

Business breakfast in HAM, FRA, DUS, MUC and BER in November / December 2019 ...

Posted by Joerg Moellenkamp on Monday, October 28, 2019
This is an event in german language, the following text is in german:

After a long break we would like to continue the series of business breakfasts and invite you to join us.

This time it's all about SPARC news, Solaris 11.4, the operation of Solaris, and the technical issues of consolidation and cloud native computing on the Oracle Private Cloud Appliance:


    Our partner Marcel Hofstetter will present the tool "Jomasoft VDCF", with which the operation of Solaris can be made more efficient.

    Before that, Jörg Möllenkamp (Oracle) reported on the news that came with Solaris 11 SRU and the renewal of legacy systems.

    The event concludes with a presentation on consolidation and Cloud Native Computing on the Oracle Private Cloud Appliance by Jan Brosowski and Thomas Müller (also Oracle)


The agenda:
09:00 breakfast
09:30 Welcome and OOW News
09:45 News in Solaris 11 and experiences with the refresh of SPARC systems
11:00 break and continuation of breakfast
11:15 JomaSoft VDCF - Efficient Solaris operation (Marcel Hofstetter, Jomasoft)
12:15 PCA - consolidate the current world and think with the Cloud Native into the future
13:15 End of the event

The event takes place at 5 locations in Germany. If you would like to attend one of the events, please register by e-mail at the e-mail address stated on the date.

Hamburg 5.11.
Oracle office Hamburg
Kühnehöfe 5 (corner of Kohlentwiete, 22761 Hamburg
Registration with Hans-Peter Hinrichs

Frankfurt 27.11
Oracle office Frankfurt,
New Mainzer Straße 46-50 (Garden Tower), 60311 Frankfurt
Registration with Matthias Burkard

Dusseldorf 3.12.
Oracle office Dusseldorf
Rolandstraße 44, 40476 Dusseldorf
Registration with Michael Färber

Munich 11.12.
Oracle headquarters and Munich office
Riesstrasse 25, 80992 Munich
Registration with Elke Freymann

Berlin 12.12.
Oracle Customer Visit Center Berlin
Behrenstraße 42 (Humboldt Carré), 10117 Berlin
Registration with Hans-Peter Hinrichs

This event is certainly also interesting for colleagues from other departments. I would be very happy if you forward this invitation.

We look forward to your visit!

PS: We do not want to miss the opportunity to refer you to the Modern Cloud Days in Darmstadt. This Oracle event will take place on 11.12. It will provide clients with exciting cloud insights and Oracle will report on concepts, ideas and best practices in keynotes and sessions. For this event, you can sign up at https://www.oracle.com/cloudday. The link is not for the businessbreakfast.
 The slides will be welcomed!

Concluding Thoughts

It is always good to get news on the SPARC Solaris front, for those workloads which do not run as well on other platforms.






Monday, October 7, 2019

Solaris 11.4: Eliminating Silent Data Corruption

Solaris 11.4: Eliminating Silent Data Corruption

Abstract:

Storage has been increasing in geometric proportions, for decades. As storage has been increasing, a problem referred to as Silent Data Corruption has been noticed. Forward thinking engineers at Sun Microsystems had created ZFS to manage this risk by having discovery & correction occur passively & automatically upon future reads & writes. Oracle later purchased Sun Microsystems and introduced proactive automated discovery & correction on a monthly basis, as part of Solaris 11.4

The Problem:

Silent Data Corruption has been measured by various industry players dealing with massive quantity of storage.
the fast database at Greenplum, which is a database software company specializing in large-scale data warehousing and analytics, faces silent corruption every 15 minutes.[9] As another example, a real-life study performed by NetApp on more than 1.5 million HDDs over 41 months found more than 400,000 silent data corruptions, out of which more than 30,000 were not detected by the hardware RAID controller. Another study, performed by CERN over six months and involving about 97 petabytes of data, found that about 128 megabytes of data became permanently corrupted.
 As storage continues to expand, the need to resolve silent corruption became more important.

The Passive Solution:

Jeff Bonwick at Sun Microsystems created ZFS, specifically to address storage as data storage quantities increased. The ZFS File System was not a 32 bit File System, like 30 year old technology, but was engineered to be a 128 bit filesystem, projected to accommodate data into the next 30 years. With such  a massive quantity of data to be retained, Silent Data Corruption was addressed by performing a checksum on the data during the write and verifying it on future reads. If the checksum does not match on the read, then a redundant block of the data on the ZFS File System will be automatically read, and a correction would occur to the formerly read bad block. This feature was very unique to Solaris.

A system administrator can read every block via an operation referred to as a "scrub".
sc25client01/root# zpool list rpool
NAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  416G   296G  120G  71%  1.00x  ONLINE  -


sc25client01/root#
zpool scrub rpool 

sc25client01/root#
This scrub will continue in the background until all disks had all of the blocks read. The scrub always reads data at a rate which does not interfere with the operation of the platform or applications.


The Proactive Solution:

With the release of Solaris 11.4, formerly known as Solaris 12, an automated schedule of reading every byte of data in the entire pool is scheduled by default in the storage pool once a month. By reading every block of data once a month, silent data corruption can be rooted out and corrected automatically, which is a very unique feature of Oracle's Solaris!

Under an older OS release (Solaris 11.3 SRU 31),  notice that the property does not exist.
sc25client01/root# uname -a
SunOS sc01client01 5.11 11.3 sun4v sparc sun4v

sc25client01/root# pkg list entire
NAME (PUBLISHER) VERSION                    IFO
entire           0.5.11-0.175.3.31.0.6.0    i--

sc25client01/root# zpool get lastscrub rpool
bad property list: invalid property 'lastscrub'
For more info, run: zpool help get
Under a modern OS release (Solaris 11.4 SRU 13), the last scrub occurred less than a month ago.
sun9781/root# uname -a
SunOS sun1824-cd 5.11 11.4.13.4.0 sun4v sparc sun4v

sun9781/root# pkg list entire
NAME (PUBLISHER) VERSION                    IFO
entire           11.4-11.4.13.0.1.4.0       i--

sun9781/root# zpool get lastscrub rpool
NAME   PROPERTY   VALUE   SOURCE
rpool  lastscrub  Sep_10  local
The last scrub details can be seen through the status option.
sun9781/root# zpool list
NAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  278G  36.9G  241G  13%  1.00x  ONLINE  -

sun9781/root# zpool status
  pool: rpool
 state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
        pool will no longer be accessible on older software versions.
  scan: scrub repaired 0 in 16m24s with 0 errors on Tue Sep 10 03:42:44 2019

config:
        NAME                       STATE      READ WRITE CKSUM
        rpool                      ONLINE        0     0     0
          mirror-0                 ONLINE        0     0     0
            c0t5000CCA0251CF0F0d0  ONLINE        0     0     0
            c0t5000CCA0251E4BC8d0  ONLINE        0     0     0

errors: No known data errors
The above 278 Gigabyte pool was able to be read in a little over 15 minutes, and checked with no errors to be corrected.

Conclusions:

Network Management is well aware that the more storage that is needed that the more critical the data recovery process becomes. Redundancy through advanced file systems like ZFS under managed services class operating systems like Solaris are a good choice. Solaris 11.4 keeps data healthy, no matter what quantity of physical disks managed or data being retained.