Tuesday, December 31, 2019

SSH Timeout Across a Firewall

SSH Timeout Across a Firewall


Firewalls represent a "choke point" between a Data Center and The Internet. The function of the Firewall is to protect the Data Center from unauthorized & malicious access. This "choke point" also typically terminate socket connections which have been idle for extended periods of time, to reduce unnecessary connections which need to be statefully inspected. This termination of idle sockets will sometimes stop the normal functioning of administrative sessions over SSH where longer running interactive jobs (i.e. backups, software installs, manual data loads, etc.) and corrupt databases. KeepAlive functionality in SSH can be engaged to inhibit this behavior.

What is the Error?

When a firewall terminates the connection, the client connecting to the Solaris Server in the remote Data Center may exhibit the following error message:

Received disconnect from 2: Timeout, your session not responding.

And the connection is terminated.

In an outsourced Data Center environment, a controlling TTY over SSH connection was being terminated, when idle for 10 minutes, while manually running an interactive backup script (which produced no output to the controlling TTY over an SSH during the copy of a significant quantity of data across a WAN connection.)

What is a KeepAlive?

A KeepAlive packet is normally a 0 byte packet, sent along an open SSH session, on a regular interval, to keep a firewall from assuming the connection is idle, during longer periods of non-interactivity.

Even though it is a null byte packet, that does not add any additional data to the text sent or received by the application, the additional null byte packet has headers which are seen by the firewall, and keep the firewall from terminating the session because no traffic was seen.

How to Configure KeepAlive

Under Solaris 11.3, there is a system-wide configuration file which can be updated, on the server receiving the connections. By default, KeepAlive functionality is disabled under Solaris 11.3

SUN0101/root# egrep '(KeepAlive|ClientAlive)' /etc/ssh/sshd_config
# KeepAlive specifies whether keep alive messages are sent to the client.
#KeepAlive yes
ClientAliveCountMax 0
ClientAliveInterval 600

The following adjustment will enable KeepAlives to be sent every 120 seconds, while forcing a disconnection after 240 seconds, without responses (so the firewall is always getting data, and a truly idle connection will beterminated by SSH server, instead.)

SUN0101/root# egrep '(KeepAlive|ClientAlive)' /etc/ssh/sshd_config
KeepAlive yes
ClientAliveCountMax 2
ClientAliveInterval 120

A backup should be done to the original file, in case it needs to be rolled back.
Console access to the system will be needed, to perform the roll back, if ssh is mis-configured.

How to Enable KeepAlive

Changing the configuration file will not enable changes for new sessions, nor make changes to open sessions. If you wish to enable the change for new sessions, refresh the config through the services.

SUN0101/root# svcs ssh
STATE          STIME    FMRI                    .
online         May_03   svc:/network/ssh:default

SUN0101/root# svcadm refresh svc:/network/ssh:default

Any existing ssh sessions will be timed out by the firewall, within the configured limit.
Any new ssh session will not be timed out by the firewall, with the keep alive enabled.
Any new ssh session, which goes into an abnormal state where the client does not respond, will be terminated by the SSH service, in 2 minutes.


Network Management of Solaris Systems in Cloud based Data Centers is still quite usable, when firewalls are deployed by Cloud Providers and clean up idle connections.  These types of environments have long been used in mission critical arenas, with secured servers residing in DMZ's and ISZ's - so a remote data center with a perimeter firewall is just "old hat".

Monday, November 11, 2019

Building OpenJDK Under Solaris SPARC and x64

[logo, courtesy OpenJDK]

Building OpenJDK Under Solaris SPARC and x64

[Duke Thinking]


There has been must discussion regarding Java and Solaris, both SPARC, Intel x86, and AMD x64. Oracle had released Java into the wild, under a fast continuous release plan, and paid support plans for long term releases. OpenJDK is the reference release of JavaSE since Java 7. Once concern in the industry has been, how to we compile the latest versions of Java, and this has been discussed in detail by various authors to be cited.
[Duke Plugging]

Building OpenJDK 12 under Solaris:

Guest Author Petr Sumbera had published an article on Oracle's Solaris Blog on how to compile your own version  of OpenJDK 12. What is interesting is that it uses an existing JDK to perform that process!

To compile OpenJDK 12, SPARC Solaris requires JDK 8 under Intel Solaris and JDK 11 under SPARC Solaris. In addition, you will need Oracle Solaris Studio 12.4.

[Duke Drawing]

OpenJDK on SPARC Solaris:

In  March 2019, Oracle Author Martin Mueller also published an article on Building OpenJDK Java Releases on Oracle Solaris/SPARC. In addition, you will need Oracle Solaris Studio 12.4. This article may help in the build of of OpenJDK for SPARC.

[Duke Plumbing]

OpenJDK on Intel or AMD Solaris:

In February 2019, Birkbeck College, Computer Science Department, Author Andrew Watkins also published an article on Building OpenJDK 12 on Solaris 11 x86_64. This followed on his October 2018 article on  Building OpenJDK 11 on Solaris 11 x86_64. This followed on his September 2018 article on Building OpenJDK 10 on Solaris 11 x86_64. These articles may help in the building of OpenJDK for Intel or AMD systems.


The availability of the latest Java, license & support free, is available if one wants to roll their own. If commercial support is desired, Java can can still be acquired from Oracle with a Long Term Support (LTS) contract.

Monday, November 4, 2019

Distributed Denial of Service, Amazon Cloud & Consequences

[Amazon Web Services Logo, Courtesy Amazon]

Distributed Denial of Service, Amazon Cloud & Consequences


The US Military had been involved in advancing the art of computing infrastructure since the early days of computing. With many clouds built inside the Pentagon, a desire to standardize on an external cloud vendor was initiated. Unlike many contracts, where vendors were considered to compete with one another for a piece of the pie, this was a "live and let die" contract, for the whole proverbial pie, not just a slice. Many vendors & government proponents did not like this approach, but the proverbial "favoured son", who had a CIA contract, approved. This is that son's story.

Problems of Very Few Large Customers

Very few large customers create distortions in the market.
  1. Many understand that consolidate smaller contracts into very few large contracts is unhealthy. Few very large single consumers, like the Military, create an environment where  suppliers will exit the business, if they can not win some business, since the number of buyers is too small, limiting possible suppliers in time of war.
  2. Some complain that personal disputes can get in the way of objective decision making, in large business transactions.
  3. Others warn that political partisanship can wreck otherwise potential terrific technology decisions.
  4. Many complain that only a few large contracts offer opportunity for corruption at many levels, because the stakes are so high for the huge entities trying to gain that business.
  5. In older days, mistakes by smaller suppliers gave opportunity for correction, before the next bid... but when very few bids are offered, fleeting opportunities require substantially deep pockets to survive a bid loss
  6. Fewer customer opportunities discourages innovation, since risk to be innovative may result in loss of an opportunity when a few RFP providers may be rigidly bound by restraints of older technology requests and discourages from higher costing newer technology opportunities
In the end, these logical issues may not have been the only realistic problems.

[Amazon Gift Card, Courtesy Amazon]

Amazon's Business to Lose

From the very beginning, Amazon's Jeff Bezos had his way in. Former Defense Secretary James Mattis, hired Washington DC Lobbyist Sally Donnelly, who formerly worked for Amazon, and the Pentagon was soon committed to moving all their data to the private cloud. The irony is that Bezos, who has a bitter disagreement with President Trump, now had a proverbial "ring in the nose" of President Trump's "second in command" with the Armed Forces, in 2017.

Amazon's Anthony DeMartino, a former deputy chief of staff in the secretary of defense’s office, who previously consulted for Amazon Web Services, was also extended a job at Amazon, after working through the RFP process.

Features of the Amazon Cloud, suspiciously looked like they were taylor written for Amazon, requesting features that only Amazon could offer. Competitors like Oracle had changed their whole business model, to redirect all corporate revenue into Cloud Computing, to even qualify for the $2 Billion in revenue requirement to be allowed to bid on the RFP! How did such requirements appear?

Amazon's Deap Ubhi left the AWS Cloud Division, to work at the Pentagon, to create the JEDI procurement contract, and later return to Amazon. Ubhi, a venture capitalist, worked as 1 of a 4 person team, to shape the JEDI procurement process, while in secret negotiations with Amazon to be re-hired for a future job. The Intercept further reminded us:
Under the Procurement Integrity Act, government officials who are “contacted by a [contract] bidder about non-federal employment” have two options: They must either report the contact and reject the offer of employment or promptly recuse themselves from any contract proceedings.
The Intercept also noted that Ubhi accepted a verbal offer from Amazon, for the purchase of one of his owned companies, during the time of his working on the Market Research that would eventually form the RFP.

A third DoD individual, tailoring the RFP, was also offered a job at Amazon, according to Oracle court filings, but this person was marked from the record.

At the highest & lowest levels, the JEDI contract appeared to be "Gift-Wrapped" for Amazon.

[Amazon CEO Jeff Bezos hosting Trump's Former Defense Secretary James Mattis at HQ, courtesy Twitter]

Amazon Navigating Troubled Waters

December 23, 2018, President Trump pushes out Secretary of Defense James Mattis after Mattis offered a resignation letter, effective February 2019.

January 24, 2019, Pentagon investigates Oracle concerns unfair practices by hiring Cloud Procurement Contract worker from Amazon.

April 11, 2019, Microsoft & Amazon become finalists in the JEDI cloud bidding, knocking out other competitors like Oracle & IBM.

June 28, 2019, Oracle Corporation files lawsuit against Federal Government for creating RFP rules which violate various Federal Laws, passed by Congress, to restrict corruption. Oracle also argued that three individuals, who tilted the process towards Amazon, who were effectively "paid off" by receiving jobs at Amazon.

July 12, 2019, Judge rules against Oracle in lawsuit over bid improprieties, leaving Microsoft & Amazon as finalists.

August 9, 2019, Newly appointed Secretary of Defense Mark Esper and was to complete "a series of thorough reviews of the technology" before the JEDI procurement is executed.

On August 29, 2019, the Pentagon awarded it's DEOS (Defense Enterprise Office Solutions) cloud contract, a 10-year, $7.6 billion, to Microsoft, based upon their 365 platform.

On October 22, 2019, Secretary of Defense Mark Esper withdrew from reviewing bids on the JEDI contract, due to his son being employed by one of the previous losing bidders.

Serendipity vs Spiral Death Syndrome

Serendipity is the occurrence and development of events by chance with a beneficial results. The opposite may be Spiral Death Syndrome, when an odd event may create a situation where catastrophic failure becomes unavoidable.

What happens when an issue, possibly out of the control of a bidder, becomes news during a vendor choice?

This may have occurred with Amazon AWS, in their recent bid for a government contract. Amazon pushed to have the Pentagon Clouds outsourced, at one level below The President and even had the rules written for an RFP, to favor a massive $10 Billion 10 year single contract agreement favoring them.

October 22, 2019, A Distributed Denial of Service (DDoS) hitsAmazon Web Services was hit by a Distributed Denial of Service attack, taking down users of Amazon AWS for hours. Oddly enough, it was a DNS attack, centered upon Amazon C3 storage objects. External vendors measured the outages to last 13 hours.

On October 25, 2019, the Pentagon awarded it's JEDI (Joint Enterprise Defense Infrastructure) cloud contract, a 10-year, $10 billion, to Microsoft. The Pentagon had over 500 separate clouds, to be unified under Microsoft, and it looks like Microsoft will do the work, with the help of smaller partners.


Whether the final choice of the JEDI provider was Serendipitous for Microsoft, or the result of Spiral Death Syndrome for Amazon, is for the reader to decide. For this writer, the final stages of choosing a bidder, where the favoured bidder looks like they could have been manipulating the system at the highest & lowest levels of government, even having the final newly installed firewall [Mark Esper] torn down 3 days earlier, is an amazing journey. A 13 hour cloud outage seems to have been the final proverbial "nail in the coffin" for a skilled new bidder who was poised to become the ONLY cloud service provider to the U.S. Department of Defense.

(Full Disclosure: a single cloud outage for Pentagon Data, just before a pre-emptive nuclear attack on the United States & European Allies [under our nuclear umbrella], lasting 13 hours, could have not only been disastrous, but could have wiped out Western Civilization. Compartmentalization of data is critical for data security and the concept of a single cloud seems ill-baked, in the opinion of this writer.)

Thursday, October 31, 2019

How to Kill a Zombie in Solaris

How to Kill a Zombie in Solaris


When a parent spans a child process, the child process will return a signal to the parent once the child process has died or was terminated. If the parent dies first, the init process inherits the children, and will receive the signals once the children die. This process is called "reaping". Sometimes, things do not go as planned. It is a good topic for Halloween.

[artwork for "ZombieLoad" malware, courtesy zombieloadattack]

When things do not go as planned:

It may take a few minutes for the exit signal to be reaped by a parent or init process, which is quite normal.

If children processes are dying and the parent is not reaping the signals, the child remains in the process table and becomes a Zombie, not taking Memory or CPU, but consuming a process slot. Under modern OS's, like Solaris, the process table can hold millions of entries, but zombies still consumes kernel resources and userland resources when process tables need to be parsed.

Identifying Zombies

Zombies are most easily identified as "defunct" processes.
# ps -ef | grep defunct
root 1260 1 0 - ? 0:00 
This defunct process would normally be managed by the parent process, which is "1" or init, but in this case we can clearly see that this process is not disappearing.
# ps -ef | grep init
root 1 0 0 Oct 25 ? 8:51 /sbin/init
But why call them Zombies and not just Defunct?
$ ps -elf | egrep '(UID|defunct)'
 0 Z root 125 4549 0 0   -  -    0  -     -     ?   0:00
The "S" or "State" flag identifies the defunct process with a "Z" for Zombie, and all can see them.

(Plus, this is being published on Halloween, or All Hallows' Eve, the day before All Hallow's Day or All Saints' Day... this is when people remember the death of the "hallows" or Saints & Martyrs, who had passed on before. So, let's also remember the deaths of the processes!)

[The Grim Reaper, courtesy Encyclopedia Britannica]

To Kill a Zombie:

How does one kill a Zombie?
Well, they are already dead... in the movies, they are shot in the head.
In the modern operating system world of Solaris, we seek the reaper, we Don't Fear The Reaper.

The tool is called Process Reap or "preap" - the manual page is wonderfully descriptive!
# preap 1260
1260: exited with status 0
It should be noted, processes being traced can not be reaped, damage can occur to the parent process if the child is forcibly reaped, and the OS may also put restrictions on reaping recently terminated processes.

To force a reaping, one can place a proverbial "bullet in the head" of the zombie.
# preap -F 125
125: exited with status 0
So, there we go, two dead zombies, see how they no longer run.


This administrator had personally seen poorly written C code, leaving thousands of zombies behind daily. The application  development team no longer had no C programmers on their staff, so this was a good option. It should be carefully exercised on a development or test box, to evaluate the results on the application, before conducing a procedure in production.

Monday, October 28, 2019

Germany: Oracle Updates on SPARC & Solaris 11.4

Germany: Oracle Updates on SPARC & Solaris 11.4


Oracle CloudDay will be opening in varound countries around the world, from 2019q4 to 2020q1! November & December of 2019 will afford people in Germany to discuss the continued advances in Oracle's SPARC Solaris!

The Annunciation

Joerg Moellenkamp published an short announcement in German, which is translated to English:

Business breakfast in HAM, FRA, DUS, MUC and BER in November / December 2019 ...

Posted by Joerg Moellenkamp on Monday, October 28, 2019
This is an event in german language, the following text is in german:

After a long break we would like to continue the series of business breakfasts and invite you to join us.

This time it's all about SPARC news, Solaris 11.4, the operation of Solaris, and the technical issues of consolidation and cloud native computing on the Oracle Private Cloud Appliance:

    Our partner Marcel Hofstetter will present the tool "Jomasoft VDCF", with which the operation of Solaris can be made more efficient.

    Before that, Jörg Möllenkamp (Oracle) reported on the news that came with Solaris 11 SRU and the renewal of legacy systems.

    The event concludes with a presentation on consolidation and Cloud Native Computing on the Oracle Private Cloud Appliance by Jan Brosowski and Thomas Müller (also Oracle)

The agenda:
09:00 breakfast
09:30 Welcome and OOW News
09:45 News in Solaris 11 and experiences with the refresh of SPARC systems
11:00 break and continuation of breakfast
11:15 JomaSoft VDCF - Efficient Solaris operation (Marcel Hofstetter, Jomasoft)
12:15 PCA - consolidate the current world and think with the Cloud Native into the future
13:15 End of the event

The event takes place at 5 locations in Germany. If you would like to attend one of the events, please register by e-mail at the e-mail address stated on the date.

Hamburg 5.11.
Oracle office Hamburg
Kühnehöfe 5 (corner of Kohlentwiete, 22761 Hamburg
Registration with Hans-Peter Hinrichs

Frankfurt 27.11
Oracle office Frankfurt,
New Mainzer Straße 46-50 (Garden Tower), 60311 Frankfurt
Registration with Matthias Burkard

Dusseldorf 3.12.
Oracle office Dusseldorf
Rolandstraße 44, 40476 Dusseldorf
Registration with Michael Färber

Munich 11.12.
Oracle headquarters and Munich office
Riesstrasse 25, 80992 Munich
Registration with Elke Freymann

Berlin 12.12.
Oracle Customer Visit Center Berlin
Behrenstraße 42 (Humboldt Carré), 10117 Berlin
Registration with Hans-Peter Hinrichs

This event is certainly also interesting for colleagues from other departments. I would be very happy if you forward this invitation.

We look forward to your visit!

PS: We do not want to miss the opportunity to refer you to the Modern Cloud Days in Darmstadt. This Oracle event will take place on 11.12. It will provide clients with exciting cloud insights and Oracle will report on concepts, ideas and best practices in keynotes and sessions. For this event, you can sign up at https://www.oracle.com/cloudday. The link is not for the businessbreakfast.
 The slides will be welcomed!

Concluding Thoughts

It is always good to get news on the SPARC Solaris front, for those workloads which do not run as well on other platforms.

Monday, October 7, 2019

Solaris 11.4: Eliminating Silent Data Corruption

Solaris 11.4: Eliminating Silent Data Corruption


Storage has been increasing in geometric proportions, for decades. As storage has been increasing, a problem referred to as Silent Data Corruption has been noticed. Forward thinking engineers at Sun Microsystems had created ZFS to manage this risk by having discovery & correction occur passively & automatically upon future reads & writes. Oracle later purchased Sun Microsystems and introduced proactive automated discovery & correction on a monthly basis, as part of Solaris 11.4

The Problem:

Silent Data Corruption has been measured by various industry players dealing with massive quantity of storage.
the fast database at Greenplum, which is a database software company specializing in large-scale data warehousing and analytics, faces silent corruption every 15 minutes.[9] As another example, a real-life study performed by NetApp on more than 1.5 million HDDs over 41 months found more than 400,000 silent data corruptions, out of which more than 30,000 were not detected by the hardware RAID controller. Another study, performed by CERN over six months and involving about 97 petabytes of data, found that about 128 megabytes of data became permanently corrupted.
 As storage continues to expand, the need to resolve silent corruption became more important.

The Passive Solution:

Jeff Bonwick at Sun Microsystems created ZFS, specifically to address storage as data storage quantities increased. The ZFS File System was not a 32 bit File System, like 30 year old technology, but was engineered to be a 128 bit filesystem, projected to accommodate data into the next 30 years. With such  a massive quantity of data to be retained, Silent Data Corruption was addressed by performing a checksum on the data during the write and verifying it on future reads. If the checksum does not match on the read, then a redundant block of the data on the ZFS File System will be automatically read, and a correction would occur to the formerly read bad block. This feature was very unique to Solaris.

A system administrator can read every block via an operation referred to as a "scrub".
sc25client01/root# zpool list rpool
rpool  416G   296G  120G  71%  1.00x  ONLINE  -

zpool scrub rpool 

This scrub will continue in the background until all disks had all of the blocks read. The scrub always reads data at a rate which does not interfere with the operation of the platform or applications.

The Proactive Solution:

With the release of Solaris 11.4, formerly known as Solaris 12, an automated schedule of reading every byte of data in the entire pool is scheduled by default in the storage pool once a month. By reading every block of data once a month, silent data corruption can be rooted out and corrected automatically, which is a very unique feature of Oracle's Solaris!

Under an older OS release (Solaris 11.3 SRU 31),  notice that the property does not exist.
sc25client01/root# uname -a
SunOS sc01client01 5.11 11.3 sun4v sparc sun4v

sc25client01/root# pkg list entire
NAME (PUBLISHER) VERSION                    IFO
entire           0.5.11-    i--

sc25client01/root# zpool get lastscrub rpool
bad property list: invalid property 'lastscrub'
For more info, run: zpool help get
Under a modern OS release (Solaris 11.4 SRU 13), the last scrub occurred less than a month ago.
sun9781/root# uname -a
SunOS sun1824-cd 5.11 sun4v sparc sun4v

sun9781/root# pkg list entire
NAME (PUBLISHER) VERSION                    IFO
entire           11.4-       i--

sun9781/root# zpool get lastscrub rpool
rpool  lastscrub  Sep_10  local
The last scrub details can be seen through the status option.
sun9781/root# zpool list
rpool  278G  36.9G  241G  13%  1.00x  ONLINE  -

sun9781/root# zpool status
  pool: rpool
 state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
        pool will no longer be accessible on older software versions.
  scan: scrub repaired 0 in 16m24s with 0 errors on Tue Sep 10 03:42:44 2019

        NAME                       STATE      READ WRITE CKSUM
        rpool                      ONLINE        0     0     0
          mirror-0                 ONLINE        0     0     0
            c0t5000CCA0251CF0F0d0  ONLINE        0     0     0
            c0t5000CCA0251E4BC8d0  ONLINE        0     0     0

errors: No known data errors
The above 278 Gigabyte pool was able to be read in a little over 15 minutes, and checked with no errors to be corrected.


Network Management is well aware that the more storage that is needed that the more critical the data recovery process becomes. Redundancy through advanced file systems like ZFS under managed services class operating systems like Solaris are a good choice. Solaris 11.4 keeps data healthy, no matter what quantity of physical disks managed or data being retained.

Friday, September 20, 2019

Solaris 10: Extended Support to 2024

Solaris 10: Extended Support to 2024

Solaris 10: Introduction

Oracle Solaris 10 has been an amazing OS update, including ground breaking features like Zones (Solaris Containers), ZFS, Services, Dynamic Tracing (against live production operating systems without impact), and Logical Domains. These features have been emulated by the market (imitation is the finest form of flattery!)

Solaris 10: End of Life

As with all good things, they must come to an end. Sun Microsystems was purchased by Oracle and eventually, the greatest OS known to the industry needed to be updated. Oracle set a retirement date of January 2021. Oracle had indicated an uplift in support costs would be needed, for Solaris 10 systems.

Solaris 10: Extended Support to 2024

No migration tools were ever provided by Oracle to facilitate migration from Solaris 10 to Solaris 11, so migration to Solaris has been slow. Oracle had decided in September 2019 that Extended Support for Solaris 10, without additional financial penalty, would be delayed to 2024!