It was a bad day for the Apache team - apache.org was hacked 2009-08-28 through their CentOS Linux derivative. The CentOS is a Linux distribution bundled with RedHat Package Manager.
When your system has been hacked, what would be your first choice to recover?
Go to an on-line backup? (How do you know it was not also compromised?)
Go to a tape backup? (How do you know it was not also compromised?)
How far back to you go? (Do you only keep 3 backups?)
Do you re-build from scratch?
aurora.apache.org runs Solaris 10, and we were able to restore the box to a known-good configuration by cloning and promoting a ZFS snapshot from a day before the CGI scripts were synced over. Doing so enabled us to bring the EU server back online, and to rapidly restore our main websites.The Apache team was very fortunate - they implemented Sun SPARC Solaris ZFS. They were able to roll back to a snapshot and recover.
It was mentioned that the ZFS snapshot was used to ultimately recover the web site. How does ZFS offer this type of capability?
ZFS is a 128 bit unified file system and volume manager. It offers virtually unlimited volume size... and virtually unlimited snapshots. Any production environment exposed to the internet should use ZFS as a best practice, to be able to quickly resort back to a pre-corrupted stage.
For example, you can schedule a snapshot of your system every day, or every 15 minutes (if you want!), and you can hold these snapshots for a week, with virtually no overhead. At any point in time, you can drop back to a previous release, just as the apache.org foundation decided to do.
What exactly is a "snapshot"? The zfs manual page reads:
A read-only version of a file system or volume at a given point in time. It is specified as filesystem@name or volume@name.The process of taking an old snapshot and making it writable is called a "clone". The manual page reads:
A clone is a writable volume or file system whose initial contents are the same as another dataset. As with snapshots, creating a clone is nearly instantaneous, and initially consumes no additional space.This "clone" can be "promoted" so as to become the master version of the file system volume, erasing the old content. This is also described in the zfs manual page:
Clones can only be created from a snapshot. When a snapshot is cloned, it creates an implicit dependency between the parent and child. Even though the clone is created somewhere else in the dataset hierarchy, the original snapshot cannot be destroyed as long as a clone exists. The origin property exposes this dependency, and the destroy command lists any such dependencies, if they exist.This is basically the process that Apache.org used to recover their Linux web servers.
The clone parent-child dependency relationship can be reversed by using the promote subcommand. This causes the “origin” file system to become a clone of the specified file system, which makes it possible to destroy the file system that the clone was created from.
Securing those web servers are a different story. SSH is no magic bullet - this was also compromised. There is no magic bullet in the open-source world. Different open-source communities have different certification processes. A Linux kernel may come up, be slowly accepted into a distribution, with patches made along the way from the original kernel team as well as a separate distribution company.
One of the sections was about positive lessons :
- The use of ZFS snapshots enabled us to restore the EU production web server to a known-good state.
- Redundant services in two locations allowed us to run services from an alternate location while continuing to work on the affected servers and services.
- A non-uniform set of compromised machines (Linux/CentOS i386, FreeBSD-7 amd_64, and Solaris 10 on sparc) made it difficult for the attackers to escalate privileges on multiple machines.
While your front-end Linux boxes may get hacked, diversifying your infrastructure with an additional (secure) OS, and using a real unified file system & volume management system like ZFS under Solaris makes the hackers struggle while provides options to every-day system administrators.
IBM recently acknowledged Sun Solaris as being best-in-class in security - something to keep in mind.
Be cautious of other vendors like Microsoft with long outstanding security holes in their IIS web serving software or security issues which they refuse to fix. These are not good candidates for customer facing systems - for obvious reasons. Imagine gaining access to the IIS server and just querying the usernames and passwords from the ebedded MSSQL server - oh, the humanity!