Showing posts with label UltraSPARC II. Show all posts
Showing posts with label UltraSPARC II. Show all posts

Monday, March 14, 2011

Background Radiation and Sun's E-Cache Crisis of 1999


Background Radiation and Sun's E-Cache Crisis of 1999

Abstract:

As the density of circuits increases, features get smaller; as frequencies increase, voltages get lower. These trends combine to reduce the amount of charge used to represent a bit, increasing the sensitivity of memory to background radiation. For example, the original UltraSPARC-I processor ran at 143MHz and had a 256KB e-cache (external cache). The cache design used simple byte parity to protect the data, which was sufficient as the amount of charge used to hold a bit was large enough that an ionizing particle would drain off only a small amount, not enough to flip a bit.

When this design was scaled up in the UltraSPARC-II processor to run at 400MHz with an 8MB e-cache, however, the amount of charge used to hold a bit was so small that background radiation would easily flip bits, producing on average one flipped bit per processor per year. While that might not seem like a high rate, a customer with 12 systems of 32 processors each would on average experience one failure a day. This is what led to Sun's infamous e-cache parity crisis of 1999...

Document:

For the story and Steve Chessin's contribution to the solution, see the ACM [HTML|PDF] publication on this. His Sun/Oracle blog retaining this article from 2010 August.



Tuesday, October 13, 2009

Sun Takes #1 Spot in TPC-C Benchmarks!



Sun Takes #1 Spot in TPC-C Benchmarks!

Abstract
Sun has long participated in benchmarks. Some benchmarks have been left idle by Sun for may years. Sun has released a new TPC-C benchmark, using a cluster of T2+, earlier than advertised.
An interesting blog on the topic
Interesting Observations
  • Order of magnitude fewer racks to produce a faster solution
  • Order of magnitude fewer watts per 1000 tpmC
  • Sun's 36 sockets to IBM's 32 sockets
  • 10 GigE & FC instead of InfiniBand
  • Intel based OpenSolaris storage servers, instead of AMD "Thumper " based servers
Some thoughts:
  • The order of magnitude improvements in space and power consumption was obviously more compelling to someone than shooting for an order of magnitude improvement in performance
  • The performance could have been faster by adding more hosts to the RAC configuration, but the order of magnitude comparisons would be lost
  • The cost savings for superior performing SPARC cluster is dramatic: fewer hardware components for maintenance , lower HVAC costs, lower UPS costs, lower generator costs, lower cabling costs, lower data center square footage costs
  • The pricing per SPARC core is still to high for the T2 and T2+ processors, in comparison to the performance with competing sockets
  • The negative hammering by a few internet posters about the Sun OpenSPARC CoolThreads processors not being capable of running large databases is finally put to rest
It would have been nice to see:
  • a more scalable SMP solution, but this solution will expand better in an IBM horse race
  • a full Sun QDR InfiniBand configuration
  • a full end-to-end 10GigE configuration
  • Ithe T2 with embedded 10GigE clustered instead of the T2+ with the 10GigE card

Thursday, September 10, 2009

What's Better: USB or SCSI?

What's Better: USB or SCSI?

Abstract
Data usage and archiving is just exploding everywhere. The bus options for adding data increase often, with new bus protocols being added regularly. With systems so prevalent throughout businesses and homes, when should one choose a different bus protocol for accessing the data? This set of tests will be done with some older mid-range internal SCSI drives against a brand new massive external USB drive.

Test: Baseline
The Ultra60 test system is an SUN UltraSPARC II server, running dual 450MHz CPU's and 2 Gigabytes of RAM. Internally, there are 280 pin 180Gigabyte SCSI drives. Externally, there is one external 1.5 Terabyte Seagate Extreme drive. A straight "dd" will be done, from a 36Gig root slice, to the internal drive, and external disk.


Test #1a: Write Internal SCSI with UFS
The first copy was to an internal disk running UFS file system. The system hovered around 60% idle time with about 35% CPU time pegged in the SYS category, the entire time of the copy.

Ultra60-root$ time dd if=/dev/dsk/c0t0d0s0 of=/u001/root_slice_0
75504936+0 records in
75504936+0 records out

real 1h14m6.95s
user 12m46.79s
sys 58m54.07s


Test #1b: Read Internal SCSI with UFS
The read back of this file was used to create a baseline for other comparisons. The system hovered around 50% idle time with about 34% CPU time pegged in the SYS category, the entire time of the copy. About 34 minutes was the span of the read.

Ultra60-root$ time dd if=/u001/root_slice_0 of=/dev/null
75504936+0 records in
75504936+0 records out

real 34m13.91s
user 10m37.39s
sys 21m54.72s


Test #2a: Write Internal SCSI with ZFS
The internal disk was tested again using the ZFS file system, instead of UFS file system. The system hovered around 50% idle with about 45% being pegged in the sys category. The write time lengthened about 50%, using ZFS.

Ultra60-root$ time dd if=/dev/dsk/c0t0d0s0 of=/u002/root_slice_0
75504936+0 records in
75504936+0 records out

real 1h49m32.79s
user 12m10.12s
sys 1h34m12.79s


Test #2b: Read Internal SCSI with ZFS
The 36 Gigabyte read took ZFS took about 50% longer than UFS. The CPU capacity was not strained much more, however.

Ultra60-root$ time dd if=/u001/root_slice_0 of=/dev/null
75504936+0 records in
75504936+0 records out

real 51m15.39s
user 10m49.16s
sys 36m46.53s


Test #3a: Write External USB with ZFS
The third copy was to an external disk running ZFS file system. The system hovered around 0% idle time with about 95% CPU time pegged in the SYS category, the entire time of the copy. The copy consumed about the same amount of time as the ZFS copy to the internal disk.

Ultra60-root$ time dd if=/dev/dsk/c0t0d0s0 of=/u003/root_slice_0
75504936+0 records in
75504936+0 records out

real 1h52m13.72s
user 12m49.68s
sys 1h36m13.82s


Test #3b: Read External USB with ZFS
Read performance is slower over USB than it is over SCSI with ZFS. The time is 82% slower than the UFS SCSI read and 21% slower than the ZFS SCSI read. CPU utilization seems to be slightly more with USB (a factor of 10% less idle time with USB over SCSI.)

Ultra60-root$ time dd if=/u003/root_slice_0 of=/dev/null
75504936+0 records in
75504936+0 records out

real 1h2m50.76s
user 12m6.22s
sys 42m34.05s


Untested Conditions

Attempted was Firewire and eSATA, but these bus protocols would not reliably work on the Seagate Extreme 1.5TB drive, under any platform tested (several Macintoshes and SUN Workstations.) If you are interested in a real interface besides USB, this external drive is not the one you should be investigating - it is a serious mistake to purchase.

Conclusion

The benefits of ZFS does not come with a cost in time. Reads and writes are about 50% slower, but the cost may be worth it for the benefits of: unlimited snapshots, unlimited file system expansion, error correction, compression, 1 or 2 disk failure tolerance, future 3 disk failure tolerance, future encryption, and future clustering are features.

If you are serious about your system performance, SCSI is definitely a better choice, over USB, to provide throughput with minimum CPU utilization - regardless of file system. If you invested in CPU capacity and have CPU capacity to burn (i.e. muti-core CPU), then buying external USB storage may be reasonable, over purchasing SCSI.