[SPARC processor, courtesy Oracle SPARC T5/M5 Kick-Off] |
Abstract:
Computing systems started with single processors. As computer requirements increased, multiple processors were lashed together, using technology called SMP (Symmetric Multi-Processing) to add more computing power into a single system, breaking up tasks into processes and threads, but the transition to multi-threaded computing was a long process. The lack of scalability for some problems produced MPP (Massively Parallel Processing) platforms, lashing systems together using special software to load-balance jobs to be processed. MPP platforms were very difficult to program general purpose applications, so massively Multi-Core and Multi-Threaded processors started to appear. Oracle recently released the SPARC T5 processor and systems - producing an SMP platform scalable with massive sockets, cores, and threads into a single chassis - leveraging existing multi-threaded computing software, reducing the need for MPP in real-world applications, while placing tremendous pressure upon the Operating System layer.
[SPARC logo, courtesy SPARC.org] |
The SPARC processors started a growth rate, with a movement to massively threaded software.
SPARC | Cores | GHz | Threads | Sockets | Total-Cores | Total-Threads |
---|---|---|---|---|---|---|
T1 | 8 | 1.4 | 32 | 1 | 8 | 32 |
T2 | 8 | 1.6 | 64 | 1 | 8 | 64 |
T2+ | 8 | 1.6 | 64 | 4 | 32 | 256 |
T3 | 16 | 1.6 | 128 | 4 | 64 | 512 |
T4 | 8 | 3 | 64 | 4 | 32 | 256 |
T5 | 16 | 3.6 | 128 | 8 | 128 | 1024 |
M5 | 6 | 3.6 | 48 | 32 | 192 | 1536 |
The movement to massively threaded processors meant that applications needed to be re-written to take advantage of the new higher throughput. Certain applications were already well suited for this workload (i.e. web servers) - but many were not.
[DTrace infrastructure and providers] |
The movement to massively threaded software, to take advantage of the higher overall throughput offered by the new processor technology, was difficult for application programmers. Technologies such as DTrace were added to advanced operating systems such as Solaris to assist developers and systems administrators in pin-pointing their code hot-spots for later re-write.
When the SPARC T4 was released, there was a feature called "Critical Thread API" in the S3 core, to assist application programmers who could not resolve some single thread bottlenecks. The S3 core could automatically switch into a single-threaded mode (with the sacrifice of throughput) to address hot-spots. The T4 (and T5) faster S3 core was also clocked at a higher rate, providing an overall boost to single threaded workflows over previous processors - even at the same number of cores and threads. The ability to perform out-of-order instruction handling in the S3 also increased speed in the execution of single-threaded applications.
The SPARC T4 and T5 processors finally offered application developers a no-compromise processor. For heavy single-threaded workloads, the SPARC M5 processor was released from Oracle, driving inreasing scales of higher single-threaded workloads, without having to rely upon systems produced by long-time SPARC partner & competitor - Fujitsu.
[Solaris logo, courtesy Sun Microsystems] |
A single system scaling to 192 cores and 1536 threads offers incredible challenges to Operating System designers. Steve Sistare from Oracle discusses some of these challenges in a Part 1 article and solutions in a Part 2 article. Some of the challenges overcome by Solaris included:
CPU scaling issues include: •increased lock contention at higher thread counts
•O(NCPU) and worse algorithms
Memory scaling issues include:Clearly, the engineering team at Oracle were up for the tasks created for them by the Oracle SPARC engineering team. Innovation from Sun Microsystems continues under Oracle. It will take years for other Operating System vendors to "catch up".
•working sets that exceed VA translation caches
•unmapping translations in all CPUs that access a memory page
•O(memory) algorithms
•memory hotspots
Device scaling issues include:
•O(Ndevice) and worse algorithms
•system bandwidth limitations
•lock contention in interrupt threads and service threads
Network Management Applications:
In the realm of Network Management, many polling applications used threads to scale, where network communication to edge devices was latency bottlenecked - making the SPARC "T" processors an excellent choice in the carrier based environment.
The data returned by the massively mult-threaded pollers needed to be placed in a database, in a consistent fashion. This offered a problem during the device "discovery" process. This is normally a single-threaded process, which experienced massive slow-downs under the "T" processors - until the T4 was released. With processors like the SPARC T4 and SPARC T5 - Network Management applications gain the proverbial "best of both worlds" with massive hardware thread scalability for pollers and excellent single-threaded throughput during discovery bottlenecks with the "Critical Thread API."
The latest SPARC platforms are optimal platforms for massive Network Management applications. There is no other platform on the planet which compares to SPARC for managing "The Internet".