Wednesday, July 11, 2012

Architecture Update: The ARMs Race


Abstract:
With the explosion of ARM processors in embedded systems, ranging from phones to tablets, ARM designers are creating ever more complex processors, CPU manufacturers are creating more options for system designers, and even system designers are discussing the movement of ARM from embedded to server and desktop systems.

CPU Architecture

64 Bit ARM v8
First, let's discuss the CPU architecture updates to the ARM processing architecture. Traditionally, the ARM processor bundled a 32 bit processor. While this was more than adequate for embedded systems, this limited the application of the architecture into other spheres of computing.



At the ARM TechCon conference in Santa Clara, California, in October 2011 - the ARM v8 64 bit architecture was announced, demonstrating 64 bit extensions similar to what was done with SPARC, AMD, and later Intel CPU's.

The new ARMv8 architecture has two execution states, the AArch32 state that is compatible with prior generations of 32-bit ARM processors, and AArch64, the new 64-bit extensions. At the moment, the ARMv8 architecture has only been profiled for what ARM calls the A line of its Cortex reference designs, which means they are designated for application processing such as that done on smartphones and tablets.
The ARMv8 architecture will bring forward TrustZone virtualization (which debuted with the ARM v6) and NEON SIMD instructions, which debuted with the ARM v7 designs. The interesting thing about the ARMv8 is that it will offer both double-precision floating point math through that NEON unit.
ARM Update: Mali 450 GPU moves from 4 to 8 cores

The Registered published a short article about an ARM roadmap split:
ARM is doubling the punch of its Mali 400 graphics processors with extra cores for tablet, phone and TV makers that are not ready for combined graphics and compute chips.
The microprocessor architect has announced the Mali 450 GPU, featuring eight cores instead of four.
ARM said the 450 showed it remains committed to the 400 range, and said it is now splitting its roadmap.

ARM Designs Quad-Core 32-bit v7
In April of 2011, ARM announced their quad-core Cortex-A15 processor was scheduled to appear in smart phones or tablets in 2012-2013.
ARM's Cortex-A15, however, will up the ante with an out-of-order superscalar pipeline, 40-bit memory-addressing capabilities, floating-point and media-handling improvements, and a clock speed of up to 2.5GHz, all at power requirements said to be comparable to the company's current Cortex-A9 design.
The Cortex-A9 is the design upon which such top-end smartphone and tablet chips as Apple's A5, Nvidia's Tegra 2, and Samsung's Exynos 4210 are based. The Cortex-A15 design, meanwhile, has already been licensed by Texas Instruments and Nvidia, and Nvidia

(ARM v8 Exception Model, courtesy ARMv8 Architecture PDF)
ARM Starts Designing 64-bit
Richard Grisenthwaite, Lead Architect at ARM, provided a technology preview of the ARMv8 64 bit architecture at ARM TechCon in 2011. It was clear from the document that 32 bit ARM processors would continue to be designed and that this was merely a new line of processor design which manufacturers could leverage.

System Designers


(A Boston Viridis server, front view, no cover, courtesy, The Register)

(Boston Limited's first Viridis server, courtsy The Register)



Boston
The Register writes about U.K. IT Supplier Boston is releasing their Viridis platform, based upon their Calexeda partnership, using the Smoothstone ARM processor.
"The Viridis server is using the 1.4GHz variant of the ECX-1000 processors and plunks a 4GB DDR3 memory stick in for each node on the card. The card has two 10GE network ports and four SATA disk ports per processor... The dozen processor cards including memory burn only 300 watts."



(Dell Quad ARM Server Chassis)

(Dell Quad ARM Server Blade)

Dell
It was mentioned during the June 2012 Network Management "System Vendor: CISC, RISC, EPIC Update"  that Dell was breaking into the ARM marketplace. Each Dell blade holds 4 32-bit ARM servers. Ironically, Dell's blade server looks a lot like a far less rugged old Sun Fire B1600 blade chassis, which contained 3U high SPARC RISC SPARC blades, but each Sun blade, from a decade ago, only held single 64 bit server.

(HP Redstone ARM v7 32 bit Server, courtesy The Register)
 HP Enters the ARMs Race
In November 2011, HP released ARM RISC servers to supplement their Proliant CISC servers and Itanium LWIS processors. This was the result of "Project Moonshot".
"To make the Redstone, HP took a half-width, single-height ProLiant tray server and ripped out just about everything but the tray. In goes the passive backplane that the Calxeda EnergyCard, and HP can cram three rows of these ARM boards, with six per row, for a total of 72 server nodes, in a half-width 2U slot... That gives you 288 server nodes in a 4U rack space, or 72 servers per rack unit."
ARM CPU Manufacturers

Samsung
DRAM manufacturer, and more recently cell phone manufacturer, appears to be hiring CPU designers, possibly for the ARM CPU chips, used in Apple and their own cell phones. Various chip designers are being harvested with experience ranging from Sun Microsystems and Oracle to AMD.


Calxeda
November 2011 - Calxeda announced their 32 bit quad-core EnergyCore ARM v7 EnergyCore ECX-1000 Series CPU's.
"Calxeda has spent the past several years tweaking the 32-bit ARMv7 core to come up with its own system-on-chip (SoC) and related interconnect fabric suitable for hyperscale parallel and distributed computing where nodes have only modest memory needs."

[Applied Micro CEP Paramesh Gopi, courtesty The Register]
Applied Micro
Also in October of 2011, Applied Micro announced their X-Gene 64 bit ARM v8 processors.

(X-Gene ARMv8 CPU)
"The X-Gene chip will also include DDR3 main memory controllers, two 10 Gigabit Ethernet ports, SATA storage and PCI-Express peripheral controllers, and a power/management module – all on the same die as the cores."
"The cores will have L1 and L2 caches per core, a shared L3 cache that spans the cores, and have a target clock speed of 3GHz."
"The X-Gene chip also has on-chip CPU and I/O virtualization, just like x86, Sparc, Power, and Itanium chips do. The architecture also allows for various kinds of offload engines to be plugged in and perhaps integrated on the chip package."



(X-Gene ARM v8 Block Diagram, courtesy The Register)
The X-Gene is suposed to be ready to ship second half of 2012 - which is right about now. Taiwan Semiconductor Manufacturing Corporation (TSMC) is first etching the chips using a 40nm process, with subsequent designes in 28nm.


Nvidia
At the Las Vegas Consumer Electronics Show in April 2011, video chip processing giant Nvidea discussed phones based upon their Tegra 2 dual-core ARM Cortex-A9 chips, which bundle graphics processing, licensed the future Cortex-A15 design, and announced "Project Denver" circa 2013 - targeting desktops.
"Denver provides a choice. System builders can now choose a high-performance processor based on a RISC instruction set with modern features such as fixed-width instructions, predication, and a large general register file. These features enable advanced compiler techniques and simplify implementation, ultimately leading to higher performance and a more energy-efficient processor."
Back in September of 2010, Nvidea president and CEP Jen-Hsun Huang also discussed their "Kepler" ARM processor, due in 2011, and the "Maxwell" ARM processor due in 2013.
(Armada XP Processor, courtesy The Register)
Marvell

Chip manufacturer Marvell acquired the ARM RISC CPU business from Xscale in 2006. In 2010, Marvell announced it's quad-core 32 bit ARM v7 Armada XP processor, implemented on a 40nm process.
"...running at 1.6 GHz with a shared 2 MB L2 cache memory... The chip will include variants that support 64-bit DDR2, DDR3, and DDR3 low-voltage memory chips. For on-chip DDR3 controllers, the memory can run at to 800 MHz and ... has ECC memory scrubbing."
"The chip includes four PCI-Express 2.0 x4 interfaces and four Gigabit Ethernet controllers etched into its silicon; it has 16 SERDES lanes for implementing USB, PCI-Express, SATA, SGMII, and QSGMII ports..."
 It seems 2013 could prove very interesting from Marvell.
Conclusions
It is very odd, not to see IBM producing any platforms based upon ARM, but very interesting to see IBM assisting ARM to reduce it's chip process down to 14nm, back in January of 2011. One has to wonder, at what point will IBM stop developing POWER (POWER 7+ is now about 10 months late?) or stop helping ARM produce smaller & faster processors. Up until this point, POWER was not in competition with ARM, but clearly ARM is climbing the food chain, moving to thin client desktops, cell phones, tablets, and now servers.

Apple Mac OSX, based upon BSD UNIX, and Google's Android Linux are the main OS players in the ARM arena - with Microsoft starting to produce Windows ports.

OpenSolaris port to ARM was of interest back in 2009, a code contribution made in 2009, additional work in Feb/Mar 2012 timeframe with some code, Illumos developers considering ARM in March 2012, Google "Summer of Code" ARM project idea suggested in April 2012, a grad student showing interest in April 2012, and with all the activity around ARM servers - one might hope that there will be additoinal interest in the Illumos community.

Will other OS vendors port to ARM?

No comments:

Post a Comment