Thursday, August 14, 2014

The [Almost] Great Internet Crash: 2014-08-12 is 512K Day

The [Almost] Great Internet Crash: 2014-08-12 is 512K Day

Since the creation of computers, people had been taking computer equipment and making them attach through different means. Proprietary cables and protocols were constantly being designed, until the advent of The Internet. Based upon TCP/IP, there seemed to be little to limit it's growth, until the 32 bit address range started to "run out" when more people in more countries wanted to come on-line. On August 12, 2014, an event affectionately referred to as "512K Day" had occurred, a direct result of the IPv4 hacks to keep the older address scheme alive until IPv6 could be implemented.

The Internet was first created with the Internet Engineering Task Force (IETF) publishing of "RFC 760" on January 1980, later to be replaced by "RFC 791" on September 1981. These defined the 32 bit version of TCP/IP was called "IPv4". During the first decade of The Internet, addresses were allocated in basic "classes", according to network size of the applicant's needs.

As corporations and individuals started to use The Internet, it was realized that this was not scalable, so IETF published "RFC 1518" and "RFC 1519" in 1993 to break the larger blocks down into more fine-grain slices for allocation, called Classless Inter-Domain Routing (or "CIDR"... which was subsequently refreshed in 2006 as "RFC 4362".) Network Address Translation ("NAT") was also created. 

The Private Internet Addresses were published as part of "RFC 1918" in February 1996, in order to help alleviate the problem of "sustained exponential growth". Service Providers used NAT and CIDR to continue to facilitate the massive expansion of The Internet, using private networks hidden behind a single Internet facing IP Address.

In 1998, the IETF formalized "IPv6", as a successor protocol for The Internet, based upon 128 bits. The thought was providers would move to IPv6 and sun-set IPv4 with the NAT hack.
[Example of a private network sitting behind a public WAN/Internet connection]
Address Exhaustion:
Routing and system vendors had started supporting IPv6, but the vast majority of users continue to use CIDR and NAT hack to run the internet. The Internet, for the most part, had run out of IPv4 Addresses, called Address Exhaustion.
The IP address space is managed by the Internet Assigned Numbers Authority (IANA) globally, and by five regional Internet registries (RIR) responsible in their designated territories for assignment to end users and local Internet registries, such as Internet service providers. The top-level exhaustion occurred on 31 January 2011.[1][2][3] Three of the five RIRs have exhausted allocation of all the blocks they have not reserved for IPv6 transition; this occurred for the Asia-Pacific on 15 April 2011,[4][5][6] for Europe on 14 September 2012, and for Latin America and the Caribbean on 10 June 2014.
Now, over a decade later, people are still using IPv4 with CIDR and NAT, trying to avoid the inevitable migration to IPv6.

[Normal outage flow with an unusual spike on 2014-08-12]

Warning... Warning... Will Robinson!
People were well aware of the problems with people using CIDR and NAT - address space would continue to become so fragmented over time that routing tables would eventually hit their maximums, crashing segments of The Internet.

Some discussions started around 2007, with how to mitigate this issue in the next half-decade. It was known that there was a limited number of routes that routing equipment can handle.
...this _should_ be a relatively safe way for networks under the gun to upgrade (especially those running 7600/6500 gear with anything less than Sup720-3bxl) to survive on an internet with >~240k routes and get by with these filtered routes, either buying more time to get upgrades done or putting off upgrades for perhaps a considerable time.

On May 12, 2014 - Cisco published a technical article warning people of the upcoming event.
As an industry, we’ve known for some time that the Internet routing table growth could cause Ternary Content Addressable Memory (TCAM) resource exhaustion for some networking products. TCAM is a very important component of certain network switches and routers that stores routing tables. It is much faster than ordinary RAM (random access memory) and allows for rapid table lookups.
No matter who provides your networking equipment, it needs to be able to manage the ongoing growth of the Internet routing table. We recommend confirming and addressing any possible impacts for all devices in your network, not just those provided by Cisco.

On June 9, 2014 - Cisco published a technical article 117712 on how to deal with the the 512K route limit on some of their largest equipment... when the high-speed TCAM memory segment overflows.
When a route is programmed into the Cisco Express Forwarding (CEF) table in the main memory (RAM), a second copy of that route is stored in the hardware TCAM memory on the Supervisor as well as any Distributed Forwarding Card (DFC) modules on the linecards.

This document focuses on the FIB TCAM; however, the information in this document can also be used in order to resolve these error messages:
%MLSCEF-SP-4-FIB_EXCEPTION_THRESHOLD: Hardware CEF entry usage is at 95% capacity for IPv4 unicast protocol
%MLSCEF-DFC4-7-FIB_EXCEPTION: FIB TCAM exception, Some entries will be software switched 
%MLSCEF-SP-7-FIB_EXCEPTION: FIB TCAM exception, Some entries will be software switched
Cisco's solution will steal memory from IPv6 and MPLS labels, but allocate up to 1 Million routes.

On July 25, 2014 - people started reminding others to adjust their routing cache sizes!
As many readers on this list know the routing table is approaching 512K routes.
For some it has already passed this threshold.
How do they know? Well, common people have an insight into this through the "CIDR Report"... yes, anyone can watch the growth of The Internet.

[ warning on 2014-05-06 of the 512K limit]

The Day Parts of The Internet Crashed:
Cisco published a Service Provider note "SP360", to note the event.
Today we know that another significant milestone has been reached, as we officially passed the 512,000 or 512k route mark!
Our industry has known this milestone was approaching for some time. In fact it was as recently as May 2014 that we provided our customers with a reminder of the milestone, the implications for some Cisco products, and advice on appropriate workarounds.

Both technical journals and business journals started noticing the issue. People started to notice that The Internet was becoming unstable on August 13, 2014. The Wall Street Journal published on August 13, 2014:
The problem also draws attention to a real, if arcane, issue with the Internet's plumbing: the shrinking number of addresses available under the most popular routing system. That system, called IPv4, can handle only a few billion addresses. But there are already nearly 13 billion devices hooked up to the Internet, and the number is quickly growing, Cisco said.
Version 6, or IPv6, can hold many orders of magnitude more addresses but has been slow to catch on. In the meantime, network engineers are using stopgap measures

The issue was inevitable, but what was the sequence of events?

[BGP spike shown by GBPMon]
One Blip from One Large Provider:
Apparently, Verizon released thousands of small networks into the global routing tables
So whatever happened internally at Verizon caused aggregation for these prefixes to fail which resulted in the introduction of thousands of new /24 routes into the global routing table.  This caused the routing table to temporarily reach 515,000 prefixes and that caused issues for older Cisco routers.
Luckily Verizon quickly solved the de-aggregation problem, so we’re good for now. However the Internet routing table will continue to grow organically and we will reach the 512,000 limit soon again.
Whether this was a mistake or not is not the issue, this situation was inevitable.
In Conclusion:
The damage was done, but perhaps it was for the best. People should be looking at making sure their internet connection is ready for when it happens again. People should be asking questions such as: "why are we still using NAT?" and "when are we moving to IPv6?" If your service provider is still relying upon NAT, they are in no position to move to IPv6, and are contributing to the instability of The Internet.

Sunday, May 25, 2014

Solaris: Loopback Optimization and TCP_FUSION

Since early days of computing, the most slowest interconnects have always been between platforms through input and output channels. The movement from Serial ports to higher speed communications channels such as TCP/IP became the standard mechanism for applications to not only communicate between physical systems, but also on the same system! During Solaris 10 development, a capability to increase the performance of the TCP/IP stack with application on the same server was introduced called TCP_FUSION. Some application vendors may be unaware of safeguards built into Solaris 10 to keep denial of service attacks or starvation of the applications due to the high performance of TCP writers on the loopback interface.
Authors Brendan Gregg and Jim Mauro describe the functionality of TCP_FUSION in their book: DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X, and FreeBSD.
Loopback TCP packets on Solaris may be processed by tcp fusion, a performance feature that bypasses the ip layer. These are packets over a fused fused connection, which will not be visible using the ip:::send and ip:::receive probes, (but they can be seen using the tcp:::send and tcp:::receive probes.) When TCP fusion is enabled (which it is by default), loopback connections become fused after a TCP handshake, and then all data packets take a shorter code path that bypasses the IP layer.
The modern application hosted under Solaris will demonstrate a significant benefit over being hosted under alternative operating systems.

Demonstrated Benefits:
TCP socket performance, under languages such as Java, may demonstrate a significant performance improvement, often shocking software developers!
While comparing java TCP socket performance between RH Linux and Solaris, one of my test is done by using a java client sending strings and reading the replies from a java echo server. I measure the time spent to send and receive the data (i.e. the loop back round trip).
The test is run 100,000 times (more occurrence are giving similar results). From my tests Solaris is 25/30% faster on average than RH Linux, on the same computer with default system and network settings, same JVM arguments (if any) etc.
The answer seems clear, TCP_FUSION is the primary reason.
In Solaris that's called "TCP Fusion" which means two local TCP endpoints will be "fused". Thus they will bypass the TCP data path entirely. 
Testing will confirm this odd performance benefit under stock Solaris under Linux.
Nice! I've used the command
echo 'do_tcp_fusion/W 0' | mdb -kw

and manage to reproduce times close to what I've experienced on RH Linux. I switched back to re-enable it using
echo 'do_tcp_fusion/W 1' | mdb -kw

Thanks both for your help.
Once people understand the benefits of TCP_FUSION, they will seldom go back.

Old Issues:
The default nature of TCP_FUSION means any application hosted under Solaris 10 or above will, by default, receive the benefit of this huge performance boost. Some early releases of Solaris 10 without patches may experience a condition where a crash can occur, because of kernel memory usage. The situation, workaround, and resolution is described:

Solaris 10 systems may panic in the tcp_fuse_rcv_drain() TCP/IP function when using TCP loopback connections, where both ends of the connection are on the same system. This may allow a local unprivileged user to cause a Denial of Service (DoS) condition on the affected host.
To work around the described issue until patches can be installed, disable TCP Fusion by adding the following line to the "/etc/system" file and rebooting the system: set ip:do_tcp_fusion = 0x0.
This issue is addressed in the following releases: SPARC Platform Solaris 10 with patch 118833-23 or later and x86 Platform Solaris 10 with patch 118855-19 or later.
Disabling TCP_FUSION feature is no longer needed for DoS protections.

Odd Application Behavior:
If an application running under Solaris does not experience a performance boost, but rather a performance degradation, it is possible your ISV is not completely understand TCP_FUSION or the symptoms of an odd code implementation. When developers expect the receiving application on a socket to respond slowly, this can result in bad behavior with TCP sockets accelerated by Solaris.

Instead of application developers optimizing the behavior of their receiving application to take advantage of 25%-30% potential performance benefit, some of those applications vendors chose to suggest disabling TCP_FUSION with their applications: Riverbed's Stingray Traffic Manager and Veritas NetBackup (4x slowdown.) Those unoptimized TCP reading applications, which perform reads 8x slower than their TCP writing application counterparts, perform extremely poorly in the TCP_FUSION environment.

Possible bad TCP_FUSION interaction?
There is a better way to debug this issue rather than shutting off the beneficial behavior. Blogger Steffen Weiberle at Oracle wrote pretty extensively on this.

First, one may want to understand if it is being used. TCP_FUSION is often used, but not always:
There are some exceptions to this, including when using IPsec, IPQoS, raw-socket, kernel SSL, non-simple TCP/IP conditions. or the two end points are on different squeues. A fused connect will revert to unfused if an IP Filter rule will drop a packet. However TCP fusion is done in the general case.
When TCP_FUSION is enabled for an application, there is a risk that the TCP data provider can provide data so fast over TCP that it can cause starvation of the receiving application! Solaris OS developers anticipated this in their acceleration design.
With TCP fusion enabled (which it is by default in Solaris 10 6/06 and later, and in OpenSolaris), when a TCP connection is created between processes on a system, the necessary things are set up to transfer data from the sender to the receiver without sending it down and back up the stack. The typical flow control of filling a send buffer (defaults to 48K or the value of tcp_xmit_hiwat, unless changed via a socket operation) still applies. With TCP Fusion on, there is a second check, which is the number of writes to the socket without a read. The reason for the counter is to allow the receiver to get CPU cycles, since the sender and receiver are on the same system and may be sharing one or more CPUs. The default value of this counter is eight (8), as determined by tcp_fusion_rcv_unread_min.
Some ISV developers may have coded their applications in such a way to anticipate that TCP is slow and coded their receiving application to be less efficient than the sending application. If the receiving application is 8x slower in servicing the reading from the TCP socket, the OS will slow down the provider. Some vendors call this a "bug" in the OS.

When doing large writes, or when the receiver is actively reading, the buffer flow control dominates. However, when doing smaller writes, it is easy for the sender to end up with a condition where the number of consecutive writes without a read is exceeded, and the writer blocks, or if using non-blocking I/O, will get an EAGAIN error.
So now, one may see the symptoms: errors with TCP applications where connections on the same system are experiencing slowdowns and may even provide EAGAIN errors.

Tuning Option: Increase Slow Reader Tolerance
If the TCP reading application is known to be 8x slower than the TCP writing application, one option is to increase the threshold that the TCP writer becomes blocked, so maybe 32x as many writes can be issued [to a single read] before the OS performs a block on the writer, from a safety perspective. Steffen Weiberle also suggested:
To test this I suggested the customer change the tcp_fusion_rcv_unread_min on their running system using mdb(1). I suggested they increase the counter by a factor of four (4), just to be safe.
# echo "tcp_fusion_rcv_unread_min/W 32" | mdb -kw
tcp_fusion_rcv_unread_min:      0x8            =       0x20

Here is how you check what the current value is.
# echo "tcp_fusion_rcv_unread_min/D" | mdb -k
tcp_fusion_rcv_unread_min:      32

After running several hours of tests, the EAGAIN error did not return.
Tuning Option: Removing Slow Reader Protections
If the reading application is just poorly written and will never keep up with the writing application, another option is to remove the write-to-read protection entirely. Steffen Weiberle wrote:
Since then I have suggested they set tcp_fusion_rcv_unread_min to 0, to turn the check off completely. This will allow the buffer size and total outstanding write data volume to determine whether the sender is blocked, as it is for remote connections. Since the mdb is only good until the next reboot, I suggested the customer change the setting in /etc/system.
\* Set TCP fusion to allow unlimited outstanding writes up to the TCP send buffer set by default or the application.
\* The default value is 8.
set ip:tcp_fusion_rcv_unread_min=0
There is a buffer safety tunable, where the writing application will block if the kernel buffer fills, so you will not crash Solaris if you turn this write-to-read ratio safety switch off.

Tuning Option: Disabling TCP_FUSION
This is the proverbial hammer on inserting a tack into a cork board. Steffen Weiberle wrote:
To turn TCP Fusion off all together, something I have not tested with, the variable do_tcp_fusion can be set from its default 1 to 0.
And I would like to note that in OpenSolaris only the do_tcp_fusion setting is available. With the delivery of CR 6826274, the consecutive write counting has been removed.
Network Management has not investigated what the changes were in the final releases of OpenSolaris or more recent  Solaris 11 releases from Oracle in regards to TCP_FUSION tuning.
Tuning Guidelines:
The assumption of Network Management is that the common systems administrator is working with well-designed applications, where the application reader is keeping up with the application writer, under Solaris 10. If there are ill-behaved applications under Solaris 10, but one is interested in maintaining the 25%-30% performance improvement, some of the earlier tuning suggestions below will provide much better help than the typical ISV suggested final step.

Check for TCP_FUSION - 0=off, 1=on (default)
SUN9999/root#   echo "do_tcp_fusion/D" | mdb -k
do_tcp_fusion: 1

Check for TCP_FUSION unread to written ratio - 0=off, 8=default
SUN9999/root# echo "tcp_fusion_rcv_unread_min/D" | mdb -k
tcp_fusion_rcv_unread_min:      8   
Quadruple the TCP_FUSION unread to write ratio and check the results:
SUN9999/root# echo "tcp_fusion_rcv_unread_min/W 32" | mdb -kw
tcp_fusion_rcv_unread_min:      0x8            =       0x20
SUN9999/root# echo "tcp_fusion_rcv_unread_min/D" | mdb -k
tcp_fusion_rcv_unread_min:      32
Disable the unread to write ratio and check the results:
SUN9999/root# echo "tcp_fusion_rcv_unread_min/W 0" | mdb -kw
SUN9999/root# echo "tcp_fusion_rcv_unread_min/D" | mdb -k
tcp_fusion_rcv_unread_min:      0
Finally, disable TCP_FUSION to lose all performance benefits of Solaris, but keep your ISV happy.
SUN9999/root# echo "do_tcp_fusion/W 0" | mdb -kw
May this be helpful for Solaris 10 platform administrators, especially with Network Management platforms!

Thursday, May 1, 2014

Oracle Solaris 11.2 Release

Oracle Solaris 11.2 Release Event
Oracle had released the 2nd revision to the Oracle 11 Operating System. During the release event, various people from the Oracle Team had spoken with overviews, with deep-dives for more technical information. Notes followed the deep-dive sessions.

Video Events
The individual video events are all available at a SINGLE SITE, the individual videos could not be embedded into the blog due to a bug in the way Oracle presented the EMBED video tag.

Oracle Solaris 11.2 - Engineered for the Cloud: Mark Hurd [Video] - Mark Hurd: President, Oracle

 Oracle Solaris 11.2 - Engineering for the Cloud: John Fowler [Video] - John Fowler: Executive Vice President Systems, Oracle

 Oracle Solaris 11.2 - Engineering for the Cloud: Panel  
[Video] - Customer Panel
Panel Members:
  • Bryon Ackerman: VP Internet Systems, Wells Fargo
  • Greg Lavender: CTO, Cloud Architecture & Infrastructure Engineering
  • Citi; Krishna Tangirala: Director of Infrastructure, B&H Photo and Video
    Oracle Solaris Lifecycle Management
    [Video] - Eric Saxe: Senior Manager Software Development, Oracle
    Key take-away point: Flash Archive to Unified Archive
    1. Solaris 10 has Flash Archive while Solaris 11 has Unified Archive 
    2. Unified Archive is Foundational 
    3. Completely portable on same CPU architecture
    4. native support for virtualization & zones (p2v, v2p, v2v, etc.) 
    Use Cases for Unified Archive
    1. Cloning & Golden Image from Physical Platform to LDom to Zones and back 
    2. Disaster Recovery 
    3. Delivering Vendor & Customer 
    Applications Management Features Include:
    1. OpenStack: Glance serves Unified Archive images into the cloud
    2. Oracle Enterprise OpsCenter is Fully integrated and free for Premier Support 
      Oracle 11.2: Virtualization and SDN
       [Video] - Markus Flierl, VP Solaris Development
      Operating Platform, Comprehensive Solution
      1. Full Operating System
      2. Full Virtualizaation
      3. Full OpenStack
      Zones enhanced with Kernel Zones
      1. 26% overhead for 4x Linux on VMWare
      2. ~1% overhead for 4x Solaris Zones
      Cost Reduction from Intel / Linux
      1. 68% reduced expenditures under Intel
      2. 74% reduced expenditures under SPARC
      Discussion about Unified Archive
      1. Encrypted Package & Delivery
      2. Compliance Checks
      3. Oracle Application Packaging
      4. Customer Packaging
      Kernel Zones
      1. Benefit of Licensing
      2. Live Resource Rebalancing
      Software Defined Networking (SDN) Virtuzalization
      1. Bundled into Solaris
      2. Optimized for Fabric Hardware Offloading
      3. Tunnel over Generic or Old Fabric
      4. VXLAN and Distributed Virtual Switch
      5. Fully Integrated into OpenStack
      Application Driven SDN
      1. Application can get it's own virtual network
      2. Resources allocated on network via distributed virtual network
      3. Priority can be provided to individual applications
      4. Java 8 will make SDN fully accessible
      Other Features:
      1. High Availability is Fully Integrated
      2. Enterprise Manager Ops Center fully integrated
      3. OpenStack - Unified, Industry Cross-Platform, Zone and Kernel Zone
        Oracle Solaris OpenStack
        [Video] - Eric Saxe: Senior Manager Software Development, Oracle
        Problems and Solutions in Datacenter
        • Several Weeks to Months normally needed to deploy systems
        • Cloud offers deployment acceleration from weeks to minutes
        • Manage data center as a single system
        • Better H-A and Data Redundancy
        • Python Based Open Source Cloud Infrastructure
        • Provides: IaaS, PaaS, SaaS
        • Self-Service web-based cloud portal
        • Compute infrastructure in minutes
        • Provides REST API's to build programmatic expansions
        OpenStack Core Services
        • Horizon - Web Based Portal
        • Nova - Virtual Machine lifecycle provisioning
          Install, Start, Migrate, etc
        • Cynder - Manage and Provision Block Storage for VM Instances
        • Neutron - Manage and Provision Networking Service, Virtual Network
        • Keystone - Offers identity and authentication for users, admins, and internal services
        • Swift - Provides Object Storage Service
        End User and Community Support
        • Joined OpenStack Foundation
        • Supported in Solaris
        • Capabilities to be Contributed Upstream
        Solaris Contributions to OpenStack
        • Solaris is trusted in Enterprise
        • Solaris scales for large workloads
        • Solaris offers gigabytes to terabytes of physical memory
        • Solaris offers unsurpassed data integrity
        • Solaris is secure by design
        • Solaris offers industry leading observability and compliance
        • Solaris features such as zones and kernel zones
        • Solaris capabilities to include software defined networking
        • Solaris OS imaging and templating technologies
        Oracle Solaris: Optimized for Database, Java and Applications  
        [Video] - Markus Flierl, VP Solaris Development
        Optimizations for Oracle Database and Java Integrated Directly into Core Solaris 11.2
        • Virtualization
        • Software Defined Networking
        • OpenStack
        Optimizations Plan:
        • Engineered Together
        • Tested Together
        • Certified Together
        • Deployed Together
        • Upgraded Together
        • Managed Together
        • Supported Together
        Pre-optimized Bundles through Unified Archives Interesting Notes:
        • Over half of SPARC SuperCluster customers are retiring non-Solaris platforms or new
        • SPARC SuperCluster growth rate over 100% year over year
        • Infiniband for optimal network performance
        • Storage built for Oracle Database
        • 26% performance gain with 4x VM's (vs Intel/RedHat/VM) 
        • Dramatic increase in performance from T3 to  T4 and T5
        Solaris 11.2 with Oracle 12c Enhancements
        • Oracle 12c Offloading RAC locking into Solaris Kernel
          (Higher Throughput, Reduced Latency)
        • 65% less $/tpm with SPARC T5-2 Solaris (vs Inte/RedHat/VM)
        • Optimize Database with 32 TB SGA startup from 40 minutes to 2 minutes
        • New Solaris Shared Memory interface to resize SGA with no downtime
        • Software Defined Network optimized and provided to Oracle 12c
        • DTrace instrumentation for I/O events in Oracle 12c
        • Oracle 12c v$kernel_io_outlier
         Solaris 11.2 with Java Enhancements
        • Solaris 11 Massive JVM improvement from T4, T5 / Java 7 to T5 / Java 8 over Intel
        • Automatic Large Page Support
        • Locking Infrastructure
        • Zero Percent Virtualization
        • Java Mission Control for DTrace visualization
         Future SPARC Improvements
        • Database Query Acceleration
        • Java Acceleration
        • Application Data Protection
        • Data Compression and Decompression
        Unified Archive Benefit to Applications in:
        • Physical Hardware
        • Zones
        • LDom/OracleVM
        • Customer can Leverage
        Oracle was first and best customer using Oracle SuperCluster
        The Economics of Oracle Solaris: Lower Your Costs
        [Video] - Scott Lynn: Solaris Product Management
        History of SPARC Performance Leadership
        • Power7+:
          10% improvement over 3 years
        • Intel x86:
          20%-50% improvement each generation
        • SPARC:
          Over 2x Performance Improvement each generation
        Solaris and SPARC Performance Leadership
        • 78% decrease in Price/Performance over M9000
        • 85% decrease in software costs on large Intel dual socket
        • 68% decrease in software costs/vm using Intel Solaris over RedHat and VMWare
        • 74% decrease in overall cost/vm using SPARC Solaris T5/2 over RedHat and VMWare
        SPARC CPU Architecture, Solaris OS, Virtualization, OpenStack - Complete.

        Friday, April 25, 2014

        Engineering for the Cloud: Solaris 11.2

        Webcast: Announcing Oracle Solaris 11.2
        Tuesday April 29, 2014
        1 PM (ET) / 10:00am (PT)

        1:00pm-1:20pmWelcome and Introduction
        Speaker: Mark Hurd, President, Oracle
        1:20pm-2:00pmAnnouncing Oracle Solaris 11.2
        Speakers: John Fowler, Executive Vice President, Systems, Oracle;
        Markus Flierl, Vice President, Solaris Engineering, Oracle
        2:00pm-2:30pmOracle Solaris: Real-world Perspectives
        Direct from the Experts: Oracle Solaris Deep Dives
        Oracle Solaris Lifecycle Management: Agile. Secure. Compliant
        Oracle Solaris 11.2 Virtualization and SDN: Integrated. Efficient. Secure.
        Oracle Solaris OpenStack
        Oracle Solaris: Optimized for Oracle Database, Oracle Java and Oracle Applications
        The Economics of Oracle Solaris: Lower Your Costs
        Speakers: Markus Flierl, Vice President, Solaris Engineering, Oracle; 
        Scott Lynn, Solaris Product Manager, Oracle; 
        Eric Saxe, Senior Manager, Software Development, Oracle

        Thursday, April 17, 2014

        Hardware: American Sell-Off with IBM and Google

        [IBM Logo, courtesy IBM]

        As the misguided U.S. economy continues to run up massive debt and continue massive trade deficit, the sell-off of U.S. High Technology assets continues to non-U.S. companies, fat with outsourcing cash. Lenovo, a Chinese company, continues their purchases in the United States of inventors of technologyu.
        [Chinese glorifying revolution, courtesy, The Telegraph]
        Chinese Lenovo Purchasing U.S. Hard Technology

        Chinese global company Lenovo has been purchasing their way into the U.S. market through many technologies essentially invented in the United States. IBM seems to be the most significant seller.

        [IBM PC, courtesy Wikipedia]
        • 2005-05-01 - PC Division acquired from IBM (PC's and ThinkPad Laptops)
          Chinese computer maker Lenovo has completed its $1.75 billion purchase of IBM’s personal computer division, creating the world’s third-largest PC maker, the company said Sunday. The deal — one of the biggest foreign acquisitions ever by a Chinese company
          [IBM Thinkpad, courtesy tecqcom]
        • 2006-04-10 - Lenovo makes break with the IBM brand (on PC's, not ThinkPad Laptops)
          Since Lenovo took over the IBM personal computer business on May 1, 2005, the company's advertising and marketing efforts have excluded IBM almost entirely. The four television spots that Lenovo ran during the Turin Winter Olympics, for example, never mentioned IBM at all. In fact, the only connection to the iconic brand is the IBM logo, which still adorns Lenovo's ThinkPad laptops.
        • 2013-01-07 - Lenovo to create ThinkPad-focused business unit to compete at the high end
          Lenovo is reorganizing its operations into two business groups... As part of the restructuring, it will create two new divisions, Lenovo Business Group and Think Business Group.The reorganization, which will be completed on April 1 [2013]
          [IBM Servers, courtesy Wikipedia]
        • 2014-01-23- Lenovo to buy IBM's x86 server business for $2.3bn (PC Servers)
          Lenovo and IBM announced on Thursday they have signed a definitive agreement that will see the Chinese hardware giant acquire the IBM's x86 server business for the tidy sum of $2.3bn, with approximately $2bn to be paid in cash and the balance in Lenovo stock.
          Adding to the PC business Lenovo acquired from IBM in 2005, Lenovo will take charge of IBM's System x, BladeCenter and Flex System blade servers and switches, x86-based Flex integrated systems, NeXtScale and iDataPlex servers and associated software, blade networking and maintenance operations.
          [Motorola Droid RAZR, courtesy Wikipedia]
        • 2014-01-29 - Motorola Cellphone Company acquired from Google (by Lenovo)
          Lenovo has signed a deal to buy the loss-making Motorola Mobility smartphone manufacturer for $2.91bn, but a switched-on Google is keeping the patents owned by the firm it gobbled two years ago for $12.5bn.
          "The acquisition of such an iconic brand, innovative product portfolio and incredibly talented global team will immediately make Lenovo a strong global competitor in smartphones," said Lenovo's CEO Yang Yuanqing. "We will immediately have the opportunity to become a strong global player in the fast-growing mobile space."
        • 2014-01-29 -  Lenovo splits into 4 groups after buying IBM's server business
          A few days after announcing its plan to buy IBM’s x86 server business, the Chinese company is dividing its operations into four business groups... enterprise products... developing a software ecosystem...PCs and mobile products. The changes go into effect on April 1 [2014]
        Clearly, Lenovo has a vision for the U.S. Market and is executing upon it. How unfortunate that American companies such as IBM and Google see little value or possibility in domestic hardware innovation, moving into the future.
        [HP Logo, courtesy eWeek]
        Impacts in the U.S. Market

        There is a great deal of uncertainty felt by partners and customers of IBM through such acquisitions. Previous attempts to leverage the IBM logo to help assure customers was performed, but with the latest purchase - competitors such as HP are seeing the a lot of noise.
        • 2014-04-11 - HP: Lenovo's buy of IBM x86 biz is bad, bad, bad...
          "Customers and partners are concerned. They are concerned about what the future will be for them – not only in the product but also in support and services," claimed the exec veep and GM of the Enterprise Group.
          HP has an internal migration programme to support customers with IBM servers as they decide to make the switch, he pointed out.
          But providing maintenance support is something that HP and other vendors already offer on third-party kit as standard.
        HP was tried to consolidate all of their computing systems under Intel Itanium, before trying to shut them all down. HP also tried to sell off their PC business, but relented, possibly due to customer pressure. How conservative customers who would only buy IBM will respond in the U.S. to their favorite manufacturer leaving the industry may not be a difficult conclusion to reach, especially from companies like HP.
        Concluding Thoughts:
        The massive technology bleed from the United States is partially due to commoditization, but also due to the migration to Cloud and Appliances and value provided by Intel computing vendors becoming less significant with Intel shipping entire motherboards bundling CPU, Floating Point, Memory Management Units, Ethernet, and most recently Video. Cell phones appear to be drastically simplifying, as well. Perhaps there was nothing of value left for Intel or cell phone based manufacturers to do? Can Apple buck the trend?

        Sunday, April 13, 2014

        Security: Heartbleed, Apple, MacOSX, iOS, Linux, and Android

        Nearly every computing device today is connected together via a network of some kind. These connections open up opportunities or vulnerabilities for exploitation by mafia, criminals, or government espionage via malware. While computers such as MacOSX are immune, along with their mobile devices based upon iOS iPhone and iPads... huge numbers of Linux and Android devices are at risk!


         This particular vulnerability can be leveraged by many sources in order to capture usernames and passwords, where those account credentials can be later used for nefarious purposes. Nefarious includes: command and control to attack commercial, financial, government, or even launch attacks against entire national electrical grids; stealing money; stealing compute resources. The defect is well documented.

        Apple and Android/Linux Vulnerabilities:

        There are many operating systems which are vulnerable to this defect, but for this article, we are only really concerned about the mobile market.
        While most of the buzz surrounding OpenSSL's Heartbleed vulnerability has focussed on websites and other servers, the SANS Institute reminds us that software running on PCs, tablets and more is just as potentially vulnerable.
        Williams said a dodgy server could easily send a message to vulnerable software on phones, laptops, PCs, home routers and other devices, and retrieve up to 64KB of highly sensitive data from the targeted system at a time. It's an attack that would probably yield handy amounts of data if deployed against users of public Wi-Fi hotspots, for example.
        While Google said in a blog post on April 9 that all versions of Android are immune to the flaw, it added that the “limited exception” was one version dubbed 4.1.1, which was released in 2012.
        Security researchers said that version of Android is still used in millions of smartphones and tablets, including popular models made by Samsung Electronics Co., HTC Corp. and other manufacturers. Google statistics show that 34 percent of Android devices use variations of the 4.1 software.

        The company said less than 10 percent of active devices are vulnerable. More than 900 million Android devices have been activated worldwide.
        After taking a few days to check its security, the fruity firm joined other companies in publicly announcing how worried or secure its customers should feel.
        “Apple takes security very seriously. IOS and OS X never incorporated the vulnerable software and key Web-based services were not affected,” an Apple spokesperson said.

        To give an adequate understanding regarding the number of mobile Android devices at risk, one could take the population of the United States, at roughly 317 Million people as a baseline. 90 million Android Linux based devices vulnerable, that is equivalent to nearly 28% of the population of the United States is at risk! This is no small number of mobile devices - there is a lot of patching that either needs to be done or mobile devices which should be destroyed. Ensure you check your android device!

        Thursday, April 10, 2014

        Window Manager Lineup

        [TWM History, courtesy Wikipedia]
        X Windows is a Client-Server based windowing system, where the client applications can run on foreign servers and the X-Windows Server provide resources to the client to run properly, such as Frame Buffer, Keyboard, and Mouse. The X Windows Client application may run on any Hardware or OS Platform, consuming the memory and CPU resources on the remote side, not bound by architecture or byte order to the X Server. This article discusses one such client, the Window Manager.

        [X Windows Architecture, Courtesy Wikipedia]

        An X Client may consume resources from a single X Server, such as a simple as a Clock Application as complicated as a Desktop Publishing Application. An X Client may consume resources from multiple X Servers for gaming, such as X Tank or X Battle. A special kind of X Client is called the Window Manager. The Window Manager acts as a client, may run as a local client, on the platform hosting the X Server, or it can run on a different platform hosting clients. The Window Manager provides controls to the desktop environment, which is ultimately virtualized through the X Protocol.

        [Open Look Virtual Window Manager, courtesy Layer 3 Networking]
        Window Manager Lineup
        Window Managers come in many different flavors. A recent article on windows managers hit the Layer 3 Networking Blog and offers a view into what may be appropriate for a vendor's virtual desktop environment.

        2013-03-17 --- A Memory Comparison of Light... Desktops – Part 1
        Fortunately, ...we have plenty of other choices, and we do like change. We have no need to keep using desktops we don’t like.I will describe some of choices in this article, and I’ll attempt to measure the RAM memory requirements. 

        2013-04-09 ---  A Memory Comparison of Light... Desktops – Part 2
        ...I’ve tried to investigate the RAM memory requirements for running some of the most common light window managers and desktop environments available... Prompted by several readers, I’ve decided to include also the big, well-known memory hogs that grab most of the... market, i.e. KDE, Unity and Gnome.

        2014-02-15 --- A Memory Comparison of Light Linux Desktops – Part 3
        Unused memory goes into a special buffering pool, where the kernel caches all recently used data. If a process attempts to read a file and the kernel already has the file cached, reading it is as fast as reading RAM. Filesystem-heavy task, such as compiling source code, processing video files, etc. benefit from as much free memory as possible in buffering pool. It is not uncommon today to see users with powerful systems running tiling window managers in only a few megabytes of memory.
        [Lineup of Window Managers by Resource Utilization, courtesy Layer 3 Networking]

        The author of these articles had placed a disproportionate weight upon Linux, which did not even exist when X Windows was released, so it should be noted that any OS can leverage these Window Managers. The layer of control the Window Manager offers to the virtual desktop user is what is most important for the environment where virtualization is occurring. What really matters is the application being virtualized, not the window manager, so the desktop features required to deliver the virtualized application to the end user is an economics question which this article series provides excellent data points for an architect to leverage in order to make the appropriate business decision.

        Wednesday, April 2, 2014

        Security: Android Phone App Steals CPU

        android marketplace shopping bag
        [Courtesy: AndroidAuthority ]
        Malware was seen traditionally only a Microsoft Windows problem. Now that highly secured,  multi-platform, standards-based UNIX environments lose influence, malware continues to spread to poorly secured Linux environments. More importantly, Google Android's mobile phone and tablet platforms have fallen victim. Attacks continue mercilessly.

        old analog time clock
        Recent History
        Some recent Linux and Android validated attacks: January through November 2013, December 2013, January through February 2014, March 2014, and more malware is hitting the Linux and Android platforms. The most recent attacks are using your Linux based Android phones to create money for others.

        virus eating desktop computer
        Latest Attack
        At the end of March 2014, a new attack was discovered... not only on the third-party Google Android application internet sources, but also multiple infected applications were found on Google Play.

        2014-03-26 - Apps with millions of Google Play downloads covertly mine cryptocurrency
        Yes, smartphones can generate digital coins, but at a painfully glacial pace.

        According to a blog post published Tuesday by a researcher from antivirus provider Trend Micro, the apps are Songs, installed from one million to five million times, and Prized, which was installed from 10,000 to 50,000 times. Neither the app descriptions nor their terms of service make clear that the apps subject Android devices to the compute-intensive process of mining, Trend Micro Mobile Threats Analyst Veo Zhang wrote. As of Wednesday afternoon, the apps were still available.
        If you download applications from Google Play or other non-Google sites - you may be noticing terrible battery life, increased battery temperature, and increased network usage.
        global network image
        What This Means To You
        While Google has managed to remove some trojan applications which were designed to steal CPU time from your smart phone in order to electronically harvest bit coins for application developers, there are others sitting in Google Play and in non-regulated application markets.

        Wednesday, March 26, 2014

        Security: Software Piracy, Android Phones, and SMS Spam

        [Courtesy: Android Authority]
         Security: Software Piracy, Android Phones, and SMS Spam
        Ever since the creation of computers, people have been distributing software to avoid paying money or paying to distribute something that people don't want. Pirated Applications and Spam are two primary means to distribute viruses, malware, and worms. Baby steps against these on-line monsters are occasionally made.

        In Review: 2013

        From January to November last year, nearly 2 viruses, trojans, or generic malware was discovered each month in the Android mobile application market. December had a couple more discovered. For the malware discovered, there are countless numbers of mobile applications which have not yet been discovered... to steal credit card information, identities, or even "command and control" applications to turn your mobile device into a robot against unsuspecting targets (while you pay for the data traffic that is produced!)

        Starting: 2014

        While consolidating a list of mobile malware in the Android market was not completed, it is clear that there is some progress in this space... no matter how small.

        2014-03-25 U.S. Government First Convictions Over Pirated Mobile Android Applications
        The US has enforced its first convictions for illegally distributing counterfeit mobile apps, after two Florida men pleaded guilty for their part in a scheme that sold pirated apps with a total retail value of more than $700,000. Thomas Allen Dye, 21, and 26-year-old Nicholas Anthony Narbone both pleaded guilty to the same charge - conspiracy to commit criminal copyright infringement - earlier this month and are due to be sentenced in June and July respectively. Both men were in the Appbucket group, of which Narbone was the leader, which made and sold more than a million copyrighted Android mobile apps through the group's alternative online market.

        2014-03-26 Chinese Arrest 1,500 in Fake Cellular Tower Text Message Spam Raid
        China’s police have arrested over 1,500 people on suspicion of using fake base stations to send out mobile SMS spam. The current crackdown, began in February, according to Reuters. Citing a Ministry of Public Security missive, the newswire says a group operating in north-east Liaoning province, bordering North Korea, is suspected of pinging out more than 200 million spam texts.

        In Conclusion:
        Be diligent! Remember to purchase your applications from reputable places, don't be seduced into stealing applications on-line or purchasing them under list price. Being a thief could make you a victim!

        Tuesday, March 4, 2014

        Security: Linux, Viruses, Malware, and Worms

        Not long after the advent of The Internet, the creation of worms, viruses, and other malware had become prevalent. Microsoft based platforms were the original serious target, because of poor security measures. Over time, malware had started to attack Linux based Android mobile phones. Now, the latest attacks appear to be hitting Linux based consumer grade internet routers, which were originally used to help protect Microsoft Windows based platforms in the home. These attacks have spiked in the first two months of 2014.

        [Huawei TP-Link image, courtesy rootatnasro]
        2013-01-11 - How I saved your a** from the ZynOS (rom-0) attack!! (Full disclosure)
        Hello everyone, I just wanted to discuss some vulnerability I found and exploited for GOODNESS .. just so that SCRIPT KIDIES won’t attack your home/business network .
        Well, in Algeria the main ISP ( Algerie Telecom ) provide you with a router when you pay for an internet plan. So you can conclude that every subscriber is using that router . TD-W8951ND is one of them, I did some ip scanning and I found that every router is using ZYXEL embedded firmware.

        [Linksys Router, courtesy ARS Technica]
        2014-02-14 - Bizarre attack infects Linksys routers with self-replicating malware
        Linksys is aware of the malware called “The Moon” that has affected select older Linksys E-Series routers and select older Wireless-N access points and routers. The exploit to bypass the admin authentication used by the worm only works when the Remote Management Access feature is enabled. Linksys ships these products with the Remote Management Access feature turned off by default. Customers who have not enabled the Remote Management Access feature are not susceptible to this specific malware. Customers who have enabled the Remote Management Access feature can prevent further vulnerability to their network, by disabling the Remote Management Access feature and rebooting their router to remove the installed malware. Linksys will be working on the affected products with a firmware fix that is planned to be posted on our website in the coming weeks.

        [ASUS Warning, courtesy ARS Technica]
        2014-02-17 - Dear Asus router user: You’ve been pwned, thanks to easily exploited flaw
        "This is an automated message being sent out to everyone effected [sic]," the message, uploaded to his device without any login credentials, read. "Your Asus router (and your documents) can be accessed by anyone in the world with an Internet connection. You need to protect yourself and learn more by reading the following news article:"
        Two weeks ago, a group posted almost 13,000 IP addresses its members said hosted similarly vulnerable Asus routers.

        If you are doing any serious internet based work, one might suggest that care is taken to watch the firmware of your consumer grade internet router, and upgrade the firmware as they become available. If you are running a business, a commercial grade router with a managed service may be of special interest. A short PDF on "SOHO Pharming" helps clarify risks. The avoidance of Linux based Android phones or consumer grade Linux routers may be the next best step.