With Cray on board, Intel rounds up its interconnect strategy – what about the others?
The impact goes beyond HPC, to the whole enterprise…
By the virtue of my high end system integration focus, I was aware of Intel's – and other CPU vendors – attempts to consolidate the high end system interconnect part of the product support, over the years. After all, seamlessly connecting a larger number of high margin multi-socket server boxes together into single system units, whether clusters or even tighter shared memory systems, does help have a winning proposition for many multi-million dollar deals in the datacenter, cloud and supercomputing space.
As the server and networking businesses come closer, we saw Intel taking up Fulcrum switch chippery to its existing 10GE node controllers, then acquiring Qlogic Infiniband arm, and now, just recently, finally concluding a long negotiated deal with Cray, to get a hold of its high speed interconnect. What's the impact?
Even in message passing clusters, the standard mainstream of today, a very efficient low latency interconnect can up the performance of a supersized system by large chunks, whether it is extra 10% in Linpack benchmark for higher TOP500 standing and the related owner ego boost, or much more than that in actual highly parallel applications sensitive to the inter-system connection lags. Think molecular modeling or weather simulation here…
Any trouble for AMD, since they were close with Cray? Not really – the problem there is that Cray needs a new, faster CPU core from AMD, and the CPU vendor will need some time for a full turnaround from the Bulldozer, to bring in a much faster core that competes back with Intel. On the other hand, AMD already has an ultrafast open spec large system interconnect for years – the High Node Count extension to HyperTransport, using inexpensive Infiniband structure as the physical layer, but with efficient low overhead Hypertransport protocol, is handled by HT Consortium and can be used on all HyperTransport enabled CPUs. This includes Chinese Loongson MIPS CPUs, whose high speed multiteraflop per chip derivative is expected to be deployed – as the main CPU, not an accelerator, mind you – some 2 years from now in one of China's 100 PetaFLOPs systems in the Chongqing city. The irony is, yes, that non X86 CPUs may benefit the most from it.
And the others? Mellanox, which with the fellow Israeli Voltaire acquisition, was the dominant Infiniband total solution party, needs to diversify beyond just Infiniband – where they now compete against Intel, who created the Infiniband, after all – and Ethernet presence, where the likes of Huawei are gaining ground too. To keep their market spread, looking at the solutions above Infiniband in both performance and features may be needed for them.
System vendors like SGI and HP, will lose yet a bit more of their differentiation, as Intel's new interconnect spread will cover more of their own niches – SGI's NUMAlink is still the fastest and tightest way to link multiple Xeons, enabling fully shared single memory space and single system images for 2,048 Xeon cores and 16 TB RAM in one block, and even more in the Ivy Bridge EX generation. Such capabilities now could come to Intel interconnects openly sold to any system integrator.
Finally, the smaller fish will likely have to get away from the dangerous open seas now – small corners under the coral reefs, with specific niches – whether NUMA single memory image for supercomputers, or application specific networks, may be the solution.
Picture Credits: Chinese VR-Zone