It is perhaps inevitable that, in a publication such as Scientific Computing World, editorial coverage of high-performance computing should be dominated by superlatives; journalists are naturally drawn to write about the first; the biggest; the fastest; the most energy efficient; the unique. Thus, for example, the Gauss Centre for Supercomputing tends to be the focus in Germany, for that is where the largest machines are based. In contrast, only recently has there been an article on our website (a portal opens to German HPC centres), featuring the Gauss Alliance, which is made up from the slightly smaller, university-based supercomputers.
But this selectivity is distorting. At the end of this year, Transtec expects to deliver a 1,100 node Lenovo cluster to the Karlsruhe Institute of Technology (KIT) in what is the largest contract in Transtec’s history. Earlier this year, in April, Oxford University officially inaugurated its own Lenovo cluster, where the integrator this time was OCF. At the University of Vienna, in the summer of last year, ClusterVision installed Europe’s first high-performance cluster to be cooled by total immersion in mineral oil. Also on the ‘green IT’ front, Megware has just completed the installation of a cluster at the University of Greifswald in Germany, involving direct hot water cooling, where the waste heat is repurposed to a district heating circuit in the datacentre building, in accordance with the university’s strategy of creating a ‘CO2-neutral university’.
There are many more contracts for business and industrial customers than academic institutions, but these seldom get much attention. Very often the customer insists on commercial secrecy in the contract, so that many such installations cannot be reported. Here the distorting selectivity is not journalistic caprice, but fear of disclosing business sensitive information. (The Formula 1 car racing companies, for example, all make extensive use of high-performance computing, but it is impossible to extract details of their installations.)
The marketplace for HPC systems is clearly more subtle and complicated than a concentration on the highest end machines, or on academic contracts, might indicate. Lenovo manufactures and sells HPC systems, but its machines also reach end-users through the mediation of integrators such as Transtec and OCF. And Lenovo is not alone; NEC and Huawei are other examples of manufacturers who partner with integrators. Dell has a close business relationship with ClusterVision.
In a neat inversion of the pattern, Bull, the biggest European supercomputer supplier, regards Atos, now its parent company, as an integrator, albeit one dedicated solely to selling Bull hardware.
Only a few companies, Cray is one example, tend not to use integrators. While Cray has a very distinct commercial strategy of aiming only at the very high end of the market, nonetheless it is not stretching the meaning of the word too much to regard Cray itself as being sometimes its own integrator. Its averred policy is to offer what it considers the best technology, whether it originated within the company or is brought in from outside.
IBM’s strategy is different; it has settled on the Power architecture and aims to extend its reach by building cooperation through the Open Power Foundation, of which both Bull and OCF are members. Nonetheless, in this short survey of the HPC marketplace, IBM appears to be a bit of an outlier.
Large supercomputing centres have their own reservoir of expertise and talent, allowing them to draw up their own detailed specifications and thus buy direct from the manufacturers. But smaller organisations cannot hope to command such expertise. Even though a university may have profound technical knowledge, it may be deep but not broad enough to cover all the factors and components that have to be brought together – that have to be ‘integrated’ – to make a modern cluster work. It is for these customers that the specialist integrators offer the greatest benefit. An oil exploration company or a wind-turbine manufacturer just wants to get on with the geophysics or the turbulence calculations and does not want to have to worry about project managing the installation of the high-performance cluster that will do those calculations.
Integrators provide solutions and service
According to Oliver Tennert of Transtec: ‘We see ourselves as a solution provider, not a manufacturer’. It is this distinction that lies at the heart of what the integrator can offer. Tennert pointed out that: ‘Lenovo can sell you only Lenovo, whereas we partner with lots of companies and therefore can select what fits best for a particular solution.’
Christopher Huggins, ClusterVision’s commercial director, also stressed the flexibility of the offering from an integrator: ‘Tier 1s have specific configurations they like to offer to the market, whereas the customer may want more memory, or GPUs, or cooling that the Tier 1 may not offer. We can offer something extra.’
Transtec had tried to position itself as a manufacturer in HPC, but it moved away from that towards being a solutions-oriented company, according to Tennert, so that, while it sells hardware: ‘Hardware is not our differentiator. Our knowledge, our ability to choose optimal components, and our service, are key,’ he continued. ‘Customer care doesn’t start with selling and stop with delivery.’
In Huggins’ view, one reason to work with an integrator is that ‘You get personalised service. If a customer calls ClusterVision support, they will get someone who lives and breathers HPC.’ In the larger companies, the Tier 1s, this focus can get lost, he said.
Integrators understand the science
Geography is important too, as it enhances not just local but sector-specific knowledge; Transtec’s location, near Stuttgart, means that there are about 200 engineering companies within 30 miles, all of them part of the supply chain of larger technology companies in the region – a classic instance of Germany’s famous Mittelstand. Apart from an original focus on academic computing, Transtec moved into providing workstations, which is still where the bulk of computational engineering is done, and which is still an important part of the company’s business. But proximity to engineering customers also brings with it extensive experience with CAE workflows, which can be utilised for HPC: ‘We know the workflow specifics of our industrial customers,’ Tennert added.
As an HPC specialist, ClusterVision is also a company with a background in science and research, Huggins pointed out. ‘There are a lot of PhDs and ex-scientists walking around. It’s a different persona from a Tier 1 – we understand the science a lot more.’ ClusterVision is based in Amsterdam and this offers a different advantage of geography. The company’s offices are minutes from Schiphol airport and so, if there is a service issue that requires an onsite visit: ‘we’re perfectly placed to send engineers out to all four corners of Europe,’ he said.
Historically, ClusterVision too was very strong in academia but over the years it has branched out to installations in industry, to the point now where more than 50 per cent of its business is in industry. As is so often the case, the company is contractually restrained from identifying its commercial customers, but they range geographically from the Middle East (almost certainly in the petrochemical industry) to Scandinavia (in the automotive sector), and in the pharmaceutical industry elsewhere in the EU. It also has industrial and academic customers in the UK. However, the company’s strategy, according to Huggins, is to go after industrial contracts in the 5 to 7 million euro bracket. ‘This is a growing area, where we see Cray more and more. Our high end is their low end.’
Tennert stressed that Transtec does not provide the software solution, such as Ansys, but rather the platform on which to run it: ‘Ninety per cent of the cost for the customer is hardware,’ he said. Sometimes, because the customer is not an IT specialist, they do not know what they want and the integrator’s job is ‘to give them the feeling that we understand what they need’.
Keeping an eye on the technology
The integrator also has to watch what is changing in the hardware market, Tennert continued. In his view, Huawei is very aggressive in terms of price in the server market, while Supermicro remains dominant, but is not always the automatic choice. Lenovo, which has taken over IBM’s x86 business, is still establishing itself in its new identity. He pointed out that the quality of Intel hardware is very good, but that that is reflected in the price and not all his customers want to pay a premium because they do not necessarily need the ‘nice to have’ features.
At ClusterVision, according to Huggins, ‘We take pleasure in providing solutions to the market that are cutting edge.’ In his view, integrators can bring new technology to the market more quickly than the larger companies: ’We have to jump through fewer hoops than the Tier 1s – we’re faster to market.’ It is not always the technology that is the issue for the larger companies, Huggins explained. There have been cases where the larger companies have dithered over the pricing of their products, whereas for a smaller company such as ClusterVision, the issue is less complicated and therefore quicker to reach a decision.
Customers and providers both benefit
It is not just the end-user customers who profit from the services of the integrators – the large manufacturers also benefit. Tennert cited three of Transtec’s partners: Lenovo; NEC; and Huawei. Each has a slightly different approach but all benefit from using integrators as intermediaries. NEC, he suggested, has realised that its installation and delivery services are limited so they need partners such as Transtec to ‘amplify’ their sales and distribution.
Huawei is also looking for partners, he explained, because it is not interested in selling direct to the end-user customer (or not yet, at any rate). As a relative newcomer to HPC, certainly compared to NEC and Lenovo (in its former IBM persona), the market pressure on Huawei is to find partners if it is to get its products accepted. Lenovo is a manufacturer that also sells direct, but because of the size of the company its sales and distribution is focused on very large orders, which means that smaller customers find it easier to acquire Lenovo hardware via an integrator.
ClusterVision works with the ‘white box’ providers such as Supermicro, and Huggins sees one of the primary advantages of using an integrator such as his company as being its ability to deploy hardware and software from multiple vendors, to the end-user customer’s advantage. But it also has a very strong partnership with Dell, Huggins said. Dell is very strong in HPC in the USA but less so in Europe and the Middle East, so ‘that is where we complete their offering.’ Dell has a much larger sales force that ClusterVision so they have many accounts to manage and Dell’s account manager may not be an HPC specialist. Delivering an HPC cluster requires different project management to what Dell is used to and so ClusterVision provides that project management.
However, Tennert was careful to point out that an integrator is not just a sales channel: the relationship is a partnership with both sides bringing different expertise to the project. Being simply a distributor does not work in HPC, he said, something more was needed: a solution provider. Huggins concurred; the service that a specialist provides starts long before a sale is made – before the customer even writes the tender, he said.
Being a relatively small company was not a hindrance when it came to keeping up with technology, Tennert felt, rather it could be an advantage. The very large companies of necessity focused on themselves and their own technological lines of development. For the time being, x86 technology is dominant and within that, the main player is Intel. However, Lenovo is right to be looking at ARM, in Tennert’s view, and NEC’s SX vector system is also worth watching. IBM is now specialising on the Power architecture and on Big Data applications which, he said, were not traditionally areas that Transtec specialised in.
Why technology providers can also be integrators
Even though one of Transtec’s partners – Lenovo – is a very different organisation, Noam Rosen, its EMEA sales director, also believes that ‘HPC is a solution, not a specific product’. In a multinational industrial corporation that does its own manufacturing, from PCs and mobile phones through to enterprise servers, Lenovo’s concept of partnership includes partnering internally with the general-purpose server team, so that there is a fit with the HPC business, he said. The HPC business also benefits from some similarity with hyperscale computing – the data centres for Google, Facebook, and the like. In fact, since Lenovo is a Chinese company, about a third of its revenue comes from hyperscale data centres within China for whom the hyperscale team within Lenovo has been developing unique design form factors.
But a counterweight to any introspective tendencies, highlighted as a potential problem by Tennert, was that many of the company’s clients at the very high end have their own talents, and so partnership with their expertise can help the company decision on what to bring to the marketplace that makes sense for HPC. He cited the example of the Hartree Centre in the UK with whom Lenovo is collaborating over an ARM prototype.
Even within the context of its own systems, Lenovo is, to some extent also an integrator. It does not make its own interconnects, for example, but uses Mellanox InfiniBand, though Rosen also looked forward to having the option of Intel’s interconnect fabric as an alternative. ‘We design the solution with the client,’ he said, ‘integrating all the elements we need to make a workable HPC cluster.’ Not everything is designed and built by Lenovo, he pointed out.
Now that the HPC business has separated from IBM, it can offer other storage options and other scheduling and machine management software, he said, citing the shipment, this summer, of two DDN storage systems to a client in the Middle East. As part of this greater flexibility, during the ISC high-performance meeting in Frankfurt in July, Rosen was not just meeting potential customers but also complementary vendors wanting to work with Lenovo.
When the IT services company Atos acquired Bull last year, there were many questions about whether the high-performance computing side of the business would survive. As the new relationship develops, a rather surprising view of the advantages to Bull is beginning to emerge: that the relation of Bull to Atos is analogous to the relation of, say, Lenovo to its integrator partners. Indeed, in an interview with Scientific Computing World at the ISC high performance meeting in Frankfurt in July, Claude Derue, head of HPC marketing for Bull, put it in precisely these terms: ‘Atos is an integrator. Bull is the technology provider to a world-wide organisation, Atos, that is an integrator.’
Atos as integrator expands the market for Bull
Bull has often been regarded as a very French company, exemplified perhaps by the contract with the Atomic Energy Commission (CEA) announced in July to develop a 25 Petaflop machine, the Tera1000, intended as a forerunner to Exaflop computers; it has a presence in European climate research and weather forecasting, with machines at the German DKRZ and in the Netherlands Meteorological Institute. According to Jean-Pierre Panziera, the company’s CTO for Extreme Computing, for historical reasons it has also had good ties with Brazil.
In September, the firm announced that it had delivered the first Petaflop system for open use by the country’s academic community, as part of a partnership with the Brazilian National Scientific Computing Laboratory (LNCC) and its Ministry of Science, Technology and Innovation (MCTI). It is named Santos Dumont, after the Brazilian national hero Alberto Santos-Dumont, who was born in 1873 and who became an aviation pioneer. He designed, built, and flew the first practical dirigible, demonstrating that routine, controlled flight was possible. The installation of Santos Dumont as the central node of the Brazilian National System of High Performance Computing (SINAPAD) means the country now has the biggest supercomputer in Latin America.
But Panziera conceded that the French image of the company was not without foundation; about 55 per cent of Bull’s business hitherto had been in France. For him, part of the significance of the acquisition by Atos is that it opens up markets for Bull’s technology that are far wider than could have been reached by Bull itself. And the expansion is not only geographical, it also expands the market for smaller systems as well: ‘What we have with Atos is a broader presence, in the northern European centres, in the central European states, and in Asia/Pacific – both high-end and SMEs.’
He stressed the importance to Bull’s business of smaller, industrial systems, running not open source or home-made application software but programs from the independent software vendors (ISVs). ‘You get big chunks such as the CEA contract, but you cannot live on this alone,’ he said. The Atos connection brings with it some 175 major contracts worldwide, in the 20 million to 40 million euro range: ‘This is the future for HPC,’ he said.
Up to now, he continued, HPC had served a largely traditional market for research, weather and climate modelling. The future however was to serve the vertical requirements and with Atos, ‘it will be possible to respond more precisely to the specific needs of our customers.’ He pointed to the fusion of big data and HPC: ‘Today all our customers face an explosion of data. You have to ensure that the way you manage data is done properly – that the lifecycle fits the customer and that is the advantage of Atos, so we can take a step ahead of the competitors.’
Like Tennert and Rosen, Panziera too emphasised that while ‘we do technology, in the end we sell solutions to our customers and that involves understanding their problems.’ Customer’s needs are evolving, whether it be an oil company, a weather forecaster, or a company designing solar cells, so ‘what do we develop and what do we integrate from other technology providers?’ he asked. ‘The simplest thing is to take what is available – we do take boxes from Supermicro and storage solutions from Seagate or DDN.’
But sometimes the customer does not have their own datacentre, he continued, and so packaging at the level of the rack, chassis and datacentre are all required. One of the things that marks Bull out as a technology provider, rather than an integrator, is that it can provide the boards themselves – with air cooling at one end, and Bull’s own design of direct contact liquid cooling at the high end.
Solutions not just technology at the high end
The systems marketplace in HPC has a range of companies, some of them ‘pure’ integrators, some of them mixing integration services with their own technology. Some technology providers, such as Lenovo and Dell, offer direct to the end customer for higher-end systems, but work through integrators as well, typically for the smaller clusters. The Bull-Atos relationship also offers a range from the smaller systems as well as the ability to deliver very high-end clusters (Bull has an Exascale roadmap).
Bull counts therefore also as a large-system provider, like Cray and IBM. And just as Bull laid its emphasis on the customer, so too do Cray and IBM. According to Barry Bolding, senior vice president and chief strategy officer of Cray: ‘We provide computational tools that help our customers – not just computers but sets of tools.’ Sumit Gupta, formerly with Nvidia, now at IBM, said: ‘The customer doesn’t think “I’m doing HPC or high performance analytics” – the question is “what does the client need?”.’
In some respects, Cray sees its role in part as that of an integrator too, albeit for the very high-end systems. Bolding was clear that Cray is prepared to discontinue its own technology if something comes on the market that is better than its own in-house product and, indeed, it has done so in the past. ‘If someone has already solved a problem, then we don’t want to reinvent that just to say it’s Cray.’ As one historical example, he cited the technology transition over vector processors: ‘We said we didn’t need to make our own because the move to commodity processors was a better solution for the customer than sticking with our old system.’
Although Cray may decide to ‘integrate’ commodity components into its own products in this fashion, it provides the value-add for its customers by its own design and quality. Bolding pointed out that, for example, Cray had put a lot of effort and money into voltage regulation – more so than the commodity providers – and that this led to increased reliability across the cabinets. Similarly, it was better for Cray to design its own boards – they are not cheaper but they are more reliable, Bolding said, whereas the quality of commodity boards was not as good.
‘We have to do constant comparisons with commodity pricing – what is the right trade-off between reliability and cost?’ he said. In this respect, what is best for the customer may not necessarily always be what a technological purist might consider the best technology. The interplay of technology and market dynamics is a complex one: ‘Our engineers have to love the technology. My job is the business – I need a different perspective. If we can help a beautiful technology survive, we’ll do it if it is best for the customer.’
The extended sense in which Cray can be regarded as an integrator is apparent in a further comment by Bolding – that the judgement over which components to provide in-house, and which to buy from the commodity market, is ‘a judgement that has to be made at the system level’. Individual processors or storage components may be outstanding, but Cray, in Bolding’s view, has to deliver to its customers an integrated system ‘that can be up and running and solving their problems in hours’.
Convergence of HPC and big data in a single architecture?
Virtually all of the suppliers of HPC systems, whether they be technology providers like Bull, Lenovo, IBM and Cray or integrators such as Transtec and ClusterVision, believe that the future lies in some sort of convergence between data processing and the traditional number-crunching business of HPC. In finding ways to address this challenge, a subtle divergence appears to be opening in the market, principally between IBM and most of the others.
Many of the companies believe that different architectures will have to be developed, tailored to different applications – there will be no one-size fits all. In contrast, IBM believes that its Power series of processors and the Open Power ‘ecosystem’ behind it will offer the way to do both HPC and big data processing.
For IBM’s Gupta, Open Power is a ‘chance to redefine the data centre’. If high-performance computing and high-performance analytics are converging, he believes that the underlying technologies can transfer, although some of the specific technologies may be different. IBM now has a single architecture – a data-centric one – and this means that in future HPC architectures, ‘you will move the compute to the data’, rather than the other way round, he said. ‘We’re not just going after HPC, but big data, cloud, machine learning, and the enterprise. We have an architecture that goes across all these sectors.’
There will be some tweaking and optimising for different workloads, he continued, but it is clear that IBM’s vision is for the underlying Open Power architecture to run across all applications and types of work. Gupta sees both the Open Power foundation and his job, in particular, as encouraging interest among developers to write applications: ‘Once you have an architecture, you need an ecosystem. IBM understands developers.’
Different technology for different workloads?
Cray’s Bolding, however, believes: ‘We will see a diversification of technologies to solve different problems. When you’re delivering a system and the customer has distinctive needs, you have to mix and match. We have to have different technologies. You’re going to see architectures that are adapted to workloads.’
Lenovo’s Rosen has a similar view. He believes that the future of HPC will neither be a single architecture nor a single processor, but workload-optimised rather than general-purpose systems. If HPC does evolve in this direction, it differentiates it from the hyperscale business, in his view. He sees a role not only for ARM processors, but also FPGAs and GPUs in systems that have been optimised for specific workloads.
Although Rosen did not cite this example, a recent case in point might be the QPACE2 machine, build by Eurotech, that recently went live at the University of Regensburg in Germany, as reported on the Scientific Computing World website. Among other distinctive design features, this uses Xeon Phi co-processors rather than conventional CPUs to do all the computing for problems in quantum chromodynamics.
In the face of an uncertain future, Cray’s strategy is to leave its options open for as long as possible, by creating an underlying platform that is flexible enough to accommodate very different technologies and on which it can build different choices. According to Bolding, when it is appraising its technology choices, Cray has to look five to seven years into the future – far further ahead than a conventional integrator. ‘Because of our unique position at the high end with long delivery times, we get early access to the roadmaps from all the technology providers and we model many types of processors and other technologies so we can make choices. But we have to make decisions about how to have flexibility to leave decisions as late as possible.’ That’s why, he concluded, ‘we have to design a good platform.’
There is convergence in Crays’ strategy – it is bringing its current-generation XC and CS lines into a single platform, called Shasta. But the emphasis is on its adaptability to enable different configurations, and thus different options for price and performance, to meet the differing requirements of commercial, government and academic organisations. Shasta’s architecture is being designed with the flexibility to accommodate a variety of processors: networking; memory; cabinet packaging; power; and cooling systems.
Dr Tom Wilkie is the editor for Scientific Computing World.
You can contact him at tom.wilkie@europascience.com.
Find us on Twitter at @SCWmagazine.