News

PRACE Software Strategy for European Exascale Systems

 

1 Introduction


Building on the successful implementation of the Partnership for Advanced Computing in Europe (PRACE), the European Commission (EC) has increased its efforts to develop a world-class supercomputing ecosystem in Europe. The EC, EuroHPC Joint Undertaking (JU) and EU Member States have made significant investments in European petascale and pre-exascale infrastructure, have put exascale supercomputers on the roadmap, and are actively exploring new post-exascale architectures. The return on investment will be directly linked to the productivity of end-users in academia, in industry, and in the public sector. Key to this productivity is an ecosystem of user-oriented software: scientific applications and workflows that act as significant multipliers for the investment in hardware [1]. Investments in software should be a top priority of any HPC strategy. The extent to which these investments are needed and their real impact are often underestimated.
New scientific and societal challenges, such as COVID-19, the green deal, or quantum computing, require that scientists and programmers design new applications from scratch. The convergence of HPC, Data Analytics (DA) and Machine Learning (ML), calls for the development and maintenance of workflows with a complexity that far exceeds the traditional simulation codes, and requires experts with the right skills. New and diverse hardware, from various vendors, including new instruction set architectures as well as new accelerators and dedicated solutions, requires continuous re-engineering and modernisation of software. The increase of computational performance cannot rely on Moore’s law anymore; instead software engineering and algorithmic progress will continue to bring orders of magnitude speed-up for various applications.
In this document, PRACE’s position [2] on and contributions to applications and software for the European Exascale Systems are described.

2 State of Play


Europe is fortunate in that almost all the applications running on its HPC infrastructure are European or have strong European participation in their development. Funders and communities need to put in dedicated effort to maintain Europe’s excellence and sovereignty in HPC software.
On the one hand, to address a particular scientific challenge or to target new technology, dedicated, large scale (Future and Emerging Technologies (FET)-like) consortia can design and develop applications from scratch to prototype. On the other hand, HPC applications are often developed by PhD students and postdocs, which start as small scientific projects and grow into large collaborative efforts afterwards. Typical funding streams for these developments are part of research grants, which rarely focus on the software itself and are thus limited in their impact. Moreover, PRACE partners from HPC sites across Europe can provide access to Research Software Engineers (RSE) who are experienced and can provide software engineering consulting and effort, also in the longer term and with strategic input. Early on, Europe recognised the strength of the open-source approach to scientific software, which allowed for rapid evolution of codes, growth of the user base, and reproducibility of scientific results based on this software.
Funding models need to accommodate the development of applications that support the HPC research infrastructure, so that appropriate grants are made available every step of the way.

3 European HPC Ecosystem


European Centres of Excellence (CoEs) have become an important structuring instrument to help support, maintain, evolve, and develop these applications over the timescales (years to decades) that are typical for scientific applications. Also, the newly installed national Competence Centres (nCCs) will contribute to the value chain by deploying the applications, and by offering consulting services in specific domains. CoEs have the responsibility to ensure the availability, efficiency, and further evolution of selected applications on the EuroHPC JU infrastructure, as well as to support and grow the user base. CoEs and nCCs naturally approach the software with domain-specific needs in mind, but they require expertise in architectural and software engineering, and must co-design to deliver efficient, sustainable and reusable software products for their users.
To support CoEs and nCCs, easy access to large-scale computing infrastructures and their expert staff, for scientific projects and code development, is essential. PRACE provides this key element of the ecosystem and acts as a co-ordinator between the CoEs and HPC centres in Europe.

4 PRACE Value


PRACE awards resources on European HPC systems to scientific and industrial projects and supports those projects in reaching their targets. PRACE supports software development with both a short term and a longer-term perspective. Additionally, PRACE supports essential training, by teaching HPC software engineering skills, by training of trainers, and by providing a large number of users access to state-of-the-art infrastructure.
High-Level Support Teams (HLSTs) assist awarded research groups to port and optimise their codes in the early phase of access to ensure cost-effective use of provided resources. Support is available for all users, including industry via e.g. the SHAPE calls, and often assigned as part of the allocation process. Teams composed of software engineers from various PRACE partners perform strategic developments to modernise codes and to ready applications for future architectural generations, as part of the PRACE-6th Implementation Phase Project. These teams contribute with technical knowhow and software engineering expertise to deliver software that addresses the long-term needs of the communities. The members of these teams have in-depth architectural knowledge and an overview of current developments in software engineering.
These projects are selected via a highly competitive call for proposals and can provide essential but generic building blocks for domain science codes, providing separation of concerns (separating domain science aspects of the models from the architectural aspects of the underlying implementation), which facilitates the evolution of HPC software on both scientific and software engineering level. They receive support from stakeholders, such as CoEs, Tier-0 users, and the centres themselves. An example of such a development, part of PRACE-6IP project in the Forward-looking Software Solutions Work Package, is the GHEX (Generic Halo-exchange for the Exascale) project that develops a library for an important HPC primitive. GHEX is used in applications from various domains, from solar physics to weather and climate, and targets different (accelerated) hardware architectures. This work will impact major European projects, such as ‘Destination Earth’, and was already used as procurement benchmark for a pre-exascale EuroHPC system (LUMI).
In addition to the software development, PRACE can act as a platform to provide feedback from the user communities, and promote European tools and applications in the vendors’ software stacks.

5 Global Race


HPC software development efforts outside of Europe provide a useful reference frame to judge European efforts. Two of the most visible projects are the American Exascale Computing Programme (ECP), and the Japanese Exascale Computing Programme. The ECP is centred around the various exascale machines that will be installed in the USA and funds developments at a rate of USD 300 million annually, excluding funding of the systems. One of the clear goals of the ECP is to deliver software on a schedule that, being a seven-year project, started in 2016 and is to deliver in 2023. The mission is to support several exascale machines. The Japanese program, on the other hand, is focussed on one flagship machine, Fugaku, and has invested approximately USD 50 million annually for five years in software, the majority in scientific applications. These two international players have thus recognised the importance of the application software stack, and accelerating Europe’s efforts in this direction is similarly essential in order to stay competitive.
Note that Europe has a bottom-up approach to software development aiming to support a broad range of user communities and SMEs. It targets multiple hardware architectures, including possible new processors and accelerators as produced by European Processor Initiative (EPI) and quantum computing. Software developers need to actively interact with hardware developers to make sure the efforts on both sides of the equation are aligned. Funding levels and projects structures supporting these efforts need to grow and to improve alongside.

6 PRACE Contribution


PRACE has the experience and the expertise to support and co-ordinate software development for exascale and beyond, based on an existing pan-European network of partners that, together, provide the range of skills needed for this undertaking. Indeed, as a pan-European platform for HPC research on top of infrastructure provided by the PRACE Hosting Members and EuroHPC JU, PRACE should not limit itself to conducting the pan-European Peer Review Process, but should continue to guarantee success of the awarded projects, by supporting application readiness both in the short and in the longer term.
The model in which software engineering experts deliver software in close collaboration with the domain scientists is successful and will need to be increasingly funded as HPC architectures and workflows become more diverse and complex. A Framework Partnership Agreement (FPA) would adequately provide funding to ensure continuity of this model and support the important long-term development of European tools and applications.
The CoEs and PRACE partners collaborate with complementary software development, with CoEs having the domain perspective and PRACE the infrastructure perspective. Besides, opportunities to extend the expertise and to be more inclusive to the European HPC ecosystem are needed.
Finally, PRACE provides a well established and successful training programme across Europe. This includes the traditional programme of (online) courses, seasonal schools and workshops as well as the training of trainers programme. Additionally, PRACE will continue to be an instrument to grant software development efforts in a bottom-up way, where dozens to hundreds of projects will be funded and consequently a similar number of people trained, to develop and modernise HPC software in a bottom-up way. This could be modelled to similar initiatives that exist in a number of EU Member States, where academic Principal Investigators (PI) lead the developments and partner with centre experts for a dedicated, long-term effort. Furthermore, PRACE emphasises the value of software, via recognitions in the form of prizes and certificates, which have value on CVs and in industry.

7 Summary


PRACE’s role in the global race for exascale software is five-fold:

  1. PRACE provides enabling support at the time of awarding resources, the process is timely and simple, leading to a more efficient use of the infrastructure.
  2. The PRACE partners have the in-house expertise to design, develop and refactor software in a forward-looking way, so that scientific applications can run at scale on modern architectures. PRACE software engineers have expert knowledge, and typically have the advantage of nondisclosure agreement (NDA) information from vendors, to make strategic software design decisions as early as possible. They can contribute to, and integrate applications in, the European opensource software stack, providing an incentive to HPC vendors to adopt it.
  3. PRACE fosters the pan-European collaboration in software development and contributes to raising the quality of support and exchange of knowledge between all partners involved.
  4. PRACE provides the necessary coordination of efforts in Europe. For example, instead of individual CoEs, nCCs and HPC centres engaging on a bilateral basis, PRACE, FocusCoE and CASTIEL could lead a co-ordination effort.
  5. PRACE will extend its efforts to deliver a trained and skilled workforce with HPC software engineering skills.

8 References


[1] PRACE Scientific Case https://prace-ri.eu/about/scientific-case/
[2] PRACE Position Paper https://prace-ri.eu/about/position-papers/

9 Acknowledgement


PRACE acknowledges the contribution of the PRACE Software for Exascale Working Group members: Lee Margets, Chair of the PRACE IAC (Manchester University); Alan Simpson, Leader PRACE-6IP WP7 (EPCC); Joost VandeVondele, Leader PRACE-6IP WP8 (ETH Zurich); Matej Praprotnik, former Chair of the PRACE SSC (Kemijski institut + UL FMF); Núria López, Member of the PRACE SSC (ICIQ); Laura Grigori, Chair of the PRACE SSC (INRIA); Jesus Labarta (BSC); Philippe Segers, Member of the PRACE BoD (GENCI); Marjolein Oorsprong, Communication Officer (PRACE aisbl); Serge Bogaerts, Managing Director (PRACE aisbl); Veronica Teodor, Project Manager PRACE-6IP (Forschungszentrum Jülich); Dirk Brömmel, Project Management Office PRACE-6IP (Forschungszentrum Jülich); Florian Berberich, Operations Director & Council Secretary (PRACE aisbl)


For further information contact: info@prace-ri.eu


Copyright of the content contained in this report remains with its original owners. The PRACE name and logo are © PRACE. The Partnership for Advanced Computing in Europe (PRACE) is an international non-profit association with its seat in Brussels. The PRACE Research Infrastructure provides a persistent world-class High-Performance Computing service for scientists and researchers from academia and industry in Europe. The computer systems and their operations accessible through PRACE are provided by 5 PRACE members (BSC representing Spain, CINECA representing Italy, ETH Zurich/CSCS representing Switzerland, GCS representing Germany and GENCI representing France). The Implementation Phase of PRACE receives funding from the EU’s Horizon 2020 Research and Innovation Programme (2014-2020) under grant agreement 823767. For more information, see https://prace-ri.eu/.