Since the DeiC HPC services started in November 2020, a consortium of universities (AU, DTU and SDU) has been working hard to finish the ambitious development of the DeiC Integration Portal. The vision of the DeiC Integration Portal is to provide a national solution to access all the DeiC HPC systems and future DeiC services under one common portal. After two years of development, UCloud has now been expanded with new functionality to integrate with the DeiC HPC providers.
Denmark currently has three national HPC services operated and hosted by different consortia of Danish universities and coordinated by the Danish e-Infrastructure Cooperation (DeiC). All researchers in Denmark can apply for resources on the national HPC services, including the Danish part of the European supercomputer LUMI, either through their universities’ Front Office or via national calls.
Along with the establishment of the national HPC services, it was also envisioned that researchers should be able to access the DeiC systems via a common national portal. This portal should ideally make it “as easy to use the national HPC centers as AWS-, Azure- and Google cloud service” (from the DeiC call in 2020). The DeiC Board decided to make a call for expression of interests for the development of the DeiC Integration Portal, which at the time was also referred to as Project 5.
In 2020, the consortium of universities consisting of AU, DTU and SDU, with SDU as the coordinating body for the consortium, sent the proposal to base this portal on UCloud. This proposal was accepted by the DeiC Board in 2020.
“When we answered the DeiC call in 2020, we understood the potential behind the vision of the DeiC Board. At the time the UCloud software platform was maturing into a full-fledge solution for e-research, and it seemed an ideal starting point for the DeiC Project 5,” says Claudio Pica, professor at SDU and coordinator of the winning consortium.
Advantages for the users
For the users, there are many advantages of having a common portal to access the national HPC services. Professor Kristoffer Nielbo, from the Center for Humanities Computing at Aarhus University, explains:
“As a researcher (and an infrastructure provider), a common portal brings us closer to the seamless integration of multiple national HPC systems. Such access simplifies my workflows and saves valuable resources otherwise spent on mentally, and sometimes physically, ‘switching’ between platforms. It also makes transitioning from interactive to batch jobs less ‘scary.’ Finally, the portal reduces resources spend on onboarding new researchers in my lab because they only have to learn how to access HPC through the integration portal.”
Kristoffer Nielbo tested the portal doing the project’s pilot phase in Fall 2022, and he was very happy with the result.
“I was surprised at how well the portal reproduced the familiar user experience of DeiC Interactive HPC – where UCloud has been used for several years. Even though the mode of running jobs is fundamentally different (although DeiC Interactive HPC can run batch jobs), the project and file management, which are large parts of UCloud, were very similar. I wish more national HPC systems had been available during testing.”
A common portal also makes it easier for the DeiC Interactive HPC users to use and transition to other more “traditional” HPC systems, such as the LUMI supercomputer.
“Even in my lab, I can see that more researchers that used to use DeiC Interactive HPC are now planning to use DeiC Throughput HPC. Project 5 arrived at the right time for many DeiC Interactive HPC users – we have just started to ‘develop an appetite’ for HPC. That being said, I see the different national HPC systems as complementary, and Project 5 enables more users to benefit from more systems,” says Kristoffer Nielbo.
Implementation of the design
To better understand how the DeiC Integration Portal has been implemented in UCloud, it may be useful to look at how UCloud used to work. In the figure below, an end-user wants to run an application. Using their laptop, they open UCloud, find the application in the application store and click on the “Start” button. This causes their laptop to send a message to UCloud, containing the user’s command. UCloud then sends a similar message to the “DeiC Interactive HPC” computing resources.
In Project 5, the consortium developed a component called the UCloud Integration Module (or UCloud/IM) which sits at the service provider and which is controlled by the service provider. The UCloud/IM communicates with UCloud and exposes the computing resources of the provider. The service providers have full control over what the UCloud/IM can do.
“At a technical level, UCloud/IM is plugin-based software. This means that, as a provider, you can choose and adapt the IM to fit your environment. We have packed it full of features for controlling authentication and authorization. It has several different implementations for compute, storage, licenses and more,” says Dan Sebastian Thrane, team leader for cloud services at the SDU eScience Center.
The UCloud/IM was designed to maintain a high level of IT security and the integrity of the individual service providers.
To use an analogy, without the UCloud/IM, sending a message via the DeiC Integration Portal would (from the service providers’ perspective) be like giving the postman the keys to your house to deliver the mail. Instead the UCloud/IM acts like a “mailbox”, where the postman can leave your mail without entering your house.
Design Principles
It has been important for the consortium behind the DeiC Integration Portal to have a transparent design and an inclusive development process. A DeiC Steering Group, which included representatives from all the universities in Denmark, was formed by the DeiC Board. This steering group has discussed the design of the portal throughout the development period and approved the final result.
It has also been important for the consortium and DeiC to stress that the DeiC Integration Portal does not replace or control any functionality which DeiC service providers have. It simply exposes these functionalities in a secure and user-friendly way to all users with a common interface, acting as a secure message brokering system.
“The DeiC Integration Portal initiative aims to facilitate access to remote compute resources through a joint portal with multiple backend HPC resources. These backend service providers are at the same time HPC service providers to their home universities and part of the emerging national HPC infrastructure. This mission duality implies that the resource providers, at all times, should be able to maintain full integrity and local control,” says Michael Rasmussen, section leader for Research-IT (RIT) at Technical University of Denmark.
Full integrity and local control has been achieved by following a set of design principles:
- Zero trust design
- Exclusively users local to the service providers
- Configurable integration module with no elevated privileges
- Local validation and authorization control for all actions following the local policies
“‘Never trust, always verify’ (zero trust) has been a guiding principle for the design of the process from initiating a job, submitting the job request to the service provider, queuing and executing the job, and finally reporting back to the portal. Users authenticate with home-institution credentials (via WAYF) on login to the Integration Portal and can from here apply for compute resources. Once the DeiC Front Office of a user’s home-institution approves an application for resources, the local resource provider can authorize access by having a local user account created and associated with the user’s DeiC Integration Portal account,” says Michael Rasmussen.
If the user does not comply with code-of-conduct, the compute resource provider can disable the user’s connection via the integration module and lock the local user account to prevent re-logins until further notice. This means that only user accounts validated, created and authenticated locally, act on the local resource provider facility, thereby ensuring local integrity and control.
“If a DeiC Integration Portal user unknown to the local resource provider facility submits a job, the process of validating and creating the new user account is completely controlled by the resource provider. This ensures that only locally validated users act on the local facility,” says Michael Rasmussen.
Integration with DeiC Large Memory HPC
Since the 19th of December 2022, the first of several planned service providers, the DeiC Large Memory HPC system, was enabled on the DeiC Integration Portal.
DeiC Large Memory HPC is a traditional HPC system with large memory nodes (up to 4TB per node) based on Slurm as the workload manager. This kind of system is historically used primarily by the natural sciences, such as physics and chemistry, for large scale simulations of physical and biological systems via non-interactive batch jobs. As such, this kind of system is very different from the DeiC Interactive HPC platform.
Traditional HPC users from the natural sciences will also benefit from the new integration.
“The DeiC Integration Portal provides project management features previously lacking on the system. From the platform, the project PI (or a project administrator) can manage users in the project themselves. Previously they had to write to the user support whenever a user had to be added to the project. Similarly, users are now able to upload their SSH keys directly, instead of sending them via mail,” says Martin Lundquist Hansen, team leader for the infrastructure team at the SDU eScience Center.
Martin Lundquist Hansen furthermore explains that:
“The integration also allows users to manage their files and Slurm jobs directly from the UCloud platform. This is especially important for users less familiar with traditional text based HPC systems, but even for more experienced users this might be convenient in some cases. It is important to emphasize, however, that the DeiC Integration Portal simply provides an additional method for accessing the system, while traditional SSH access is still possible and unchanged.”
Like Kristoffer Nielbo, Martin Lundquist Hansen stresses that the DeiC Integration Portal may help users of the DeiC Interactive HPC system transition to other DeiC HPC systems:
“With the new integration, users can consume resources on the DeiC Large Memory HPC system in the same way they are already consuming resources on UCloud. There is of course a difference in the type of applications that conventionally are used on the two types of systems, but they can now be accessed and executed in a uniform way. As users learn to run jobs on the system via the UCloud platform, the transition to accessing the system via SSH might also become easier, due to familiarity with certain aspects of the system.”
The implementation of the DeiC Integration Portal also offers a new avenue for running interactive jobs on traditional HPC clusters, like the DeiC Large Memory HPC system, something that is not typically done on these types of systems. An example, a popular application is JupyterLab, which is a web-based application that allows you to work interactively with languages such as Python and R. Thanks to the DeiC Integration Portal these applications can be launched as a Slurm job and the users can then work with the application directly from their browsers.
We are planning to implement more applications of this type in the future, such that the resources are more readily available for non-expert users,” says Martin Lundquist Hansen.
Currently JupyterLab and RStudio are available for the DeiC Large Memory HPC.
Integration with DTU Sophia
The DTU Sophia HPC cluster, which is part of the DeiC Throughput HPC service, is also available on the DeiC Integration Portal.
The Sophia system is hosted at DTU Campus Risø. The HPC cluster consists of dual processor AMD EPYC nodes fully connected through a 100G Infiniband Fat Tree topology. The full description can be found in the system documentation.
Currently, the main user groups on Sophia are from DTU Wind and DTU Construct. They typically run heavy duty numerical simulations like Computational Fluid Dynamics workloads, using softwares like Ellipsys, OpenFOAM, PETSc, and WRF. Other commonly used applications are AI/Machine Learning, Quantum Chemistry (Density Functional Theory), Monte Carlo and Molecular Dynamics codes. Commercial applications, like ABAQUS, COMSOL, Mathematica, and Matlab are also widely used.
Integration with LUMI/Puhuri
The third planned integration is with the LUMI supercomputer. LUMI has its own project management portal called Puhuri, which is used to create projects on the LUMI supercomputer. The consortium has worked with the Puhuri development team to support the functionality from the DeiC Integration Portal. Due to the scope of the Puhuri portal, this integration will, however, be limited to project management and requests of resources on LUMI. It is not yet possible to run jobs on LUMI directly from the DeiC Integration Portal.
What comes next?
With the DeiC Integration Portal now launched, in the future more DeiC services can be added. The majority of DeiC HPC services are already part of the portal: DeiC Interactive HPC (where hardware is placed both at SDU and AAU), DTU Sophia (part of DeiC Throughput HPC), DeiC Large Memory HPC and LUMI. The missing DeiC HPC services, part of the DeiC Throughput HPC, will be added in the future.
The DeiC Integration Portal will also make it possible to integrate with the upcoming DeiC data management services. A possible integration with DeiC data management services could mean that researchers will be able to use their data across the whole portfolio of DeiC services, for example to analyse data at different DeiC HPC centers.
In collaboration with DeiC, we plan to improve the look and branding of the new DeiC Integration Portal.
Outside of Denmark, the functionality of the new DeiC Integration Portal has already caught the attention of research institutions. This includes e.g. the HALRIC consortium, which recently received 11 million euros to build collaborations between companies, hospitals and universities (see press release from Lund University here). Within Denmark, there has been a dialogue with Danish Bioimaging Infrastructure (DBI-INFRA) Image Analysis Core Facility, who are also interested in the possibilities offered by the platform.
No doubt, the attention the DeiC Integration Portal has received both nationally and on a European level is an acknowledgement of the skills and competences of the consortium’s developers and the original vision of the DeiC Board from 2020. Surely, this is only the beginning of many future collaborations, which will benefit the research environment in Denmark.