Presenting EPCC's novel architectures and compilers activities at SC24
14 November 2024
With around 15,000 attendees, Supercomputing is the HPC community's largest global conference. Running every November in the US, it is an opportunity to showcase latest developments and share with the wider community what we are all doing. With so many people and organisations, there is something for everybody and strong representation by the novel architecture and compiler communities targeting HPC.
My activities at SC24 will be focused around two main topics in which we are heavily involved at EPCC: novel architectures for high performance computing (HPC), and compilers.
Novel architectures are interesting because they have the potential to deliver improved performance at reduced energy usage. However, due to the challenges associated with programming such new hardware, they are intrinsically linked with compilers to help address this aspect.
Fortran compilers
With around 60% of codes running on the ARCHER2 UK national supercomputer written in Fortran, it is crucially important that we have high quality, open, Fortran compilers. I will be presenting the paper Fully integrating the Flang Fortran compiler with standard MLIR at the LLVM workshop on the Monday morning which firstly compares the performance of the open source Flang Fortran compiler against the Cray and GNU compilers, highlighting that Flang tends to fall short.
Secondly, and the main purpose of the paper, it describes integration of Flang with the standard MLIR dialects and transformations. MLIR provides composable compiler infrastructure, and an interesting design decision by Flang was to integrate with this in the initial parts of the compiler. However, after this, it then relies on a bespoke compilation pipeline that is separate from the rest of MLIR. We show that, by fully embracing MLIR, one can deliver improved performance compared to the current state of the art and also, crucially, the flexibility that is provided then unlocks new opportunities.
Integrating OpenMP with Zig
Also at the LLVM workshop, one of my PhD students, David Kacs, will present the paper Pragma driven shared memory parallelism in Zig by supporting OpenMP loop directives, which describes the integration of OpenMP with the Zig programming language.
Before studying for a PhD with us, David was one of our MSc in HPC students and this work was started as part of his dissertation.
The premise is that the performance and memory safety offered by Zig means it is potentially an interesting option for HPC workloads, however there are specific aspects of the ecosystem that must be further developed for it to become a realistic proposition. Supporting OpenMP is one of these, providing threaded programming, and David will describe work to integrate this, as well as highlighting interoperability with Fortran and providing an evaluation of Zig against C and Fortran for HPC workloads.
RISC-V for HPC workshop
On the Monday afternoon I will chair the RISC-V for HPC workshop. This is the fifth time we have run this workshop and it will be a great opportunity for the RISC-V and HPC communities to come together and discuss the challenges and opportunities offered by RISC-V. Last year at SC23, we filled the room and so are looking forward to a busy session this year.
The workshop will be kicked off with a keynote talk from Dave Ditzel who is not only the founder of Esperanto Technologies, but furthermore is a legend in the industry having founded Transmeta amongst several other companies. We also have research paper presentations, hardware vendor talks, and a talk from RISC-V International at the workshop too, so it will be a challenge keeping everyone on time and fitting everything in!
I will present a research paper talk on Accelerating stencils on the Tenstorrent Grayskull RISC-V accelerator. This paper describes our work porting a stencil-based kernel to the Tenstorrent Grayskull PCIe accelerator.
Comprising over five hundred RISC-V cores, these accelerators were initially designed for AI workloads but the raw compute also provides great potential for scientific computing, and Tenstorrent has opened up its tooling to allow direct programming of the hardware. Indeed, to the best of our knowledge this is not only the first example of a scientific computing workload running on a Tenstorrent accelerator, but furthermore running on any sort of RISC-V PCIe accelerator more generally. Exploring the most appropriate code and algorithmic techniques, ultimately we are able to deliver slightly better performance on the Grayskull than a 24-core Xeon Platinum CPU but with around five times less energy usage.
RISC-V and HPC
On the Wednesday I will chair the panel RISC-V and HPC: How Can We Benefit from the Open Hardware Revolution? (1:30pm to 3pm). This session will bring together experts in the RISC-V community with the HPC audience at large and provide a platform to highlight the current state of RISC-V and to discuss some of the exciting new developments, as well as current blockers, that we face.
Democratizing AI Accelerators for HPC Applications
Later on Wednesday I will stand in for Joseph Lee, who left EPCC for pastures new earlier in the year, at the Democratizing AI Accelerators for HPC Applications: Challenges, Success, and Support BoF (5:15pm to 6:45pm). This event will explore leveraging accelerators designed for AI for more traditional scientific computing workloads, and I will give a talk summarising some of the activities in EPCC in this area.
MLIR compiler technology
On the Thursday I will chair the MLIR for HPC: An Opportunity to Revolutionize HPC Programming Tools BoF (12:15pm and 1:15pm). Co-leading this with Johannes de Fine Licht from Next Silicon, the purpose of this session is to dive deep into the MLIR compiler technology. We will have four excellent panellists who are heavily involved in this area and we hope to ultimately make a strong case for why the HPC community should care about this technology.
Programming FPGAs and AIEs for HPC
On the Friday morning I will be giving an invited talk at the H2RC workshop (9:20am to 9:40am) entitled Lowering the Barriers to Programming FPGAs and AIEs for HPC.
I will describe work that we are doing at EPCC in the xDSL project to enable seamless programming of FPGAs and AI engines for scientific computing workloads. The premise is that, in order to obtain the full benefit from such architectures, it must be possible to leverage them from existing, mainly Fortran, codes with little or no changes required on behalf of the end programmer. By leveraging the wide LLVM and MLIR ecosystems, we are able to make significant advances towards this goal, ultimately resulting in impressive performance and energy savings.
SC promises to be a busy week, but it's always great fun. If you want to chat to me about any of these topics then please grab me!