Continuing to share some templates for getting DFT software running in Modal and taking advantage of their GPU hardware. This time we'll be looking at setting up a Modal Image to run VASP. I'll be working with VASP v6.3.0, but I expect this approach to work for any version v6.x.x. In the last post, we looked at setting up ABACUS. Check that post if you want to run DFT in a serverless environment but don't have a VASP license:
A simple guide for compiling ABACUS to run with GPU acceleration in Modal. The post explains how to build ABACUS with CUDA support and run DFT calculations in a serverless environment. It covers why Modal’s on‑demand GPUs (like A100) can help, and which ABACUS setup (plane waves with basis_type pw and ks_solver bpcg) tends to work best on GPUs in version 3.9.0.
The process this time is fairly similar to the last time. We choose a custom docker image, install any required packages, do some compiler linking, then compile the software from source.
I mostly followed their OpenACC setup for NVIDIA HPC build template. I had never heard of OpenACC before, but this is what it is:
OpenACC is an open standard for parallel programming that uses compiler directives in C, C++, and Fortran to offload code to accelerators like GPUs. It allows developers to add directives to existing code to manage parallel execution and data movement, making it easier to accelerate applications for heterogeneous systems with both CPUs and GPUs.
Kinda cool. You build once, and automatically get support for CPU and GPU acceleration. Kind of like hybrid web/mobile apps with a framework like React Native, I wonder how much performance you leave on the table with this approach. Anyways, I'll leave my personal opinions for another post and get back to sharing the instructions.
We're going to make things easy on ourselves and the environment setup by using a NVIDIA HPC SDK container, offered directly from NVIDIA. This is the container I chose:
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nvhpc?version=24.11-devel-cuda12.6-ubuntu22.04
It's got all sorts of CUDA stuff installed, and still uses a familiar distro Ubuntu. From the page, key features of the NVIDIA HPC SDK for Linux include:
Support for NVIDIA Blackwell architecture GPUs
Support for NVIDIA Ampere Architecture GPUs with FP16, TF32 and FP64 tensor cores and MIG
Support for x86-64 and Arm Server multicore CPUs
NVC++ ISO C++17 compiler with Parallel Algorithms acceleration on GPUs, OpenACC and OpenMP
NVFORTRAN ISO Fortran 2003 compiler with array intrinsics acceleration on GPUs, CUDA Fortran, OpenACC and OpenMP
NVC ISO C11 compiler with OpenACC and OpenMP
NVCC NVIDIA CUDA C++ compiler
cuBLAS GPU-accelerated basic linear algebra subroutine (BLAS) library
And a lot more. Check out the container page for more info. Key features is that it has everything we need for OpenACC and support ARM hardware which is what I believe Modal CPUs are working with.
Before we can pull the image we need on Modal, we need to setup a NVIDIA Cloud account. This is just a requirement NVIDIA has to be able to pull from their registry. Doesn't cost anything, but it is kind of annoying.
See their guide on how to get setup. I followed this guide, and you should too, but I'll paraphrase the steps here. Specifically, we want to follow the instructions for how you'd pull a container using the Docker CLI, as it seems Modal follows the same flow.
Create an API key. Key should look like nvapi-***
Create a Secret in Modal with the following setup:
Keys must be called REGISTRY_USERNAME and REGISTRY_PASSWORD. REGISTRY_USERNAME must equal $oauthtoken. REGISTRY_PASSWORD is your API you generate from your NVIDIA Cloud account.
Pay close attention to the key names and values here. They are not arbitrary. REGISTRY_USERNAME = $oauthtoken and REGISTRY_PASSWORD = nvapi-***.
We are ready to move on to the Modal Image setup now.
We build the image like so:
Locally in the same folder as this Python file, you'll need your VASP source (mine is in a folder called vasp.6.3.0), the Makefile (mine is called makefile.include.gpu), and a build script to pull everything together (called build_vasp.sh here).
You'd download the VASP source from their website, but the rest of it I can share here:
makefile.include.gpu
Because of all the complexity going on here, I'm well aware that there could be further optimizations to these flags and config that could make a significant performance difference. If you find anything, please share it here!
build_vasp.sh
Note that this script builds three different binaries, vasp_std, vasp_ncl, and vasp_gam. For many use cases, you likely only need vasp_std. If so, I'd recommend removing the make commands for the others as it makes compile time quite long. Expect to wait ~30 minutes to an hour for compilation regardless.
And that's all there is. We have our main Python Modal script pulling in the VASP source, our Makefile, and a finally build script to bring it all together.
Good luck with it! This enables some really cool use cases in building first-principles simulations apps that you can share with others and let them use them on their own structures/data. If you make something cool, share it here as an API! Modal already makes that very easy with their FastAPI integration. Happy building.
On this page
This post explains how to run VASP with GPU acceleration inside Modal. It uses VASP version 6.3.0 and should work for other 6.x.x builds. The idea is to create a Modal Image that has an OpenACC-enabled GPU workflow, based on NVIDIA’s HPC SDK. The result is a self-contained image that can run GPU-accelerated VASP calculations in a serverless Modal environment.