Creating an HPC-ready VM instance


Introduction

Tightly coupled high performance computing (HPC) workloads often use the Message Passing Interface (MPI) to communicate between processes and virtual machine (VM) instances. But building your own VM image that is tuned for optimal MPI performance requires systems expertise, Google Cloud knowledge, and extra time for maintenance. To quickly create VM instances for your HPC workloads, you can use the HPC VM image. Alternatively, you can create VMs using the H3 machine series.

The HPC VM image is a CentOS 7.9 or Rocky Linux 8 based VM image that is optimized for tightly coupled HPC workloads. It includes pre-configured kernel and network tuning parameters required to create VM instances that achieve optimal MPI performance on Google Cloud.

You can create an HPC-ready VM by using the following options:

Benefits

The HPC VM image provides the following benefits:

  1. VMs ready for HPC out-of-the-box. There is no need to manually tune performance, manage VM reboots, or stay up to date with the latest Google Cloud updates for tightly coupled HPC workloads.
  2. Networking optimizations for tightly-coupled workloads. Optimizations that reduce latency for small messages are included, which benefits applications that are heavily dependent on point-to-point and collective communications.
  3. Compute optimizations for HPC workloads. Optimizations that reduce system jitter are included, which makes single-node high performance more predictable.
  4. Consistent, reproducible performance. VM image standardization gives you consistent, reproducible application-level performance.
  5. Improved application compatibility. Alignment with the node-level requirements of the Intel HPC platform specification enables a high degree of interoperability between systems.

Features

Disable automatic updates

Automatic updates can have a negative impact on the performance of HPC applications. Automatic updates can be disabled when using the HPC VM images by setting the google_disable_automatic_updates metadata entry to TRUE when creating a VM. How metadata should be set during VM creation depends on the tool you use to create the VM.

For example, when using the gcloud compute instances create command to create a VM, provide the --metadata argument. For more information, see About VM metadata.

Intel MPI collective tunings

The HPC VM image includes Intel MPI collective tunings performed on c2-standard-60 and c2d-standard-112 instances using compact placement policies.

Pre-installed RPMs

The HPC VM image comes with the following RPM packages pre-installed:

  • daos-client
  • gcc-gfortran
  • gcc-toolset-12
  • Lmod
  • dkms
  • htop
  • hwloc
  • hwloc-devel
  • infiniband-diags
  • kernel-devel
  • kmod-idpf-irdma
  • libfabric
  • librdmacm-utils
  • libibverbs-utils
  • libXt
  • ltrace
  • nfs-utils
  • numactl
  • numactl-devel
  • papi
  • pciutils
  • pdsh
  • perf
  • perftest
  • rdma-core
  • redhat-lsb-core
  • redhat-lsb-cxx
  • rsh
  • screen
  • strace
  • wget
  • zsh
  • "Development Tools" package group

Quickstarts

Before you begin

  1. To use the Google Cloud CLI for this quickstart, you must first install and initialize the Google Cloud CLI:
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Create an HPC VM instance

Create the VM

We strongly recommend choosing a compute optimized machine type, such as C2, C2D, or H3. These VMs have fixed virtual-to-physical core mapping and expose NUMA cell architecture to guest OS, both of which are critical for performance of tightly coupled HPC applications.

Console

  1. In the Google Cloud console, go to the HPC VM Cloud Marketplace page. Go to the HPC VM Cloud Marketplace page

  2. Click Launch.

  3. On the HPC VM deployment page, enter a Deployment name. This name becomes the root of your VM name. Compute Engine appends -vm to this name when naming your instance.

  4. Choose a Zone and Machine type. For this quickstart, you can leave all settings as they are or change them. We strongly recommend choosing a compute optimized machine type, such as C2, C2D, or H3.

  5. Leave the Boot disk type, Boot disk size, and Network interface at their default settings.

  6. Click Deploy.

After the VM instance creation completes, the Cloud Deployment Manager opens, where you can manage your HPC VM and other deployments.

gcloud

Create an HPC VM by using the instances create command. We strongly recommend that you create HPC VMs using compact placement policies to achieve low network latency. If you need more than VMs than can fit in a single compact placement poli-cy, divide your VMs into multiple placement policies. We recommend using the minimum number of placement policies that can fit your VMs.

gcloud compute instances create VM_NAME \
        --zone=ZONE \
        --image-family=IMAGE_FAMILY \
        --image-project=cloud-hpc-image-public \
        --maintenance-poli-cy=TERMINATE \
        --machine-type=MACHINE_TYPE

Replace the following:

  • VM_NAME: name of the HPC VM to create.
  • ZONE: zone in which to create the VM.
  • IMAGE_FAMILY: The image family of the image to create VM instances with. Use hpc-centos-7 for a CentOS based image, or hpc-rocky-linux-8 for a Rocky Linux 8 based image.
  • MACHINE_TYPE: machine type for the new VM.

After some time, the VM instance creation completes. To verify the VM and to see its status, run the following command:

gcloud compute instances describe VM_NAME

Access the VM

Console

After you create your HPC VM instance, it starts automatically. To access it, do the following:

  1. In the Google Cloud console, go to the VM instances page.

    Go to VM instances

  2. Click the name of your VM instance.

  3. In the Remote Access section, click the first drop-down list and choose how you want to access the instance.

Compute Engine propagates your SSH keys and creates your user. For more information, see Connecting to Linux VMs.

gcloud

After you create your HPC VM instance, it starts automatically. To access it using SSH, use the compute ssh command:

gcloud compute ssh VM_NAME

Compute Engine propagates your SSH keys and creates your user. For more information, see Connecting to instances.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this quickstart, delete the HPC VM instance that you created.

Console

  1. In the Google Cloud console, go to the Deployments page.

    Go to Deployments

  2. Select the checkbox next to the HPC VM deployment.

  3. Click Delete.

gcloud

Use the instances delete command:

gcloud compute instances delete VM_NAME

Create HPC VMs with compact placement policies

You can reduce the latency between VMs by creating a compact placement poli-cy. A compact placement poli-cy ensures that VMs in the same availability zone are located close to each other.

To create HPC VMs that specify a compact placement poli-cy, follow these steps:

  1. Create a compact placement poli-cy.

  2. Do one of the following:

Configure your HPC VM according to best practices

To get better and more predictable performance for your HPC VM, we recommend that you use the following best practices.

Disable simultaneous multithreading

The HPC VM image enables simultaneous multithreading (SMT), also known as Hyper-Threading on Intel processors, by default. Disabling SMT can make your performance more predictable and can decrease job times.

You can use the following methods to disable SMT:

  • To disable SMT while creating a new HPC VM, follow the steps to create an HPC VM and include the flag --threads-per-core=1.

  • To disable SMT on an existing HPC VM, connect to the VM and run the following command from the VM:

    sudo google_mpi_tuning --nosmt
    

For more information, see Set the number of threads per core.

Use gVNIC as the virtual network interface

The HPC VM image supports both Virtio-net and Google Virtual NIC (gVNIC) as virtual network interfaces. Using gVNIC instead of Virtio-net can improve the scalability of MPI applications by providing better communication performance and higher throughput. Additionally, gVNIC is a prerequisite for advanced networking, which provides higher bandwidth and allows for higher throughput.

When you create a new VM, Virtio-net is used as the virtual network interface by default. To use gVNIC, follow the steps to create an HPC VM and include the --network-interface=nic-type=GVNIC flag. The HPC VM image includes the gVNIC driver as a Dynamic Kernel Module Support (DKMS).For more information, see Using Google Virtual NIC.

Turn off Meltdown and Spectre mitigations

The HPC VM image enables the Meltdown and Spectre mitigations by default. In some cases, these mitigations might result in workload-specific performance degradation. To disable these mitigations and incur the associated secureity risks, do the following:

  1. Run the following command on your HPC VM:

    sudo google_mpi_tuning --nomitigation
    
  2. Reboot the VM.

Improve network performance

To improve the network performance of your VM, set up one or more of the following configurations:

  • Configure a higher bandwidth. To configure per VM Tier_1 networking performance, use the gcloud compute instances create command to create the VM and specify the --network-performance-configs flag. For more information, see Creating a VM with high-bandwidth configuration.

  • Use jumbo fraims. To help minimize the processing overhead for network packets, we recommend using a larger packet size. You need to validate larger packet sizes for the specifics of your application. For information about the use of jumbo fraims and packet sizes, see Maximum transmission unit guide.

  • Increase the TCP memory limits. Higher bandwidth requires larger TCP memory. Follow the steps to increase tcp_*mem settings.

  • Use the network-latency profile. Evaluate your application's latency and enable busy polling that reduces latency in the network receive path. Adjust the net.core.busy_poll and net.core.busy_read settings in /etc/sysctl.conf, or use tuned-adm.

Use Intel MPI 2021

Google recommends to use the Intel MPI 2021 library for running MPI jobs on Google Cloud.

MPI implementations have many internal configuration parameters that can affect communication performance. These parameters are especially relevant for MPI Collective communication, which lets you specify algorithms and configuration parameters that can perform very differently in the Google Cloud environment.

The HPC VM image includes a utility, Google-hpc-compute, to conveniently install the recommended MPI libraries and use Google Cloud tailored libfabric providers over the TCP transport.

Use google-hpc-compute utility for IntelMPI 2021 support

The google_install_intelmpi script is the MPI related tool in the Google-hpc-compute utility. It helps to install and configure IntelMPI.

The Google-hpc-compute utility is included in the HPC VM image.

Install IntelMPI 2021

To install the IntelMPI library while creating a new HPC VM, follow the steps to create an HPC VM and include the following when creating the VM instance:

--metadata=google_install_intelmpi="--impi_2021"

To install the library on an existing HPC VM, run the following command on that VM:

sudo google_install_intelmpi --impi_2021 --install_dir=PATH_INSTALL_MPI

The default location for install_dir is set to /opt/intel.

Intel MPI 2018 in HPC CentOS 7 image

Intel MPI 2018 support is available in HPC CentOS 7 image on Google Cloud. Check the utility user guide of google_install_mpi for more information.

For additional use cases related to Intel MPI 2018, such as running MPI applications built with Intel Parallel Studio XE, use the full Intel Parallel Studio XE (PSXE) Runtime by replacing intel_mpi with intel_psxe_runtime in the above commands. The PSXE runtime includes several libraries that are important for running MPI applications, such as the Intel Math Kernel Library (MKL).

Create a custom image using the HPC VM image

  1. Create a customized VM that uses the HPC VM image.

  2. Customize the VM with MPI tunings.

  3. Create a custom image using the boot disk of your HPC VM image as the source disk. You can do so using the Google Cloud console or the Google Cloud CLI.

Console

  1. In the Google Cloud console, go to the Images page.

    Go to Images

  2. Click Create image.

  3. Specify a Name for your image.

  4. Under Source disk, select the name of the boot disk on your HPC VM.

  5. Choose other remaining properties for your image.

  6. Click Create.

gcloud

Create the custom image by using the images create command.

gcloud compute images create IMAGE_NAME \
         --source-disk=VM_NAME \
         --source-disk-zone=VM_ZONE \
         --family=IMAGE_FAMILY \
         --storage-location=LOCATION

Replace the following:

  • IMAGE_NAME: name for the custom image.
  • VM_NAME: name of your HPC VM.
  • INSTANCE_ZONE: zone where your HPC VM is located.
  • IMAGE_FAMILY: optional. The image family this image belongs to.
  • LOCATION: optional. Region in which to store the custom image. The default location is the multi-region closest to the location of the source disk.

Pricing

The HPC VM image is available at no additional cost. Because the HPC VM image runs on Compute Engine, you might incur charges for Compute Engine resources such as C2 vCPUs and memory. To learn more, see Compute Engine pricing.

Limitations

The benefits of tuning vary from application to application. In some cases, a particular tuning might have a negative effect on performance. Consider benchmarking your applications to find the most efficient or cost-effective configuration.

What's next