IBM Power System AC922 Introduction and Technical Overview: Paper

Front cover
Draft Document for Review March 5, 2018 1:24 pm REDP-5472-00
IBM Power System AC922

Introduction and Technical Overview
Alexandre Bicas Caldeira
Redpaper
Draft Document for Review March 5, 2018 1:24 pm 5472edno.fm
International Technical Support Organization
IBM Power System AC922 Introduction and Technical

Overview
March 2018
REDP-5472-00
5472edno.fm Draft Document for Review March 5, 2018 1:24 pm
Note: Before using this information and the product it supports, read the information in “Notices” on page v.
First Edition (March 2018)
This edition applies to the IBM Power System AC922 server models 8335-GTG and 8335-GTW.
© Copyright International Business Machines Corporation 2018. All rights reserved.

Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
Draft Document for Review March 5, 2018 1:24 pm 5472TOC.fm
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter 1. Product summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Key server features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Server models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Power AC922 server model 8335-GTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Power AC922 server model 8335-GTW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Minimum features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2. System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Processor subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 POWER9 processor overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Processor feature codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Memory subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.1 Memory features and placement rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.2 Memory bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 I/O subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 PCIe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 IBM CAPI2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.3 OpenCAPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.4 The NVIDIA Tesla V100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.5 NVLINK 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 PCI adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.1 Slot configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.2 LAN adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.3 Fibre Channel adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.4 CAPI-enabled InfiniBand adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.5 Compute intensive accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.6 Flash storage adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 System ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7 Internal storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.1 Disk and media features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8 External I/O subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.9 Location Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.10 IBM System Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.11 Operating system support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.11.1 Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.11.2 Red Hat Enterprise Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.11.3 Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.12 Java. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
© Copyright IBM Corp. 2018. All rights reserved. iii

5472TOC.fm Draft Document for Review March 5, 2018 1:24 pm
Chapter 3. Physical infrastructure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.1 Operating environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Physical package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 System power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 System cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Rack specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5.1 IBM Enterprise Slim Rack 7965-S42. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.2 AC power distribution units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5.3 Rack-mounting rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.4 OEM racks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
iv IBM Power System AC922 Introduction and Technical Overview

Draft Document for Review March 5, 2018 1:24 pm 5472spec.fm
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS”

WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
© Copyright IBM Corp. 2018. All rights reserved. v

5472spec.fm Draft Document for Review March 5, 2018 1:24 pm
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright
and trademark information” at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
AIX® POWER® Redbooks®
DS8000® Power Systems™ Redpaper™
Easy Tier® POWER9™ Redbooks (logo) ®
EnergyScale™ PowerHA® Storwize®
IBM® PowerLinux™ System Storage®
IBM FlashSystem® PowerVM® XIV®
OpenCAPI™ Real-time Compression™
The following terms are trademarks of other companies:
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
Other company, product, or service names may be trademarks or service marks of others.
vi IBM Power System AC922 Introduction and Technical Overview

Draft Document for Review March 5, 2018 1:24 pm 5472pref.fm
Preface
This IBM® Redpaper™ publication is a comprehensive guide that covers the IBM Power
System AC922 server (8335-GTG and 8335-GTW models). The Power AC922 server is the
next generation of the IBM Power processor-based systems, which are designed for deep
learning and artificial intelligence (AI), high-performance analytics, and high-performance
computing (HPC).
This paper introduces the major innovative Power AC922 server features and their relevant
functions:
򐂰 Powerful IBM POWER9™ processors that offer 16 cores at 2.6 GHz with 3.09 GHz turbo
performance or 20 cores at 2.0 GHz with 2.87 GHz turbo for the 8335-GTG
򐂰 Eighteen cores at 2.98 GHz with 3.26 GHz turbo performance or 22 at 2.78 GHz cores
with 3.07 GHz turbo for the 8335-GTW
򐂰 IBM Coherent Accelerator Processor Interface (CAPI) 2.0, OpenCAPI™, and
second-generation NVIDIA NVLink technology for exceptional processor-to-accelerator
intercommunication
򐂰 Up to six dedicated NVIDIA Tesla V100 GPUs
This publication is for professionals who want to acquire a better understanding of IBM Power
Systems™ products and is intended for the following audiences:
򐂰 Clients
򐂰 Sales and marketing professionals
򐂰 Technical support professionals
򐂰 IBM Business Partners
򐂰 Independent software vendors (ISVs)
This paper expands the set of IBM Power Systems documentation by providing a desktop
reference that offers a detailed technical description of the Power AC922 server.
This paper does not replace the current marketing materials and configuration tools. It is
intended as an extra source of information that, together with existing sources, can be used to
enhance your knowledge of IBM server solutions.
Authors
This paper was produced by a specialist working at the International Technical Support
Organization, Austin Center.
Alexandre Bicas Caldeira is a Certified IT Specialist and a former Product Manager for
Power Systems Latin America. He holds a degree in computer science from the Universidade
Estadual Paulista (UNESP) and an MBA in marketing. His major areas of focus are
competition, sales, marketing, and technical sales support. Alexandre has more than 20
years of experience working on IBM Systems Solutions and has worked also as an IBM
Business Partner on Power Systems hardware, IBM AIX®, and IBM PowerVM® virtualization
products.
© Copyright IBM Corp. 2018. All rights reserved. vii

5472pref.fm Draft Document for Review March 5, 2018 1:24 pm
The project that produced this publication was managed by:
Scott Vetter
Executive Project Manager, PMP
Thanks to the following people for their contributions to this project:
Adel El-Hallak, Volker Haug, Ann Lund, Cesar Diniz Maciel, Chris Mann, Scott Soutter, Jeff
Stuecheli
IBM
Now you can become a published author, too!

Here’s an opportunity to spotlight your skills, grow your career, and become a published
author—all at the same time! Join an ITSO residency project and help write a book in your
area of expertise, while honing your experience using leading-edge technologies. Your efforts
will help to increase product acceptance and customer satisfaction, as you expand your
network of technical contacts and relationships. Residencies run from two to six weeks in
length, and you can participate either in person or as a remote resident working from your
home base.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our papers to be as helpful as possible. Send us your comments about this paper or
other IBM Redbooks publications in one of the following ways:
򐂰 Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
򐂰 Send your comments in an email to:
redbooks@us.ibm.com
򐂰 Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Stay connected to IBM Redbooks

򐂰 Find us on Facebook:
http://www.facebook.com/IBMRedbooks
򐂰 Follow us on Twitter:
http://twitter.com/ibmredbooks
viii IBM Power System AC922 Introduction and Technical Overview

Draft Document for Review March 5, 2018 1:24 pm 5472pref.fm
򐂰 Look for us on LinkedIn:

http://www.linkedin.com/groups?home=&gid=2130806
򐂰 Explore new Redbooks publications, residencies, and workshops with the IBM Redbooks
weekly newsletter:
https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm
򐂰 Stay current on recent Redbooks publications with RSS Feeds:
http://www.redbooks.ibm.com/rss.html
Preface ix
5472pref.fm Draft Document for Review March 5, 2018 1:24 pm
x IBM Power System AC922 Introduction and Technical Overview

Draft Document for Review March 5, 2018 1:24 pm 5472ch01.fm
Chapter 1. Product summary

The IBM Power System AC922 is the next generation of the IBM POWER® processor-based
systems, which are designed for deep learning and artificial intelligence (AI),
high-performance analytics, and high-performance computing (HPC).
The system is a co-designed with OpenPOWER Foundation members and will be deployed at
the most powerful supercomputer on the planet with a partnership between IBM, NVIDIA,
Mellanox, and others. It provides the latest technologies that are available for HPC, improving
even more the movement of data from memory to GPU accelerator cards and back, enabling
faster and lower latency data processing.
Among the new technologies the system provides, the most significant are the following ones:
򐂰 Two POWER9 processors with up to 40 cores (8335-GTG) or 44 cores (8335-GTW) and
improved buses
򐂰 1 TB of DDR4 memory (8335-GTG) with improved speed or 1 TB for the 8335-GTW
򐂰 Up to six NVIDIA Tesla V100 (Volta) GPUs, delivering up to 100 TFlops each, which is a
5x improvement compared to the previous generation
򐂰 Second-generation NVLINK with 2x throughput compared to the first generation
Because the massive computing capacity is packed into just 2Us of rack space, special
cooling systems supports the largest configurations. Therefore, to accommodate distinct data
center infrastructure requirements, the system is available in two distinct models:
򐂰 8335-GTG: Up to four GPUs and air-cooled
򐂰 8335-GTW: Up to six GPUS and water-cooled
© Copyright IBM Corp. 2018. All rights reserved. 1

5472ch01.fm Draft Document for Review March 5, 2018 1:24 pm
Figure 1-1 shows the front and rear views of an Power AC922 server.

Figure 1-1 Front and rear views of the Power AC922 server
1.1 Key server features

The Power AC922 server addresses the demanding needs of deep learning and AI,
high-performance analytics, and HPC.
An updated list of ported HPC applications that can use the IBM POWER technology is
available at IBM Power Systems HPC Applications Summary.
The system includes several features to improve performance:

򐂰 POWER9 Processors
– Each POWER9 processor module has either 16 or 20 cores based 64-bit architecture
• Clock speeds for 16-core chip of 2.6 GHz (3.09 GHz turbo - 8335-GTG)
• Clock speeds for 20-core chip of 2.0 GHz (2.87 GHz turbo - 8335-GTG
• Clock speeds for 18-core chip of 2.98 GHz (3.26 GHz turbo - 8335-GTW)
• Clock speeds for 18-core chip of 2.78 GHz (3.076 GHz turbo - 8335-GTW)
– 512 KB of L2 cache per core, and up to 120 MB of L3 cache per chip
– Up to 4 threads per core
– 120 GB/s memory bandwidth per chip
– 64 GB/s SMP interconnect between POWER9 chips
򐂰 DDR4 Memory
– Sixteen dual in-line memory module (DIMM) memory slots
– Maximum of 1024 GB DDR4 system memory (8335-GTG)
– Improved clock from 1333 MHz to 2666 MHz for reduced latency
򐂰 NVIDIA Tesla V100 GPUs
– Up to six NVIDIA Tesla V100 GPUs, based on the NVIDIA SXM2 form factor
connectors.
– 7.8 TFLOPs per GPU for double precision
– 15.7 TFLOPs per GPU for single precision
2 IBM Power System AC922 Introduction and Technical Overview

– 125 TFLOPs per GPU for deep learning

• New 640 Tensor Cores per GPU, designed for deep learning
– 16 GB HBM2 internal memory with 900 GB/s bandwidth, 1.5x the bandwidth compared
to Pascal P100
– Liquid cooling for six GPUs configurations to improve compute density
򐂰 NVLink 2.0
– Twice the throughput, compared to the previous generation of NVLink
– Up to 200 GB/s of bi-directional bandwidth between GPUs
– Up to 300 GB/s of bi-directional bandwidth per POWER9 chip and GPUs, compared to
32 GB/s of traditional PCIe Gen3
򐂰 OpenCAPI 3.0
– Open protocol bus to allow for connections between the processor system bus in a
high speed and cache coherent manner with OpenCAPI compatible devices like
accelerators, network controllers, storage controllers, and advanced memory
technologies
– Up to 100 GB/s of bi-directional bandwidth between CPUs and OpenCAPI devices
򐂰 PCIe Gen4 Slots
– Four PCIe Gen4 slots up to 64 GB/s bandwidth per slot, twice the throughput from
PCIe Gen3
• Three Coherent Accelerator Processor Interface (CAPI) 2.0 capable slots
Figure 1-2 shows the physical locations of the main server components.

0

% & '*-$( 0 6'%"'

0 =9

&$%&#' 0
0 68,5
, ! ',$(

0 7)& #'&)6;#(%# +
"
0 6)& #')=#(%# +
"
0 6)& #')9#(%#
% !
#" '+$(
0 7755
0 77<
0 #'($

!
0 $'#8$%>$
0 7#%!'#%
0 855
0 #'&($ 0 " 7,5
. '+$(
0 6;#%75#%
0 9
!!
!% 0 68,51=88:/#" *2
0 7)&. 0 *&'!-&
Figure 1-2 Location of server main components
Chapter 1. Product summary 3

1.2 Server models

The Power AC922 is manufactured in two distinct models:
Table 1-1 Summary of Power AC922 server available models

Server POWER9 Maximum Maximum Cooling
Model Chips Memory GPU Cards
8335-GTG 2 1 TB 4 Air cooled
8335-GTW 2 1 TB 6 Water cooled
1.2.1 Power AC922 server model 8335-GTG

This summary describes the standard features of the Power AC922 model 8355-GTG:
򐂰 19-inch rack-mount (2U) chassis
򐂰 Two POWER9 processor modules:
– 16-core 2.6 GHz processor module
– Up to 1024 GB of 2666 MHz DDR4 error correction code (ECC) memory
򐂰 Two small form factor (SFF) bays for hard disk drives (HDDs) or solid state drives (SSDs)
that support:
– Two 1 TB 7200 RPM NL SATA disk drives (#ELD0)
– Two 2 TB 7200 RPM NL SATA disk drives (#ES6A)
– Two 960 GB SATA SSDs (#ELU4)
– Two 1.92 TB SATA SSDs (#ELU5)
򐂰 Integrated SATA controller
򐂰 Four Peripheral Component Interconnect Express (PCIe) Gen4 slots:
– Two PCIe x16 Gen4 Low Profile slot, CAPI enabled
– One PCIe x8 Gen4 Low Profile slot, CAPI enabled
– One PCIe x4 Gen4 Low Profile slot
򐂰 Two or four NVIDIA Tesla V100 GPU (#EC4J), based on the NVIDIA SXM2 form factor
connectors air-cooled
򐂰 Integrated features:
– IBM EnergyScale™ technology
– Hot-swap and redundant cooling
– Two 1 Gb RJ45 ports
– One front USB 3.0 port for general use
– One rear USB 3.0 port for general use
– One system port with RJ45 connector
򐂰 Two power supplies (both are required)

The internal view of the fully populated server with four GPUs can be seen in Figure 1-3. In
this photo, the air baffles have been removed to better show the major components.
Figure 1-3 Power AC922 server model 8335-GTG fully populated with four GPUs
1.2.2 Power AC922 server model 8335-GTW

This summary describes the standard features of the Power AC922 model 8335-GTW:
򐂰 19-inch rack-mount (2U) chassis
򐂰 Two POWER9 processor modules:
– Up to 2048 GB of 2666 MHz DDR4 error correction code (ECC) memory
򐂰 Two small form factor (SFF) bays for hard disk drives (HDDs) or solid state drives (SSDs)
that support:
– Two 1 TB 7200 RPM NL SATA disk drives (#ELD0)
– Two 2 TB 7200 RPM NL SATA disk drives (#ES6A)
– Two 960 GB SATA SSDs (#ELU4)
򐂰 Integrated SATA controller
򐂰 Four Peripheral Component Interconnect Express (PCIe) Gen4 slots:
– Two PCIe x16 Gen4 Low Profile slot, CAPI enabled
– One PCIe x8 Gen4 Low Profile slot, CAPI enabled
– One PCIe x4 Gen4 Low Profile slot

򐂰 Four or six NVIDIA Tesla V100 GPU (#EC4J), based on the NVIDIA SXM2 form factor
connectors water-cooled
򐂰 Integrated features:
– IBM EnergyScale technology
– Hot-swap and redundant cooling
– Two 1 Gb RJ45 ports
– One rear USB 3.0 port for general use
– One system port with RJ45 connector
򐂰 Two power supplies (both are required)
The internal view of the fully populated server with six GPUs and water cooling system
installed can be seen in Figure 1-4. In this photo, the air baffles have been removed to better
show the major components.
Figure 1-4 Power AC922 server model 8335-GTW fully populated with six GPUs
Note: A hardware management console is not supported on the Power AC922 server.
1.2.3 Minimum features

The minimum initial order for the Power AC922 model 8355-GTG must include the following
minimum features:
򐂰 Two processor modules with at least 16 cores each
򐂰 256 GB of memory (sixteen 16 GB memory DIMMs)
򐂰 Two HDDs or SSDs
򐂰 Two #EC3L PCIe 2-port 100 Gbps Ethernet adapters
򐂰 Two #EC4J compute-intensive accelerators (NVIDIA V100)
򐂰 Two power supplies and power cords (both are required)

򐂰 An Linux OS indicator
򐂰 A rack integration indicator
򐂰 A Language Group Specify


Chapter 2. System architecture

This chapter describes the overall system architecture for the Power AC922 server. The
bandwidths that are provided throughout the section are theoretical maximums that are used
for reference.
Note: The speeds that are shown are at an individual component level. Multiple
components and application implementation are key to achieving the preferred
performance. Always do the performance sizing at the application-workload environment
level and evaluate performance by using real-world performance measurements and
production workloads.

2.1 System architecture

The Power AC922 server is a two single-chip module (SCM) system. Each SCM is attached
eight memory RDIMM slots. The server has a maximum capacity of 16 memory DIMMs which
allows for up to 1024 GB of memory.
The system board has sockets for four or six GPUs depending on the model, each 300 Watt
capable. Additionally, the server has a total of four PCIe Gen3 slots; three of these slots are
Coherent Accelerator Processor Interface (CAPI)-capable.
A diagram with the location of the processors, memory DIMMs, GPUs and PCIe slots can be
seen in Figure 2-1.

"# !

Figure 2-1 Component location for four GPU and six GPU system planars
An integrated SATA controller is fed through a dedicated PCI bus on the main system board
and allows for up to two SATA HDDs or SSDs to be installed. The location of the integrated
SATA connector can be seen in Figure 2-2 on page 11. This bus also drives the integrated
Ethernet and USB ports.


Figure 2-2 Integrated SATA connector
The POWER9 processor brings enhanced memory and I/O connectivity, improved chip to
chip communication and a new bus called NVLINK 2.0. A diagram with the external processor
connectivity can be seen in Figure 2-3 on page 12.
Chapter 2. System architecture 11

*
--()- .)/ 0/1%/.$
'+& .%!2))3%")*
*&)("$ 00)2
*)+'(,

--()- .)/ 0/1%/.$

$%&

$%&

"'()*

*0.)("'00)".
.' "$%&
*&)("$ 00)2
*) )0
0)* 0)* 0)*

--()- .)/ 0/1%/.$

*&)() )0
0)
) 2'.*
Figure 2-3 POWER9 chip external connectivity
Faster DDR4 memory DIMMs at 2666 MHz are connected to two memory controllers via
eight channels with a total bandwidth of 120 GB/s. Symmetric Multiprocessing chip-to-chip
interconnect is done via a four channel SMP bus with 64 GB/s bidirectional bandwidth.
The latest PCIe Gen4 interconnect doubles the channel bandwidth from previous PCIe Gen3
generation, allowing for the 48 PCIe channels to drive total of 192 GB/s bidirectional
bandwidth between I/O adapters and the POWER9 chip.
The connection between GPUs and between CPUs and GPUs is done via a link called
NVLINK 2.0, developed by IBM, NVIDIA and the OpenPOWER Foundation. This link provides
up to 5x more communication bandwidth between CPUs and GPUs (when compared to
traditional PCIe Gen3) and allows for faster data transfer between memory and GPUs and
between GPUs. Complex and data hungry algorithms like the ones used in machine learning
can benefit from having these enlarged pipelines for data transfer once the amount of data
needed to be processed is many times larger than the GPU internal memory. For more
information about NVLINK 2.0 please see 2.4.5, “NVLINK 2.0” on page 28.
Each POWER9 CPU and each GPU have six NVLINK channels, called Bricks, each one
delivering up to 50 GB/s bi-directional bandwidth. These channels can be aggregated to allow
for more bandwidth or more peer to peer connections.
The Figure 2-4 on page 13 compares the POWER9 implementation of NVLINK 2.0 with
traditional processor chips using PCIe and NVLINK.

(!/&.&'0!2'00)".&1&.4 (!/&.&'0!2'00)".&1&.4 '00)".&1&.4

2&.%) )0
2&.% 2&.%
4*.)+ 4*.)+ 4*.)+

)+'(4 )+'(4 )+'(4
* * *

)+'(40* )+'(40* )+'(40*
('")**'( ('")**'(

%&& %&& %&&
*

*
* .'
) )0
) )0

(&"'*
'2.! '2.! '2.! '2.! '2.! '2.!
* *
.' .'
(&"'
(&"'*
Figure 2-4 NVLINK 2.0 POWER9 implementation versus traditional architectures
On traditional processors, communication is done via PCIe Gen3 buses. Once the processor
has to handle all the GPU to GPU communication and GPU to system memory
communication, having more than two GPUs per processor would potentially create a
bottleneck on the data flow from system memory to GPUs.
To reduce this impact on the GPU to GPU communication, NVLINK brings a 50 GB/s direct
link between GPUs, reducing the dependency on the PCIe bus to exchange data between
GPUs but still depending on PCIe Gen3 to GPU to system memory communications.
The NVLINK 2.0 implementation on POWER9 goes beyond traditional implementation by

implementing 1.5x more memory bandwidth and aggregating NVLINK Bricks to allow for up to
3x faster communication between GPUs and system memory to GPU, reducing potential
bottlenecks throughout the system. The goal is to move data from system memory to GPU’s
16 GB internal memory as fast as possible so that GPU processing doesn’t have to stop and
wait for data to be moved to be able to continue.
Once NVLINK Bricks are combined differently depending on the server having four or six
GPUs (with two POWER9 processors) to maximize bandwidth, there are two distinct logical
diagrams depending on the amount of maximum GPUs supported per system.
Figure 2-5 on page 14 shows the logical system diagram for the Power AC922 server
(8335-GTG) with four GPUs, where the six NVLINK Bricks are divided in groups of three
allowing for 150 GB/s buses between GPUs.

NVLink 2.0
NVIDIA NVIDIA NVIDIA NVIDIA

VOLTA VOLTA VOLTA VOLTA
GPU GPU GPU GPU
50 GB/s per channel (Brick)
NVLink 2.0 150 GB/s aggregated

bandwidth
(3 Bricks)
15 GB/s per channel

DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM

POWER9 X Bus POWER9
DDR4 DIMM CPU 0 64 GB/s CPU 1 DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
PCIe Gen4 x8
PCIe Gen4 x16 - CAPI PCIe Gen4 x16 - CAPI
PCIe Gen4 x8 CAPI PCIe Gen4 x8
PCIe Gen4 x4
PCIe Gen2 x4 PCIe Gen2 x4 PCIe Gen2 x4

PEX
PCIe Gen2 x2 PCIe Gen2 x1
2 x 1Gbps Ethernet Internal Storage

Broadcom BMC USB Controller
2x Internal Rear Front

IPMI VGA
RJ-45 USB USB USB
Figure 2-5 The Power AC922 server model 8335-GTG logical system diagram
Figure 2-5 shows the logical system diagram for the Power AC922 server (8335-GTW) with
six connected GPUs, where the six NVLINK Bricks are divided in three groups of two Bricks,
allowing for 100 GB/s buses between GPUs but allowing for more connected GPUs.

100 GB/s aggregated

bandwidth
(2 Bricks)
NVLink 2.0
NVIDIA NVIDIA NVIDIA NVIDIA NVIDIA NVIDIA

VOLTA VOLTA VOLTA VOLTA VOLTA VOLTA
GPU GPU GPU GPU GPU GPU
50 GB/s per channel

NVLink 2.0 (Brick)
15 GB/s per channel

DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM

POWER9 X Bus POWER9
DDR4 DIMM CPU 0 64 GB/s CPU 1 DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
DDR4 DIMM DDR4 DIMM
PCIe Gen4 x8
PCIe Gen4 x16 - CAPI PCIe Gen4 x16 - CAPI
PCIe Gen4 x4
PCIe Gen2 x4 PCIe Gen2 x4 PCIe Gen2 x4

PEX
PCIe Gen2 x2 PCIe Gen2 x1
2 x 1Gbps Ethernet Internal Storage

Broadcom BMC USB Controller
2x Internal Rear Front

IPMI VGA
RJ-45 USB USB USB
Figure 2-6 The Power AC922 server model 8335-GTW logical system diagram
2.2 Processor subsystem

This section introduces the latest processor in the Power Systems product family and
describes its main characteristics and features in general.
The POWER9 processor in the Power AC922 server is the latest generation of the POWER
processor family. Based on the 14 nm FinFET Silicon-On-Insulator (SOI) architecture, the
chip size is 685 mm x 685 mm and contains eight billion transistors.
2.2.1 POWER9 processor overview

The POWER9 chip has four variations, depending on the server scalability and whether their
are being used on Linux ecosystem designed servers or PowerVM ecosystem servers.
The main differences reside in the scalability, maximum core count, SMT capability, and
memory connection. Table 2-1 on page 16 compares the chip variations.

Table 2-1 POWER9 chip variations

Variation Maximum SMP Maximum Memory Memory
Cores Connections SMT Connection Bandwidth
Scale-Out Linux 24 2 sockets SMT4 Direct 120 GB/s
Scale-Out PowerVM 24 2 sockets SMT4 Direct 120 GB/s
Scale-Up Linux 12 16 sockets SMT8 Memory 230 GB/s

Buffer
Scale-Up PowerVM 12 16 sockets SMT8 Memory 230 GB/s

Buffer
The main reason for this differentiation between Linux Ecosystem and PowerVM ecosystem
is that Linux Ecosystems have a greater need of granularity and core counts per chip while
PowerVM Ecosystem has a greater need of stronger threads and higher per core
performance for better efficiency on licensing.
A diagram reflecting the main differences can also be seen in Figure 2-7.
&002 '*4*."+&.&+&Y"! '1"( '*4*."+&.&+&Y"!

'("*%&& '("*%&&

2"0.

%"0."( '00" . %"0."( '00" .

Q *' '".
Q &(" ."+'(4 .. %
&"0
&"0
"+'(4

"+'(4

Q 0!0*.(4.0!(!
Q *"+'(40!1&!.%
"
"

&0'* &0'* &0'* &0'*

2"&

%"0."( '00" . %"0."( '00" .

Q *' '".
Q 0##"("!"+'(4 .. %
&"0
&"0
"+'(4

"+'(4

Q "*&$0"!
Q *"+'(40!1&!.%
"
"

0##"( 0##"(
&0'* &0'* &0'* &0'*
Figure 2-7 POWER9 chip variations
The Power AC922 server utilizes the Scale-out Linux version, with up to 24 cores and SMT4.
The POWER9 chip contains two memory controllers, PCIe Gen4 I/O controllers, and an
interconnection system that connects all components within the chip at 7 TB/s. Each core has
256 KB of L2 cache, and all cores share 120 MB of L3 embedded DRAM (eDRAM). The
interconnect also extends through module and system board technology to other POWER9
processors in addition to DDR4 memory and various I/O devices.

While scale-out variations have direct memory connections, scale-up POWER9

processor-based systems use memory buffer chips to interface between the POWER9
processor and DDR4 memory. Each buffer chip also includes an L4 cache to reduce the
latency of local memory accesses.
Figure 2-8 shows the POWER9 processor with 24 cores.
Figure 2-8 The 24-core POWER9 processor
Each POWER9 processor has eight memory channels, designed to address up to 512 GB of
memory. Theoretically, a two-socket server could address up to 8 TB of memory and a
16-socket system could address up to 64 TB of memory. Due to the current state of memory
DIMM technology, the largest available DIMM is 64 GB and therefore the largest amount of
memory supported on the Power AC922 is 1024 GB (64 GB x 8 channels x 2 processors).
2.2.2 Processor feature codes

The Power AC922 (8335-GTG) server supports two processor configurations only.
Processor features must be in quantity two and cannot be mixed.

Table 2-2 POWER9 processor features supported

Feature Description Min/Max OS
code support
EP0K POWER9 16-core 2.60 GHz (3.09 GHz Turbo) - 190W 2/2 Linux
EP0M POWER9 20-core 2.00 GHz (2.87 GHz Turbo) - 190W 2/2 Linux
2.3 Memory subsystem

The Power AC922 server is a two-socket system that supports two POWER9 SCM processor
modules. The server supports a maximum of 16 DDR4 RDIMMs slots in the main system
planar directly connected to the POWER9 processor.
Memory features equate to a single memory DIMM. All memory DIMMs must be populated
and mixing of different memory feature codes is not supported. Memory feature codes that
are supported are as follows:
򐂰 16 GB DDR4
򐂰 32 GB DDR4
򐂰 64 GB DDR4
Plans for future memory growth needs should be taken into account when deciding which
memory feature size to use at the time of initial system order once an upgrade will require a
full replacement of the installed DIMMs.
2.3.1 Memory features and placement rules

Each feature code equates to a single memory DIMMs. On the Table 2-3 it is shown the
available memory feature codes for ordering:
Table 2-3 Memory features supported

Feature Description Min/Max OS
code support
EM61 16 GB DDR4 2666 MHz DDR4 RDIMM 16/16 Linux
The supported maximum memory is 1024 GB by installing a quantity of 16 #EM64 memory

DIMMs. For the Power AC922 server (models 8335-GTG and 8335-GTW), the following
requirements apply:
򐂰 All the memory DIMMs must be populated.
򐂰 Memory features cannot be mixed.
򐂰 The base memory is 256 GB with sixteen 16 GB, 2666 MHz DDR4 memory modules
(#EM61).
Table 2-4 on page 19 shows total memory and how it can be accomplished by the quantities
of each memory feature code.

Table 2-4 Supported memory feature codes for Power AC922 server
Total installed memory
Memory features 256 GB 512 GB 1024 GB
16 GB (#EM61) 16
32 GB (#EM63) 16
64 GB (#EM64) 16
2.3.2 Memory bandwidth

The POWER9 processor has exceptional cache, memory, and interconnect bandwidths.
Table 2-5 shows the maximum bandwidth estimates for a single core on the server.
Table 2-5 The Power AC922 server single-core bandwidth estimates

Single core 8335-GTG and 8335-GTW
2.860 GHz 3.259 GHz
L1 (data) cache 137.28 GBps 156.43 GBps
L2 cache 137.28 GBps 156.43 GBps
L3 cache 183.04 GBps 208.57 GBps
The bandwidth figures for the caches are calculated as follows:

򐂰 L1 cache: In one clock cycle, two 16-byte load operations and one 16-byte store operation
can be accomplished. The value varies depending on the clock of the core, and the
formulas are as follows:
– 2.860 GHz Core: (2 x 16 B + 1 x 16 B) x 2.860 GHz = 137.28 GBps
򐂰 L2 cache: In one clock cycle, one 32-byte load operation and one 16-byte store operation
can be accomplished. The value varies depending on the clock of the core, and the
formula is as follows:
򐂰 L3 cache: One 32-byte load operation and one 32-byte store operation can be
accomplished at half-clock speed, and the formula is as follows:
Table 2-6 shows the overall bandwidths for the entire Power AC922 server populated with the
two processor modules.
Table 2-6 The Power AC922 server total bandwidth estimates

Total bandwidths 8335-GTG
20 cores @ 2.860 GHz 16 cores @ 3.259 GHz
L1 (data) cache 2746 GBps 2503 GBps
L2 cache 2746 GBps 2503 GBps

Total bandwidths 8335-GTG
20 cores @ 2.860 GHz 16 cores @ 3.259 GHz
L3 cache 3661 GBps 3337 GBps
Total memory 240 GBps 240 GBps
SMP interconnect 64 GBps 64 GBps
PCIe interconnect 272 GBps 272 GBps
Where:
򐂰 Total memory bandwidth: Each POWER9 processor has eight memory channels running
at 15 GBps. The bandwidth formula is calculated as follows:
8 channels x 15 GBps = 120 GBps per processor module
򐂰 SMP interconnect: The POWER9 processors are connected using an X-bus. The
bandwidth formula is calculated as follows:
1 X bus * 4 bytes * 16 GHz = 64 GBps
򐂰 PCIe interconnect: Each POWER9 processor has 34 PCIe lanes running at 16 Gbps
full-duplex. The bandwidth formula is calculated as follows:
34 lanes x 2 processors x 16 Gbps x 2 = 272 GBps
2.4 I/O subsystem

The key components of the I/O subsystem are discussed in the following.
2.4.1 PCIe
PCIe uses a serial interface and allows for point-to-point interconnections between devices by
using a directly wired interface between these connection points. A single PCIe serial link is a
dual-simplex connection that uses two pairs of wires, one pair for transmit and one pair for
receive, and can transmit only one bit per cycle. These two pairs of wires are called a lane. A
PCIe link can consist of multiple lanes. In these configurations, the connection is labeled as
x1, x2, x8, x12, x16, or x32, where the number is effectively the number of lanes.
The Power AC922 supports the new PCIe Gen4, which are capable of 32 GBps simplex
(64 GBps duplex) on a single x16 interface. PCIe Gen4 slots also support previous
generation (Gen3 and Gen2) adapters, which operate at lower speeds according to the
following rules:
򐂰 Place x1, x4, x8, and x16 speed adapters in the same size connector slots first before
mixing adapter speed with connector slot size.
򐂰 Adapters with lower speeds are allowed in larger sized PCIe connectors, but larger speed
adapters are not compatible in smaller connector sizes (that is, a x16 adapter cannot go in
an x8 PCIe slot connector).
PCIe adapters use a different type of slot than PCI adapters. If you attempt to force an
adapter into the wrong type of slot, you might damage the adapter or the slot.

POWER9-based servers support PCIe low profile (LP) cards, due to the restricted height of
the server.
Before adding or rearranging adapters, use the System Planning Tool to validate the new
adapter configuration. For more information about the System Planning Tool, see the
following website:
http://www.ibm.com/systems/support/tools/systemplanningtool/
If you are installing a new feature, ensure that you have the software that is required to
support the new feature and determine whether there are existing update prerequisites to
install. To obtain this information, use the IBM prerequisite website:
https://www-912.ibm.com/e_dir/eServerPreReq.nsf
The following sections describe other I/O technologies that enhance or replace the PCIe
interface.
2.4.2 IBM CAPI2

IBM CAPI2 is the evolution of CAPI and defines a coherent accelerator interface structure for
attaching special processing devices to the POWER9 processor bus. As the original CAPI,
CAPI2 can attach accelerators that have coherent shared memory access with the
processors in the server and share full virtual address translation with these processors, now
using a standard PCIe Gen4 buses with twice the bandwidth compared to the previous
generation.
Applications can have customized functions in Field Programmable Gate Arrays (FPGAs) and
enqueue work requests directly in shared memory queues to the FPGA. Applications can
also have customized functions by using the same effective addresses (pointers) they use for
any threads running on a host processor. From a practical perspective, CAPI allows a
specialized hardware accelerator to be seen as an additional processor in the system with
access to the main system memory and coherent communication with other processors in the
system. Figure 2-9 shows a comparison of the traditional model, where the accelerator has to
go thru the processor to access memory, with CAPI.

Figure 2-9 CAPI accelerator attached to the POWER9 processor
The benefits of using CAPI include the ability to access shared memory blocks directly from
the accelerator, the ability to perform memory transfers directly between the accelerator and
processor cache, and the ability to reduce the code path length between the adapter and the
processors. This reduction in the code path length might occur because the adapter is not

operating as a traditional I/O device, and there is no device driver layer to perform processing.
CAPI also presents a simpler programming model.
The accelerator adapter implements the Power Service Layer (PSL), which provides address
translation and system memory cache for the accelerator functions. The custom processors
on the system board, consisting of an FPGA or an ASIC, use this layer to access shared
memory regions, and cache areas as though they were a processor in the system. This ability
enhances the performance of the data access for the device and simplifies the programming
effort to use the device. Instead of treating the hardware accelerator as an I/O device, it is
treated as a processor, which eliminates the requirement of a device driver to perform
communication. It also eliminates the need for direct memory access that requires system
calls to the OS kernel. By removing these layers, the data transfer operation requires fewer
clock cycles in the processor, improving the I/O performance.
The implementation of CAPI on the POWER9 processor allows hardware companies to

develop solutions for specific application demands. Companies use the performance of the
POWER9 processor for general applications and the custom acceleration of specific
functions. They do so by using a hardware accelerator with a simplified programming model
and efficient communication with the processor and memory resources.
For a list of supported CAPI adapters, see 2.5.4, “CAPI-enabled InfiniBand adapters” on
page 32.
2.4.3 OpenCAPI
While CAPI is a technology present in IBM POWER processors and depending on IBM’s
intellectual property (the Processor Service Layer, or PSL), several industry solutions would
benefit from having a mechanism of connecting different devices to the processor, with low
latency, including memory attachment. The PCIe standard is pervasive to every processor
technology, but its design characteristics and latency, do not allow the attachment of memory
tor load/store operations.
With this in mind, the OpenCAPI Consortium was created, with the goal of defining a device
attachment interface, opening the CAPI interface to other hardware developers and extending
its capabilities. OpenCAPI aims to allow memory, accelerators, network, storage and other
devices to be connected to the processor through a high bandwidth, low latency interface,
becoming the interface of choice for connecting high performance devices.
By providing a high bandwidth low latency connection to devices, OpenCAPI allows several
applications to improve networking, make use of FPGA accelerators, use expanded memory
beyond server internal capacity, and reduce latency to storage devices. Some of these use
cases and examples can be seen in Figure 2-10 on page 23.

Figure 2-10 OpenCAPI uses cases
The design of OpenCAPI allows for very low latency in accessing attached devices (in the
same range of DDR memory access - 10 ns), which enables memory to be connected
through OpenCAPI and serve as main memory for load/store operations. In contrast, PCIe
latency is 10 times bigger (around 100 ns). That alone show a significant enhancement in
OpenCAPI when compared to traditional PCIe interconnects.
Open CAPI is agnostic to processor architecture and as such the electrical interface is not
being defined by the OpenCAPI consortium or any of its workgroups. On the POWER9
processor, the electrical interface is based on the design from the 25G workgroup within the
OpenPower Foundation, which encompasses a 25 GB/s signaling and protocol built to enable
very low latency interface on CPU and attached devices. Future capabilities include increase
speeds to 32 GB/s and 56 GB/s signaling.
The current design for the adapter is based on a PCIe card that draws power from the PCIe
slot, while connecting to the OpenCAPI port on the planar through a 25 GBs/s cable, as seen
in the Figure 2-11 on page 24.

!!

!

"
!
"!
! !
# !
Figure 2-11 OpenCAPI compatible adapter and 25G link
The OpenCAPI interface uses the same electrical interconnect as NVLink 2.0 – systems can
be designed to have an NVLink-attached GPU, an OpenCAPI-attached device, or both. The
use of OpenCAPI adapters limits the mount of NVLINK ports available for GPU
communication. Each POWER9 chip has six NVLINK ports, out of which four can be
alternatively be used for OpenCAPI adapters as shows in Figure 2-12.
!
!

!
!

!
%

!
!

!
!

!
%
"
!#
$#
Figure 2-12 OpenCAPI and NVLINK shared ports on the POWER9 chip

2.4.4 The NVIDIA Tesla V100

NVIDIA’s new NVIDIA Tesla V100 accelerator, code named Volta, takes GPU computing to
the next level. This section describes the Tesla V100 accelerator, as shown in Figure 2-13.
Figure 2-13 NVIDIA Tesla V100 for NVLINK accelerator
NVIDIA Tesla V100 is the world’s most advanced data center GPU ever built to accelerate
artificial intelligence (AI), high-performance computing (HPC), and graphics. Powered by
NVIDIA Volta, the latest GPU architecture, Tesla V100 offers the performance of 100 CPUs in
a single GPU - enabling data scientists, researchers, and engineers to tackle challenges that
were once impossible
The Tesla V100 includes the following key features:

򐂰 Volta architecture
By pairing CUDA Cores and Tensor Cores within a unified architecture, a single server
with Tesla V100 GPUs can replace hundreds of commodity CPU servers for traditional
HPC and Deep Learning.
򐂰 Tensor core
Equipped with 640 Tensor Cores, Tesla V100 delivers 125 TeraFLOPS of deep learning
performance. That’s 12X Tensor FLOPS for DL Training, and 6X Tensor FLOPS for DL
Inference when compared to NVIDIA Pascal GPUs.
򐂰 Next generation NVLINK
NVIDIA NVLink in Tesla V100 delivers 2X higher throughput compared to the previous
generation. Up to eight Tesla V100 accelerators can be interconnected at up to 300 GB/s
to unleash the highest application performance possible on a single server.

򐂰 Maximum efficiency mode

The new maximum efficiency mode allows data centers to achieve up to 40% higher
compute capacity per rack within the existing power budget. In this mode, Tesla V100 runs
at peak processing efficiency, providing up to 80% of the performance at half the power
consumption.
򐂰 HBM2
With a combination of improved raw bandwidth of 900 GB/s and higher DRAM utilization
efficiency at 95%, Tesla V100 delivers 1.5X higher memory bandwidth over Pascal GPUs
as measured on STREAM.
򐂰 Programmability
Tesla V100 is architected from the ground up to simplify programmability. Its new
independent thread scheduling enables finer-grain synchronization and improves GPU
utilization by sharing resources among small jobs.
The Tesla V100 is built to deliver exceptional performance for the most demanding compute
applications. It delivers the following performance benefits:
򐂰 7.8 TFLOPS of double-precision floating point (FP64) performance
򐂰 15.7 TFLOPS of single-precision (FP32) performance
򐂰 125 Tensor TFLOP/s of mixed-precision.
With 640 Tensor Cores, Tesla V100 is the world’s first GPU to break the 100 teraflops
(TFLOPS) barrier of deep learning performance. The next generation of NVIDIA NVLink
connects multiple V100 GPUs at up to 300 GB/s to create the world’s most powerful
computing servers. AI models that would consume weeks of computing resources on
previous systems can now be trained in a few days. With this dramatic reduction in training
time, a whole new world of problems will now be solvable with AI.
Multiple GPUs are common in workstations, as are the nodes of HPC clusters and
deep-learning training systems. A powerful interconnect is extremely valuable in
multiprocessing systems. NVIDIA’s new NVIDIA Tesla V100 doesn’t rely on traditional PCIe
for data transfers, instead using the new NVLINK 2.0 bus that creates an interconnect for
GPUs that offer higher bandwidth than PCI Express Gen3 (PCIe) and are compatible with the
GPU ISA to support shared memory multiprocessing workloads.
Once PCIe buses are not used for data transfer, the GPU cards don’t need to comply with the
traditional PCIe card format. In order to improve density in the Power AC922 server the GPUs
have a different form factor called SXM2. This form factor allows for the GPU to be connected
directly on the system planar. Figure 2-14 on page 27 shows the SXM2 GPU module top and
bottom views and the connectors used for the GPU modules.


Figure 2-14 SXM2 GPU module views
The location of the GPUs on the Power AC922 system planar can be seen in Figure 2-15.

Figure 2-15 GPU location in a six GPU configuration

Cooling for four GPU configurations (Power AC922 model 8335-GTG) is done via air while for
six GPU configurations (model 8335-GTW) are water based. For more information on server
water cooling please see Chapter 3, “Physical infrastructure” on page 39.
For more information on the Tesla V100 please visit Inside Volta Parallel for All blog:
https://devblogs.nvidia.com/parallelforall/inside-volta
2.4.5 NVLINK 2.0

NVLINK 2.0 is NVIDIA’s new generation high-speed interconnect technology for
GPU-accelerated computing. Supported on SXM2-based Tesla V100 accelerator boards,
NVLink significantly increases performance for both GPU-to-GPU communications and for
GPU access to system memory.
Support for the GPU ISA allows programs running on NVLINK-connected GPUs to execute
directly on data in the memory of another GPU and on local memory. GPUs can also perform
atomic memory operations on remote GPU memory addresses, enabling much tighter data
sharing and improved application scaling.
NVLINK 2.0 uses NVIDIA’s new High-Speed Signaling interconnect (NVHS). NVHS transmits
data over a link called Brick that connects two processors (GPU-to-GPU or GPU-to-CPU). A
single Brick supports up to 50 GB/s of bidirectional bandwidth between the endpoints.
Multiple Links can be combined to form Gangs for even higher-bandwidth connectivity
between processors. The NVLINK implementation in Tesla V100 supports up to six Links,
allowing for a gang with an aggregate maximum theoretical bandwidth of 300 GB/s
bidirectional bandwidth.
Although traditional NVLINK implementation primarily focuses on interconnecting multiple

NVIDIA Tesla V100s together, under POWER9 it also connects Tesla V100 GPUs with IBM
POWER9 CPUs allowing for direct system memory access providing GPUs with an extended
memory orders of magnitude larger than the internal 16 GB memory.
On Power implementation, Bricks are always combined to provide the highest bandwidth
possible. Figure 2-16 on page 29 compares the bandwidth of the POWER9 processor
connected with two GPUs and three GPUs.


Figure 2-16 CPU to GPU and GPU to GPU interconnect using NVLink 2.0
All the initialization of the GPU is through the PCIe interface. The PCIe interface also contain
the side band communication for status, power management, and so on. Once the GPU is up
and running, all data communication is using the NVLink.
2.5 PCI adapters

This section describes the types and functions of the PCI adapters that are supported by the
Power AC922 server.
The Power AC922 server leverages the latest PCIe Gen4 technology, allowing for 32 GB/s
unidirectional and 64 GB/s bi-directional bandwidth.
Note: PCIe adapters on the Power AC922 server are not hot-pluggable.
2.5.1 Slot configuration

The Power AC922 server has four PCIe Gen4 slots. Figure 2-17 on page 30 shows a
rear-view diagram of the Power AC922 server with its PCIe slots.

!%
•
'$" !%
• $ •
'$"
!! %!$! %!#$(%#!!!$
•
• $
"$" $#"& #
•

&$!" "$!$
!(#&""$
( % #' %!#$

!%
•
"'$" &"$
• $
•

!%
•
'$"
• $
Figure 2-17 Rear-view PCIe slots and main components
Table 2-7 provides the PCIe Gen4 slot configuration.
Table 2-7 The Power AC922 server PCIe slot properties

Slot Description Card size CAPI capable
Slot 1 PCIe Gen4 x4 Half height, No

half length
Slot 2 PCIe Gen4 x8 Shared Half height, Yes

half length
Slot 3 PCIe Gen4 x16 Half height, Yes

half length
Slot 4 PCIe Gen4 x16 Half height, Yes

half length
Slot 2 has a shared connection between the two POWER9 CPUs. When using a dual channel
Mellanox InfiniBand ConnectX5 (IB-EDR) Network Interface Card (#EC64) it allows for each
CPU to have direct access to the Infiniband card. If the #EC64 card is not installed, the
shared slot will operate as a single x8 PCIe Gen4 slot attached to processor 0.
Figure 2-18 on page 31 shows the logical diagram of the slot 2 connected to the two
POWER9 processors.

POWER9 X Bus POWER9

CPU 0 64 GB/s CPU 1
PCIe Gen4 x8
Figure 2-18 Shared PCIe slot 2 logical diagram
Only LP adapters can be placed in LP slots. A x8 adapter can be placed in a x16 slot, but a
x16 adapter cannot be placed in a x8 slot.
2.5.2 LAN adapters

To connect the Power AC922 server to a local area network (LAN), you can use the LAN
adapters that are supported in the PCIe slots of the system unit. Table 2-8 lists the supported
LAN adapters for the server.
Table 2-8 Supported LAN adapters

Feature Description Max OS
code support
EC2R PCIe3 LP 2-port 10GbE (NIC& RoCE) SFP28 Adapter x8 3 Linux
EC2T PCIe3 LP 2-port 25/10GbE (NIC& RoCE) SFP28 Adapter x8 2 Linux
EC3L PCIe3 LP 2-port 100GbE (NIC& RoCE) QSFP28 Adapter x16 2 Linux
EL3Z PCIe2 LP 2-port 10/1GbE BaseT RJ45 Adapter x8 3 Linux
EL4M PCIe2 LP 4-port 1GbE Adapter 4 Linux
EN0T PCIe2 LP 4-Port (10Gb+1GbE) SR+RJ45 Adapter x8 3 Linux
EN0V PCIe2 LP 4-port (10Gb+1GbE) Copper SFP+RJ45 Adapter x8 2 Linux
2.5.3 Fibre Channel adapters

The Power AC922 server supports direct or SAN connection to devices that use Fibre
Channel adapters. Table 2-9 on page 32 summarizes the available Fibre Channel adapters,
which all have LC connectors.
If you are attaching a device or switch with an SC-type fiber connector, an LC-SC 50 micron
fibre converter cable (#2456) or an LC-SC 62.5 micron fibre converter cable (#2459) is
required.

Table 2-9 Fibre Channel adapters supported

code support
EL43 PCIe3 LP 16 Gb 2-port Fibre Channel Adapter x8 3 Linux
EL5V PCIe3 LP 32 Gb 2-port Fibre Channel Adapter x8 3 Linux
2.5.4 CAPI-enabled InfiniBand adapters

Table 2-10 shows the available CAPI adapters.
Table 2-10 Available CAPI adapters

code support
EC64 PCIe4 LP 2-port 100 Gb EDR InfiniBand Adapter x16 3 Linux
EC62 PCIe4 LP 1-port 100 Gb EDR InfiniBand Adapter x16 3 Linux
2.5.5 Compute intensive accelerator

Compute intensive accelerators are GPUs that are developed by NVIDIA. With NVIDIA
GPUs, the server can offload processor-intensive operations to a GPU accelerator and boost
performance. The Power AC922 server aims to deliver a new class of technology that
maximizes performance and efficiency for all types of scientific, machine learning, deep
learning, AI, engineering, Java, big-data analytics, and other technical computing workloads.
Table 2-11 lists the available compute intensive accelerators.
Table 2-11 Graphics processing units adapters that are supported

code support
EC4J One air-cooled NVIDIA Tesla V100 GPUs 16 GB 4 Linux

(model 8335-GTG only
EC4H One water-cooled NVIDIA Tesla V100 GPUs 16 GB 6 Linux

(model 8335-GTW only)
2.5.6 Flash storage adapters

The available flash storage adapters are shown in Table 2-12.
Table 2-12 Available flash storage adapters

Feature CCIN Description Max OS
code support
EC5A PCIe3 1.6 TB NVMe Flash Adapter 3 Linux

2.6 System ports

The system board has two 1 Gbps Ethernet port, one Intelligent Platform Management
Interface (IPMI) port, one rear USB 3.0 port (model 8335-GTG also has one front USB 3.0
port), and a VGA port shown in Figure 2-17 on page 30.
The integrated system ports are supported for modem and asynchronous terminal
connections with Linux. Any other application that uses serial ports requires a serial port
adapter to be installed in a PCI slot. The integrated system ports do not support IBM
PowerHA® configurations. The VGA port does not support cable lengths that exceed three
meters.
2.7 Internal storage

The internal storage on the Power AC922 server contains the following features:
򐂰 A storage backplane for two 2.5-inch SFF Gen4 SATA HDDs or SDDs.
Limitation: The disks use an SFF-4 carrier. Disks that are used in other Power
Systems servers usually have an SFF-3 or SFF-2 carrier and are not compatible with
this system.
򐂰 One integrated SATA disk controller (non-RAID).

򐂰 The storage split backplane feature is not supported.
Table 2-13 presents a summarized view of these features.
Table 2-13 Summary of features for the integrated SATA disk controller
Option Integrated SATA disk controller
Supported RAID types None - JBOD
Disk bays Two SFF Gen4 (HDDs/SDDs)
SATA controllers Single
IBM Easy Tier® capable controllers No
External SAS ports No
Split backplane No
The 2.5 inch or SFF SAS bays can contain SATA drives (HDD or SSD) that are mounted on a
Gen4 tray or carrier (also knows as SFF-4). SFF-2 or SFF-3 drives do not fit in an SFF-4 bay.
All SFF-4 bays support concurrent maintenance or hot-plug capability.
2.7.1 Disk and media features

The server supports the attachment of up to two SATA storage devices. Table 2-14 on
page 34 lists the supported devices that can be installed. Disk features cannot be mixed.

Table 2-14 Supported storage devices

code support
ELD0 1 TB 7.2k RPM SATA SFF-4 disk drive 2 Linux
ELU4 960 GB SSD SFF-4 disk drive 2 Linux
ELU5 1.92 TB SSD SFF-4 disk drive 2 Linux
ELU6 3.84 TB SSD SFF-4 disk drive 2 Linux
ES6A 2 TB 7.2k RPM 5xx SATA SFF-4 disk drive 2 Linux
The Power AC922 server is designed for network installation or USB media installation. It
does not support an internal DVD drive.
2.8 External I/O subsystems

The Power AC922 server does not support external PCIe Gen3 I/O expansion drawers nor
EXP24S, EXP12X, and EXP24SX storage drawers.
2.9 Location Codes

The location codes for the server main components can be seen on Figure 2-19.
P1C15 DDR4 Memory DIMM

P1C16 P1C6
P1C17
P1C18 NVIDIA
FAN Tesla V100
GPU
P1C13
POWER9 P1C7
CPU
NVIDIA
Tesla V100
GPU
FAN
P1C19
P1C20
P1C21
P1C22 P1C8
NVIDIA
FRONT
P1C23 Tesla V100

REAR
P1C24 GPU
P1C9
P1C25
FAN P1C26 NVIDIA
Tesla V100
GPU
P1C2 – PCIe Gen4 x16 - CAPI
P1C14 P1C3 – PCIe Gen4 x16 - CAPI
POWER9
P1C10
CPU P1C4 – PCIe Gen4 x8 (shared) – CAPI
NVIDIA
Tesla V100
P1C5 – PCIe Gen4 x4
FAN GPU
P1C27
P1C28
P1C29
P1C30 P1C11
P1D1 - SATA HDD/SSD Power Supplies (x2)
NVIDIA
Tesla V100
GPU
P1D2 - SATA HDD/SSD
Power Button / Front USB
Figure 2-19 Power AC922 main components location codes

2.10 IBM System Storage

The IBM System Storage® disk systems products and offerings provide compelling storage
solutions with superior value for all levels of business, from entry-level to high-end storage
systems. For more information about the various offerings, see the following website:
http://www.ibm.com/systems/storage/disk
The following sections describe sample System Storage offerings.
IBM Network Attached Storage

IBM Network Attached Storage (NAS) products provide a wide-range of network attachment
capabilities to a broad range of host and client systems, such as IBM Scale Out Network
Attached Storage and the IBM System Storage N series. For more information about the
hardware and software, see the following website:
http://www.ibm.com/systems/storage/network
IBM Storwize family

The IBM Storwize® family is the ideal solution to optimize the data architecture for business
flexibility and data storage efficiency. Different models, such as the IBM Storwize V3700,
IBM Storwize V5000, and IBM Storwize V7000, offer storage virtualization, IBM Real-time
Compression™, Easy Tier, and many more functions. For more information, see the following
website:
http://www.ibm.com/systems/storage/storwize
IBM FlashSystem family

The IBM FlashSystem® family delivers extreme performance to derive measurable economic
value across the data architecture (servers, software, applications, and storage). IBM offers a
comprehensive flash portfolio with the IBM FlashSystem family. For more information, see the
following website:
http://www.ibm.com/systems/storage/flash
IBM XIV Storage System

The IBM XIV® Storage System is a high-end disk storage system, helping thousands of
enterprises meet the challenge of data growth with hotspot-free performance and ease of
use. Simple scaling, high service levels for dynamic, heterogeneous workloads, and tight
integration with hypervisors and the OpenStack platform enable optimal storage agility for
cloud environments.
XIV Storage Systems extend ease of use with integrated management for large and multi-site
XIV deployments, reducing operational complexity and enhancing capacity planning. For
more information, see the following website:
http://www.ibm.com/systems/storage/disk/xiv/index.html
IBM System Storage DS8000

The IBM System Storage DS8800 storage system is a high-performance, high-capacity, and
secure storage system that delivers the highest levels of performance, flexibility, scalability,
resiliency, and total overall value for the most demanding, heterogeneous storage
environments. The storage system can manage a broad scope of storage workloads that
exist in today’s complex data center, doing it effectively and efficiently.

Additionally, the IBM System Storage DS8000® storage system includes a range of features
that automate performance optimization and application quality of service, and also provide
the highest levels of reliability and system uptime. For more information, see the following
website:
http://www.ibm.com/systems/storage/disk/ds8000/index.html
2.11 Operating system support

The Power AC922 server supports Linux, which provides a UNIX-like implementation across
many computer architectures.
For more information about the software that is available on Power Systems servers, see the
Linux on Power Systems website:
http://www.ibm.com/systems/power/software/linux/index.html
The Linux operating system is an open source, cross-platform OS. It is supported on every
Power Systems server IBM sells. Linux on Power Systems is the only Linux infrastructure that
offers both scale-out and scale-up choices.
2.11.1 Ubuntu
Ubuntu Server 16.04.03 LTS and any subsequent updates are supported. For more
information about Ubuntu for POWER9, see the following website:
https://www.ubuntu.com/download/server
2.11.2 Red Hat Enterprise Linux

Red Hat Enterprise Linux 7.4 for Power LE (POWER9) and any subsequent updates are
supported.
Starting with Red Hat Enterprise Linux 7.1, Red Hat provides separate builds and licenses for
big endian and little endian versions for Power. For more information about RHEL for
POWER9, see the following website:
https://access.redhat.com/ecosystem/hardware/2689861
2.11.3 Additional information

For more information about the IBM PowerLinux™ Community, see the following website:
https://www.ibm.com/developerworks/group/tpl
For more information about the features and external devices that are supported by Linux,
see the following website:
http://www.ibm.com/systems/power/software/linux/index.html

2.12 Java
When running Java applications on the POWER9 processor, the prepackaged Java that is
part of a Linux distribution is designed to meet the most common requirements. If you require
a different level of Java, there are several resources available.
For current information about IBM Java and tested Linux distributions, see the following
website:
https://www.ibm.com/developerworks/java/jdk/linux/tested.html
For additional information about the OpenJDK port for Linux on PPC64 LE and pregenerated
builds, see the following website:
http://cr.openjdk.java.net/~simonis/ppc-aix-port/
Launchpad.net has resources for Ubuntu builds. For more information, see the following
webistes:
https://launchpad.net/ubuntu/+source/openjdk-9


Chapter 3. Physical infrastructure

The objective of this section is to summarize all the physical infrastructure requirements
regarding the IBM Power System AC922 servers.
Additional information can be found in the Knowledge Center for the Power AC922 server.

3.1 Operating environment

Table 3-1 provides the operating environment specifications for the Power AC922 server.
Table 3-1 Operating environment for the 4-GPU 8335-GTG Power AC922 server
Server operating environment
Description Operating Non-operating
Temperatureab Allowable: 5 - 40°C 1 - 60°C

(41 - 104°F) (34 - 140°F)
Recommended: 18 - 27 °C
(64 - 80.6 °F)
Relative humidity 8 - 80% 8 - 80%

Recommeded: 60%
Maximum dew point 24°C (75° F) 27°C (80°F)

Recommended: 15°C (59° F)
Operating voltage 200 - 240 V AC N/A
Operating frequency 50 - 60 Hz +/- 3 Hz N/A
Power consumption 2300 watts maximum N/A
Power source loading 2.6 kVA maximum N/A
Thermal output 8872 BTU/hr maximum N/A
Maximum altitude 3050 m N/A

(10,000 ft.)
Noise level and sound power 7.6/6.7 bels operating/idling N/A

a. Derate maximum allowable dry-bulb temperature 1°C (1.8°F) per 175 m above 950 m. IBM
recommends a temperature range of 18°C - 27°C (64°F - 80.6°F)
b. For model 8335-GTV, heavy workloads might see performance degradation above 30°C
(86°F), 900 m (2953 ft.), or both if internal temperatures result in a central processing unit
(CPU) or graphics processing unit (GPU) clock reduction.
Figure 3-1 shows the flow rate of water that is required based on the inlet temperature of the
water to the rack for a single system.

Figure 3-1 Water flow rate versus temperature
Figure 3-2 provides data on water flow versus pressure drop as a function of the number of
systems in a rack. The facility rack level pressure drop includes the following pressure drops:
򐂰 Supply side Eaton ball valve quick connect pair
򐂰 Supply side 1 in. ID, 6 in. long Hose going to the supply manifold
򐂰 Supply side manifold
򐂰 8335-GTW node
򐂰 Return side manifold
򐂰 Return side 1 in. ID, 6 in. long hose leaving the return manifold
򐂰 Return side Eaton ball valve quick connect pair
Chapter 3. Physical infrastructure 41

Figure 3-2 Water flow rate versus pressure drop
Tip: The maximum measured value is expected from a fully populated server under an
intensive workload. The maximum measured value also accounts for component tolerance
and operating conditions that are not ideal. Power consumption and heat load vary greatly
by server configuration and usage. Use the IBM Systems Energy Estimator to obtain a
heat output estimate that is based on a specific configuration. The estimator is available at
the following website:
http://www-912.ibm.com/see/EnergyEstimator
3.2 Physical package

Table 3-2 shows the physical dimensions of the chassis. The server is available only in a
rack-mounted form factor and requires 2U (2 EIA units) of rack space.
Table 3-2 Physical dimensions for the Power AC922 server

Dimension The Power AC922 server models 8335-GTG and
8335-GTW
Width 441.5 mm (17.4 in.)

Dimension The Power AC922 server models 8335-GTG and

8335-GTW
Depth 845.0 mm (33.3 in.)
Height 86.0 mm (3.4 in.)
Weight (maximum configuration) 30 kg (65 lbs.)
3.3 System power

The Power AC922 server is powered by two 2200W power supplies located in the rear of the
unit.
The power supplies are designed to provide redundancy in case of a power supply failure.
Once GPUs are the largest power consuming devices in the server, depending on the
configuration and utilization, throttling may occur in case of a power supply failure when six
GPUs are installed. In this case system remain operational but may experience reduced
performance until the power supply is replaced.
The power supplies on the server use a new Rong Feng 203P-HP connector. A new power
cable to connect the power supplies to the PDUs in the rack is required rending the reuse of
existing power cables not viable. The PDU connector type (IEC C20 or IEC C19) depends on
the selected rack PDU. An example of the power cable with its connectors can be seen in
Figure 3-3.

Figure 3-3 Power AC922 power cables with the Rong Feng connector
Both 1-phase and 3-phase PDUs are supported. For more information see 3.5.2, “AC power
distribution units” on page 52.
When opting for 3-phase 60A PDUs, a total of 4 PDUs will be required to support a full rack
with 18 Power AC922 servers configured with four GPUs. If 1-phase PDUs are selected a
minimum of 5 PDUs are required to support a full rack of 18 Power AC922 servers with four
GPUs configuration. Once the 1-phase PDUs are limited to 48A, no more than four Power
AC922 servers can be connected to a single PDU.
3.4 System cooling

Air or water cooling depends on the model of the server and the GPUs feature codes
selected. See 2.5.5, “Compute intensive accelerator” on page 32 for a list of available GPUs.

Rack requirement: The IBM 7965-S42 rack with feature #ECR3 or #ECR4 installed
supports the water cooling option for the Power AC922 server (see “Optional water
cooling” on page 49).
When using water cooled systems, the customer is responsible for providing the system that
supplies the chilled conditioned water to the rack. Water condensation can occur in certain
combinations of temperature and relative humidity, which define the dew point. The system
that supplies the cooling water must be able to measure the room dew point and
automatically adjust the water temperature several degrees above dew point. Otherwise, the
water temperature must be above the maximum dew point for that data center installation.
Typical primary chilled water is too cold for use in this application because building chilled
water can be as cold as 4°C - 6°C (39°F - 43°F).
In air cooled systems (8335-GTG), all components are air cooled, including processors and
GPUs that use heatsinks. A picture with the internal view of the server and the two processors
and four GPUs heatsinks installed can be seen in Figure 3-4.

Figure 3-4 Power AC922 air cooled model internal view
In water cooled systems (8335-GTW) processors and GPUs are cooled using water while
other components like memories, PCIe adapters and power supplies are cooled using
traditional air cooling systems. Cold plates to cool two processor modules and up to six GPUs
are shipped. Water lines carrying cool water in and warm water out are also shipped. This
feature is installed in the system unit when the server is manufactured and is not installed in
the field.

When ordering the Power AC922 model 8335-GTW a cooling kit is required. It contain the
pipes, coldplates and splitters required to cool the system. The feature code #XXX will
provide with the internal cooling system of the server as shows in Figure 3-5.

Figure 3-5 Internal cooling components for the 8335-GTW server
A view of the cooling system installed in the server can be seen in Figure 3-6.

Figure 3-6 Internal cooling installed in a 8335-GTW model server

A detailed view on a processor and three GPUs cooling can be seen in Figure 3-7.

Figure 3-7 Processor and GPU water cooling details
Water enters the system and passes through a splitter block, where the water goes in two
different flowpaths. In each flowpath the water flows first through the cpu coldplate and then
through the gpu coldplate. Then, the warm water enters a return line splitter block and out the
server. A picture showing cold water in blue and warm water in red can be seen in Figure 3-8.

Figure 3-8 Cold and warm water flow through the Power AC922 system
When shipped from IBM, an air-cooled server cannot be changed into a water-cooled server;
and a water-cooled server cannot be changed into an air-cooled server.
Customer setup is not supported for water-cooled systems.
The GPU air-cooled and water-cooled servers have the following ordering differences:
򐂰 With an air-cooled server, (8335-GTG), an initial order can be ordered with two GPUs or
four GPUs feature #EC4J.

򐂰 With a water-cooled server (8335-GTW), a quantity of four or six feature #EC4H GPUs
must be ordered.
Note: Power AC922 model 8335-GTW only offers the fixed rail kit option. Ordering this
model with slide rails is not supported. Maintenance of components other than power
supplies and fans must be done on a bench with the server unplugged from the cooling
system.
For more information about the water cooling option, see the following website:
http://www.ibm.com/support/knowledgecenter/POWER8/p8had/p8had_83x_watercool.htm
3.5 Rack specifications

Depending on the model chosen for the Power AC922 server, there are different racks
supported. While the air cooled model (8335-GTG) supports a wide variety of racks,
water-cooled options have just one rack supported as can be seen in Table 3-3:
Table 3-3 Supported racks by model

Type-Model Description Supported by Supported by
8335-GTG 8335-GTW
7014-T00 IBM 7014 Rack Model T00 Yes No
7014-T42 IBM 7014 Rack Model T42 Yes No
7014-S25 IBM Entry Rack Cabinet Model S25 Yes No
7965-94Y IBM 42U Slim Rack Yes No
7965-S42 IBM Enterprise Slim Rack Yes Yes
N/A OEM 19” Rack See 3.5.4, “OEM No

racks” on page 55
Note: Due to the water cooling system, model 8335-GTW server only mounts in the 42U
IBM Enterprise Slim Rack (7965-S42).
These racks are built to the 19 inch EIA 310D standard.
Order information: The Power AC922 server cannot be integrated into these racks during
the manufacturing process and are not ordered together with servers. If the server and any
of the supported IBM racks are ordered together, they are shipped at the same time in the
same shipment but in separate packing material. IBM does not offer integration of the
server into the rack before shipping.
If a system is installed in a rack or cabinet that is not an IBM rack, ensure that the rack meets
the requirements that are described in 3.5.4, “OEM racks” on page 55.

Responsibility: The client is responsible for ensuring that the installation of the drawer in
the preferred rack or cabinet results in a configuration that is stable, serviceable, safe, and
compatible with the drawer requirements for power, cooling, cable management, weight,
and rail security.
3.5.1 IBM Enterprise Slim Rack 7965-S42

The new 2.0-meter (79-inch) Model 7965-S42 is compatible with past and present Power
Systems servers and provides an excellent 19-inch rack enclosure for your data center.
This is a 19" rack cabinet that provides 42U of rack space for use with rack-mounted,
non-blade servers, and I/O drawers. Its 600 mm (23.6 in.) width combined with its 1070 mm
(42.1 in.) depth plus its 42 EIA enclosure capacity provides great footprint efficiency for your
systems and allows it to be easily placed on standard 24-inch floor tiles, allowing for better
thermal and cable management capabilities.
Another difference between the 7965-S42 model rack and the 7014-T42 model rack is that
the “top hat” is on the 40U and 41U boundary instead of the 36U and 37U boundary in the
7014-T42 model.
The IBM power distribution units (PDU) are mounted vertically in four (4) side bays, two (2) on
each side. After the side bays have been filled, PDUs can be mounted horizontally at the rear
of the rack. For more information on IBM PDUs, please see 3.5.2, “AC power distribution
units” on page 52
To allow maximum airflow through the datacenter and the rack cabinets, filler panels are
mounted in the front of the rack in empty EIA locations and the rack offers perforated front and
rear door designs. A front view of the 7965-S42 rack can be seen in Figure 3-9.

Figure 3-9 IBM 7965-S42 racks front view
Ballasts for additional stability will be available therefore it is expected that the 7965-S42
racks will not require the depopulate rules above the 32 EIA location as required with
7014-T42 rack models.
Optional water cooling

When opting for model 8335-GTW of the Power AC922 server, water cooling is mandatory
and therefore the 7965-S42 rack must be ordered with the water cooling option (features
#ECR3 or #ECR4). There are no MES for these features in the field.
These features represents a manifold for water cooling and provides water supply and water
return for one to 20 servers mounted in a 7965-S42 Enterprise Slim Rack.
The feature #ECR3 indicates the manifold with water input and output at the top of the rack.
The feature #ECR4 can be used to order the manifold with water input and output at the
bottom of the rack. Since the hose exits may require some space inside the rack, it is
recommended that a 2U space must be left vacant on the top or bottom of the rack depending
on the location of the hoses chosen. Figure 3-10 shows both options of water input and
output.


Figure 3-10 Top and bottom water input and output for 7965-S42 rack
Figure 3-11 shows a datacenter rock if 7965-S42 racks with water input and output at the top
of the rack.

Figure 3-11 Datacenter rack row with water input and output at the top of racks
The manifold is mounted on the right side of the rack as viewed from the rear and extends for
40U. The manifold does not interfere with the placement of servers or other I/O drawers.
Quick connect fittings are located every 2U on the manifold for water supply and return
providing 20 pairs of fittings.
Figure 3-12 on page 52 shows a manifold for the 7965-S42 rack.

Figure 3-12 Manifold for the 7965-S42 rack
The servers are connected to the manifold through quick-connects. Supply and return hoses
from the manifold to the server are provided as part the server cooling feature.
The manifold has one cold water inlet that leads to the rack and one warm water outlet. Two
4.25 m (14-foot) hose kits are provided with the manifold to connect water supply and return.
Outer diameter of the hoses is approximately 34.5 mm (1.36 in).
You must provide a 1-inch ID barb fitting to attach your facility to the hose kit for each hose.
Only clean, filtered, chemically treated water must be used, not generic building water.
For more information, see the site and hardware planning documentation at:
https://www.ibm.com/support/knowledgecenter/POWER8/p8had/p8had_7965s_watercool.htm
Important: Avoid vertically mounted PDUs on the right side as viewed from the rear of the
rack. The manifold makes access to PDU impossible. Use either horizontally mounted
PDUs, or use vertically mounted PDUs on the left side of the rack.
3.5.2 AC power distribution units

AC power distribution is done via power distribution units (PDUs).

PDUs include the AC power distribution unit #7188 and the AC Intelligent PDU+ #7109. The
Intelligent PDU+ is identical to #7188 PDUs, but it is equipped with one Ethernet port, one
console serial port, and one RS232 serial port for power monitoring.
The PDUs have 12 client-usable IEC 320-C13 outlets. Six groups of two outlets are fed by six
circuit breakers. Each outlet is rated up to 10 amps, but each group of two outlets is fed from
one 15 amp circuit breaker.
Four PDUs can be mounted vertically in the back of the T00 and T42 racks. Figure 3-13
shows the placement of the four vertically mounted PDUs. In the rear of the rack, two
additional PDUs can be installed horizontally in the T00 rack and three in the T42 rack. The
four vertical mounting locations are filled first in the T00 and T42 racks. Mounting PDUs
horizontally consumes 1U per PDU and reduces the space that is available for other racked
components. When mounting PDUs horizontally, the preferred approach is to use fillers in the
EIA units that are occupied by these PDUs to facilitate the correct airflow and ventilation in the
rack.
Rack Rear View
Circuit breaker reset

Status LED
3 4
1 2
Figure 3-13 PDU placement and PDU view

The PDU receives power through a UTG0247 power-line connector. Each PDU requires one
PDU-to-wall power cord. Various power cord features are available for various countries and
applications by varying the PDU-to-wall power cord, which must be ordered separately. Each
power cord provides the unique design characteristics for the specific power requirements. To
match new power requirements and save previous investments, these power cords can be
requested with an initial order of the rack or with a later upgrade of the rack features.
Table 3-4 shows the available wall power cord options for the PDU and iPDU features, which
must be ordered separately.
Table 3-4 Wall power cord options for the PDU and iPDU features
Feature Wall plug Rated voltage Phase Rated amperage Geography

code (Vac)
6653 IEC 309, 230 3 16 amps/phase Internationally available

3P+N+G, 16A
6489 IEC309 230 3 32 amps/phase EMEA

3P+N+G, 32A
6654 NEMA L6-30 200 - 208, 240 1 24 amps US, Canada, LA, and Japan
6655 RS 3750DP 200 - 208, 240 1 24 amps US, Canada, LA, and Japan
(watertight)
6656 IEC 309, 230 1 24 amps EMEA

P+N+G, 32A
6657 PDL 230 - 240 1 32 amps Australia and New Zealand
6658 Korean plug 220 1 30 amps North and South Korea
6492 IEC 309, 2P+G, 200 - 208, 240 1 48 amps US, Canada, LA, and Japan
60A
6491 IEC 309, P+N+G, 230 1 63 amps EMEA

63A
Notes: Ensure that the correct power cord feature is configured to support the power that
is being supplied. Based on the power cord that is used, the PDU can supply 4.8 - 19.2
kVA. The power of all of the drawers that are plugged into the PDU must not exceed the
power cord limitation.
The Universal PDUs are compatible with previous models.
To better enable electrical redundancy, each server has two power supplies that must be
connected to separate PDUs, which are not included in the base order.
For maximum availability, a preferred approach is to connect power cords from the same
system to two separate PDUs in the rack, and to connect each PDU to independent power
sources.
For detailed power requirements and power cord details about the 7014 racks, see IBM
Knowledge Center:
http://www.ibm.com/support/knowledgecenter/api/redirect/powersys/v3r1m5/topic/p7ha
d/p7hadrpower.htm

For detailed power requirements and power cord details about the 7965-94Y rack, see IBM
Knowledge Center:
http://www.ibm.com/support/knowledgecenter/api/redirect/powersys/v3r1m5/topic/p7ha
d/p7hadkickoff795394x.htm
3.5.3 Rack-mounting rules

Consider the following primary rules when you mount the system into a rack:
򐂰 The system can be placed at any location in the rack. For rack stability, start filling a rack
from the bottom.
򐂰 Any remaining space in the rack can be used to install other systems or peripheral devices
if the maximum permissible weight of the rack is not exceeded and the installation rules for
these devices are followed.
򐂰 Before placing the system into the service position, be sure to follow the rack
manufacturer’s safety instructions regarding rack stability.
3.5.4 OEM racks

The system can be installed in a suitable OEM rack if that the rack conforms to the EIA-310-D
standard for 19-inch racks. This standard is published by the Electrical Industries Alliance. For
more information, see IBM Power Systems Hardware IBM Knowledge Center:
http://www.ibm.com/support/knowledgecenter/api/redirect/systems/scope/hw/index.jsp
The website mentions the following key points:

򐂰 The front rack opening must be 450 mm wide ± 0.75 mm (17.72 in. ± 0.03 in.).Figure 0-1
on page 56 is a top view that shows the specification dimensions.

Figure 0-1 Top view of rack specification dimensions (not specific to IBM)
򐂰 The rail-mounting holes must be 465 mm ± 0.8 mm (18.3 in. ± 0.03 in.) apart on-center
(horizontal width between the vertical columns of holes on the two front-mounting flanges
and on the two rear-mounting flanges) as seen on Figure 3-14 on page 57.

Figure 3-14 Mounting flange dimensions
򐂰 The vertical distance between the mounting holes must consist of sets of three holes
spaced (from bottom to top) 15.9 mm (0.625 in.), 15.9 mm (0.625 in.), and 12.67 mm (0.5
in.) on-center, which makes each three-hole set of vertical hole spacing 44.45 mm
(1.75 in.) apart on center. Rail-mounting holes must be 7.1 mm ± 0.1 mm (0.28 in. ±
0.004 in.) in diameter. Figure 0-2 on page 58 shows the top front specification dimensions.

Figure 0-2 Rack specification dimensions top front view
򐂰 A minimum rack opening width of 500 mm (19.69 in.) for a depth of 330 mm (12.99 in.) is
needed behind the installed system for maintenance, service and cable management.
Recommended depth is at least 254 mm (10 in.) of depth within the rack from the rear rack
mount flange to the frame line as shows in Figure 3-15 on page 59.

Figure 3-15 OEM rack opening depth


Draft Document for Review March 5, 2018 1:24 pm 5472bibl.fm
Related publications
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this paper.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Note that some publications referenced in this list might be available in softcopy
only.
򐂰 IBM Power System S822LC for High Performance Computing Introduction and Technical
Overview, REDP-5405
򐂰 IBM PowerAI: Deep Learning Unleashed on IBM Power Systems, SG24-8409
You can search for, view, download or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
򐂰 OpenPOWER Foundation
https://openpowerfoundation.org/
򐂰 NVIDIA Tesla V100
https://www.nvidia.com/en-us/data-center/tesla-v100/
򐂰 NVIDIA Tesla V100 Performance Guide
http://images.nvidia.com/content/pdf/volta-marketing-v100-performance-guide-us-
r6-web.pdf
򐂰 IBM Portal for OpenPOWER - POWER9 Monza Module
https://www-355.ibm.com/systems/power/openpower/tgcmDocumentRepository.xhtml?al
iasId=POWER9_Monza
򐂰 OpenCAPI
http://opencapi.org/technical/use-cases/
Help from IBM

IBM Support and downloads
ibm.com/support
IBM Global Services

ibm.com/services

5472bibl.fm Draft Document for Review March 5, 2018 1:24 pm

Back cover
Draft Document for Review March 5, 2018 1:28 pm
REDP-5472-00
ISBN DocISBN
Printed in U.S.A.
®
ibm.com/redbooks

IBM Power System AC922 Introduction and Technical Overview: Paper

Uploaded by

Copyright:

Available Formats

IBM Power System AC922 Introduction and Technical Overview: Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IBM Power System AC922 Introduction and Technical Overview: Paper

Uploaded by

Copyright:

Available Formats

Front cover

Draft Document for Review March 5, 2018 1:24 pm REDP-5472-00

IBM Power System AC922

Alexandre Bicas Caldeira

International Technical Support Organization

IBM Power System AC922 Introduction and Technical

First Edition (March 2018)

© Copyright International Business Machines Corporation 2018. All rights reserved.

Chapter 1. Product summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2. System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

© Copyright IBM Corp. 2018. All rights reserved. iii

Chapter 3. Physical infrastructure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

iv IBM Power System AC922 Introduction and Technical Overview

INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS”

© Copyright IBM Corp. 2018. All rights reserved. v

The following terms are trademarks of other companies:

vi IBM Power System AC922 Introduction and Technical Overview

© Copyright IBM Corp. 2018. All rights reserved. vii

The project that produced this publication was managed by:

Thanks to the following people for their contributions to this project:

Now you can become a published author, too!

Stay connected to IBM Redbooks

viii IBM Power System AC922 Introduction and Technical Overview

򐂰 Look for us on LinkedIn:

x IBM Power System AC922 Introduction and Technical Overview

Chapter 1. Product summary

© Copyright IBM Corp. 2018. All rights reserved. 1

1.1 Key server features

The system includes several features to improve performance:

2 IBM Power System AC922 Introduction and Technical Overview

– 125 TFLOPs per GPU for deep learning

% & '*-$( 0 6'%"'

 , ! ',$(

Figure 1-2 Location of server main components

Chapter 1. Product summary 3

1.2 Server models

Table 1-1 Summary of Power AC922 server available models

8335-GTG 2 1 TB 4 Air cooled

8335-GTW 2 1 TB 6 Water cooled

1.2.1 Power AC922 server model 8335-GTG

4 IBM Power System AC922 Introduction and Technical Overview

1.2.2 Power AC922 server model 8335-GTW

Chapter 1. Product summary 5

1.2.3 Minimum features

6 IBM Power System AC922 Introduction and Technical Overview

Chapter 1. Product summary 7

8 IBM Power System AC922 Introduction and Technical Overview

Chapter 2. System architecture

© Copyright IBM Corp. 2018. All rights reserved. 9

2.1 System architecture

 "#   ! 

10 IBM Power System AC922 Introduction and Technical Overview

Figure 2-2 Integrated SATA connector

Chapter 2. System architecture 11

 *&)("$ 00)2   

 0)*  0)*  0)*

Figure 2-3 POWER9 chip external connectivity

12 IBM Power System AC922 Introduction and Technical Overview

(!/&.&'0!2'00)".&1&.4 (!/&.&'0!2'00)".&1&.4  '00)".&1&.4

% & '*-$( 0 6'%"'

, ! ',$(

"# !

*&)("$ 00)2

0)* 0)* 0)*

(!/&.&'0!2'00)".&1&.4 (!/&.&'0!2'00)".&1&.4 '00)".&1&.4

4.)+ 4.)+ 4*.)+

* * *

('")'( ('")'(

'2.! '2.! '2.! '2.! '2.! '2.!

&002 '4."+&.&+&Y"! '1"( '4."+&.&+&Y"!

%"0."( '00" . %"0."( '00" .

%"0."( '00" . %"0."( '00" .