Technical specifications of NIWA’s HPCF

The specifications for our current machine, and those of the system after it was upgraded in 2013.

Initial system (2010)

Hardware

  • IBM p575/p6 supercomputer with 58 POWER6, 32 way 4.7 GHz nodes for a total of 1856 processors and 5.5 terabytes of memory (28 nodes with 128 GB, and 30 with 64 GB of memory)
  • Four 144 Port Qlogic Infiniband Switches
  • The bandwidth within each 32 way node is as high as 80 GB/s. The nodes can communicate with each other at up to 32 GB/s (i.e. 8 × DDR IB links per node)  with a latency of around 4μs (MPI) for short messages
  • 482.5 terabytes of useable disk storage on 740 disks in an IBM DCS9900 storage array
  • Two IBM TS3500 Automatic Tape Libraries, with 12 LTO-5 drives and each capable of storing 2.5 Petabytes of data (uncompressed). One tape library is for disaster recovery, and contains copies of all the data on the primary library
  • Eight p520/p6 servers for HPC Management, GPFS (General Parallel File System), and TSM (Tivoli Storage Manager) functions
  • A p520/p6 Login Node for external (NeSI) users
  • A BladeCenter with 56 Xeon 2.53 GHz processors and 224 Gigabytes of memory for pre- and post- processing tasks

HPC Software Stack

  • AIX operating system on all p520 servers and p575 nodes
  • General Parallel File System (GPFS) – accessible from all HPCF processors – both POWER 6 and Bladecenter X-Series
  • Tivoli Storage Manager (TSM) to provide backup and transparent movement of data between storage media (Hierarchical Storage Management)
  • LoadLeveler  – to manage and schedule work on the HPCF
  • XL Fortran, C and C++ compilers
  • IBM High Performance Computing Toolkit
  • TotalView

Physical Infrastructure

  • 370 kW to run the system
  • 120 kW for cooling via chilled water – both direct and indirectly via Computer Room Air Conditioners
  • Weight: 12.6 tonnes, including water

Upgraded System (2013)  

FitzRoy was upgraded to the following configuration:

  • IBM p575/p6 supercomputer with 106 × POWER6, 32 way 4.7 GHz nodes, for a total of 3392 processors and 8.1 terabytes of memory (28 nodes with 128 GB, and 78 with 64 GB of memory)
  • 744.5 terabytes of useable disk storage on 1040 disks in two IBM DCS9900 storage arrays
  • Eight 144 Port Qlogic Infiniband Switches[MU1]
  • Eight p520/p6 GPFS servers
  • Four x3550 IB Management servers
  • Approximately 8 PB of magnetic tape storage (accessible to users via the Hierarchical Storage Management System)
  • V7000 SSD subsystem for GPFS Metadata
  • BladeCenter with 113 Xeon 2.53 GHz processors and 448 Gigabytes of memory for pre- and post- processing tasks
  • PureFlex system with 60 Xeon Processor E5-2690 v2, one NVIDIA Tesla K40 GPGPU and 1.4 TB of memory for pre- and post- processing tasks

Barometer Test System

  • Barometer is also an IBM p575/p6 system with two POWER6, 32 way 4.7 GHz nodes, for a total of 64 processors and 128 Gigabytes of memory
  • Two p520/p6 Login nodes
  • Token disk and tape storage storage
  • Two 144 Port Qlogic Infiniband Switches
  • Two p520/p6 GPFS servers
  • Two x3550 IB Management Servers

The HPCF is supported by:

  • 1 Megawatt power system
  • 675 KW of water chilling (expanding to 1375 KW in 2014/15)
  • 840 KW backup generator
  • Automated Control System to manage all power, cooling, fire detection, suppressant discharge and related support systems
Research subject: CoastsOceans