Monday 15 September 2014

Setting up a two-node HPC cluster with InfiniBand

In this post, I would share how to build a two-node cluster with InfiniBand (IB) interconnection.

I have installed CentOS 7.0 (server with GUI version) with IB and iWARP support.

To install the necessary software

The following are to be done on every node, unless stated otherwise.

$ yum groupinstall "Infiniband Support"
$ yum install infiniband-diags perftest qperf opensm

OpenSM is the subnet manager.
 
Extra steps (optional):

Edit /etc/default/opensm such that
$ cat /etc/default/opensm
PORTS="0x00117500007005aa 0x0011750000700c2a"

The port GUID can be obtained by doing
$ ibstat -p

Activate the services:
$ chkconfig rdma on
$ chkconfig opensm on #only on master node

$ service rdma start
$ service opensm start #only on master node
$ shutdown -r now

After services are started / reboot, hopefully ibstat will show State "Active" and Physical State "LinkUp".

To check network connectivity

Display all switches
$ ibswitches

Display all hosts visible in the network
$ ibhosts

Reports link info
$ iblinkinfo

Testing with ibping
$ ibping -S #on one server
$ ibping -G 0x0011750000700c2a #on another server

Replace the port GUID above with yours.

No comments:

Post a Comment