[LU-5421] MGS and MDT with dual ib port Created: 27/Jul/14  Updated: 28/Jul/14

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.2
Fix Version/s: None

Type: Task Priority: Critical
Reporter: Atul Yadav Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Lustre 2.5.2, Cent OS 6.5, Dual IB port.
DRBD, Heartbeat


Epic/Theme: DUAl-ib-port, Lustre-2.5.2
Rank (Obsolete): 15077

 Description   

Dear Team,

We are looking for HA in terms of IO node, IB port failure.

In our setup
IO1 Role MGS, MDT and OST
IO2 Role MGS, MDT and OST

2 IB switch independently connected with IO1 and IO2 server.

IO1-ib0 192.168.2.101
IO1-ib1 192.168.2.102
IO2-ib0 192.168.3.101
I02-ib1 192.168.3.102

LNET options lnet networks=o2ib0(ib0),o2ib1(ib1)
Lustre setup
Prebuilt server rpm
IO1 and IO2 servers would be on both core networks, o2ib0 and o2ib1

ssh IO2 "mkfs.lustre --reformat --fsname=lustre --mdt --mgs --index=0 --failnode=192.168.3.101@o2ib1 --failnode=192.168.2.102@o2ib0 --failnode=192.168.3.102@o2ib1 /dev/drbd0"

ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=0 \
--mgsnode=192.168.2.101@o2ib0 \
--mgsnode=192.168.3.101@o2ib1 \
--mgsnode=192.168.2.102@o2ib0 \
--mgsnode=192.168.3.102@o2ib1 \
--failnode=192.168.2.101@o2ib0 \
--failnode=192.168.3.101@o2ib1 \
/dev/mapper/ost-0"
ssh IO2 "mount -t lustre /dev/mapper/ost-0 /OST0"
ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=1 \
--mgsnode=192.168.2.101@o2ib0 \
--mgsnode=192.168.3.101@o2ib1 \
--mgsnode=192.168.2.102@o2ib0 \
--mgsnode=192.168.3.102@o2ib1 \
--failnode=192.168.2.101@o2ib0 \
--failnode=192.168.3.101@o2ib1 \
/dev/mapper/ost-1"
ssh IO2 "mount -t lustre /dev/mapper/ost-1 /OST1"

ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=2 \
--mgsnode=192.168.2.101@o2ib0 \
--mgsnode=192.168.3.101@o2ib1 \
--mgsnode=192.168.2.102@o2ib0 \
--mgsnode=192.168.3.102@o2ib1 \
--failnode=192.168.2.101@o2ib0 \
--failnode=192.168.3.101@o2ib1 \
/dev/mapper/ost-2"

ssh IO2 "mount -t lustre /dev/mapper/ost-2 /OST2"

        1. The Failover will be IO2
          ssh IO1 "mkfs.lustre --reformat --fsname=lustre --ost --index=3 \
          --mgsnode=192.168.2.101@o2ib0 \
          --mgsnode=192.168.3.101@o2ib1 \
          --mgsnode=192.168.2.102@o2ib0 \
          --mgsnode=192.168.3.102@o2ib1 \
          --failnode=192.168.2.102@o2ib0 \
          --failnode=192.168.3.102@o2ib1 \
          /dev/mapper/ost-3"
          ssh IO1 "mount -t lustre /dev/mapper/ost-3 /OST3"

ssh IO1 "mkfs.lustre --reformat --fsname=lustre --ost --index=4 \
--mgsnode=192.168.2.101@o2ib0 \
--mgsnode=192.168.3.101@o2ib1 \
--mgsnode=192.168.2.102@o2ib0 \
--mgsnode=192.168.3.102@o2ib1 \
--failnode=192.168.2.102@o2ib0 \
--failnode=192.168.3.102@o2ib1 \
/dev/mapper/ost-4"
ssh IO1 "mount -t lustre /dev/mapper/ost-4 /OST4"

Is the above configuration is correct

Thank You
Atul Yadav



 Comments   
Comment by Atul Yadav [ 27/Jul/14 ]

After executing below command, we are getting error
++++++++++++++++++++++++++++++++++++++
[root@IO1 ~]# modprobe -v lustre
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/libcfs.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lvfs.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/lnet.ko networks=o2ib0(ib0),o2ib1(ib1)
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/obdclass.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/ptlrpc.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fld.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fid.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/mdc.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/osc.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lov.ko
insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lustre.ko

mkfs.lustre --reformat --fsname=lustre --mdt --mgs --index=0 --failnode=192.168.3.101@o2ib1 --failnode=192.168.2.102@o2ib0 --failnode=192.168.3.102@o2ib1 /dev/drbd0

Permanent disk data:
Target: lustre:MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x65
(MDT MGS first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=192.168.3.101@o2ib1 failover.node=192.168.2.102@o2ib failover.node=192.168.3.102@o2ib1

device size = 380916MB
formatting backing filesystem ldiskfs on /dev/drbd0
target name lustre:MDT0000
4k blocks 97514583
options -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F /dev/drbd0 97514583
Writing CONFIGS/mountdata

[root@IO1 ~]# mount -v -t lustre /dev/drbd0 /MDT
arg[0] = /sbin/mount.lustre
arg[1] = -v
arg[2] = -o
arg[3] = rw
arg[4] = /dev/drbd0
arg[5] = /MDT
source = /dev/drbd0 (/dev/drbd0), target = /MDT
options = rw
checking for existing Lustre data: found
Reading CONFIGS/mountdata
Writing CONFIGS/mountdata
mounting device /dev/drbd0 at /MDT, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgs,virgin,update,param=failover.node=192.168.3.101@o2ib1,param=failover.node=192.168.2.102@o2ib,param=failover.node=192.168.3.102@o2ib1,svname=lustre-MDT0000,device=/dev/drbd0
mount.lustre: cannot parse scheduler options for '/sys/block/drbd0/queue/scheduler'
mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address retries left: 0
mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address

+++++++++++++++++++++++++++++++++

Thank YOU
Atul Yadav

Generated at Sat Feb 10 01:51:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.