Details

    • Task
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.5.2
    • None
    • Lustre 2.5.2, Cent OS 6.5, Dual IB port.
      DRBD, Heartbeat

    Description

      Dear Team,

      We are looking for HA in terms of IO node, IB port failure.

      In our setup
      IO1 Role MGS, MDT and OST
      IO2 Role MGS, MDT and OST

      2 IB switch independently connected with IO1 and IO2 server.

      IO1-ib0 192.168.2.101
      IO1-ib1 192.168.2.102
      IO2-ib0 192.168.3.101
      I02-ib1 192.168.3.102

      LNET options lnet networks=o2ib0(ib0),o2ib1(ib1)
      Lustre setup
      Prebuilt server rpm
      IO1 and IO2 servers would be on both core networks, o2ib0 and o2ib1

      ssh IO2 "mkfs.lustre --reformat --fsname=lustre --mdt --mgs --index=0 --failnode=192.168.3.101@o2ib1 --failnode=192.168.2.102@o2ib0 --failnode=192.168.3.102@o2ib1 /dev/drbd0"

      ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=0 \
      --mgsnode=192.168.2.101@o2ib0 \
      --mgsnode=192.168.3.101@o2ib1 \
      --mgsnode=192.168.2.102@o2ib0 \
      --mgsnode=192.168.3.102@o2ib1 \
      --failnode=192.168.2.101@o2ib0 \
      --failnode=192.168.3.101@o2ib1 \
      /dev/mapper/ost-0"
      ssh IO2 "mount -t lustre /dev/mapper/ost-0 /OST0"
      ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=1 \
      --mgsnode=192.168.2.101@o2ib0 \
      --mgsnode=192.168.3.101@o2ib1 \
      --mgsnode=192.168.2.102@o2ib0 \
      --mgsnode=192.168.3.102@o2ib1 \
      --failnode=192.168.2.101@o2ib0 \
      --failnode=192.168.3.101@o2ib1 \
      /dev/mapper/ost-1"
      ssh IO2 "mount -t lustre /dev/mapper/ost-1 /OST1"

      ssh IO2 "mkfs.lustre --reformat --fsname=lustre --ost --index=2 \
      --mgsnode=192.168.2.101@o2ib0 \
      --mgsnode=192.168.3.101@o2ib1 \
      --mgsnode=192.168.2.102@o2ib0 \
      --mgsnode=192.168.3.102@o2ib1 \
      --failnode=192.168.2.101@o2ib0 \
      --failnode=192.168.3.101@o2ib1 \
      /dev/mapper/ost-2"

      ssh IO2 "mount -t lustre /dev/mapper/ost-2 /OST2"

            1. The Failover will be IO2
              ssh IO1 "mkfs.lustre --reformat --fsname=lustre --ost --index=3 \
              --mgsnode=192.168.2.101@o2ib0 \
              --mgsnode=192.168.3.101@o2ib1 \
              --mgsnode=192.168.2.102@o2ib0 \
              --mgsnode=192.168.3.102@o2ib1 \
              --failnode=192.168.2.102@o2ib0 \
              --failnode=192.168.3.102@o2ib1 \
              /dev/mapper/ost-3"
              ssh IO1 "mount -t lustre /dev/mapper/ost-3 /OST3"

      ssh IO1 "mkfs.lustre --reformat --fsname=lustre --ost --index=4 \
      --mgsnode=192.168.2.101@o2ib0 \
      --mgsnode=192.168.3.101@o2ib1 \
      --mgsnode=192.168.2.102@o2ib0 \
      --mgsnode=192.168.3.102@o2ib1 \
      --failnode=192.168.2.102@o2ib0 \
      --failnode=192.168.3.102@o2ib1 \
      /dev/mapper/ost-4"
      ssh IO1 "mount -t lustre /dev/mapper/ost-4 /OST4"

      Is the above configuration is correct

      Thank You
      Atul Yadav

      Attachments

        Activity

          [LU-5421] MGS and MDT with dual ib port
          atulyadavtech Atul Yadav added a comment -

          After executing below command, we are getting error
          ++++++++++++++++++++++++++++++++++++++
          [root@IO1 ~]# modprobe -v lustre
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/libcfs.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lvfs.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/lnet.ko networks=o2ib0(ib0),o2ib1(ib1)
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/obdclass.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/ptlrpc.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fld.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fid.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/mdc.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/osc.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lov.ko
          insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lustre.ko

          mkfs.lustre --reformat --fsname=lustre --mdt --mgs --index=0 --failnode=192.168.3.101@o2ib1 --failnode=192.168.2.102@o2ib0 --failnode=192.168.3.102@o2ib1 /dev/drbd0

          Permanent disk data:
          Target: lustre:MDT0000
          Index: 0
          Lustre FS: lustre
          Mount type: ldiskfs
          Flags: 0x65
          (MDT MGS first_time update )
          Persistent mount opts: user_xattr,errors=remount-ro
          Parameters: failover.node=192.168.3.101@o2ib1 failover.node=192.168.2.102@o2ib failover.node=192.168.3.102@o2ib1

          device size = 380916MB
          formatting backing filesystem ldiskfs on /dev/drbd0
          target name lustre:MDT0000
          4k blocks 97514583
          options -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F
          mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F /dev/drbd0 97514583
          Writing CONFIGS/mountdata

          [root@IO1 ~]# mount -v -t lustre /dev/drbd0 /MDT
          arg[0] = /sbin/mount.lustre
          arg[1] = -v
          arg[2] = -o
          arg[3] = rw
          arg[4] = /dev/drbd0
          arg[5] = /MDT
          source = /dev/drbd0 (/dev/drbd0), target = /MDT
          options = rw
          checking for existing Lustre data: found
          Reading CONFIGS/mountdata
          Writing CONFIGS/mountdata
          mounting device /dev/drbd0 at /MDT, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgs,virgin,update,param=failover.node=192.168.3.101@o2ib1,param=failover.node=192.168.2.102@o2ib,param=failover.node=192.168.3.102@o2ib1,svname=lustre-MDT0000,device=/dev/drbd0
          mount.lustre: cannot parse scheduler options for '/sys/block/drbd0/queue/scheduler'
          mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address retries left: 0
          mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address

          +++++++++++++++++++++++++++++++++

          Thank YOU
          Atul Yadav

          atulyadavtech Atul Yadav added a comment - After executing below command, we are getting error ++++++++++++++++++++++++++++++++++++++ [root@IO1 ~] # modprobe -v lustre insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/libcfs.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lvfs.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/net/lustre/lnet.ko networks=o2ib0(ib0),o2ib1(ib1) insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/obdclass.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/ptlrpc.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fld.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/fid.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/mdc.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/osc.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lov.ko insmod /lib/modules/2.6.32-431.17.1.el6_lustre.x86_64/extra/kernel/fs/lustre/lustre.ko mkfs.lustre --reformat --fsname=lustre --mdt --mgs --index=0 --failnode=192.168.3.101@o2ib1 --failnode=192.168.2.102@o2ib0 --failnode=192.168.3.102@o2ib1 /dev/drbd0 Permanent disk data: Target: lustre:MDT0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x65 (MDT MGS first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: failover.node=192.168.3.101@o2ib1 failover.node=192.168.2.102@o2ib failover.node=192.168.3.102@o2ib1 device size = 380916MB formatting backing filesystem ldiskfs on /dev/drbd0 target name lustre:MDT0000 4k blocks 97514583 options -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F /dev/drbd0 97514583 Writing CONFIGS/mountdata [root@IO1 ~] # mount -v -t lustre /dev/drbd0 /MDT arg [0] = /sbin/mount.lustre arg [1] = -v arg [2] = -o arg [3] = rw arg [4] = /dev/drbd0 arg [5] = /MDT source = /dev/drbd0 (/dev/drbd0), target = /MDT options = rw checking for existing Lustre data: found Reading CONFIGS/mountdata Writing CONFIGS/mountdata mounting device /dev/drbd0 at /MDT, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgs,virgin,update,param=failover.node=192.168.3.101@o2ib1,param=failover.node=192.168.2.102@o2ib,param=failover.node=192.168.3.102@o2ib1,svname=lustre-MDT0000,device=/dev/drbd0 mount.lustre: cannot parse scheduler options for '/sys/block/drbd0/queue/scheduler' mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address retries left: 0 mount.lustre: mount /dev/drbd0 at /MDT failed: Cannot assign requested address +++++++++++++++++++++++++++++++++ Thank YOU Atul Yadav

          People

            wc-triage WC Triage
            atulyadavtech Atul Yadav
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: