Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8414

DNE: Setting of remote_dir_gid parameter not persistent

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      Error occured during soak testing of build '20160713' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713). MDSes have been configured using ldiskfs, OSTs using zfs. Test environment consist of 4 MDSes with 1 MDT each, 6 OSSes with 4 OSTs each. MDS and OSS nodes are configured in active-active HA configuration.
      Roles:

      • lola-8 MGS/MDS
      • lola-[9-11] MDS

      DNE has been enabled using the command sequence (see Lustre manual page 96):

      pdsh -g mds 'lctl set_param mdt.*.enable_remote_dir=1'
      pdsh -g mds 'lctl set_param mdt.*.enable_remote_dir_gid=-1'
      especially
      pdsh -w lola-8 'lctl set_param -P mdt.*.enable_remote_dir=1'
      pdsh -w lola-8 'lctl set_param -P mdt.*.enable_remote_dir_git=-1'
      

      (The later two commands only work on MGS node).
      Problem occur after each of the node lola-[9-11] have been restarted or resourcres had been failover / failedback.
      While parameter 'enable_remote_dir' is persistent on the non MGS MDSes, the parameter 'enable_remote_dir_gid' isn't.
      Therefore the command:

      [soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1  /mnt/soaked/soaktest/hsm_rbh/
      error on LL_IOC_LMV_SETSTRIPE '/mnt/soaked/soaktest/hsm_rbh/' (3): Operation not permitted
      error: setdirstripe: create stripe dir '/mnt/soaked/soaktest/hsm_rbh/' failed
      
      ---------------
      --> Remote dir setting:
      ----------------
      lola-8
      ----------------
      Remote dir_gid setting soaked-MDT0000: -1
      ----------------
      lola-9
      ----------------
      Remote dir_gid setting soaked-MDT0001: 0
      ----------------
      lola-10
      ----------------
      Remote dir_gid setting soaked-MDT0002: -1
      ----------------
      lola-11
      ----------------
      Remote dir_gid setting soaked-MDT0003: 0
      

      failed. This will break all test (slurm) jobs that rely on this functionality.

      After setting the parameters on the nodes again the command

      [soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1  /mnt/soaked/soaktest/hsm_rbh/
      ^A2[soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1  -D /mnt/soaked/soaktest/hsm_rbh/
      [soaktest@lola-16 ~]$ lfs getdirstripe  /mnt/soaked/soaktest/hsm_rbh/
      /mnt/soaked/soaktest/hsm_rbh/
      [soaktest@lola-16 ~]$ lfs getdirstripe  /mnt/soaked/soaktest/hsm_rbh/
      /mnt/soaked/soaktest/hsm_rbh/
      lmv_stripe_count: 4 lmv_stripe_offset: 1
      mdtidx           FID[seq:oid:ver]
           1           [0x240007160:0x3:0x0]
           2           [0x28000d714:0x3:0x0]
           3           [0x2c000a810:0x1:0x0]
           0           [0x20000fe01:0x3:0x0]
      

      end successful.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              heckes Frank Heckes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: