Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4722

IO Errors during the failover - SLES 11 SP2 - Lustre 2.4.2

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.4.2
    • SLES 11 SP2
      Lustre 2.4.2
    • 3
    • 12978

    Description

      We have applied the patch provided in teh LU-3645. And still the customer complains that the issue is can be reproduced.

      Attaching the latest set of logs.

      The issue re-occured on 18th Feb.

      Attachments

        Activity

          [LU-4722] IO Errors during the failover - SLES 11 SP2 - Lustre 2.4.2
          pjones Peter Jones added a comment -

          Any update Rajesh?

          pjones Peter Jones added a comment - Any update Rajesh?

          We are in the process of applying the patch. I will get back to you with the results.

          rganesan@ddn.com Rajeshwaran Ganesan added a comment - We are in the process of applying the patch. I will get back to you with the results.

          Hi Rajesh,

          What is the result of the test?

          Thanks.

          hongchao.zhang Hongchao Zhang added a comment - Hi Rajesh, What is the result of the test? Thanks.

          there is a bug in obd_str2uuid,

           static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp)
           {
                  strncpy((char *)uuid->uuid, tmp, sizeof(*uuid));
                  uuid->uuid[sizeof(*uuid) - 1] = '\0';
           }
          

          it take "tmp" also as a implicit "obd_uuid" type, but it isn't in all cases, such as in "class_add_uuid", the "tmp" is
          "lustre_cfg_string(lcfg, 1)", and obd_str2uuid will copy some undefined data beyond the "tmp" to "uuid" and could cause two same
          "uuid" in config were thought to be different.

          the patch against b2_4 is tracked at http://review.whamcloud.com/#/c/10269/

          Hi Rajesh,
          Could you please try the patch in your site?
          Thanks!

          hongchao.zhang Hongchao Zhang added a comment - there is a bug in obd_str2uuid, static inline void obd_str2uuid(struct obd_uuid *uuid, const char *tmp) { strncpy(( char *)uuid->uuid, tmp, sizeof(*uuid)); uuid->uuid[sizeof(*uuid) - 1] = '\0' ; } it take "tmp" also as a implicit "obd_uuid" type, but it isn't in all cases, such as in "class_add_uuid", the "tmp" is "lustre_cfg_string(lcfg, 1)", and obd_str2uuid will copy some undefined data beyond the "tmp" to "uuid" and could cause two same "uuid" in config were thought to be different. the patch against b2_4 is tracked at http://review.whamcloud.com/#/c/10269/ Hi Rajesh, Could you please try the patch in your site? Thanks!

          Hello Hongchao,

          I have uploaded the requested log files into ftp.whamcloud.com:/uploads/LU-4722

          2014-05-08-SR30502_pfs2n17.llog.gz

          Thanks,
          Rajesh

          rganesan@ddn.com Rajeshwaran Ganesan added a comment - Hello Hongchao, I have uploaded the requested log files into ftp.whamcloud.com:/uploads/ LU-4722 2014-05-08-SR30502_pfs2n17.llog.gz Thanks, Rajesh

          there is no error in these configs.

          Could you please collect the debug logs(lctl dk >XXX.log) at the problematic node just after mounting the client (make sure the "ha" is contained in "/proc/sys/lnet/debug")?
          Thanks very much!

          hongchao.zhang Hongchao Zhang added a comment - there is no error in these configs. Could you please collect the debug logs(lctl dk >XXX.log) at the problematic node just after mounting the client (make sure the "ha" is contained in "/proc/sys/lnet/debug")? Thanks very much!

          There seems to be some confusion about our systems. The system with
          prefix pfsc and IP addresses 172.26.4.x is our test system and the
          system with prefix pfs2 and IP addresses 172.26.17.x is our production
          system. We had seen the issue on both systems and therefore provided
          logs from both systems. The configuration of both systems should be
          very similar and also the config was newly generated on both systems.

          After the config was newly generated the remaining issue is that
          duplicate IP addresses appear as failover_nids on the servers only.
          This appears for /proc/fs/lustre/osp/*/import on the MDS but
          astonishingly only on pfsc and not on pfs2. It also appears for
          /proc/fs/lustre/osc/*/import if we mount the file systems on servers
          but astonishingly only for some OSTs. This appears if we mount the file
          system on an OSS, on an MDS which is currently MDT for another file
          system or on an currently unused (failover) MDS.

          and also, uploading the logs into ftp site

          rganesan@ddn.com Rajeshwaran Ganesan added a comment - There seems to be some confusion about our systems. The system with prefix pfsc and IP addresses 172.26.4.x is our test system and the system with prefix pfs2 and IP addresses 172.26.17.x is our production system. We had seen the issue on both systems and therefore provided logs from both systems. The configuration of both systems should be very similar and also the config was newly generated on both systems. After the config was newly generated the remaining issue is that duplicate IP addresses appear as failover_nids on the servers only. This appears for /proc/fs/lustre/osp/*/import on the MDS but astonishingly only on pfsc and not on pfs2. It also appears for /proc/fs/lustre/osc/*/import if we mount the file systems on servers but astonishingly only for some OSTs. This appears if we mount the file system on an OSS, on an MDS which is currently MDT for another file system or on an currently unused (failover) MDS. and also, uploading the logs into ftp site
          hongchao.zhang Hongchao Zhang added a comment - - edited
          /proc/fs/lustre/osc/pfs2dat2-OST0012-osc-ffff881033429400/import:
          failover_nids: [172.26.8.15@o2ib, 172.26.8.15@o2ib, 172.26.8.14@o2ib]
          

          Do you mount Lustre client at MDT? normally, the OSC(is OSP actually, and there is a symlink in /proc/fs/lustre/osc/
          for each entry in /proc/fs/lustre/osp/) name for MDT is (fsname)-OSTXXXX-osc-MDT0000,
          the name for client is (fsname)OSTXXXX-osc(address of superblock).

          hongchao.zhang Hongchao Zhang added a comment - - edited /proc/fs/lustre/osc/pfs2dat2-OST0012-osc-ffff881033429400/import: failover_nids: [172.26.8.15@o2ib, 172.26.8.15@o2ib, 172.26.8.14@o2ib] Do you mount Lustre client at MDT? normally, the OSC(is OSP actually, and there is a symlink in /proc/fs/lustre/osc/ for each entry in /proc/fs/lustre/osp/) name for MDT is (fsname)-OSTXXXX-osc-MDT0000, the name for client is (fsname) OSTXXXX-osc (address of superblock).
          hongchao.zhang Hongchao Zhang added a comment - - edited

          the configs seems fine.
          Do you also regenerate the config of pfscdat2? it adds "172.26.17.4@o2ib" and "172.26.17.3@o2ib" as target node, and it was "172.26.4.4@o2ib" and "172.26.4.3@o2ib" previously.

          cmd=cf010 marker=10(0x1)pfscdat2-OST0000 'add osc'
          cmd=cf005 nid=172.26.17.4@o2ib(0x50000ac1a1104) 0:(null)  1:172.26.17.4@o2ib
          cmd=cf001 0:pfscdat2-OST0000-osc  1:osc  2:pfscdat2-clilov_UUID
          cmd=cf003 0:pfscdat2-OST0000-osc  1:pfscdat2-OST0000_UUID  2:172.26.17.4@o2ib
          cmd=cf005 nid=172.26.17.4@o2ib(0x50000ac1a1104) 0:(null)  1:172.26.17.4@o2ib
          cmd=cf00b 0:pfscdat2-OST0000-osc  1:172.26.17.4@o2ib
          cmd=cf005 nid=172.26.17.3@o2ib(0x50000ac1a1103) 0:(null)  1:172.26.17.3@o2ib
          cmd=cf00b 0:pfscdat2-OST0000-osc  1:172.26.17.3@o2ib 
          cmd=cf00d 0:pfscdat2-clilov  1:pfscdat2-OST0000_UUID  2:0  3:1
          cmd=cf010 marker=10(0x2)pfscdat2-OST0000 'add osc'
          

          Does the issue occur again in the new generated config?

          btw, since only the OSC(OSP) on the MDT is affected, could you please dump the config of MDT (the config-uuid-name is fsname-MDT0000, say, pfscdat2-MDT0000).
          Thanks!

          hongchao.zhang Hongchao Zhang added a comment - - edited the configs seems fine. Do you also regenerate the config of pfscdat2? it adds "172.26.17.4@o2ib" and "172.26.17.3@o2ib" as target node, and it was "172.26.4.4@o2ib" and "172.26.4.3@o2ib" previously. cmd=cf010 marker=10(0x1)pfscdat2-OST0000 'add osc' cmd=cf005 nid=172.26.17.4@o2ib(0x50000ac1a1104) 0:(null) 1:172.26.17.4@o2ib cmd=cf001 0:pfscdat2-OST0000-osc 1:osc 2:pfscdat2-clilov_UUID cmd=cf003 0:pfscdat2-OST0000-osc 1:pfscdat2-OST0000_UUID 2:172.26.17.4@o2ib cmd=cf005 nid=172.26.17.4@o2ib(0x50000ac1a1104) 0:(null) 1:172.26.17.4@o2ib cmd=cf00b 0:pfscdat2-OST0000-osc 1:172.26.17.4@o2ib cmd=cf005 nid=172.26.17.3@o2ib(0x50000ac1a1103) 0:(null) 1:172.26.17.3@o2ib cmd=cf00b 0:pfscdat2-OST0000-osc 1:172.26.17.3@o2ib cmd=cf00d 0:pfscdat2-clilov 1:pfscdat2-OST0000_UUID 2:0 3:1 cmd=cf010 marker=10(0x2)pfscdat2-OST0000 'add osc' Does the issue occur again in the new generated config? btw, since only the OSC(OSP) on the MDT is affected, could you please dump the config of MDT (the config-uuid-name is fsname-MDT0000, say, pfscdat2-MDT0000). Thanks!

          tunefs.lustre --erase-params --mgsnode=172.26.8.12@o2ib
          --mgsnode=172.26.8.13@o2ib --servicenode=172.26.8.14@o2ib
          --servicenode=172.26.8.15@o2ib /dev/mapper/ost_pfs2dat2_0
          tunefs.lustre --erase-params --mgsnode=172.26.8.12@o2ib
          --mgsnode=172.26.8.13@o2ib --servicenode=172.26.8.14@o2ib
          --servicenode=172.26.8.15@o2ib /dev/mapper/ost_pfs2dat2_1

          2. I have uploaded the files in the FTP server

          3. only the OSC on the MDT is affected, having the duplicate entries. Anyway, we should find the reason for the wrong behaviour
          of the servers.

          rganesan@ddn.com Rajeshwaran Ganesan added a comment - tunefs.lustre --erase-params --mgsnode=172.26.8.12@o2ib --mgsnode=172.26.8.13@o2ib --servicenode=172.26.8.14@o2ib --servicenode=172.26.8.15@o2ib /dev/mapper/ost_pfs2dat2_0 tunefs.lustre --erase-params --mgsnode=172.26.8.12@o2ib --mgsnode=172.26.8.13@o2ib --servicenode=172.26.8.14@o2ib --servicenode=172.26.8.15@o2ib /dev/mapper/ost_pfs2dat2_1 2. I have uploaded the files in the FTP server 3. only the OSC on the MDT is affected, having the duplicate entries. Anyway, we should find the reason for the wrong behaviour of the servers.

          what is the command line to create the failover nid?

          Could you please dump the config file of your system,
          at MGS node:
          lctl>dl (will show the device list, it will show the device number at the first column)
          lctl>device #MGS (MGS index, say, 1)
          lctl>dump_cfg pfs2dat2-client (it will dump the config to syslog)

          normally one nid (Node Id) will have only one UUID to represent it (one UUID could have more nid)
          Having duplicated failover nid could cause some recovery problem, for it will use these nids one by one and it is possible to miss the recovery window.

          Thanks

          hongchao.zhang Hongchao Zhang added a comment - what is the command line to create the failover nid? Could you please dump the config file of your system, at MGS node: lctl>dl (will show the device list, it will show the device number at the first column) lctl>device #MGS (MGS index, say, 1) lctl>dump_cfg pfs2dat2-client (it will dump the config to syslog) normally one nid (Node Id) will have only one UUID to represent it (one UUID could have more nid) Having duplicated failover nid could cause some recovery problem, for it will use these nids one by one and it is possible to miss the recovery window. Thanks

          People

            hongchao.zhang Hongchao Zhang
            rganesan@ddn.com Rajeshwaran Ganesan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: