Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8021

interop: 2.1.x server <-> clients version > 2.3: t-f debugsave() debugrestore() defect

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • Lustre 2.1.5
    • 3
    • 9223372036854775807

    Description

      test-framework.sh :

      debugsave() {
          DEBUGSAVE="$(lctl get_param -n debug)"
      }
      

      – DEBUGSAVE is equal to value set on client

      debugrestore() {
          [ -n "$DEBUGSAVE" ] && \
              do_nodes $(comma_list $(nodes_list)) "$LCTL set_param debug=\\\"${DEBUGSAVE}\\\";"
          DEBUGSAVE=""
      }
      

      – sets debug=$DEBUGSAVE on all nodes including the server nodes.
      I.e. debugrestore () does not restore the initial debug values set on servers, but sets the debug value equal to initial debug value set on client.

      Intel clients (starting from 2.3, D_LFSCK added by LU-957) have some debugging masks are missing on 2.1.x servers.

      libcfs_debug.h :
      #define D_SEC           0x08000000
      #define D_LFSCK         0x10000000 /* For both OI scrub and LFSCK */
      #define D_HSM           0x20000000
      

      The described debugsave() and debugrestore() defect leads the tests to fail when run with PTLDEBUG=-1 because of EINVAL returned by lctl executed on servers:

      + debugrestore
      ...
      + /usr/bin/pdsh -R ssh -S -w fre0205,fre0206,fre0207,fre0208 '(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre"  FSTYPE=ldiskfs sh -c "/usr/sbin/lctl set_param debug=\"trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec lfsck\";")'
      fre0206: error: set_param: writing to file /proc/sys/lnet/debug: Invalid argument
      pdsh@fre0207: fre0206: ssh exited with exit code 1
      
      • Reproducible on 2.1 Server <-> 2.5.1 client.
      == sanity test 63b: async write errors should be returned to fsync ===== 21:43:48 (1457502228)
      debug=-1
      1+0 records in
      1+0 records out
      4096 bytes (4.1 kB) copied, 0.00600947 s, 682 kB/s
      fail_loc=0x80000406
      fsync: Input/output error
      192.18.177.138: error: set_param: setting /proc/sys/lnet/debug=trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec lfsck hsm: Invalid argument
      pdsh@osh-1: 192.18.177.138: ssh exited with exit code 1
      debug=trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec lfsck hsm
      debug=trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec lfsck hsm
      Resetting fail_loc and fail_val on all nodes...done.
      PASS 63b (13s)
      resend_count is set to 4 4
      resend_count is set to 4 4
      resend_count is set to 4 4
      resend_count is set to 4 4
      resend_count is set to 4 4
      == sanity test complete, duration 235 sec == 21:44:02 (1457502242)
      Stopping clients: osh-1.xyus.xyratex.com /mnt/lustre (opts:-f)
      Stopping client osh-1.xyus.xyratex.com /mnt/lustre opts:-f
      Stopping clients: osh-1.xyus.xyratex.com /mnt/lustre2 (opts:-f)
      Stopping /mnt/mds1 (opts:-f) on 192.18.177.138
      Stopping /mnt/ost1 (opts:-f) on 192.18.177.138
      Stopping /mnt/ost2 (opts:-f) on 192.18.177.138
      modules unloaded.
      [root@osh-1 tests]# 
      
      
      
      -----------
      
      2.1.x server
      
      [root@osh-1 ~]# lctl set_param -n debug=-1
      You have new mail in /var/spool/mail/root
      [root@osh-1 ~]# lctl get_param -n debug
      trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec
      [root@osh-1 ~]#
      
      2.5.1 client
      [root@osh-1 tests]# lctl set_param -n debug=-1
      [root@osh-1 tests]# lctl get_param -n debug
      trace inode super ext2 malloc cache info ioctl neterror net warning buffs other dentry nettrace page dlmtrace error emerg ha rpctrace vfstrace reada mmap config console quota sec *lfsck hsm*
      [root@osh-1 tests]#
      
      -------------
      

      Attachments

        Activity

          People

            emoly.liu Emoly Liu
            parinay parinay v kondekar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: