Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4250

sanity-hsm test_15 failure: 'could not rebind file list'

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.5.0
    • 3
    • 11587

    Description

      Test results are at: https://maloo.whamcloud.com/test_sets/5338e98e-4c98-11e3-8ab0-52540035b04c

      From the test log, a call to get_param fails and then errors reported from the copytool:

      CMD: wtm-17vm3 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | egrep 'WAITING|STARTED'
      CMD: wtm-17vm3 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | egrep 'WAITING|STARTED'
      CMD: wtm-17vm3 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | egrep 'WAITING|STARTED'
      CMD: wtm-17vm3 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | egrep 'WAITING|STARTED'
      wwtm-17vm3: error: get_param: read('/proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions') failed: Input/output error
      rebind list of files
      CMD: wtm-17vm5 lhsmtool_posix --archive 2 --hsm-root /home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1		 --rebind /home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/tmp.30189 /mnt/lustre
      wtm-17vm5: lhsmtool_posix[10247]: action=2 src=/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/tmp.30189 dst=(null) mount_point=/mnt/lustre
      wtm-17vm5: lhsmtool_posix[10247]: rebind [0x200000401:0x10:0x0] to [0x200000400:0x2f:0x0]
      wtm-17vm5: lhsmtool_posix[10247]: rebind [0x200000401:0x11:0x0] to [0x200000400:0x31:0x0]
      wtm-17vm5: lhsmtool_posix[10247]: rebind [0x200000401:0x12:0x0] to [0x200000400:0x33:0x0]
      wtm-17vm5: lhsmtool_posix[10247]: rebind [0x200000401:0x13:0x0] to [0x200000400:0x35:0x0]
      wtm-17vm5: lhsmtool_posix[10247]: rebind [0x200000401:0x14:0x0] to [0x200000400:0x37:0x0]
      wtm-17vm5: lhsmtool_posix[10247]: cannot rename '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0014/0000/0401/0000/0002/0000/0x200000401:0x14:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0037/0000/0400/0000/0002/0000/0x200000400:0x37:0x0': No such file or directory (2)
      wtm-17vm5: lhsmtool_posix[10247]: 5 lines read from '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/tmp.30189', 4 rebind successful
      wtm-17vm5: lhsmtool_posix[10247]: process finished, errs: 1 major, 0 minor, rc=-1 (Operation not permitted)
       sanity-hsm test_15: @@@@@@ FAIL: could not rebind file list 
      

      Errors in the copy log are:

      lhsmtool_posix[10198]: data archiving for '/mnt/lustre/.lustre/fid/0x200000401:0x12:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0012/0000/0401/0000/0002/0000/0x200000401:0x12:0x0_tmp' done
      lhsmtool_posix[10198]: attr file for '/mnt/lustre/.lustre/fid/0x200000401:0x12:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0012/0000/0401/0000/0002/0000/0x200000401:0x12:0x0_tmp'
      lhsmtool_posix[10198]: cannot copy xattr of '/mnt/lustre/.lustre/fid/0x200000401:0x12:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0012/0000/0401/0000/0002/0000/0x200000401:0x12:0x0_tmp': No such file or directory (2)
      lhsmtool_posix[10198]: xattr file for '/mnt/lustre/.lustre/fid/0x200000401:0x12:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0012/0000/0401/0000/0002/0000/0x200000401:0x12:0x0_tmp'
      lhsmtool_posix[10198]: cannot get FID of '[0x200000401:0x12:0x0]': No such file or directory (2)
      lhsmtool_posix[10198]: Action completed, notifying coordinator cookie=0x5283b1ac, FID=[0x200000401:0x12:0x0], hp_flags=0 err=2
      lhsmtool_posix[10198]: llapi_hsm_action_end() on '/mnt/lustre/.lustre/fid/0x200000401:0x12:0x0' ok (rc=0)
      lhsmtool_posix[10199]: data archiving for '/mnt/lustre/.lustre/fid/0x200000401:0x13:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0013/0000/0401/0000/0002/0000/0x200000401:0x13:0x0_tmp' done
      lhsmtool_posix[10199]: attr file for '/mnt/lustre/.lustre/fid/0x200000401:0x13:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0013/0000/0401/0000/0002/0000/0x200000401:0x13:0x0_tmp'
      lhsmtool_posix[10199]: cannot copy xattr of '/mnt/lustre/.lustre/fid/0x200000401:0x13:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0013/0000/0401/0000/0002/0000/0x200000401:0x13:0x0_tmp': No such file or directory (2)
      lhsmtool_posix[10199]: xattr file for '/mnt/lustre/.lustre/fid/0x200000401:0x13:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0013/0000/0401/0000/0002/0000/0x200000401:0x13:0x0_tmp'
      lhsmtool_posix[10199]: cannot get FID of '[0x200000401:0x13:0x0]': No such file or directory (2)
      lhsmtool_posix[10199]: Action completed, notifying coordinator cookie=0x5283b1ad, FID=[0x200000401:0x13:0x0], hp_flags=0 err=2
      lhsmtool_posix[10199]: llapi_hsm_action_end() on '/mnt/lustre/.lustre/fid/0x200000401:0x13:0x0' ok (rc=0)
      lhsmtool_posix[10200]: data archiving for '/mnt/lustre/.lustre/fid/0x200000401:0x14:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0014/0000/0401/0000/0002/0000/0x200000401:0x14:0x0_tmp' done
      lhsmtool_posix[10200]: attr file for '/mnt/lustre/.lustre/fid/0x200000401:0x14:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0014/0000/0401/0000/0002/0000/0x200000401:0x14:0x0_tmp'
      lhsmtool_posix[10200]: cannot copy xattr of '/mnt/lustre/.lustre/fid/0x200000401:0x14:0x0' to '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0014/0000/0401/0000/0002/0000/0x200000401:0x14:0x0_tmp': No such file or directory (2)
      lhsmtool_posix[10200]: xattr file for '/mnt/lustre/.lustre/fid/0x200000401:0x14:0x0' saved to archive '/home/cgearing/.autotest/shared_dir/2013-11-12/223921-70307003924420/arc1/0014/0000/0401/0000/0002/0000/0x200000401:0x14:0x0_tmp'
      lhsmtool_posix[10200]: cannot get FID of '[0x200000401:0x14:0x0]': No such file or directory (2)
      lhsmtool_posix[10200]: Action completed, notifying coordinator cookie=0x5283b1ae, FID=[0x200000401:0x14:0x0], hp_flags=0 err=2
      lhsmtool_posix[10200]: llapi_hsm_action_end() on '/mnt/lustre/.lustre/fid/0x200000401:0x14:0x0' ok (rc=0)
      

      On the MDS console and dmesg:

      09:08:27:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | egrep 'WAITING|STARTED'
      09:08:27:LustreError: 12057:0:(mdt_xattr.c:134:mdt_getxattr_one()) getxattr failed: -2
      09:08:27:LustreError: 12057:0:(mdt_xattr.c:134:mdt_getxattr_one()) getxattr failed: -2
      09:08:27:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-hsm test_15: @@@@@@ FAIL: could not rebind file list 
      09:08:27:Lustre: DEBUG MARKER: sanity-hsm test_15: @@@@@@ FAIL: could not rebind file list
      09:08:27:Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2013-11-12/lustre-reviews-el6-x86_64--review--1_2_1__19416__-70307003924420-223918/sanity-hsm.test_15.debug_log.$(hostname -s).1384362493.log;
      09:08:27:         dmesg > /logdir/test_logs/2013-11-12/lustre-reviews-el6-x86_64--r
      09:08:27:LustreError: 12057:0:(mdt_xattr.c:134:mdt_getxattr_one()) getxattr failed: -2
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: