Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5683

HSM: re-archiving dirty file fails, request for cookie not found

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.6.0
    • Scientific Linux 6.5, Lutre 2.6.0, 1 MDS, 2 OSSs, 2 clients, all VMs hosted on KVM.
      Also reproducible with CentOS 6.5 in other environment.
    • 3
    • 15920

    Description

      During some tests, an issue has been found that when a file is archived, released, changed and re-archived, the re-archival fails under some circumstances. The problem could be reproduced in a very simple way using the lhsmd_posix:

      Boot machines, start Lustre services (MDS, OSS1, OSS2, mount on both clients)

      Start copytool on client 2:

      lhsmtool_posix -A 3 -A 4 -p /tmp /lustre
      

      Start test script on client 1:

      arch=3
      rm -f /lustre/mytest 
      echo thisfile >/lustre/mytest
      lfs hsm_archive -a $arch /lustre/mytest
      sleep 3
      lfs hsm_state /lustre/mytest
      sleep 3
      lfs hsm_release /lustre/mytest
      sleep 3
      lfs hsm_state /lustre/mytest
      sleep 3
      cat /etc/passwd >>/lustre/mytest
      sleep 3
      lfs hsm_state /lustre/mytest
      sleep 3
      lfs hsm_archive -a $arch /lustre/mytest
      sleep 3
      lfs hsm_state /lustre/mytest
      sleep 5
      lfs hsm_state /lustre/mytest
      

      Test script output

      /lustre/mytest: (0x00000009) exists archived, archive_id:3
      /lustre/mytest: (0x0000000d) released exists archived, archive_id:3
      /lustre/mytest: (0x0000000b) exists dirty archived, archive_id:3
      /lustre/mytest: (0x0000000b) exists dirty archived, archive_id:3
      /lustre/mytest: (0x0000000b) exists dirty archived, archive_id:3
      

      Observations while watching the MDS syslog:

      Sep 17 10:39:06 lustre-mds1 kernel: LustreError: 1396:0:(mdt_coordinator.c:1463:mdt_hsm_update_request_state()) lsttest-MDT0000: Cannot find running request for cookie 0x54194821 on fid=[0x200003ab0:0x1:0x0]
      

      Observations on the copytool side:

      lhsmtool_posix[1230]: action=0 src=(null) dst=(null) mount_point=/lustre
      lhsmtool_posix[1230]: waiting for message from kernel
      lhsmtool_posix[1230]: copytool fs=lsttest archive#=3 item_count=1
      lhsmtool_posix[1230]: waiting for message from kernel
      lhsmtool_posix[1231]: '[0x200003ab0:0x1:0x0]' action ARCHIVE reclen 72, cookie=0x5419481f
      lhsmtool_posix[1231]: processing file 'mytest'
      lhsmtool_posix[1231]: archiving '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1231]: saving stripe info of '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' in /tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp.lov
      lhsmtool_posix[1231]: going to copy data from '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1231]: data archiving for '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' done
      lhsmtool_posix[1231]: attr file for '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' saved to archive '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1231]: fsetxattr of 'trusted.hsm' on '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' rc=0 (Success)
      lhsmtool_posix[1231]: fsetxattr of 'trusted.link' on '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' rc=0 (Success)
      lhsmtool_posix[1231]: fsetxattr of 'trusted.lov' on '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' rc=0 (Success)
      lhsmtool_posix[1231]: fsetxattr of 'trusted.lma' on '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' rc=0 (Success)
      lhsmtool_posix[1231]: fsetxattr of 'lustre.lov' on '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' rc=-1 (Operation not supported)
      lhsmtool_posix[1231]: xattr file for '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' saved to archive '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1231]: symlink '/tmp/shadow/mytest' to '../0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0' done
      lhsmtool_posix[1231]: Action completed, notifying coordinator cookie=0x5419481f, FID=[0x200003ab0:0x1:0x0], hp_flags=0 err=0
      lhsmtool_posix[1231]: llapi_hsm_action_end() on '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' ok (rc=0)
      lhsmtool_posix[1230]: copytool fs=lsttest archive#=3 item_count=1
      lhsmtool_posix[1230]: waiting for message from kernel
      lhsmtool_posix[1232]: '[0x200003ab0:0x1:0x0]' action RESTORE reclen 72, cookie=0x54194820
      lhsmtool_posix[1232]: processing file 'mytest'
      lhsmtool_posix[1232]: reading stripe rules from '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0.lov' for '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0'
      lhsmtool_posix[1232]: restoring data from '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0' to '{VOLATILE}=[0x200003ab1:0x1:0x0]'
      lhsmtool_posix[1232]: going to copy data from '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0' to '{VOLATILE}=[0x200003ab1:0x1:0x0]'
      lhsmtool_posix[1232]: data restore from '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0' to '{VOLATILE}=[0x200003ab1:0x1:0x0]' done
      lhsmtool_posix[1232]: Action completed, notifying coordinator cookie=0x54194820, FID=[0x200003ab0:0x1:0x0], hp_flags=0 err=0
      lhsmtool_posix[1232]: llapi_hsm_action_end() on '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' ok (rc=0)
      lhsmtool_posix[1230]: copytool fs=lsttest archive#=3 item_count=1
      lhsmtool_posix[1230]: waiting for message from kernel
      lhsmtool_posix[1233]: '[0x200003ab0:0x1:0x0]' action ARCHIVE reclen 72, cookie=0x54194821
      lhsmtool_posix[1233]: processing file 'mytest'
      lhsmtool_posix[1233]: archiving '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1233]: saving stripe info of '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' in /tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp.lov
      lhsmtool_posix[1233]: going to copy data from '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp'
      lhsmtool_posix[1233]: progress ioctl for copy '/lustre/.lustre/fid/0x200003ab0:0x1:0x0'->'/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp' failed: No such file or directory (2)
      lhsmtool_posix[1233]: data copy failed from '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' to '/tmp/0001/0000/3ab0/0000/0002/0000/0x200003ab0:0x1:0x0_tmp': No such file or directory (2)
      lhsmtool_posix[1233]: Action completed, notifying coordinator cookie=0x54194821, FID=[0x200003ab0:0x1:0x0], hp_flags=0 err=2
      lhsmtool_posix[1233]: llapi_hsm_action_end() on '/lustre/.lustre/fid/0x200003ab0:0x1:0x0' failed: No such file or directory (2)
      Immediately after the test, mdt/*/hsm/actions shows the following:
      lrh=[type=10680000 len=136 idx=11/3] fid=[0x200003ab0:0x1:0x0] dfid=[0x200003ab0:0x1:0x0] compound/cookie=0x5419481f/0x5419481f action=ARCHIVE archive#=3 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=SUCCEED data=[]
      lrh=[type=10680000 len=136 idx=11/6] fid=[0x200003ab0:0x1:0x0] dfid=[0x200003ab0:0x1:0x0] compound/cookie=0x54194820/0x54194820 action=RESTORE archive#=3 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=SUCCEED data=[]
      lrh=[type=10680000 len=136 idx=11/9] fid=[0x200003ab0:0x1:0x0] dfid=[0x200003ab0:0x1:0x0] compound/cookie=0x54194821/0x54194821 action=ARCHIVE archive#=3 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=FAILED data=[]
      

      A second, later retry of the archive request sometimes works fine and it also makes a difference if the copytool and the command line actions run on the same client node or on different nodes.

      I already added this as a comment to the possibly related LU-4004, but as this is closed, it doesn't seem to get attention any longer.

      Attachments

        Issue Links

          Activity

            [LU-5683] HSM: re-archiving dirty file fails, request for cookie not found

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14377/
            Subject: LU-5683 llite: Inform copytool of dataversion changes
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f99c0178e6d90bc6674f9d6a4e724b2c113b4a43

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14377/ Subject: LU-5683 llite: Inform copytool of dataversion changes Project: fs/lustre-release Branch: master Current Patch Set: Commit: f99c0178e6d90bc6674f9d6a4e724b2c113b4a43
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14649/
            Subject: LU-5683 clio: add CIT_DATA_VERSION
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: fcd45488711ac7341eef2a1bed9c7a8021c57436

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14649/ Subject: LU-5683 clio: add CIT_DATA_VERSION Project: fs/lustre-release Branch: master Current Patch Set: Commit: fcd45488711ac7341eef2a1bed9c7a8021c57436

            > Do you actually mean 'archive' when you said 'restore'?

            Yes.

            > I can see the problem. An easier way to fix the problem is:
            > ...

            It's not clear what will terminate this loop.

            I think that unifying data version with the rest of CLIO is a better fix. With http://review.whamcloud.com/#/c/14640/ we could get rid of some terrible crap like struct obd_info.

            jhammond John Hammond added a comment - > Do you actually mean 'archive' when you said 'restore'? Yes. > I can see the problem. An easier way to fix the problem is: > ... It's not clear what will terminate this loop. I think that unifying data version with the rest of CLIO is a better fix. With http://review.whamcloud.com/#/c/14640/ we could get rid of some terrible crap like struct obd_info.

            When the restore starts ll_data_version() uses the old layout (the layout that was valid before the file was released).

            Do you actually mean 'archive' when you said 'restore'?

            I can see the problem. An easier way to fix the problem is:

            do {
                 gen = get_layout_version();
                 ll_data_version();
            } while (gen != get_layout_version())
            
            
            jay Jinshan Xiong (Inactive) added a comment - When the restore starts ll_data_version() uses the old layout (the layout that was valid before the file was released). Do you actually mean 'archive' when you said 'restore'? I can see the problem. An easier way to fix the problem is: do { gen = get_layout_version(); ll_data_version(); } while (gen != get_layout_version())

            John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/14649
            Subject: LU-5683 clio: add CIT_DATA_VERSION
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 17ac7b8d2f87567ff898468ad77d1f2a577960b7

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/14649 Subject: LU-5683 clio: add CIT_DATA_VERSION Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 17ac7b8d2f87567ff898468ad77d1f2a577960b7
            jhammond John Hammond added a comment - - edited

            > The second issue is why re-archiving a file fails. I have not fully diagnosed it yet but what I see is a failure from ll_data_version() at ll_ioc_copy_start(). -ENOENT returned at some point in the OFD layer, propagated back and turned into -EIO by tgt_extent_lock().

            When the archive starts ll_data_version() uses the old layout (the layout that was valid before the file was released).

            t:lustre# dd if=/dev/urandom of=f0 bs=1M count=1
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB) copied, 0.23944 s, 4.4 MB/s
            t:lustre# lfs getstripe f0
            f0
            lmm_stripe_count:   1
            lmm_stripe_size:    1048576
            lmm_pattern:        1
            lmm_layout_gen:     0
            lmm_stripe_offset:  0
            	obdidx		 objid		 objid		 group
            	     0	             2	          0x2	             0
            
            t:lustre# lfs hsm_archive f0
            t:lustre# lfs hsm_state f0
            f0: (0x00000001) exists, archive_id:1
            t:lustre# lfs hsm_state f0
            f0: (0x00000009) exists archived, archive_id:1
            t:lustre# lfs hsm_release f0
            t:lustre# lfs hsm_state f0
            f0: (0x0000000d) released exists archived, archive_id:1
            t:lustre# dd if=/dev/urandom of=f0 bs=1M count=1
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB) copied, 0.239978 s, 4.4 MB/s
            t:lustre# lfs hsm_state f0
            f0: (0x0000000b) exists dirty archived, archive_id:1
            t:lustre# 
            t:lustre# 
            t:lustre# date +%s; lctl clear; lfs hsm_archive f0; lctl dk > /tmp/1.dk; sleep 1; lctl dk >> /tmp/1.dk
            1430327516
            t:lustre# lfs getstripe f0
            f0
            lmm_stripe_count:   1
            lmm_stripe_size:    1048576
            lmm_pattern:        1
            lmm_layout_gen:     2
            lmm_stripe_offset:  0
            	obdidx		 objid		 objid		 group
            	     0	             3	          0x3	             0
            

            The error message from the OST_GETATTR handler show that the CT client send the old object id:

            [  194.727468] LustreError: 3526:0:(ldlm_request.c:452:ldlm_cli_enqueue_local()) ### delayed lvb init failed (rc -2) ns: filter-lustre-OST0000_UUID lock: ffff8800d8bd2e90/0xba0c6f5e9969fb9 lrc: 2/0,0 mode: --/PR res: [0x2:0x0:0x0].0 rrc: 1 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0x0 expref: -99 pid: 3526 timeout: 0 lvb_type: 0
            
            jhammond John Hammond added a comment - - edited > The second issue is why re-archiving a file fails. I have not fully diagnosed it yet but what I see is a failure from ll_data_version() at ll_ioc_copy_start(). -ENOENT returned at some point in the OFD layer, propagated back and turned into -EIO by tgt_extent_lock(). When the archive starts ll_data_version() uses the old layout (the layout that was valid before the file was released). t:lustre# dd if=/dev/urandom of=f0 bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.23944 s, 4.4 MB/s t:lustre# lfs getstripe f0 f0 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 2 0x2 0 t:lustre# lfs hsm_archive f0 t:lustre# lfs hsm_state f0 f0: (0x00000001) exists, archive_id:1 t:lustre# lfs hsm_state f0 f0: (0x00000009) exists archived, archive_id:1 t:lustre# lfs hsm_release f0 t:lustre# lfs hsm_state f0 f0: (0x0000000d) released exists archived, archive_id:1 t:lustre# dd if=/dev/urandom of=f0 bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.239978 s, 4.4 MB/s t:lustre# lfs hsm_state f0 f0: (0x0000000b) exists dirty archived, archive_id:1 t:lustre# t:lustre# t:lustre# date +%s; lctl clear; lfs hsm_archive f0; lctl dk > /tmp/1.dk; sleep 1; lctl dk >> /tmp/1.dk 1430327516 t:lustre# lfs getstripe f0 f0 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 2 lmm_stripe_offset: 0 obdidx objid objid group 0 3 0x3 0 The error message from the OST_GETATTR handler show that the CT client send the old object id: [ 194.727468] LustreError: 3526:0:(ldlm_request.c:452:ldlm_cli_enqueue_local()) ### delayed lvb init failed (rc -2) ns: filter-lustre-OST0000_UUID lock: ffff8800d8bd2e90/0xba0c6f5e9969fb9 lrc: 2/0,0 mode: --/PR res: [0x2:0x0:0x0].0 rrc: 1 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0x0 expref: -99 pid: 3526 timeout: 0 lvb_type: 0

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14335/
            Subject: LU-5683 llite: Report first encountered error
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 93ca2db26b4eecc15ee96289370a588b2d7bb35a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14335/ Subject: LU-5683 llite: Report first encountered error Project: fs/lustre-release Branch: master Current Patch Set: Commit: 93ca2db26b4eecc15ee96289370a588b2d7bb35a

            Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/14377
            Subject: LU-5683 llite: Inform copytool of dataversion changes
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2df254332de7046b2939e7925d522c5729879b8a

            gerrit Gerrit Updater added a comment - Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/14377 Subject: LU-5683 llite: Inform copytool of dataversion changes Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2df254332de7046b2939e7925d522c5729879b8a

            People

              jhammond John Hammond
              mkukat Michael Kukat
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: