Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7302

sanity-lfsck test_23b hangs on START_NAMESPACE

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • None
    • Configuration : 4 node setup : 1 MDS/ 1 OSS/ 2 clients
      Release
      Server 2.7.60
      Client 2.7.60
    • 3
    • 9223372036854775807

    Description

      stdout.log
      Total allocated inode limit: 0, total allocated block limit: 0
      
      == sanity-lfsck test 23b: LFSCK can repair dangling name entry (2) == 01:58:28 (1443319108)
      #####
      The objectA has multiple hard links, one of them corresponding
      to the name entry_B. But there is something wrong for the name
      entry_B and cause entry_B to references non-exist object_C.
      In the first-stage scanning, the LFSCK will think the entry_B
      as dangling, and re-create the lost object_C. When the LFSCK
      comes to the second-stage scanning, it will find that the
      former re-creating object_C is not proper, and will try to
      replace the object_C with the real object_A.
      Inject failure stub on MDT0 to simulate dangling name entry
      fail_loc=0x1621
      fail_loc=0
      'ls' should fail because of dangling name entry
      Trigger namespace LFSCK to find out dangling name entry
      
      
      stderr.log
      pdsh@fre0203: fre0201: ssh exited with exit code 1
      pdsh@fre0203: fre0202: ssh exited with exit code 1
      Using TIMEOUT=20
      excepting tests: 
      
      
      
      

      In the ticket LU-7217 similar timeout is seen during test_23c due to

      #define OBD_FAIL_LFSCK_DELAY3           0x1602
              do_facet $SINGLEMDS $LCTL set_param fail_val=10 fail_loc=0x1602
      

      in test_23b I see following,

      4178 00000080:00200000:1.0:1443429106.393336:0:14812:0:(dcache.c:424:ll_revalidate_nd()) VFS Op:name=d23b.sanity-lfsck, flags=0
      4179 00000080:00200000:1.0:1443429106.393338:0:14812:0:(file.c:3217:__ll_inode_revalidate()) VFS Op:inode=[0x200000401:0x1:0x0](ffff88011bc48140),name=d23b.sanity-lfsck
      4180 00000002:00010000:1.0:1443429106.393339:0:14812:0:(mdc_locks.c:1114:mdc_intent_lock()) (name: ,[0x200000401:0x1:0x0]) in obj [0x200000401:0x1:0x0], intent: getattr flags 00
      4181 00010000:00010000:1.0:1443429106.393341:0:14812:0:(ldlm_lock.c:1493:ldlm_lock_match()) ### not matched ns ffff88011a7faa00 type 13 mode 30 res 8589935617/1 (0 0)
      4182 00010000:00010000:1.0:1443429106.393350:0:14812:0:(ldlm_lock.c:746:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(CR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb     9fa48c lrc: 3/1,0 mode: --/CR res: [0x200000401:0x1:0x0].0 bits 0x0 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x0 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4183 00010000:00010000:1.0:1443429106.393352:0:14812:0:(ldlm_request.c:925:ldlm_cli_enqueue()) ### client-side enqueue START, flags 0x1000
      4184  ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48c lrc: 3/1,0 mode: --/CR res: [0x200000401:0x1:0x0].0 bits 0x2 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x0 expref:      -99 pid: 14812 timeout: 0 lvb_type: 0
      4185 00010000:00010000:1.0:1443429106.393355:0:14812:0:(ldlm_request.c:980:ldlm_cli_enqueue()) ### sending request ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48c lrc: 3/1,0      mode: --/CR res: [0x200000401:0x1:0x0].0 bits 0x2 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4186 00000100:00100000:1.0:1443429106.393360:0:14812:0:(client.c:1575:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc rm:d5ce60dd-b58f-953f-4b1f-b15d8912a206:14812:1513543112333784:192.168.101     .33@tcp:101
      4187 00000100:00100000:1.0:1443429106.393384:0:14812:0:(client.c:2255:ptlrpc_set_wait()) set ffff88011ae58680 going to sleep for 6 seconds
      4188 00000100:00100000:1.0:1443429106.396347:0:14812:0:(client.c:1970:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc rm:d5ce60dd-b58f-953f-4b1f-b15d8912a206:14812:1513543112333784:192.168.101.     33@tcp:101
      4189 00010000:00010000:1.0:1443429106.396351:0:14812:0:(ldlm_request.c:639:ldlm_cli_enqueue_fini()) ### server returned different mode PR ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c     2a0fb9fa48c lrc: 4/1,0 mode: --/CR res: [0x200000401:0x1:0x0].0 bits 0x2 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4190 00010000:00010000:1.0:1443429106.396354:0:14812:0:(ldlm_request.c:665:ldlm_cli_enqueue_fini()) ### client-side enqueue, new policy data ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d     58c2a0fb9fa48c lrc: 4/1,0 mode: --/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4191 00010000:00010000:1.0:1443429106.396356:0:14812:0:(ldlm_lock.c:1047:ldlm_granted_list_add_lock()) ### About to add lock: ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48c      lrc: 4/1,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4192 00010000:00010000:1.0:1443429106.396359:0:14812:0:(ldlm_request.c:710:ldlm_cli_enqueue_fini()) ### client-side enqueue END ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48     c lrc: 4/1,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4193 00010000:00010000:1.0:1443429106.396362:0:14812:0:(ldlm_lock.c:746:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(PR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb     9fa48c lrc: 5/2,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4194 00010000:00010000:1.0:1443429106.396364:0:14812:0:(ldlm_lock.c:800:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(CR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb     9fa48c lrc: 5/2,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4195 00010000:00010000:1.0:1443429106.396366:0:14812:0:(ldlm_lock.c:891:ldlm_lock_decref_internal()) ### do not add lock into lru list ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0     fb9fa48c lrc: 4/1,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4196 00000002:00100000:1.0:1443429106.396369:0:14812:0:(mdc_locks.c:609:mdc_finish_enqueue()) @@@ op: 8 disposition: b, status: 0  req@ffff88011bf403c0 x1513543112333784/t0(0) o101->lustre-MDT0000-mdc-ffff880     1198a9400@192.168.101.33@tcp:12/10 lens 696/536 e 0 to 0 dl 1443429113 ref 1 fl Complete:R/0/0 rc 0/0
      4197 00000002:00010000:1.0:1443429106.396373:0:14812:0:(mdc_locks.c:979:mdc_finish_intent_lock()) ### matching against this ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48c lr     c: 3/1,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4198 00010000:00010000:1.0:1443429106.396383:0:14812:0:(ldlm_lock.c:1493:ldlm_lock_match()) ### not matched ns ffff88011a7faa00 type 13 mode 4 res 8589935617/1 (0 0)
      4199 00000080:00200000:1.0:1443429106.396390:0:14812:0:(llite_lib.c:1952:ll_update_inode()) inode=[0x200000401:0x1:0x0], updating i_size 4096
      4200 00000080:00010000:1.0:1443429106.396392:0:14812:0:(dcache.c:346:ll_lookup_finish_locks()) setting l_data to inode [0x200000401:0x1:0x0](ffff88011bc48140)
      4201 00000080:00010000:1.0:1443429106.396393:0:14812:0:(llite_internal.h:1394:ll_set_lock_data()) setting l_data to inode [0x200000401:0x1:0x0](ffff88011bc48140) for lock 0x2d58c2a0fb9fa48c
      4202 00000080:00010000:1.0:1443429106.396395:0:14812:0:(dcache.c:249:ll_intent_drop_lock()) releasing lock with cookie 0x2d58c2a0fb9fa48c from it ffff88011bf77d28
      4203 00010000:00010000:1.0:1443429106.396395:0:14812:0:(ldlm_lock.c:800:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb     9fa48c lrc: 3/1,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4204 00010000:00010000:1.0:1443429106.396397:0:14812:0:(ldlm_lock.c:874:ldlm_lock_decref_internal()) ### add lock into lru list ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd64ac0/0x2d58c2a0fb9fa48     c lrc: 2/0,0 mode: PR/PR res: [0x200000401:0x1:0x0].0 bits 0x13 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129b042 expref: -99 pid: 14812 timeout: 0 lvb_type: 0
      4205 00000080:00200000:1.0:1443429106.396417:0:14812:0:(file.c:3217:__ll_inode_revalidate()) VFS Op:inode=[0x200000007:0x1:0x0](ffff88011bc49c40),name=/
      4206 00000002:00010000:1.0:1443429106.396419:0:14812:0:(mdc_locks.c:1114:mdc_intent_lock()) (name: ,[0x200000007:0x1:0x0]) in obj [0x200000007:0x1:0x0], intent: lookup flags 00
      4207 00010000:00010000:1.0:1443429106.396421:0:14812:0:(ldlm_lock.c:746:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(PR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd644c0/0x2d58c2a0fb     9fa415 lrc: 2/1,0 mode: PR/PR res: [0x200000007:0x1:0x0].0 bits 0x11 rrc: 3 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129ae97 expref: -99 pid: 11235 timeout: 0 lvb_type: 0
      4208 00010000:00010000:1.0:1443429106.396423:0:14812:0:(ldlm_lock.c:1472:ldlm_lock_match()) ### matched (0 0) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd644c0/0x2d58c2a0fb9fa415 lrc: 2/1,0 mode:      PR/PR res: [0x200000007:0x1:0x0].0 bits 0x11 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x1c3b041d4129ae97 expref: -99 pid: 11235 timeout: 0 lvb_type: 0
      4209 00000080:00010000:1.0:1443429106.396426:0:14812:0:(dcache.c:346:ll_lookup_finish_locks()) setting l_data to inode [0x200000007:0x1:0x0](ffff88011bc49c40)
      4210 00000080:00010000:1.0:1443429106.396427:0:14812:0:(llite_internal.h:1394:ll_set_lock_data()) setting l_data to inode [0x200000007:0x1:0x0](ffff88011bc49c40) for lock 0x2d58c2a0fb9fa415
      4211 00000080:00010000:1.0:1443429106.396428:0:14812:0:(dcache.c:249:ll_intent_drop_lock()) releasing lock with cookie 0x2d58c2a0fb9fa415 from it ffff88011bf77b68
      4212 00010000:00010000:1.0:1443429106.396428:0:14812:0:(ldlm_lock.c:800:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd644c0/0x2d58c2a0fb     9fa415 lrc: 3/1,0 mode: PR/PR res: [0x200000007:0x1:0x0].0 bits 0x11 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129ae97 expref: -99 pid: 11235 timeout: 0 lvb_type: 0
      4213 00010000:00010000:1.0:1443429106.396431:0:14812:0:(ldlm_lock.c:874:ldlm_lock_decref_internal()) ### add lock into lru list ns: lustre-MDT0000-mdc-ffff8801198a9400 lock: ffff88011bd644c0/0x2d58c2a0fb9fa41     5 lrc: 2/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0 bits 0x11 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x1c3b041d4129ae97 expref: -99 pid: 11235 timeout: 0 lvb_type: 0
      4214 00000080:00200000:1.0:1443429106.396435:0:14812:0:(file.c:3510:ll_inode_permission()) VFS Op:inode=[0x200000007:0x1:0x0](ffff88011bc49c40), inode mode 41ed mask 1
      

      Attachments

        1. 23b.lctl.tgz
          567 kB
        2. 26th.oct.23b.lctl.tgz
          1.93 MB
        3. MRP_2979_ptldebug5.txt
          1.86 MB

        Activity

          People

            yong.fan nasf (Inactive)
            parinay parinay v kondekar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: