Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16404

sanity test_230n: lfs mirror: cannot get UNLOCK lease, Mirroring failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for S Buisson <sbuisson@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/65516983-67f2-4883-b7ab-1cb5abdf58e5

      test_230n failed with the following error:

      Mirroring failed
      

      Test output is:

      == sanity test 230n: Dir migration with mirrored file ==== 22:36:31 (1671057391)
      lfs mirror mirror: cannot get UNLOCK lease, ext 4: No such file or directory (2)
      error: lfs mirror extend: /mnt/lustre/d230n.sanity/f230n.sanity: cannot merge layout: No such file or directory
       sanity test_230n: @@@@@@ FAIL: Mirroring failed 
      

      In client dmesg, we can see the client was evicted by the MDS:

      [13212.828154] LustreError: 866765:0:(file.c:242:ll_close_inode_openhandle()) lustre-clilmv-ffff93f6048ef000:
          inode [0x200002b15:0x5b9b:0x0] mdc close failed: rc = -2
      [13212.831301] LustreError: 11-0: lustre-MDT0000-mdc-ffff93f6048ef000: operation ldlm_enqueue to node 10.240.38.63@tcp failed: rc = -107
      [13212.831308] Lustre: lustre-MDT0000-mdc-ffff93f6048ef000: Connection to lustre-MDT0000 (at 10.240.38.63@tcp) was lost;
          in progress operations using this service will wait for recovery to complete
      [13212.833555] LustreError: Skipped 1 previous similar message
      [13212.838262] LustreError: 167-0: lustre-MDT0000-mdc-ffff93f6048ef000: This client was evicted by lustre-MDT0000;
          in progress operations using this service will fail.
      [13212.847340] Lustre: lustre-MDT0000-mdc-ffff93f6048ef000: Connection restored to 10.240.38.63@tcp (at 10.240.38.63@tcp)
      [13213.270161] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_230n: @@@@@@ FAIL: Mirroring failed 
      

      The MDS claims the client was not responsive to the lock request:

      [13026.421567] Lustre: DEBUG MARKER: == sanity test 230n: Dir migration with mirrored file ==== 03:41:59 (1679283719)
      [13070.199369] Lustre: mdt_rdpg00_003: service thread pid 541635 was inactive for 42.970 seconds. The thread might be hung,
           or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [13070.203029] Pid: 541635, comm: mdt_rdpg00_003 4.18.0-348.23.1.el8_lustre.x86_64 #1 SMP Thu Mar 2 00:54:25 UTC 2023
      [13070.204965] Call Trace TBD:
      [13070.206179] ldlm_completion_ast+0x7ac/0x900 [ptlrpc]
      [13070.207308] ldlm_cli_enqueue_local+0x307/0x860 [ptlrpc]
      [13070.208596] mdt_object_local_lock+0x509/0xb30 [mdt]
      [13070.209666] mdt_object_lock_internal+0x18d/0x4a0 [mdt]
      [13070.210781] mdt_object_lock+0x1b/0x20 [mdt]
      [13070.211735] mdt_close_handle_layouts+0x935/0x10b0 [mdt]
      [13070.212865] mdt_mfd_close+0x510/0xbc0 [mdt]
      [13070.213811] mdt_close_internal+0xcc/0x250 [mdt]
      [13070.214833] mdt_close+0x2c0/0x8b0 [mdt]
      [13070.215745] tgt_request_handle+0xc8c/0x1950 [ptlrpc]
      [13070.216850] ptlrpc_server_handle_request+0x31d/0xbc0 [ptlrpc]
      [13070.218096] ptlrpc_main+0xc4e/0x1510 [ptlrpc]
      [13127.540308] LustreError: 482890:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s:
           evicting client at 10.240.38.60@tcp  ns: mdt-lustre-MDT0000_UUID lock: 0000000001a8df2b/0x3c9ac870599a2eb7
           lrc: 3/0,0 mode: CR/CR res: [0x200002b15:0x5b9b:0x0].0x0 bits 0xd/0x0 rrc: 4 type: IBT gid 0 flags: 0x60200400000020
           nid: 10.240.38.60@tcp remote: 0x5d158cc36e2e425f expref: 49 pid: 537456 timeout: 13128 lvb_type: 0
      [13128.435958] Lustre: DEBUG MARKER: sanity test_230n: @@@@@@ FAIL: Mirroring failed
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_230n - Mirroring failed

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: