Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7372

replay-dual test_26: test failed to respond and timed out

    Details

    • Type: Bug
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5
    • Fix Version/s: None
    • Environment:
      Server/Client : master, build # 3225 RHEL 6.7
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/1e79d2a6-7d21-11e5-a254-5254006e85c2.

      The sub-test test_26 failed with the following error:

      test failed to respond and timed out
      

      Client dmesg:

      Lustre: DEBUG MARKER: test_26 fail mds1 1 times
      LustreError: 980:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1445937610, 300s ago), entering recovery for MGS@10.2.4.140@tcp ns: MGC10.2.4.140@tcp lock: ffff88007bdd82c0/0x956ab2c8047544d6 lrc: 4/1,0 mode: --/CR res: [0x65727473756c:0x2:0x0].0x0 rrc: 1 type: PLN flags: 0x1000000000000 nid: local remote: 0x223a79061b204538 expref: -99 pid: 980 timeout: 0 lvb_type: 0
      Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1445937910/real 1445937910]  req@ffff880028347980 x1516173751413108/t0(0) o250->MGC10.2.4.140@tcp@10.2.4.140@tcp:26/25 lens 520/544 e 0 to 1 dl 1445937916 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) Skipped 67 previous similar messages
      

      MDS console:

      09:22:17:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004d92c980 x1516158358024328/t0(0) o101->lustre-MDT0000-lwp-MDT0000@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      09:25:19:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 6 previous similar messages
      09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) lustre-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-5
      09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) Skipped 1 previous similar message
      09:25:19:INFO: task umount:24629 blocked for more than 120 seconds.
      09:25:19:      Not tainted 2.6.32-573.7.1.el6_lustre.x86_64 #1
      09:25:19:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      09:25:19:umount        D 0000000000000000     0 24629  24628 0x00000080
      09:25:19: ffff880059e2bb48 0000000000000086 0000000000000000 00000000000708b7
      09:25:20: 0000603500000000 000000ac00000000 00001c1fd9b9c014 ffff880059e2bb98
      09:25:20: ffff880059e2bb58 0000000101d3458a ffff880076ee3ad8 ffff880059e2bfd8
      09:25:20:Call Trace:
      09:25:20: [<ffffffff8153a756>] __mutex_lock_slowpath+0x96/0x210
      09:25:20: [<ffffffff8153a27b>] mutex_lock+0x2b/0x50
      09:25:20: [<ffffffffa02cb30d>] mgc_process_config+0x1dd/0x1210 [mgc]
      09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      09:25:20: [<ffffffffa07fe28d>] obd_process_config.clone.0+0x8d/0x2e0 [obdclass]
      09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      09:25:20: [<ffffffffa08024c2>] lustre_end_log+0x262/0x6a0 [obdclass]
      09:25:20: [<ffffffffa082efb1>] server_put_super+0x911/0xed0 [obdclass]
      09:25:20: [<ffffffff811b0116>] ? invalidate_inodes+0xf6/0x190
      09:25:20: [<ffffffff8119437b>] generic_shutdown_super+0x5b/0xe0
      09:25:20: [<ffffffff81194466>] kill_anon_super+0x16/0x60
      09:25:20: [<ffffffffa07fa096>] lustre_kill_super+0x36/0x60 [obdclass]
      09:25:20: [<ffffffff81194c07>] deactivate_super+0x57/0x80
      09:25:20: [<ffffffff811b4a7f>] mntput_no_expire+0xbf/0x110
      09:25:20: [<ffffffff811b55cb>] sys_umount+0x7b/0x3a0
      09:25:20: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      

      Info required for matching: replay-dual test_26

        Attachments

        1. 1453855057.tgz
          24.15 MB
        2. log-7372
          65 kB

          Issue Links

            Activity

              People

              • Assignee:
                bobijam Zhenyu Xu
                Reporter:
                maloo Maloo
              • Votes:
                0 Vote for this issue
                Watchers:
                18 Start watching this issue

                Dates

                • Created:
                  Updated: