Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-895

1.8<->2.2 interop: test connectathon hang

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • None
    • None
    • server: lustre-master build #353 RHEL6-x86_64
      client: 1.8.6-wc1
    • 3
    • 10341

    Description

      When running parallel-scale test_connectathon lock test, system hang. Please find MDS trace in the attached. This issue can be reproduced.

      client:
      -------------------------------------------------------------------------
      root 6347 0.0 0.0 0 0 ? S 22:14 0:00 [ldlm_bl_03]
      root 6535 0.0 0.0 107324 2144 ttyS0 S+ 22:16 0:00 bash /usr/lib64/lustre/tests/parallel-scale.sh
      root 6536 0.0 0.0 100896 640 ttyS0 S+ 22:16 0:00 tee /tmp/test_logs/2011-12-03/210437/parallel-scale.test_con
      root 10965 0.0 0.0 107324 2192 ttyS0 S+ 22:18 0:00 bash /usr/lib64/lustre/tests/parallel-scale.sh
      root 10967 0.0 0.0 106088 1344 ttyS0 S+ 22:18 0:00 sh runtests -f
      root 10974 0.0 0.0 6672 576 ttyS0 S+ 22:18 0:00 tlocklfs /mnt/lustre/d0.connectathon
      root 10975 0.0 0.0 6508 316 ttyS0 S+ 22:18 0:00 tlocklfs /mnt/lustre/d0.connectathon

      client trace:
      --------------------------------------------------------------------------
      tlocklfs S 0000000000000001 0 10974 10967 0x00000080
      ffff8802b9987ca8 0000000000000086 0000000000000000 0000000000000082
      0000000000000001 ffff8802b1ac5cb8 0000000000000000 0000000100518209
      ffff88031f8dd0b8 ffff8802b9987fd8 000000000000f598 ffff88031f8dd0b8
      Call Trace:
      [<ffffffff8117bf7b>] pipe_wait+0x5b/0x80
      [<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40
      [<ffffffff814dbc1e>] ? mutex_lock+0x1e/0x50
      [<ffffffff8117c9d6>] pipe_read+0x3e6/0x4e0
      [<ffffffff811723ea>] do_sync_read+0xfa/0x140
      [<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40
      [<ffffffff811bc395>] ? fcntl_setlk+0x75/0x320
      [<ffffffff81204ef6>] ? security_file_permission+0x16/0x20
      [<ffffffff81172e15>] vfs_read+0xb5/0x1a0
      [<ffffffff810d1ac2>] ? audit_syscall_entry+0x272/0x2a0
      [<ffffffff81172f51>] sys_read+0x51/0x90
      [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
      tlocklfs S 0000000000000001 0 10975 10974 0x00000080
      ffff880247309a98 0000000000000082 0000000000000000 0000020300002adf
      0000000000000000 ffffffffa051119b ffff880300000075 000000010051777e
      ffff88031a337ab8 ffff880247309fd8 000000000000f598 ffff88031a337ab8
      Call Trace:
      [<ffffffffa04c300d>] ldlm_flock_completion_ast+0x61d/0x9f0 [ptlrpc]
      [<ffffffff8105dc20>] ? default_wake_function+0x0/0x20
      [<ffffffffa04b1565>] ldlm_cli_enqueue_fini+0x6c5/0xba0 [ptlrpc]
      [<ffffffff8105dc20>] ? default_wake_function+0x0/0x20
      [<ffffffffa04b5074>] ldlm_cli_enqueue+0x344/0x7a0 [ptlrpc]
      [<ffffffffa06c8edd>] ll_file_flock+0x47d/0x6b0 [lustre]
      [<ffffffff81190f40>] ? mntput_no_expire+0x30/0x110
      [<ffffffffa04c29f0>] ? ldlm_flock_completion_ast+0x0/0x9f0 [ptlrpc]
      [<ffffffff8117f451>] ? path_put+0x31/0x40
      [<ffffffff811bc243>] vfs_lock_file+0x23/0x40
      [<ffffffff811bc497>] fcntl_setlk+0x177/0x320
      [<ffffffff811845f7>] sys_fcntl+0x197/0x530
      [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: