Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9113

insanity test_0 umount fails for /mnt/lustre-mds1, "Fail all nodes" test can't start

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: Lustre 2.10.0
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      onyx-30vm1-3/7/8, Full Group test,
      master branch, v2.9.52, b3520,
      DNE, ZFS
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      https://testing.hpdd.intel.com/test_sets/3c80e50a-efe9-11e6-8c0d-5254006e85c2

      Client tries multiple times (unsuccessfully) to unmount mds1, but eventually times out.

      From MDS console:

      02:50:15:[ 4080.084137] INFO: task umount:19374 blocked for more than 120 seconds.
      02:50:15:[ 4080.086174] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      02:50:15:[ 4080.088282] umount          D ffff8800793b7fc0     0 19374  19373 0x00000080
      02:50:15:[ 4080.090408]  ffff880056d43bd0 0000000000000086 ffff8800422ebec0 ffff880056d43fd8
      02:50:15:[ 4080.092523]  ffff880056d43fd8 ffff880056d43fd8 ffff8800422ebec0 ffff8800793b7fb8
      02:50:15:[ 4080.094612]  ffff8800793b7fbc ffff8800422ebec0 00000000ffffffff ffff8800793b7fc0
      02:50:15:[ 4080.096724] Call Trace:
      02:50:15:[ 4080.098391]  [<ffffffff8168cad9>] schedule_preempt_disabled+0x29/0x70
      02:50:15:[ 4080.100447]  [<ffffffff8168a735>] __mutex_lock_slowpath+0xc5/0x1c0
      02:50:15:[ 4080.102429]  [<ffffffff81689b9f>] mutex_lock+0x1f/0x2f
      02:50:15:[ 4080.104322]  [<ffffffffa0ce6a56>] mgc_process_config+0x7d6/0x1400 [mgc]
      02:50:15:[ 4080.106336]  [<ffffffff810bc064>] ? __wake_up+0x44/0x50
      02:50:15:[ 4080.108272]  [<ffffffffa0b37225>] obd_process_config.constprop.14+0x85/0x2d0 [obdclass]
      02:50:15:[ 4080.110413]  [<ffffffffa0b375f0>] ? lustre_cfg_new+0x180/0x400 [obdclass]
      02:50:15:[ 4080.112481]  [<ffffffffa0b39440>] lustre_end_log+0xf0/0x5c0 [obdclass]
      02:50:15:[ 4080.114533]  [<ffffffffa0b61d2e>] server_put_super+0x7de/0xcd0 [obdclass]
      02:50:15:[ 4080.116595]  [<ffffffff81200802>] generic_shutdown_super+0x72/0xf0
      02:50:15:[ 4080.118594]  [<ffffffff81200bd2>] kill_anon_super+0x12/0x20
      02:50:15:[ 4080.120545]  [<ffffffffa0b36db2>] lustre_kill_super+0x32/0x50 [obdclass]
      02:50:15:[ 4080.122589]  [<ffffffff81200f89>] deactivate_locked_super+0x49/0x60
      02:50:15:[ 4080.124609]  [<ffffffff81201586>] deactivate_super+0x46/0x60
      02:50:15:[ 4080.126559]  [<ffffffff8121e9c5>] mntput_no_expire+0xc5/0x120
      02:50:15:[ 4080.128491]  [<ffffffff8121fb00>] SyS_umount+0xa0/0x3b0
      02:50:15:[ 4080.130375]  [<ffffffff81696949>] system_call_fastpath+0x16/0x1b
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wc-triage WC Triage
                Reporter:
                casperjx James Casper (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: