Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4449

Test failure timeout on sanity-scrub test_3: MGS stuck on umount with obd_unlinked_exports

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 12199

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite run:
      http://maloo.whamcloud.com/test_sets/5e3fd144-766e-11e3-b3c0-52540035b04c.

      The sub-test test_3 failed with the following error:

      test failed to respond and timed out

      Info required for matching: sanity-scrub 3

      MDS/MGS console log

      09:39:21:LustreError: 166-1: MGC10.10.17.217@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
      09:39:21:Lustre: MGS is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck?
      09:39:21:Lustre: MGS is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck?
      09:39:21:Lustre: MGS is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck?
      09:39:23:Lustre: MGS is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 5. Is it stuck?
      09:39:23:Lustre: MGS is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 5. Is it stuck?
      09:39:23:INFO: task umount:19279 blocked for more than 120 seconds.
      09:39:23:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      09:39:23:umount        D 0000000000000000     0 19279  19278 0x00000080
      09:39:23: ffff880067477aa8 0000000000000086 0000000000000000 ffff88007b83a000
      09:39:23: ffffffffa0931ce3 0000000000000000 ffff88005b046084 ffffffffa0931ce3
      09:39:24: ffff880060265af8 ffff880067477fd8 000000000000fb88 ffff880060265af8
      09:39:24:Call Trace:
      09:39:24: [<ffffffff8150f3f2>] schedule_timeout+0x192/0x2e0
      09:39:24: [<ffffffff810811e0>] ? process_timeout+0x0/0x10
      09:39:24: [<ffffffffa08b767b>] obd_exports_barrier+0xab/0x180 [obdclass]
      09:39:24: [<ffffffffa12c252e>] mgs_device_fini+0xfe/0x580 [mgs]
      09:39:24: [<ffffffffa08e0013>] class_cleanup+0x573/0xd30 [obdclass]
      09:39:24: [<ffffffffa08b9816>] ? class_name2dev+0x56/0xe0 [obdclass]
      09:39:24: [<ffffffffa08e1d3a>] class_process_config+0x156a/0x1ad0 [obdclass]
      09:39:24: [<ffffffffa08da013>] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass]
      09:39:24: [<ffffffffa08e2419>] class_manual_cleanup+0x179/0x6f0 [obdclass]
      09:45:18: [<ffffffffa08b9816>] ? class_name2dev+0x56/0xe0 [obdclass]
      09:45:18: [<ffffffffa091b51b>] server_put_super+0x94b/0xe30 [obdclass]
      09:45:18: [<ffffffff8118366b>] generic_shutdown_super+0x5b/0xe0
      09:45:18: [<ffffffff81183756>] kill_anon_super+0x16/0x60
      09:45:18: [<ffffffffa08e42d6>] lustre_kill_super+0x36/0x60 [obdclass]
      09:45:18: [<ffffffff81183ef7>] deactivate_super+0x57/0x80
      09:45:18: [<ffffffff811a21ef>] mntput_no_expire+0xbf/0x110
      09:45:18: [<ffffffff811a2c5b>] sys_umount+0x7b/0x3a0
      09:45:18: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      

      Possibly not a full fix for LU-4365?

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: