Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2015

Test failure on test suite obdfilter-survey, subtest test_3a

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.5.0
    • lustre-master build #1560 zfs
    • 3
    • 4116

    Description

      This issue was created by maloo for Li Wei <liwei@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/6dfc3fae-049c-11e2-bfd4-52540035b04c.

      The sub-test test_3a failed with the following error:

      test failed to respond and timed out

      From the OSS console:

      06:39:39:Lustre: DEBUG MARKER: == obdfilter-survey test 3a: Network survey ========================================================== 06:39:38 (1348234778)
      06:40:00:Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts
      06:40:00:Lustre: DEBUG MARKER: umount -d -f /mnt/ost1
      06:40:01:Lustre: 3533:0:(client.c:1905:ptlrpc_expire_one_request()) @@@ Request  sent has timed out for slow reply: [sent 1348234793/real 1348234793]  req@ffff88003d984800 x1413717737622294/t0(0) o400->MGC10.10.4.186@tcp@10.10.4.186@tcp:26/25 lens 224/224 e 0 to 1 dl 1348234800 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      06:40:01:Lustre: 3533:0:(client.c:1905:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      06:40:01:LustreError: 166-1: MGC10.10.4.186@tcp: Connection to MGS (at 10.10.4.186@tcp) was lost; in progress operations using this service will fail
      06:40:24:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 4. Is it stuck?
      06:40:36:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 4. Is it stuck?
      06:41:07:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 4. Is it stuck?
      06:41:07:Lustre: 3532:0:(client.c:1905:ptlrpc_expire_one_request()) @@@ Request  sent has timed out for slow reply: [sent 1348234845/real 1348234845]  req@ffff88003d984800 x1413717737622299/t0(0) o250->MGC10.10.4.186@tcp@10.10.4.186@tcp:26/25 lens 400/544 e 0 to 1 dl 1348234866 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      06:41:07:Lustre: 3532:0:(client.c:1905:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
      06:42:09:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 4. Is it stuck?
      06:43:42:Lustre: 3532:0:(client.c:1905:ptlrpc_expire_one_request()) @@@ Request  sent has timed out for slow reply: [sent 1348234990/real 1348234990]  req@ffff88003d984800 x1413717737622304/t0(0) o250->MGC10.10.4.186@tcp@10.10.4.186@tcp:26/25 lens 400/544 e 0 to 1 dl 1348235015 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      06:43:42:Lustre: 3532:0:(client.c:1905:ptlrpc_expire_one_request()) Skipped 4 previous similar messages
      06:44:24:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 4. Is it stuck?
      06:47:58:INFO: task umount:13712 blocked for more than 120 seconds.
      06:47:58:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      06:47:58:umount        D 0000000000000000     0 13712  13711 0x00000080
      06:47:58: ffff880038249ac8 0000000000000086 ffffffff00000010 ffff880038249a78
      06:47:58: ffff880038249a38 0000000000000286 ffffffffa05f0ff0 ffff880052692d54
      06:47:58: ffff8800383245f8 ffff880038249fd8 000000000000fb88 ffff8800383245f8
      06:47:58:Call Trace:
      06:47:58: [<ffffffff814fea92>] schedule_timeout+0x192/0x2e0
      06:47:58: [<ffffffff8107e120>] ? process_timeout+0x0/0x10
      06:47:58: [<ffffffffa03a373d>] cfs_schedule_timeout_and_set_state+0x1d/0x20 [libcfs]
      06:47:58: [<ffffffffa0538828>] obd_exports_barrier+0x98/0x180 [obdclass]
      06:47:59: [<ffffffffa0bc6fb2>] ofd_device_fini+0x42/0x230 [ofd]
      06:47:59: [<ffffffffa055ddc7>] class_cleanup+0x577/0xdc0 [obdclass]
      06:47:59: [<ffffffffa053aa36>] ? class_name2dev+0x56/0xe0 [obdclass]
      06:47:59: [<ffffffffa055f6b5>] class_process_config+0x10a5/0x1ca0 [obdclass]
      06:47:59: [<ffffffffa03a3be0>] ? cfs_alloc+0x30/0x60 [libcfs]
      06:47:59: [<ffffffffa0559043>] ? lustre_cfg_new+0x353/0x7e0 [obdclass]
      06:47:59: [<ffffffffa0560427>] class_manual_cleanup+0x177/0x6f0 [obdclass]
      06:47:59: [<ffffffffa053aa36>] ? class_name2dev+0x56/0xe0 [obdclass]
      06:47:59: [<ffffffffa0569997>] server_put_super+0x5a7/0xcb0 [obdclass]
      06:47:59: [<ffffffff8117d34b>] generic_shutdown_super+0x5b/0xe0
      06:48:00: [<ffffffff8117d436>] kill_anon_super+0x16/0x60
      06:48:00: [<ffffffffa0562066>] lustre_kill_super+0x36/0x60 [obdclass]
      06:48:00: [<ffffffff8117e4b0>] deactivate_super+0x70/0x90
      06:48:00: [<ffffffff8119a4ff>] mntput_no_expire+0xbf/0x110
      06:48:00: [<ffffffff8119af9b>] sys_umount+0x7b/0x3a0
      06:48:00: [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
      06:48:00: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
      

      This was master with OFD and LDiskFS OSTs.

      Info required for matching: obdfilter-survey 3a

      Attachments

        Issue Links

          Activity

            People

              mdiep Minh Diep
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: