[LU-7872] conf-sanity: test_50i 'test failed to respond and timed out' Created: 14/Mar/16 Updated: 22/Oct/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Canary patch failed, during 'review-dne-part-1' This issue was created by maloo for Richard Henwood <richard.henwood@intel.com> Please provide additional information about the failure here. This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/5e327dde-e86b-11e5-be76-5254006e85c2. looks happy enough until: ... CMD: trevis-45vm1.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n mdc.lustre-MDT0001-mdc-[!M]*.active CMD: trevis-45vm1.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n mdc.lustre-MDT0001-mdc-[!M]*.active Updated after 7s: wanted '0' got '0' error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/d50i.conf-sanity/2' (3): No such device error: mkdir: create stripe dir '/mnt/lustre/d50i.conf-sanity/2' failed umount lustre on /mnt/lustre..... CMD: trevis-45vm1.trevis.hpdd.intel.com grep -c /mnt/lustre' ' /proc/mounts Stopping client trevis-45vm1.trevis.hpdd.intel.com /mnt/lustre (opts:) CMD: trevis-45vm1.trevis.hpdd.intel.com lsof -t /mnt/lustre CMD: trevis-45vm1.trevis.hpdd.intel.com umount /mnt/lustre 2>&1 stop mds service on trevis-45vm7 CMD: trevis-45vm7 grep -c /mnt/mds1' ' /proc/mounts ... |
| Comments |
| Comment by James Nunez (Inactive) [ 14/Mar/16 ] |
|
From the MDS1 and MDS3 console log, we see: 13:37:35:LustreError: 18333:0:(osp_dev.c:1259:osp_device_free()) } header@ffff8800451c2b40
13:37:35:
13:37:35:LustreError: 18333:0:(osp_dev.c:1259:osp_device_free()) header@ffff880040837b00[0x1, 1, [0x200000001:0x1017:0x0] hash exist]{
13:37:35:
13:37:35:LustreError: 18333:0:(osp_dev.c:1259:osp_device_free()) ....local_storage@ffff880040837b50
13:37:35:
13:37:35:LustreError: 18333:0:(osp_dev.c:1259:osp_device_free()) ....osd-ldiskfs@ffff8800451e5480osd-ldiskfs-object@ffff8800451e5480(i:ffff88007bb7f6e0:25001/3959569064)[plain]
13:37:35:
13:37:35:LustreError: 18333:0:(osp_dev.c:1259:osp_device_free()) } header@ffff880040837b00
13:37:35:
13:37:35:LustreError: 18333:0:(lu_object.c:1224:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
13:37:35:LustreError: 18333:0:(lu_object.c:1224:lu_device_fini()) LBUG
13:37:35:Pid: 18333, comm: obd_zombid
13:37:35:
13:37:35:Call Trace:
13:37:35: [<ffffffffa06b6875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
13:37:35: [<ffffffffa06b6e77>] lbug_with_loc+0x47/0xb0 [libcfs]
13:37:35: [<ffffffffa0fadd38>] lu_device_fini+0xb8/0xc0 [obdclass]
13:37:35: [<ffffffffa0fb36ce>] dt_device_fini+0xe/0x10 [obdclass]
13:37:35: [<ffffffffa185f196>] osp_device_free+0x96/0x180 [osp]
13:37:35: [<ffffffffa0f98a2d>] class_decref+0x3dd/0x4c0 [obdclass]
13:37:35: [<ffffffffa0f84b21>] obd_zombie_impexp_cull+0x611/0x970 [obdclass]
13:37:35: [<ffffffffa0f84ee5>] obd_zombie_impexp_thread+0x65/0x190 [obdclass]
13:37:35: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
13:37:35: [<ffffffffa0f84e80>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass]
13:37:35: [<ffffffff810a0fce>] kthread+0x9e/0xc0
13:37:35: [<ffffffff8100c28a>] child_rip+0xa/0x20
13:37:35: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
13:37:35: [<ffffffff8100c280>] ? child_rip+0x0/0x20
13:37:35:
13:37:35:LustreError: 4510:0:(mdt_handler.c:4395:mdt_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed:
13:37:35:LustreError: 4510:0:(mdt_handler.c:4395:mdt_fini()) LBUG
13:37:35:Pid: 4510, comm: umount
13:37:35:
13:37:35:Call Trace:
13:37:35: [<ffffffffa06b6875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
13:37:35: [<ffffffffa06b6e77>] lbug_with_loc+0x47/0xb0 [libcfs]
13:37:35: [<ffffffffa17141ba>] mdt_device_fini+0x121a/0x12e0 [mdt]
13:37:35: [<ffffffffa0f85b1d>] ? class_disconnect_exports+0x17d/0x2f0 [obdclass]
13:37:35: [<ffffffffa0f9e302>] class_cleanup+0x572/0xd20 [obdclass]
13:37:35: [<ffffffffa0f81336>] ? class_name2dev+0x56/0xe0 [obdclass]
13:37:35: [<ffffffffa0fa0616>] class_process_config+0x1b66/0x24c0 [obdclass]
13:37:35: [<ffffffffa06c1cf1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
13:37:35: [<ffffffffa0fa142f>] class_manual_cleanup+0x4bf/0xc90 [obdclass]
13:37:35: [<ffffffffa0f81336>] ? class_name2dev+0x56/0xe0 [obdclass]
13:37:35: [<ffffffffa0fd29ec>] server_put_super+0x8bc/0xcd0 [obdclass]
13:37:35: [<ffffffff811946eb>] generic_shutdown_super+0x5b/0xe0
13:37:35: [<ffffffff811947d6>] kill_anon_super+0x16/0x60
13:37:35: [<ffffffffa0fa4616>] lustre_kill_super+0x36/0x60 [obdclass]
13:37:35: [<ffffffff81194f77>] deactivate_super+0x57/0x80
13:37:35: [<ffffffff811b4f5f>] mntput_no_expire+0xbf/0x110
13:37:35: [<ffffffff811b5aab>] sys_umount+0x7b/0x3a0
13:37:35: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
13:37:35:
13:37:35:Kernel panic - not syncing: LBUG
13:37:35:Pid: 4510, comm: umount Not tainted 2.6.32-573.18.1.el6_lustre.ge5f28dc.x86_64 #1
13:37:35:Call Trace:
13:37:35: [<ffffffff81539011>] ? panic+0xa7/0x16f
13:37:35: [<ffffffffa06b6ecb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
13:37:35: [<ffffffffa17141ba>] ? mdt_device_fini+0x121a/0x12e0 [mdt]
13:37:35: [<ffffffffa0f85b1d>] ? class_disconnect_exports+0x17d/0x2f0 [obdclass]
13:37:35: [<ffffffffa0f9e302>] ? class_cleanup+0x572/0xd20 [obdclass]
13:37:35: [<ffffffffa0f81336>] ? class_name2dev+0x56/0xe0 [obdclass]
13:37:35: [<ffffffffa0fa0616>] ? class_process_config+0x1b66/0x24c0 [obdclass]
13:37:35: [<ffffffffa06c1cf1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
13:37:35: [<ffffffffa0fa142f>] ? class_manual_cleanup+0x4bf/0xc90 [obdclass]
13:37:35: [<ffffffffa0f81336>] ? class_name2dev+0x56/0xe0 [obdclass]
13:37:35: [<ffffffffa0fd29ec>] ? server_put_super+0x8bc/0xcd0 [obdclass]
13:37:35: [<ffffffff811946eb>] ? generic_shutdown_super+0x5b/0xe0
13:37:35: [<ffffffff811947d6>] ? kill_anon_super+0x16/0x60
13:37:35: [<ffffffffa0fa4616>] ? lustre_kill_super+0x36/0x60 [obdclass]
13:37:35: [<ffffffff81194f77>] ? deactivate_super+0x57/0x80
13:37:35: [<ffffffff811b4f5f>] ? mntput_no_expire+0xbf/0x110
13:37:35: [<ffffffff811b5aab>] ? sys_umount+0x7b/0x3a0
13:37:35: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
13:37:35:Initializing cgroup subsys cpuset
13:37:35:Initializing cgroup subsys cpu
|
| Comment by nasf (Inactive) [ 17/Mar/16 ] |
|
Another failure instance on master: |