Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.1.0
-
None
-
Lustre Branch: master
Lustre Build: http://build.whamcloud.com/job/lustre-reviews/2177/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/54/
Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.6.1.el6.x86_64)
ENABLE_QUOTA=yes
FAILURE_MODE=HARD
FLAVOR=OSS
MGS/MDS Node: fat-amd-1-ib
OSS Nodes: fat-amd-3-ib(active), fat-amd-4-ib(active)
\ /
OST1 (active in fat-amd-3-ib)
OST2 (active in fat-amd-4-ib)
OST3 (active in fat-amd-3-ib)
OST4 (active in fat-amd-4-ib)
OST5 (active in fat-amd-3-ib)
OST6 (active in fat-amd-4-ib)
fat-amd-2-ib(OST7)
Client Nodes: client-[1,2,4,5,12,13,15],fat-intel-4
Network Addresses:
fat-amd-1-ib: 192.168.4.132
fat-amd-2-ib: 192.168.4.133
fat-amd-3-ib: 192.168.4.134
fat-amd-4-ib: 192.168.4.135
client-1-ib: 192.168.4.1
client-2-ib: 192.168.4.2
client-4-ib: 192.168.4.4
client-5-ib: 192.168.4.5
client-12-ib: 192.168.4.12
client-13-ib: 192.168.4.13
client-15-ib: 192.168.4.15
fat-intel-4-ib: 192.168.4.131
Lustre Branch: master Lustre Build: http://build.whamcloud.com/job/lustre-reviews/2177/ e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/54/ Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.6.1.el6.x86_64) ENABLE_QUOTA=yes FAILURE_MODE=HARD FLAVOR=OSS MGS/MDS Node: fat-amd-1-ib OSS Nodes: fat-amd-3-ib(active), fat-amd-4-ib(active) \ / OST1 (active in fat-amd-3-ib) OST2 (active in fat-amd-4-ib) OST3 (active in fat-amd-3-ib) OST4 (active in fat-amd-4-ib) OST5 (active in fat-amd-3-ib) OST6 (active in fat-amd-4-ib) fat-amd-2-ib(OST7) Client Nodes: client-[1,2,4,5,12,13,15],fat-intel-4 Network Addresses: fat-amd-1-ib: 192.168.4.132 fat-amd-2-ib: 192.168.4.133 fat-amd-3-ib: 192.168.4.134 fat-amd-4-ib: 192.168.4.135 client-1-ib: 192.168.4.1 client-2-ib: 192.168.4.2 client-4-ib: 192.168.4.4 client-5-ib: 192.168.4.5 client-12-ib: 192.168.4.12 client-13-ib: 192.168.4.13 client-15-ib: 192.168.4.15 fat-intel-4-ib: 192.168.4.131
-
3
-
4902
Description
While running recovery-mds-scale with FLAVOR=OSS, it failed as follows:
<~snip~> ==== Checking the clients loads AFTER failover -- failure NOT OK Client load failed on node client-2-ib, rc=1 Client load failed during failover. Exiting 2011-09-13 22:12:19 Terminating clients loads ... Duration: 7200 Server failover period: 600 seconds Exited after: 4991 seconds Number of failovers before exit: mds1: 0 times ost1: 1 times ost2: 1 times ost3: 0 times ost4: 0 times ost5: 1 times ost6: 0 times ost7: 1 times Status: FAIL: rc=5
Client node client-2-ib hit the following LBUG:
LustreError: 32289:0:(osc_io.c:336:osc_io_commit_write()) ASSERTION(to > 0) failed^M LustreError: 32289:0:(osc_io.c:336:osc_io_commit_write()) LBUG^M Pid: 32289, comm: dd^M ^M Call Trace:^M [<ffffffffa03b9855>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]^M [<ffffffffa03b9e95>] lbug_with_loc+0x75/0xe0 [libcfs]^M [<ffffffffa03c4d76>] libcfs_assertion_failed+0x66/0x70 [libcfs]^M [<ffffffffa079dcc0>] osc_io_commit_write+0x1d0/0x1e0 [osc]^M [<ffffffff8119b6cf>] ? __mark_inode_dirty+0x13f/0x160^M [<ffffffffa04e241f>] cl_io_commit_write+0xaf/0x1f0 [obdclass]^M [<ffffffffa07fe6dd>] ? lov_page_subio+0xad/0x260 [lov]^M [<ffffffffa04d37d8>] ? cl_page_export+0x58/0x80 [obdclass]^M [<ffffffffa07fe937>] lov_io_commit_write+0xa7/0x1d0 [lov]^M^@ [<ffffffffa04e241f>] cl_io_commit_write+0xaf/0x1f0 [obdclass]^M [<ffffffffa04d26b9>] ? cl_env_get+0x29/0x350 [obdclass]^M [<ffffffffa0871e0d>] ll_commit_write+0xdd/0x2b0 [lustre]^M [<ffffffff814e4d3c>] ? bad_to_user+0x66/0x612^M [<ffffffffa08894a0>] ll_write_end+0x30/0x60 [lustre]^M [<ffffffff8110dc84>] generic_file_buffered_write+0x174/0x2a0^M [<ffffffff8106dd17>] ? current_fs_time+0x27/0x30^M [<ffffffff8110f570>] __generic_file_aio_write+0x250/0x480^M [<ffffffff8110f80f>] generic_file_aio_write+0x6f/0xe0^M [<ffffffffa089b001>] vvp_io_write_start+0xa1/0x270 [lustre]^M [<ffffffffa04df318>] cl_io_start+0x68/0x170 [obdclass]^M [<ffffffffa04e37d0>] cl_io_loop+0x110/0x1c0 [obdclass]^M [<ffffffffa084291b>] ll_file_io_generic+0x44b/0x580 [lustre]^M [<ffffffffa03c8424>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]^M [<ffffffffa04d26b9>] ? cl_env_get+0x29/0x350 [obdclass]^M [<ffffffffa0842b8f>] ll_file_aio_write+0x13f/0x310 [lustre]^M [<ffffffffa04d282e>] ? cl_env_get+0x19e/0x350 [obdclass]^M [<ffffffffa04d0d8f>] ? cl_env_put+0x1af/0x2e0 [obdclass]^M [<ffffffffa0849301>] ll_file_write+0x171/0x310 [lustre]^M [<ffffffff8126e781>] ? __clear_user+0x21/0x70^M [<ffffffff81172748>] vfs_write+0xb8/0x1a0^M [<ffffffff810d1ad2>] ? audit_syscall_entry+0x272/0x2a0^M [<ffffffff81173181>] sys_write+0x51/0x90^M [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b^M ^M Kernel panic - not syncing: LBUG^M Pid: 32289, comm: dd Tainted: G ---------------- T 2.6.32-131.6.1.el6.x86_64 #1^M Call Trace:^M [<ffffffff814da518>] ? panic+0x78/0x143^M [<ffffffffa03b9eeb>] ? lbug_with_loc+0xcb/0xe0 [libcfs]^M [<ffffffffa03c4d76>] ? libcfs_assertion_failed+0x66/0x70 [libcfs]^M [<ffffffffa079dcc0>] ? osc_io_commit_write+0x1d0/0x1e0 [osc]^M [<ffffffff8119b6cf>] ? __mark_inode_dirty+0x13f/0x160^M [<ffffffffa04e241f>] ? cl_io_commit_write+0xaf/0x1f0 [obdclass]^M [<ffffffffa07fe6dd>] ? lov_page_subio+0xad/0x260 [lov]^M [<ffffffffa04d37d8>] ? cl_page_export+0x58/0x80 [obdclass]^M [<ffffffffa07fe937>] ? lov_io_commit_write+0xa7/0x1d0 [lov]^M [<ffffffffa04e241f>] ? cl_io_commit_write+0xaf/0x1f0 [obdclass]^M [<ffffffffa04d26b9>] ? cl_env_get+0x29/0x350 [obdclass]^M [<ffffffffa0871e0d>] ? ll_commit_write+0xdd/0x2b0 [lustre]^M [<ffffffff814e4d3c>] ? bad_to_user+0x66/0x612^M [<ffffffffa08894a0>] ? ll_write_end+0x30/0x60 [lustre]^M [<ffffffff8110dc84>] ? generic_file_buffered_write+0x174/0x2a0^M [<ffffffff8106dd17>] ? current_fs_time+0x27/0x30^M [<ffffffff8110f570>] ? __generic_file_aio_write+0x250/0x480^M [<ffffffff8110f80f>] ? generic_file_aio_write+0x6f/0xe0^M [<ffffffffa089b001>] ? vvp_io_write_start+0xa1/0x270 [lustre]^M [<ffffffffa04df318>] ? cl_io_start+0x68/0x170 [obdclass]^M [<ffffffffa04e37d0>] ? cl_io_loop+0x110/0x1c0 [obdclass]^M [<ffffffffa084291b>] ? ll_file_io_generic+0x44b/0x580 [lustre]^M [<ffffffffa03c8424>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]^M [<ffffffffa04d26b9>] ? cl_env_get+0x29/0x350 [obdclass]^M [<ffffffffa0842b8f>] ? ll_file_aio_write+0x13f/0x310 [lustre]^M [<ffffffffa04d282e>] ? cl_env_get+0x19e/0x350 [obdclass]^M [<ffffffffa04d0d8f>] ? cl_env_put+0x1af/0x2e0 [obdclass]^M [<ffffffffa0849301>] ? ll_file_write+0x171/0x310 [lustre]^M [<ffffffff8126e781>] ? __clear_user+0x21/0x70^M [<ffffffff81172748>] ? vfs_write+0xb8/0x1a0^M [<ffffffff810d1ad2>] ? audit_syscall_entry+0x272/0x2a0^M [<ffffffff81173181>] ? sys_write+0x51/0x90^M [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b^M
Maloo report: https://maloo.whamcloud.com/test_sets/943d6cda-de9e-11e0-9909-52540025f9af
Please refer to the attached recovery-oss-scale.1315977145.log.tar.bz2 for more logs.
Attachments
Issue Links
- Trackbacks
-
Lustre 2.1.0 release testing tracker Lustre 2.1.0 RC0 Tag: v2100RC0 Build: