[LU-2598] obdfilter-survey LBUG ASSERTION( iobuf->dr_npages < iobuf->dr_max_pages ) failed Created: 09/Jan/13  Updated: 15/Oct/13  Resolved: 15/Oct/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.3
Fix Version/s: Lustre 2.4.0, Lustre 2.1.5, Lustre 2.1.6, Lustre 2.5.0

Type: Bug Priority: Major
Reporter: Malcolm Cowe (Inactive) Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: LB
Environment:

VirtualBox 4.2.6 VM
CentOS 6.3 x86_64
Kernel 2.6.32-279.2.1.el6_lustre.gc46c389.x86_64
Lustre Version: 2.1.3


Issue Links:
Related
is related to LU-1431 Support for larger than 1MB sequentia... Resolved
Severity: 3
Rank (Obsolete): 6053

 Description   

On execution of obdfilter-survey with rszlo="2048" and rszhi="2048" (non-default params) against a single OST, an LBUG is generated:

LustreError: 3202:0:(filter_io_26.c:297:filter_iobuf_add_page()) ASSERTION( iobuf->dr_npages < iobuf->dr_max_pages ) failed:
LustreError: 3202:0:(filter_io_26.c:297:filter_iobuf_add_page()) LBUG
LustreError: 3203:0:(filter_io_26.c:297:filter_iobuf_add_page()) ASSERTION( iobuf->dr_npages < iobuf->dr_max_pages ) failed:
LustreError: 3203:0:(filter_io_26.c:297:filter_iobuf_add_page()) LBUG
Kernel panic - not syncing: LBUG
Pid: 3202, comm: lctl Not tainted 2.6.32-279.2.1.el6_lustre.gc46c389.x86_64 #1
Call Trace:
[<ffffffff814fd57a>] ? panic+0xa0/0x168
[<ffffffffa0508e5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[<ffffffffa0bef472>] ? filter_iobuf_add_page+0x62/0x70 [obdfilter]
[<ffffffffa0bf2762>] ? filter_commitrw_write+0xa62/0x2e78 [obdfilter]
[<ffffffff8112b690>] ? __lru_cache_add+0x40/0x90
[<ffffffffa0be6fa7>] ? filter_preprw_write+0xc87/0x1cd0 [obdfilter]
[<ffffffff81162710>] ? cache_alloc_refill+0x1c0/0x240
[<ffffffffa0be6252>] ? filter_commitrw+0x272/0x290 [obdfilter]
[<ffffffffa0be8d88>] ? filter_preprw+0x68/0x80 [obdfilter]
[<ffffffffa0be919e>] ? filter_brw+0x3fe/0x740 [obdfilter]
[<ffffffffa0a80f8d>] ? echo_client_kbrw+0xd6d/0x1a50 [obdecho]
[<ffffffffa0517894>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
[<ffffffffa051b0c2>] ? cfs_hash_del+0xa2/0x1d0 [libcfs]
[<ffffffffa0a827d7>] ? echo_client_brw_ioctl+0x1f7/0x1380 [obdecho]
[<ffffffff81212e39>] ? security_capable+0x29/0x30
[<ffffffffa0a86f38>] ? echo_client_iocontrol+0x638/0x1d00 [obdecho]
[<ffffffffa050819b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
[<ffffffffa0509993>] ? cfs_alloc+0x63/0x90 [libcfs]
[<ffffffffa05c4cfa>] ? obd_ioctl_getdata+0x13a/0x1160 [obdclass]
[<ffffffffa05d83cf>] ? class_handle_ioctl+0x12ff/0x1ed0 [obdclass]
[<ffffffff8113ff34>] ? handle_mm_fault+0x1e4/0x2b0
[<ffffffffa05c42ab>] ? obd_class_ioctl+0x4b/0x190 [obdclass]
[<ffffffff8118dff2>] ? vfs_ioctl+0x22/0xa0
[<ffffffff81500a85>] ? page_fault+0x25/0x30
[<ffffffff8118e194>] ? do_vfs_ioctl+0x84/0x580
[<ffffffff8118e711>] ? sys_ioctl+0x81/0xa0
[<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b

The LBUG is consistent and reproducible on my test VM cluster using this command-line for obdfilter-survey:

size="512" rszlo="2048" rszhi="2048" nobjlo="2" thrlo="2" nobjhi="32" thrhi="32" case="disk" rslt_loc="/root/obdres" obdfilter-survey



 Comments   
Comment by Andreas Dilger [ 11/Jan/13 ]

Yes, the maximum IO size is 1MB, but the code shouldn't crash if some larger IO size is specified. The code should return an error in this case, or handle the larger IO by submitting multiple IO requests. This may also be fixed by the 4MB RPC patch in LU-1431.

Comment by Andreas Dilger [ 15/Jan/13 ]

The http://review.whamcloud.com/4993 patch looks like it will resolve this problem in osd-ldiskfs/osd-io.c:

-                     bio = bio_alloc(GFP_NOIO, max(BIO_MAX_PAGES,
+                     bio = bio_alloc(GFP_NOIO, min(BIO_MAX_PAGES,
Comment by Jian Yu [ 21/Jan/13 ]

If the patches for LU-1431 are not going to be landed on Lustre b2_1 branch, then we need cherry-pick the patch of http://review.whamcloud.com/1741 for LU-844 to b2_1 so as to get the following info while running obdfilter-survey with more than 1MB IO size:

Test disk case support maximum 1024KB IO data (rszhi=xxxx is too big) please use a smaller value.

Here is the test result on the current master branch with rszhi=2048:

== obdfilter-survey test 1a: Object Storage Targets survey == 00:16:37 (1358756197)
+ NETTYPE=tcp rszlo=2048 rszhi=2048 nobjlo=2 thrlo=2 nobjhi=1 thrhi=4 size=512 case=disk rslt_loc=/tmp targets="10.10.4.209:lustre-OST0000 10.10.4.209:lustre-OST0001 10.10.4.209:lustre-OST0002 10.10.4.209:lustre-OST0003 10.10.4.209:lustre-OST0004 10.10.4.209:lustre-OST0005 10.10.4.209:lustre-OST0006" /usr/bin/obdfilter-survey
Test disk case support maximum 1024KB IO data (rszhi=2048 is too big) please use a smaller value.
Resetting fail_loc on all nodes...done.
PASS 1a (1s)

Maloo report: https://maloo.whamcloud.com/test_sets/2d9fdba8-63a3-11e2-824c-52540035b04c

Comment by Andreas Dilger [ 21/Jan/13 ]

The http://review.whamcloud.com/1741 patch is at best a workaround of the real problem I now see. The check should be done in the kernel instead of the script, since kernels may be configured differently, and even if the script is changed it should not be possible to cause a kernel oops.

Comment by Jian Yu [ 05/Feb/13 ]

On the latest master branch, after commenting out the "rszhi" check from obdfilter-survey, running the test with rszhi="2048" hit the following assertion failure:

LustreError: 16216:0:(ofd_internal.h:524:ofd_info_init()) ASSERTION( info->fti_exp == ((void *)0) ) failed:
Comment by Andreas Dilger [ 20/Feb/13 ]

Yu Jian, can you please retest manually, now that http://review.whamcloud.com/1741 has landed. Even better would be to write a small sanity test that runs a very short test manually with a huge blocksize (e.g. 32MB) to check that this no longer LASSERTs.

Comment by Jian Yu [ 22/Feb/13 ]

Yu Jian, can you please retest manually, now that http://review.whamcloud.com/1741 has landed.

With http://review.whamcloud.com/1741 on master branch, running obdfilter-survey with rszhi="2048" will always get:

Test disk case support maximum 1024KB IO data (rszhi=2048 is too big) please use a smaller value.

Now that the 4MB RPC patch http://review.whamcloud.com/4993 has been landed on master branch, I commented out the change of http://review.whamcloud.com/1741 and ran the obdfilter-survey test with rszhi="2048", it passed without any assertion failures.

Lustre master build: http://build.whamcloud.com/job/lustre-master/1269/

+ NETTYPE=tcp rszlo=2048 rszhi=2048 nobjlo=2 thrlo=2 nobjhi=32 thrhi=32 size=512 case=disk rslt_loc=/tmp targets="10.10.4.209:lustre-OST0000 10.10.4.209:lustre-OST0001" /usr/bin/obdfilter-survey
Fri Feb 22 04:52:38 PST 2013 Obdfilter-survey for case=disk from client-12vm1
ost  2 sz  1048576K rsz 2048K obj    4 thr    4 write   38.53 [  14.00,  24.00] rewrite   44.82 [  19.99,  32.00] read 6816.48             SHORT 
ost  2 sz  1048576K rsz 2048K obj    4 thr    8 write   42.19 [  12.00,  31.99] rewrite   50.90 [  18.00,  31.99] read 6419.65             SHORT 
ost  2 sz  1048576K rsz 2048K obj    4 thr   16 write   44.38 [   6.00,  33.99] rewrite   54.78 [  15.99,  33.99] read 6656.54             SHORT 
ost  2 sz  1048576K rsz 2048K obj    4 thr   32 write   39.14 [   0.00,  40.00] rewrite   51.76 [   6.00,  41.99] read 6373.85             SHORT 
ost  2 sz  1048576K rsz 2048K obj    4 thr   64 write   44.55 [   0.00,  47.99] rewrite   55.59 [   6.00,  51.99] read 5994.27             SHORT 
ost  2 sz  1048576K rsz 2048K obj    8 thr    8 write   35.73 [   8.00,  27.99] rewrite   46.46 [  17.99,  27.99] read 6606.66             SHORT 
ost  2 sz  1048576K rsz 2048K obj    8 thr   16 write   39.02 [   8.00,  31.99] rewrite   50.27 [  10.00,  37.99] read 6695.94             SHORT 
ost  2 sz  1048576K rsz 2048K obj    8 thr   32 write   43.88 [   0.00,  39.99] rewrite   50.43 [   0.00,  39.99] read 6350.12             SHORT 
ost  2 sz  1048576K rsz 2048K obj    8 thr   64 write   46.22 [   0.00,  63.98] rewrite   55.43 [   0.00,  59.98] read 6055.04             SHORT 
ost  2 sz  1048576K rsz 2048K obj   16 thr   16 write   35.33 [   0.00,  31.99] rewrite   44.84 [   6.00,  32.00] read 6597.55             SHORT 
ost  2 sz  1048576K rsz 2048K obj   16 thr   32 write   39.48 [   4.00,  35.99] rewrite   44.83 [   0.00,  35.99] read 6348.42             SHORT 
ost  2 sz  1048576K rsz 2048K obj   16 thr   64 write   43.35 [   0.00,  61.98] rewrite   52.14 [   0.00,  49.99] read 6024.13             SHORT 
ost  2 sz  1048576K rsz 2048K obj   32 thr   32 write   36.63 [   0.00,  37.99] rewrite   47.39 [   8.00,  39.99] read 6045.66             SHORT 
ost  2 sz  1048576K rsz 2048K obj   32 thr   64 write   40.83 [   0.00,  49.98] rewrite   50.22 [   4.00,  51.99] read 5918.83             SHORT 
ost  2 sz  1048576K rsz 2048K obj   64 thr   64 write   39.22 [   0.00,  45.99] rewrite   49.17 [   0.00,  51.99] read 6107.23             SHORT 
done!

Maloo report: https://maloo.whamcloud.com/test_sets/24a44ca0-7cf3-11e2-a108-52540035b04c

I'll change the limit of 1024 to 4096 in obdfilter-survey.

Even better would be to write a small sanity test that runs a very short test manually with a huge blocksize (e.g. 32MB) to check that this no longer LASSERTs.

OK, will do.

Comment by Andreas Dilger [ 22/Feb/13 ]

I'll change the limit of 1024 to 4096 in obdfilter-survey.

Note that this would cause LBUG if new obdfilter-survey script is run against an old OST... It should be conditional upon the remote Lustre version being used.

Comment by Jian Yu [ 14/Mar/13 ]

Patch for Lustre b2_1 branch is in http://review.whamcloud.com/5715.

Comment by Jian Yu [ 20/Mar/13 ]

Patch for Lustre master branch is in http://review.whamcloud.com/5783.

Comment by Jay Lan (Inactive) [ 29/Mar/13 ]

Hmm, "Note that this would cause LBUG if new obdfilter-survey script is run against an old OST... It should be conditional upon the remote Lustre version being used."

Thanks! I kicked off a regression test yesterday and found my OSS crashed! Ah,
a 2.1.4 server, pretty close, right?

Just want to be sure... This problem only affect sanity test-180c, right? It
will not cause production server to crash, will it?

Comment by Oleg Drokin [ 29/Mar/13 ]

yes, this will only happen if you run obdfilter, which you don't really do in production.

Comment by Jian Yu [ 31/Mar/13 ]

Just want to be sure... This problem only affect sanity test-180c, right? It will not cause production server to crash, will it?

The sanity test 180c on Lustre b2_1 branch needs to be improved to interoperate with the servers (version < 2.1.5, and 2.2.0 <= version < 2.4.0) which do not have the patch fixing the assertion failure.

Comment by Jian Yu [ 01/Apr/13 ]

Patch for Lustre b2_1 branch to resolve the interop issues: http://review.whamcloud.com/5902

Comment by Andreas Dilger [ 23/May/13 ]

Patch for b2_4 at http://review.whamcloud.com/6394.

Comment by Jian Yu [ 27/May/13 ]

Patches were landed on Lustre b2_1, b2_4 and master branches.

Comment by Jodi Levi (Inactive) [ 15/Oct/13 ]

Added 2.5.0 FixVersion

Generated at Sat Feb 10 01:26:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.