Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
I see this semi-frequently in master even after LU-11697, so this must be something else.
This is typically only in racer and the full crash looks like this:
[ 8628.366285] Lustre: DEBUG MARKER: == racer test 1: racer on clients: centos-70.localnet DURATION=2700 ================================== 05:27:21 (1549708041) [ 8629.054425] Lustre: lfs: using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x200000402:0x4:0x0], use llapi_layout_get_by_path() [ 8630.549219] Lustre: DEBUG MARKER: racer test_1: @@@@@@ FAIL: generate lss conf (mds1) [ 8634.303466] LustreError: 14083:0:(mdt_lvb.c:430:mdt_lvbo_fill()) lustre-MDT0000: small buffer size 472 for EA 496 (max_mdsize 496): rc = -34 [ 8779.449264] BUG: unable to handle kernel paging request at ffff8800aa2dc000 [ 8779.449670] IP: [<ffffffff813ee500>] do_csum+0x70/0x180 [ 8779.449670] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 8779.449670] CPU: 9 PID: 15375 Comm: ll_ost_io04_000 3.10.0-7.6-debug #1 [ 8779.449670] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 8779.509742] Call Trace: [ 8779.509742] [<ffffffff813ee61e>] ip_compute_csum+0xe/0x30 [ 8779.509742] [<ffffffffa035e62e>] obd_dif_ip_fn+0xe/0x10 [obdclass] [ 8779.523520] [<ffffffffa035e6f9>] obd_page_dif_generate_buffer+0xc9/0x190 [obdclass] [ 8779.523520] [<ffffffffa05e18db>] tgt_checksum_niobuf_rw+0x28b/0xea0 [ptlrpc] [ 8779.541604] [<ffffffffa05e7e8d>] tgt_brw_read+0xc2d/0x1e60 [ptlrpc] [ 8779.541604] [<ffffffffa05e62a5>] tgt_request_handle+0x915/0x1610 [ptlrpc] [ 8779.541604] [<ffffffffa058b3d9>] ptlrpc_server_handle_request+0x259/0xad0 [ptlrpc] [ 8779.541604] [<ffffffffa058f3bc>] ptlrpc_main+0xb7c/0x22c0 [ptlrpc] [ 8779.541604] [<ffffffff810b4ed4>] kthread+0xe4/0xf0 [ 8779.541604] [<ffffffff817c4c77>] ret_from_fork_nospec_begin+0x21/0x21
note that even before ti10dif was landed I still saw this, just a bit different trace.
It seems in all cases only tgt_brw_read is hitting this