[LU-8005] LBUG on osc_req_attr_set Created: 11/Apr/16  Updated: 19/Jul/21  Resolved: 03/May/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Jinshan Xiong (Inactive) Assignee: Jinshan Xiong (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After new readahead code is introduced, client could die with the following backtrace.

38023:0:(osc_io.c:976:osc_req_attr_set()) LBUG
<4>[762654.201077] Pid: 38023, comm: RegSep_mpi
<4>[762654.201078]
<4>[762654.201078] Call Trace:
<4>[762654.201082] [<ffffffff81004b95>] dump_trace+0x75/0x300
<4>[762654.201089] [<ffffffffa077682a>] libcfs_debug_dumpstack+0x4a/0x70 [libcfs]
<4>[762654.201098] [<ffffffffa0776d5e>] lbug_with_loc+0x3e/0xb0 [libcfs]
<4>[762654.201107] [<ffffffffa0b73f5e>] osc_req_attr_set+0x4de/0x5d0 [osc]
<4>[762654.201132] [<ffffffffa0858dd0>] cl_req_attr_set+0xd0/0x240 [obdclass]
<4>[762654.201157] [<ffffffffa0b62865>] osc_build_rpc+0x505/0x1310 [osc]
<4>[762654.201169] [<ffffffffa0b80139>] osc_send_read_rpc+0x7a9/0x9b0 [osc]
<4>[762654.201185] [<ffffffffa0b84b34>] osc_check_rpcs+0x544/0x720 [osc]
<4>[762654.201202] [<ffffffffa0b850a1>] osc_io_unplug0+0xf1/0x4c0 [osc]
<4>[762654.201218] [<ffffffffa0b8793c>] osc_queue_sync_pages+0x1fc/0x390 [osc]
<4>[762654.201234] [<ffffffffa0b7473b>] osc_io_submit+0x2fb/0x500 [osc]
<4>[762654.201258] [<ffffffffa0857d26>] cl_io_submit_rw+0x66/0x170 [obdclass]
<4>[762654.201286] [<ffffffffa0c17f9c>] lov_io_submit+0x21c/0x4d0 [lov]
<4>[762654.201311] [<ffffffffa0857d26>] cl_io_submit_rw+0x66/0x170 [obdclass]
<4>[762654.201342] [<ffffffffa0cbf278>] ll_io_read_page+0x218/0x270 [lustre]
<4>[762654.201361] [<ffffffffa0cbf47f>] ll_readpage+0x1af/0x3e0 [lustre]
<4>[762654.201370] [<ffffffff810fc18e>] do_generic_file_read+0x13e/0x490
<4>[762654.201374] [<ffffffff810fcf0c>] generic_file_aio_read+0xfc/0x260
1730,1 97%
<3>[762654.199866] LustreError: 38023:0:(osc_io.c:966:osc_req_attr_set()) lov-page@ffff880cd41c6ab8, raid0
<3>[762654.199872] LustreError: 38023:0:(osc_io.c:966:osc_req_attr_set()) osc-page@ffff880cd41c6b20 8960: 1< 0x845fed 1 0 + + > 2< 36700160 0 4096 0x7 0x8 | (null) ffff880fefbdc5e0 ffff880f683d82f0 > 3< + ffff880d80524440 1 0 0 > 4< 1 0 8 34795520 - | - - - + > 5< - - - + | 0 - | 0 - ->
<3>[762654.199875] LustreError: 38023:0:(osc_io.c:966:osc_req_attr_set()) end page@ffff880cd41c6a00
<3>[762654.199877] LustreError: 38023:0:(osc_io.c:966:osc_req_attr_set()) uncovered page!

This is because osc_io_read_ahead() is referring a dlm lock that belongs to a previous instance of osc_object, which doesn't have an osc_object attached any more. Fix is coming soon.



 Comments   
Comment by Gerrit Updater [ 11/Apr/16 ]

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/19453
Subject: LU-8005 osc: set lock data for readahead lock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5b00c27825912ff0de3bd96ddd40b3b3d15c6ed8

Comment by Gerrit Updater [ 02/May/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19453/
Subject: LU-8005 osc: set lock data for readahead lock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 2b0479c0c959e44a4a3e850d36979fdbdf370d3a

Comment by Joseph Gmitter (Inactive) [ 03/May/16 ]

Landed to master for 2.9.0

Comment by Sarah Liu [ 08/Jun/16 ]

Talk with Jinshan, here are the pattern to reproduce/verify this issue:

[6/8/16, 2:44:51 PM] Jinshan Xiong: Sure, the reproduce pattern is to do I/O, clear page cache, do I/O over and over again

Generated at Sat Feb 10 02:13:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.