[LU-10239] Lustre crash (client): The first extent to be fit in a RPC contains 17 chunks, which is over the limit 16. Created: 14/Nov/17  Updated: 26/Feb/20  Resolved: 04/Jan/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Critical
Reporter: Murshid Azman Assignee: Zhenyu Xu
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Related
is related to LU-8135 sanity test_101g fails with 'not all ... Resolved
is related to LU-11449 osc: send RPCs with random i/o Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Our clients have max_rpcs_in_flight set to 16. Some of our clients hit this bug at times, which crashes these nodes:

 

2017-10-27T20:22:40-05:00 node0748 kernel: [14998.118253] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) extent ffff882e6c434000@{[60858 -> 64953/64953], [1|0|+|lockdone|wSu|ffff882dc5cce4d0], [0|4096|+|-| (null)|4096| (null)]} The first extent to be fit in a RPC contains 17 chunks, which is over the limit 16.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.154188] LustreError: 1708:0:(osc_cache.c:1226:osc_extent_tree_dump0()) Dump object ffff882dc5cce4d0 extents at try_to_add_extent_for_io:1931, mppr: 4096.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.181833] LustreError: 1708:0:(osc_cache.c:1239:osc_extent_tree_dump0()) extent ffff882e6c434000@{[60858 -> 64953/64953], [1|0|+|lockdone|wSu|ffff882dc5cce4d0], [0|4096|+|-| (null)|4096| (null)]}urgent 1.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.216119] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) ASSERTION( data->erd_page_count != 0 || chunk_count <= data->erd_max_chunks ) failed: 
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.244827] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) LBUG
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.259546] Pid: 1708, comm: ptlrpcd_00_86
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.270365] 
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.270365] Call Trace:
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.287055] [<ffffffffa07537f3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.301366] [<ffffffffa0753861>] lbug_with_loc+0x41/0xb0 [libcfs]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.314695] [<ffffffffa0c69498>] try_to_add_extent_for_io.isra.24+0xf58/0x12e0 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.328249] [<ffffffffa0c6b9dd>] osc_io_unplug0+0x3fd/0x1950 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.339730] [<ffffffff810d2372>] ? load_balance+0x192/0x990
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.350013] [<ffffffff810ce46c>] ? dequeue_entity+0x11c/0x5d0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.360444] [<ffffffffa0c6db30>] osc_io_unplug+0x10/0x20 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.370874] [<ffffffffa0c49441>] brw_queue_work+0x31/0xd0 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.381510] [<ffffffffa0a5e3d7>] work_interpreter+0x37/0xf0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.392538] [<ffffffffa0a5b0b5>] ptlrpc_check_set.part.23+0x425/0x1dd0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.405132] [<ffffffffa0a5cabb>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.416411] [<ffffffffa0a88a3b>] ptlrpcd_check+0x4db/0x5d0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.427367] [<ffffffffa0a88d57>] ptlrpcd+0x227/0x560 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.437727] [<ffffffff810c4fd0>] ? default_wake_function+0x0/0x20
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.448479] [<ffffffffa0a88b30>] ? ptlrpcd+0x0/0x560 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.458669] [<ffffffff810b064f>] kthread+0xcf/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.467800] [<ffffffff810b0580>] ? kthread+0x0/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.477111] [<ffffffff81696818>] ret_from_fork+0x58/0x90
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.486513] [<ffffffff810b0580>] ? kthread+0x0/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.495492] 
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.500760] Kernel panic - not syncing: LBUG
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.509021] CPU: 203 PID: 1708 Comm: ptlrpcd_00_86 Tainted: P OE ------------ 3.10.0-514.6.1.el7.x86_64 #1
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.524591] Hardware name: Penguin Computing Relion 1904GT/S7200AP, BIOS S72C610.86B.01.02.0001.112820162103 11/28/2016
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.540144] ffffffffa0770ccc 00000000e8cd63f9 ffff882ed87a7958 ffffffff816862ac
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.551919] ffff882ed87a79d8 ffffffff8167f6b3 ffffffff00000008 ffff882ed87a79e8
 
2017-10-27T20:22:40-05:00 node0748 kernel: [14998.118253] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) extent ffff882e6c434000@{[60858 -> 64953/64953], [1|0|+|lockdone|wSu|ffff882dc5cce4d0], [0|4096|+|-| (null)|4096| (null)]} The first extent to be fit in a RPC contains 17 chunks, which is over the limit 16.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.154188] LustreError: 1708:0:(osc_cache.c:1226:osc_extent_tree_dump0()) Dump object ffff882dc5cce4d0 extents at try_to_add_extent_for_io:1931, mppr: 4096.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.181833] LustreError: 1708:0:(osc_cache.c:1239:osc_extent_tree_dump0()) extent ffff882e6c434000@{[60858 -> 64953/64953], [1|0|+|lockdone|wSu|ffff882dc5cce4d0], [0|4096|+|-| (null)|4096| (null)]}urgent 1.
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.216119] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) ASSERTION( data->erd_page_count != 0 || chunk_count <= data->erd_max_chunks ) failed: 
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.244827] LustreError: 1708:0:(osc_cache.c:1931:try_to_add_extent_for_io()) LBUG
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.259546] Pid: 1708, comm: ptlrpcd_00_86
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.270365] 
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.270365] Call Trace:
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.287055] [<ffffffffa07537f3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.301366] [<ffffffffa0753861>] lbug_with_loc+0x41/0xb0 [libcfs]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.314695] [<ffffffffa0c69498>] try_to_add_extent_for_io.isra.24+0xf58/0x12e0 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.328249] [<ffffffffa0c6b9dd>] osc_io_unplug0+0x3fd/0x1950 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.339730] [<ffffffff810d2372>] ? load_balance+0x192/0x990
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.350013] [<ffffffff810ce46c>] ? dequeue_entity+0x11c/0x5d0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.360444] [<ffffffffa0c6db30>] osc_io_unplug+0x10/0x20 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.370874] [<ffffffffa0c49441>] brw_queue_work+0x31/0xd0 [osc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.381510] [<ffffffffa0a5e3d7>] work_interpreter+0x37/0xf0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.392538] [<ffffffffa0a5b0b5>] ptlrpc_check_set.part.23+0x425/0x1dd0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.405132] [<ffffffffa0a5cabb>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.416411] [<ffffffffa0a88a3b>] ptlrpcd_check+0x4db/0x5d0 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.427367] [<ffffffffa0a88d57>] ptlrpcd+0x227/0x560 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.437727] [<ffffffff810c4fd0>] ? default_wake_function+0x0/0x20
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.448479] [<ffffffffa0a88b30>] ? ptlrpcd+0x0/0x560 [ptlrpc]
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.458669] [<ffffffff810b064f>] kthread+0xcf/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.467800] [<ffffffff810b0580>] ? kthread+0x0/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.477111] [<ffffffff81696818>] ret_from_fork+0x58/0x90
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.486513] [<ffffffff810b0580>] ? kthread+0x0/0xe0
 2017-10-27T20:22:40-05:00 node0748 kernel: [14998.495492] 
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.500760] Kernel panic - not syncing: LBUG
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.509021] CPU: 203 PID: 1708 Comm: ptlrpcd_00_86 Tainted: P OE ------------ 3.10.0-514.6.1.el7.x86_64 #1
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.524591] Hardware name: Penguin Computing Relion 1904GT/S7200AP, BIOS S72C610.86B.01.02.0001.112820162103 11/28/2016
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.540144] ffffffffa0770ccc 00000000e8cd63f9 ffff882ed87a7958 ffffffff816862ac
 2017-10-27T20:22:41-05:00 node0748 kernel: [14998.551919] ffff882ed87a79d8 ffffffff8167f6b3 ffffffff00000008 ffff882ed87a79e8
 

 

Seems to be related to LU-8680



 Comments   
Comment by Phil Schwan (Inactive) [ 19/Nov/17 ]

We just ran a job that blew up 1,600 nodes with this issue, so we're highly motivated to solve or work around it.

My theory – which is admittedly not very deep – goes something like this:

  • osc_io_submit() builds up a list of pending pages. It honours max_pages_per_rpc, but I don't see it making any effort to limit the number of chunks? (our max_pages_per_rpc on most nodes = 4096)
  • osc_queue_sync_pages() walks the list of pending items, building up an extent.
  • that extent gets added to the osc_object->oo_reading_exts
  • osc_send_read_rpc() iterates over the oo_reading_exts extents
  • each extent gets passed into try_to_add_extent_for_io()
  • try_to_add_extent_for_io() sees that it has 17 chunks, greater than the 16 limit, and blows up

Am I wrong? What prevents osc_io_submit() from building up a 4096-page I/O that contains more than 16 chunks?

This is a 2.9.52 client, but I checked master and it didn't look like things were materially different in this area.

I could see that cutting max_pages_per_rpc to 16 might work around it, but that would be an extremely harsh workaround, resulting in a lot of extremely tiny I/Os...

Does anyone have a clue to spare, perhaps, to point us in the right direction?

Comment by James Nunez (Inactive) [ 20/Dec/17 ]

Bobijam -

Would you please look into this issue?

Thank you

Comment by Gerrit Updater [ 21/Dec/17 ]

Bobi Jam (bobijam@hotmail.com) uploaded a new patch: https://review.whamcloud.com/30627
Subject: LU-10239 osc: limit chunk size of write submit
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cae710b2f9d2e257add0a572ebc999fd407a07aa

Comment by Gerrit Updater [ 04/Jan/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/30627/
Subject: LU-10239 osc: limit chunk number of write submit
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 93ef6e7863b4e3aadfb3171f7f4958c5040446ec

Comment by Peter Jones [ 04/Jan/19 ]

Landed for 2.13

Comment by Jacek Tomaka [ 08/Feb/20 ]

Hey, Any chances to backport this fix to 2.12.4?

Comment by Jacek Tomaka [ 26/Feb/20 ]

Hmm?

Generated at Sat Feb 10 02:33:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.