[LU-11620] BUG: sleeping function called from invalid context at mm/slub.c:940 Created: 05/Nov/18 Updated: 19/Mar/19 Resolved: 04/Jan/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.5 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.10.7, Lustre 2.12.1 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
Lustre 2.10.5_2.chaos |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
OSS console log reports BUG: sleeping function called from invalid context at mm/slub.c:940 in_atomic(): 1, irqs_disabled(): 0, pid: 152563, name: lfsck CPU: 9 PID: 152563 Comm: lfsck Kdump: loaded Tainted: P OE ------------ T 3.10.0-862.14.4.1chaos.ch6.x86_64 #1 Hardware name: CRAY CRAY-GB512X-CN/S2600JF, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 Call Trace: [<ffffffff9f334f01>] dump_stack+0x19/0x1b [<ffffffff9eccf439>] __might_sleep+0xd9/0x100 [<ffffffff9ee06363>] kmem_cache_alloc+0x43/0x240 [<ffffffffc16918e1>] ? ofd_object_alloc+0x51/0x240 [ofd] [<ffffffffc16918e1>] ofd_object_alloc+0x51/0x240 [ofd] [<ffffffffc1176464>] lu_object_alloc+0x54/0x320 [obdclass] [<ffffffffc1173cb3>] ? htable_lookup+0x163/0x180 [obdclass] [<ffffffffc1176910>] lu_object_find_at+0x180/0x2b0 [obdclass] [<ffffffffc1177a98>] dt_locate_at+0x18/0xb0 [obdclass] [<ffffffffc15fadb2>] lfsck_layout_slave_prep+0x392/0x5b0 [lfsck] [<ffffffffc15d1fe6>] lfsck_master_engine+0x196/0x1450 [lfsck] [<ffffffffc15d1e50>] ? lfsck_master_oit_engine+0x11a0/0x11a0 [lfsck] [<ffffffff9ecc12d1>] kthread+0xd1/0xe0 [<ffffffff9ecc1200>] ? insert_kthread_work+0x40/0x40 [<ffffffff9f347837>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffff9ecc1200>] ? insert_kthread_work+0x40/0x40 and about a day later, a different stack BUG: sleeping function called from invalid context at mm/slub.c:940 in_atomic(): 1, irqs_disabled(): 0, pid: 154333, name: ll_ost_out01_00 CPU: 8 PID: 154333 Comm: ll_ost_out01_00 Kdump: loaded Tainted: P W OE ------------ T 3.10.0-862.14.4.1chaos.ch6.x86_64 #1 Hardware name: CRAY CRAY-GB512X-CN/S2600JF, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 Call Trace: [<ffffffff9f334f01>] dump_stack+0x19/0x1b [<ffffffff9eccf439>] __might_sleep+0xd9/0x100 [<ffffffff9ee06363>] kmem_cache_alloc+0x43/0x240 [<ffffffffc16918e1>] ? ofd_object_alloc+0x51/0x240 [ofd] [<ffffffffc16918e1>] ofd_object_alloc+0x51/0x240 [ofd] [<ffffffffc1176464>] lu_object_alloc+0x54/0x320 [obdclass] [<ffffffffc1173cb3>] ? htable_lookup+0x163/0x180 [obdclass] [<ffffffffc1176910>] lu_object_find_at+0x180/0x2b0 [obdclass] [<ffffffffc1176e7f>] lu_object_find_slice+0x1f/0x90 [obdclass] [<ffffffffc16074ce>] lfsck_orphan_it_next+0x17e/0xc90 [lfsck] [<ffffffffc160804e>] lfsck_orphan_it_load+0x6e/0x160 [lfsck] [<ffffffffc1178d28>] dt_index_walk+0xf8/0x450 [obdclass] [<ffffffffc1179080>] ? dt_index_walk+0x450/0x450 [obdclass] [<ffffffffc117993c>] dt_index_read+0x44c/0x6b0 [obdclass] [<ffffffffc13b47e2>] tgt_obd_idx_read+0x612/0x860 [ptlrpc] [<ffffffffc13b653a>] tgt_request_handle+0x92a/0x1370 [ptlrpc] [<ffffffffc135db5b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] [<ffffffffc135b26b>] ? ptlrpc_wait_event+0xab/0x350 [ptlrpc] [<ffffffff9ecd6492>] ? default_wake_function+0x12/0x20 [<ffffffff9eccb87b>] ? __wake_up_common+0x5b/0x90 [<ffffffffc1361c70>] ptlrpc_main+0xae0/0x1e90 [ptlrpc] [<ffffffffc1361190>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc] [<ffffffff9ecc12d1>] kthread+0xd1/0xe0 [<ffffffff9ecc1200>] ? insert_kthread_work+0x40/0x40 [<ffffffff9f347837>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffff9ecc1200>] ? insert_kthread_work+0x40/0x40 |
| Comments |
| Comment by Peter Jones [ 05/Nov/18 ] |
|
? |
| Comment by Olaf Faaland [ 05/Nov/18 ] |
|
Peter Jones writes: > ? Wrong window had focus. Argh! |
| Comment by Olaf Faaland [ 05/Nov/18 ] |
|
See https://github.com/LLNL/lustre for the patch stack. |
| Comment by Olaf Faaland [ 05/Nov/18 ] |
|
The second stack looks very similar to https://jira.whamcloud.com/browse/LU-11302 |
| Comment by Olaf Faaland [ 05/Nov/18 ] |
|
Several other stacks, but they all start the same way: lfsck_layout_slave_prep() |
| Comment by Peter Jones [ 06/Nov/18 ] |
|
Lai Can you please investigate? Thanks Peter |
| Comment by Lai Siyao [ 07/Nov/18 ] |
|
yes, I think it's duplicate of |
| Comment by Gerrit Updater [ 07/Nov/18 ] |
|
Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33603 |
| Comment by Gerrit Updater [ 04/Jan/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33603/ |
| Comment by Peter Jones [ 04/Jan/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 07/Jan/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33979 |
| Comment by Gerrit Updater [ 15/Feb/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33979/ |
| Comment by Gerrit Updater [ 25/Feb/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34303 |
| Comment by Gerrit Updater [ 19/Mar/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34303/ |