[LU-14283] sanity-sec test 52 crashes due to ‘divide error: 0000 [#1] SMP PTI’ Created: 30/Dec/20 Updated: 13/Apr/21 Resolved: 25/Jan/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | Wang Shilong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | rhel8.3 | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
sanity-sec test_52 crashes due to ‘divide error: 0000 1 SMP PTI’. We’ve seen this test crash five times for review-dne-ssk or review-dne-selinux-ssk patch testing. Two recent crashes happened while testing RHEL8.3 clients and servers: Looking at the client console log, we see [37329.207145] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock,test_dummy_encryption,skpath=/tmp/test-framework-keys trevis-8vm4@tcp:/lustre /mnt/lustre [37331.900380] Lustre: 1600717:0:(keysetup.c:472:llcrypt_get_encryption_info()) inode 144115188193296385: setting policy filenames_encryption_mode to null [37331.901861] Lustre: 1600717:0:(keysetup.c:472:llcrypt_get_encryption_info()) Skipped 5 previous similar messages [37334.491809] divide error: 0000 [#1] SMP PTI [37334.492358] CPU: 1 PID: 1550060 Comm: ptlrpcd_00_00 Kdump: loaded Tainted: G OE --------- - - 4.18.0-240.1.1.el8_3.x86_64 #1 [37334.493661] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [37334.494315] RIP: 0010:osc_announce_cached.isra.24+0xd4/0x3b0 [osc] [37334.494970] Code: af d0 48 8b 43 50 48 39 ca 48 0f 42 d1 48 89 d1 48 c1 e1 0c f6 80 c5 02 00 00 10 74 1a 8b b3 f0 00 00 00 48 8d 44 32 ff 31 d2 <48> f7 f6 0f af 83 ec 00 00 00 48 01 c1 48 81 f9 ff ff ff 6f b8 ff [37334.496814] RSP: 0018:ffffa4c700833a78 EFLAGS: 00010246 [37334.497361] RAX: 0000000000023fff RBX: ffff958f013085f0 RCX: 0000000024000000 [37334.498104] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff958f013086e4 [37334.498838] RBP: ffff958f1ba8c150 R08: 0000000000033c62 R09: ffff958f1ba8c248 [37334.499564] R10: ffff958efd282880 R11: 0000000000002000 R12: ffff958f013086e4 [37334.500297] R13: 0000000000000010 R14: 0000000000000002 R15: ffff958f013085f0 [37334.501029] FS: 0000000000000000(0000) GS:ffff958f3fd00000(0000) knlGS:0000000000000000 [37334.501848] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [37334.502442] CR2: 00007fc4a2b46000 CR3: 000000005a20a002 CR4: 00000000000606e0 [37334.503184] Call Trace: [37334.503498] osc_brw_prep_request+0xf2e/0x15b0 [osc] [37334.504040] osc_build_rpc+0xb82/0x10b0 [osc] [37334.504514] osc_check_rpcs+0x1703/0x1880 [osc] [37334.505029] ? __switch_to_asm+0x41/0x70 [37334.505454] ? __switch_to_asm+0x35/0x70 [37334.505879] ? __switch_to_asm+0x35/0x70 [37334.506299] ? __switch_to_asm+0x41/0x70 [37334.506728] osc_io_unplug0+0xc0/0x110 [osc] [37334.507194] brw_queue_work+0x2e/0xc0 [osc] [37334.507801] work_interpreter+0x32/0x110 [ptlrpc] [37334.508337] ptlrpc_check_set+0x51a/0x21b0 [ptlrpc] [37334.508883] ? finish_task_switch+0x77/0x2a0 [37334.509368] ptlrpcd_check+0x3d5/0x5b0 [ptlrpc] [37334.509889] ptlrpcd+0x458/0x4c0 [ptlrpc] [37334.510332] ? finish_wait+0x80/0x80 [37334.510751] ? ptlrpcd_check+0x5b0/0x5b0 [ptlrpc] [37334.511262] kthread+0x112/0x130 [37334.511622] ? kthread_flush_work_fn+0x10/0x10 [37334.512106] ret_from_fork+0x35/0x40 |
| Comments |
| Comment by Peter Jones [ 04/Jan/21 ] |
|
Sébastien Could you please advise? Thanks Peter |
| Comment by Gerrit Updater [ 06/Jan/21 ] |
|
Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41144 |
| Comment by John Hammond [ 06/Jan/21 ] |
|
sebastien the logs from the debug patch contain a lot a concerning messages. Are these understood? [ 699.370905] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity-sec test 52: Mirrored encrypted file ======================================================= 15:06:50 \(1609945610\) [ 699.848730] Lustre: DEBUG MARKER: == sanity-sec test 52: Mirrored encrypted file ======================================================= 15:06:50 (1609945610) [ 699.960099] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts [ 699.972026] Lustre: DEBUG MARKER: lsof -t /mnt/lustre [ 700.133566] Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1 [ 700.171015] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre [ 700.180844] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock,test_dummy_encryption trevis-25vm4@tcp:/lustre /mnt/lustre [ 700.605880] BUG: Bad rss-counter state mm:00000000b84532d3 idx:1 val:2 [ 700.607197] BUG: Bad rss-counter state mm:00000000b84532d3 idx:3 val:-2 [ 700.614727] BUG: Bad page state in process lfs pfn:3ca01 [ 700.615774] page:ffffd87ac0f28040 refcount:0 mapcount:-1 mapping:0000000000000000 index:0x0 [ 700.617194] flags: 0xfffffc0000000() [ 700.617850] raw: 000fffffc0000000 0000000000000000 ffffffffc0f20903 0000000000000000 [ 700.622001] raw: 0000000000000000 0000000000000000 00000000fffffffe 0000000000000000 [ 700.623329] page dumped because: nonzero mapcount [ 700.624155] Modules linked in: lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser ib_ipoib libiscsi scsi_transport_iscsi rdma_ucm ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_uverbs ib_core sunrpc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i2c_piix4 virtio_balloon joydev pcspkr ip_tables ext4 mbcache jbd2 mlx4_en ata_generic ata_piix 8139too mlx4_core libata crc32c_intel serio_raw 8139cp virtio_blk mii [ 700.633949] CPU: 1 PID: 45813 Comm: lfs Kdump: loaded Tainted: G OE --------- - - 4.18.0-240.1.1.el8_3.x86_64 #1 [ 700.635821] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 700.636800] Call Trace: [ 700.637382] dump_stack+0x5c/0x80 [ 700.638015] bad_page.cold.115+0xba/0xbf [ 700.638726] __free_pages_ok+0x2c4/0x2d0 [ 700.639436] release_pages+0x305/0x430 [ 700.640117] ? cpumask_any_but+0x20/0x40 [ 700.640819] tlb_flush_mmu_free+0x3d/0x60 [ 700.641539] arch_tlb_finish_mmu+0x89/0x130 [ 700.642277] tlb_finish_mmu+0x1f/0x30 [ 700.642931] unmap_region+0xdd/0x110 [ 700.643585] ? __vma_rb_erase+0x127/0x250 [ 700.644294] do_munmap+0x256/0x440 [ 700.644913] vm_munmap+0x5f/0xa0 [ 700.645499] __x64_sys_munmap+0x22/0x30 [ 700.646208] do_syscall_64+0x5b/0x1a0 [ 700.646887] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 700.647778] RIP: 0033:0x7f4035528a0b [ 700.648418] Code: ff ff 0f 1f 44 00 00 48 8b 15 79 54 2c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff e9 6f ff ff ff f3 0f 1e fa b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 54 2c 00 f7 d8 64 89 01 48 [ 700.651441] RSP: 002b:00007ffc550ab5f8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b [ 700.652712] RAX: ffffffffffffffda RBX: 00007ffc550ab7e0 RCX: 00007f4035528a0b [ 700.653907] RDX: 0000000000000000 RSI: 0000000000402000 RDI: 00007f4034bfe000 [ 700.655109] RBP: 0000000000000002 R08: 0000000000401010 R09: 00007ffc550ab4c0 [ 700.656330] R10: 000000000000000a R11: 0000000000000202 R12: 0000000000000002 [ 700.657530] R13: 00007ffc550ab6d0 R14: 00007f4034bff000 R15: 00007ffc550ab7e0 [ 700.658764] Disabling lock debugging due to kernel taint https://testing.whamcloud.com/test_logs/36ab23fa-5569-491e-855a-66c41621cae8/show_text |
| Comment by Wang Shilong (Inactive) [ 07/Jan/21 ] |
|
It looks "‘divide error" is not easily to reproduce, but bad memory problem is easily reproduced and should be addressed in seperate ticket. |
| Comment by Sebastien Buisson [ 07/Jan/21 ] |
|
Apparently, the message BUG: Bad rss-counter state was printed for every single run of sanity-sec test_52. However, pages with nonzero mapcount were dumped "only" 4 times (out of 200 test runs). I have created |
| Comment by John Hammond [ 07/Jan/21 ] |
|
> It looks "‘divide error" is not easily to reproduce, but bad memory problem is easily reproduced and should be addressed in seperate ticket. Likely it requires more than just running test_52. |
| Comment by James Nunez (Inactive) [ 07/Jan/21 ] |
|
Wang Shilong - |
| Comment by Gerrit Updater [ 14/Jan/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41225 |
| Comment by Andreas Dilger [ 14/Jan/21 ] |
|
The above patch is not really a fix for the root problem of (what appears to be) a case of the obd_connect_data being zeroed out during reconnect (or similar). It is only working around the crash to avoid taking out the client. Probably this patch needs to be reverted in order to reproduce the root cause. |
| Comment by John Hammond [ 14/Jan/21 ] |
|
sebastien could you look again? See https://testing.whamcloud.com/test_sets/13b5e7cf-b908-4799-9d40-9ba4b21dbf00 and the client logs at https://testing.whamcloud.com/test_logs/d8e65e1b-ad1c-4b27-9318-07ebeab40ea2/show_text I see several messages of the form [13483.414663] LustreError: 809674:0:(gss_cli_upcall.c:270:gss_do_ctx_init_rpc()) obd lustre-OST0007-osc-ffff99b9f9160800 not setup The obd device found does not have the obd_set_up bit set. In obd_import_event() if that bit is not set then the event is ignored: static inline void obd_import_event(struct obd_device *obd, struct obd_import *imp, enum obd_import_event event) { ENTRY; if (!obd) { CERROR("NULL device\n"); EXIT; return; } if (obd->obd_set_up && OBP(obd, import_event)) OBP(obd, import_event)(obd, imp, event); EXIT; } In osc_import_event() we have:
case IMP_EVENT_OCD: {
struct obd_connect_data *ocd = &imp->imp_connect_data;
if (ocd->ocd_connect_flags & OBD_CONNECT_GRANT)
osc_init_grant(&obd->u.cli, ocd);
/* See bug 7198 */
if (ocd->ocd_connect_flags & OBD_CONNECT_REQPORTAL)
imp->imp_client->cli_request_portal =OST_REQUEST_PORTAL;
rc = obd_notify_observer(obd, obd, OBD_NOTIFY_OCD);
break;
}
Note: osc_init_grant() is responsible for initializing the cl_max_extent_pages field to a non zero value (according to the OCD data):
/* determine maximum extent size, in #pages */
size = (u64)ocd->ocd_grant_max_blks << ocd->ocd_grant_blkbits;
cli->cl_max_extent_pages = size >> PAGE_SHIFT;
cl_max_extent_pages being zero is giving us the divide error in osc_announce_cached():
/* take extent tax into account when asking for more
* grant space */
nrextents = (nrpages + cli->cl_max_extent_pages - 1) /
cli->cl_max_extent_pages;
Could you determine why the OSC obd_device is not getting setup here? |
| Comment by Sebastien Buisson [ 14/Jan/21 ] |
|
Thanks for this analysis, this is helpful. The messages like: [13483.414663] LustreError: 809674:0:(gss_cli_upcall.c:270:gss_do_ctx_init_rpc()) obd lustre-OST0007-osc-ffff99b9f9160800 not setup occur because the client tries to initialize its GSS context before the obd device is actually set up. This GSS initialization operation is triggered from userspace by lgss_keyring upon key request, so it can happen completely asynchronously. In any case, I would not tend to consider this is harmful. The other question to ask is how can we end up in osc_announce_cached() if the OSC obd_device is not setup. |
| Comment by John Hammond [ 14/Jan/21 ] |
|
> The other question to ask is how can we end up in osc_announce_cached() if the OSC obd_device is not setup. I do not think this is possible. |
| Comment by Wang Shilong (Inactive) [ 14/Jan/21 ] |
|
I did not find why we could hit this issue. Notice once @cl_max_extent_pages is inited ever, even reconnect won't clear it to be zero unless i miss something here, and there is a window so big possibility is it was zero initially... |
| Comment by John Hammond [ 15/Jan/21 ] |
|
It seems very unlikely, but if if osd_connect_flags includes GRANT_PARAM but not GRANT then we could get here. |
| Comment by Gerrit Updater [ 16/Jan/21 ] |
|
Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41256 |
| Comment by Gerrit Updater [ 22/Jan/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41256/ |
| Comment by Gerrit Updater [ 25/Jan/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41225/ |
| Comment by Peter Jones [ 25/Jan/21 ] |
|
Landed for 2.14 |