[LU-15500] Crash in crypto_unregister_alg on client modules unload Created: 29/Jan/22  Updated: 18/Apr/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This has been happening for a while and hopefully somebody can look at it. Master and also at customer sites

 [60734.385258] Lustre: server umount lustre-OST0003 complete
[60736.102740] device-mapper: core: cleaned up
[60740.876637] Key type lgssc unregistered
[60741.388491] LNet: Removed LNI 192.168.123.27@tcp
[60742.924666] ------------[ cut here ]------------
[60742.925427] kernel BUG at crypto/algapi.c:405!
[60742.925427] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[60742.925427] Modules linked in: libcfs(OE-) loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) jbd2 mbcache crc32_generic crc_t10dif crct10dif_generic crct10dif_common virtio_balloon virtio_console i2c_piix4 pcspkr ip_tables rpcsec_gss_krb5 ata_generic pata_acpi drm_kms_helper ttm drm ata_piix drm_panel_orientation_quirks floppy libata virtio_blk serio_raw i2c_core [last unloaded: lnet]
[60742.925427] CPU: 10 PID: 12048 Comm: rmmod Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-7.9-debug #1
[60742.925427] Hardware name: Red Hat KVM, BIOS 1.13.0-2.module_el8.5.0+746+bbd5d70c 04/01/2014
[60742.925427] task: ffff8802ef7d9280 ti: ffff8802d8558000 task.ti: ffff8802d8558000
[60742.925427] RIP: 0010:[<ffffffff8139fafa>]  [<ffffffff8139fafa>] crypto_unregister_alg+0xaa/0xb0
[60742.925427] RSP: 0018:ffff8802d855be50  EFLAGS: 00010202
[60742.925427] RAX: 0000000000000003 RBX: ffff8802d855be50 RCX: ffff8802d855bde0
[60742.925427] RDX: ffffffff00000001 RSI: ffff8802d855be50 RDI: ffffffff81cf3580
[60742.925427] RBP: ffff8802d855be80 R08: ffffffffa022b360 R09: 0000000000000000
[60742.925427] R10: 0000000000000429 R11: ffff8802ea37b1b8 R12: ffffffffa022b350
[60742.925427] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[60742.925427] FS:  00007fb84f4f9740(0000) GS:ffff880331c80000(0000) knlGS:0000000000000000
[60742.925427] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[60742.925427] CR2: 00007f03c8550288 CR3: 0000000295904000 CR4: 00000000000007e0
[60742.925427] Call Trace:
[60742.925427]  [<ffffffff813a64b2>] crypto_unregister_shash+0x12/0x20
[60742.925427]  [<ffffffffa0219935>] cfs_crypto_adler32_unregister+0x15/0x20 [libcfs]
[60742.925427]  [<ffffffffa0219179>] cfs_crypto_unregister+0x29/0x40 [libcfs]
[60742.925427]  [<ffffffffa021e171>] libcfs_exit+0xa5/0x142 [libcfs]
[60742.925427]  [<ffffffff8110f39b>] SyS_delete_module+0x19b/0x320
[60742.925427]  [<ffffffff817edf49>] ? system_call_after_swapgs+0x96/0x13a
[60742.925427]  [<ffffffff817edf55>] ? system_call_after_swapgs+0xa2/0x13a
[60742.925427]  [<ffffffff817edf49>] ? system_call_after_swapgs+0x96/0x13a
[60742.925427]  [<ffffffff817edf55>] ? system_call_after_swapgs+0xa2/0x13a
[60742.925427]  [<ffffffff817edf49>] ? system_call_after_swapgs+0x96/0x13a
[60742.925427]  [<ffffffff817ee00c>] system_call_fastpath+0x1f/0x24
[60742.925427]  [<ffffffff817edf55>] ? system_call_after_swapgs+0xa2/0x13a
[60742.925427] Code: 31 c0 48 8b 55 e0 65 48 33 14 25 28 00 00 00 75 15 48 83 c4 18 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 44 89 e8 eb dc e8 b6 d7 ce ff <0f> 0b 0f 1f 40 00 66 66 66 66 90 85 f6 7e 46 8d 46 ff 55 48 8d 
[60742.925427] RIP  [<ffffffff8139fafa>] crypto_unregister_alg+0xaa/0xb0
[60742.925427]  RSP <ffff8802d855be50>


 Comments   
Comment by Alex Zhuravlev [ 15/Apr/22 ]

I'm hitting this one quite often.

Comment by Alex Zhuravlev [ 18/Apr/23 ]

this is due to early exit in osc_checksum_bulk_t10pi() and tgt_checksum_niobuf_t10pi():

        if (rc)
                GOTO(out, rc);

        if (used_number != 0)
                cfs_crypto_hash_update_page(req, __page, 0,
                        used_number * sizeof(*guard_start));

        bufsize = sizeof(cksum);
        cfs_crypto_hash_final(req, (unsigned char *)&cksum, &bufsize);
....
out:
Comment by Gerrit Updater [ 18/Apr/23 ]

"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50666
Subject: LU-15500 osc: calculate checksum unconditionally
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: df5e745a55c709e6ed4954273d268a740f9a9406

Generated at Sat Feb 10 03:18:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.