Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/111944 - 4.18.0-553.44.1.el8_10.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/111944 - 4.18.0-553.44.1.el8_lustre.x86_64
Crashes executing sanity test 900 during umount:
[17722.241646] LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 10.240.28.46@tcp) was lost; in progress operations using this service will fail [17727.828813] Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping) [17729.469790] Lustre: lustre-MDT0001: Not available for connect from 10.240.24.216@tcp (stopping) [17729.471698] Lustre: Skipped 7 previous similar messages : [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.371426] CPU: 1 PID: 573821 Comm: umount 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.230656] Call Trace: [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb
Attachments
Issue Links
Activity
Link | New: This issue is related to EX-8050 [ EX-8050 ] |
Assignee | Original: WC Triage [ wc-triage ] | New: Arshad Hussain [ arshad512 ] |
Link | New: This issue is related to EX-1913 [ EX-1913 ] |
Description |
Original:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: [https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012] Test session details: clients: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_10.x86_64 servers: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_lustre.x86_64 +*Crashes executing sanity test 900 during umount:*+ {noformat} [17722.241646] LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 10.240.28.46@tcp) was lost; in progress operations using this service will fail [17727.828813] Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping) [17729.469790] Lustre: lustre-MDT0001: Not available for connect from 10.240.24.216@tcp (stopping) [17729.471698] Lustre: Skipped 7 previous similar messages : [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.108981] Modules linked in: dm_flakey obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs intel_rapl_msr lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic crc32c_intel virtio_net ata_piix libata serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey] [17772.371426] CPU: 1 PID: 573821 Comm: umount Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.521037] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.214742] Code: 24 40 00 00 00 00 8b 40 2c 89 44 24 20 49 8b 46 38 48 8d 74 24 38 4c 89 f7 48 8b 00 e8 be ae 99 d0 48 85 c0 0f 84 f1 01 00 00 <48> 8b 18 48 85 db 0f 84 cb 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [17773.218264] RSP: 0018:ffffbfdd44dfb9f8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [17773.219722] RAX: ffffbfdd433f8008 RBX: 0000000000000000 RCX: 000000000000000e [17773.221092] RDX: ffffbfdd433d9000 RSI: ffffbfdd44dfba30 RDI: ffff97c2fa9d9600 [17773.222465] RBP: ffffffffc0eadde0 R08: 00000000000007b9 R09: 000000000000000e [17773.223836] R10: ffff97c2fa1a5000 R11: ffff97c2fa1a4554 R12: ffffbfdd44dfbaa8 [17773.225211] R13: 0000000000000000 R14: ffff97c2fa9d9600 R15: 0000000000000000 [17773.226598] FS: 00007f18a889b080(0000) GS:ffff97c37fd00000(0000) knlGS:0000000000000000 [17773.228142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17773.229278] CR2: 00005591f490e668 CR3: 000000003497c003 CR4: 00000000001706e0 [17773.230656] Call Trace: [17773.297324] <IRQ> [17774.001000] ? watchdog_timer_fn.cold.10+0x46/0x9e [17774.037843] ? watchdog+0x30/0x30 [17774.038589] ? __hrtimer_run_queues+0x101/0x280 [17774.039524] ? hrtimer_interrupt+0x100/0x220 [17774.040386] ? smp_apic_timer_interrupt+0x6a/0x130 [17774.041370] ? apic_timer_interrupt+0xf/0x20 [17774.042216] </IRQ> [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.512833] ? class_disconnect_exports+0x187/0x2f0 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.516163] ? class_manual_cleanup+0x191/0x780 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.577312] ? fsnotify_sb_delete+0x138/0x1c0 [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb{noformat} |
New:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: [https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012] Test session details: clients: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_10.x86_64 servers: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_lustre.x86_64 +*Crashes executing sanity test 900 during umount:*+ {noformat} [17722.241646] LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 10.240.28.46@tcp) was lost; in progress operations using this service will fail [17727.828813] Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping) [17729.469790] Lustre: lustre-MDT0001: Not available for connect from 10.240.24.216@tcp (stopping) [17729.471698] Lustre: Skipped 7 previous similar messages : [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.371426] CPU: 1 PID: 573821 Comm: umount 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.230656] Call Trace: [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb{noformat} |
Description |
Original:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: [https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012] Test session details: clients: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_10.x86_64 servers: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_lustre.x86_64 +*Crashes executing sanity test 900 during umount:*+ {noformat} [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.108981] Modules linked in: dm_flakey obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs intel_rapl_msr lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic crc32c_intel virtio_net ata_piix libata serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey] [17772.371426] CPU: 1 PID: 573821 Comm: umount Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.521037] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.214742] Code: 24 40 00 00 00 00 8b 40 2c 89 44 24 20 49 8b 46 38 48 8d 74 24 38 4c 89 f7 48 8b 00 e8 be ae 99 d0 48 85 c0 0f 84 f1 01 00 00 <48> 8b 18 48 85 db 0f 84 cb 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [17773.218264] RSP: 0018:ffffbfdd44dfb9f8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [17773.219722] RAX: ffffbfdd433f8008 RBX: 0000000000000000 RCX: 000000000000000e [17773.221092] RDX: ffffbfdd433d9000 RSI: ffffbfdd44dfba30 RDI: ffff97c2fa9d9600 [17773.222465] RBP: ffffffffc0eadde0 R08: 00000000000007b9 R09: 000000000000000e [17773.223836] R10: ffff97c2fa1a5000 R11: ffff97c2fa1a4554 R12: ffffbfdd44dfbaa8 [17773.225211] R13: 0000000000000000 R14: ffff97c2fa9d9600 R15: 0000000000000000 [17773.226598] FS: 00007f18a889b080(0000) GS:ffff97c37fd00000(0000) knlGS:0000000000000000 [17773.228142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17773.229278] CR2: 00005591f490e668 CR3: 000000003497c003 CR4: 00000000001706e0 [17773.230656] Call Trace: [17773.297324] <IRQ> [17774.001000] ? watchdog_timer_fn.cold.10+0x46/0x9e [17774.037843] ? watchdog+0x30/0x30 [17774.038589] ? __hrtimer_run_queues+0x101/0x280 [17774.039524] ? hrtimer_interrupt+0x100/0x220 [17774.040386] ? smp_apic_timer_interrupt+0x6a/0x130 [17774.041370] ? apic_timer_interrupt+0xf/0x20 [17774.042216] </IRQ> [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.512833] ? class_disconnect_exports+0x187/0x2f0 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.516163] ? class_manual_cleanup+0x191/0x780 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.577312] ? fsnotify_sb_delete+0x138/0x1c0 [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb{noformat} |
New:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: [https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012] Test session details: clients: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_10.x86_64 servers: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_lustre.x86_64 +*Crashes executing sanity test 900 during umount:*+ {noformat} [17722.241646] LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 10.240.28.46@tcp) was lost; in progress operations using this service will fail [17727.828813] Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping) [17729.469790] Lustre: lustre-MDT0001: Not available for connect from 10.240.24.216@tcp (stopping) [17729.471698] Lustre: Skipped 7 previous similar messages : [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.108981] Modules linked in: dm_flakey obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs intel_rapl_msr lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic crc32c_intel virtio_net ata_piix libata serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey] [17772.371426] CPU: 1 PID: 573821 Comm: umount Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.521037] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.214742] Code: 24 40 00 00 00 00 8b 40 2c 89 44 24 20 49 8b 46 38 48 8d 74 24 38 4c 89 f7 48 8b 00 e8 be ae 99 d0 48 85 c0 0f 84 f1 01 00 00 <48> 8b 18 48 85 db 0f 84 cb 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [17773.218264] RSP: 0018:ffffbfdd44dfb9f8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [17773.219722] RAX: ffffbfdd433f8008 RBX: 0000000000000000 RCX: 000000000000000e [17773.221092] RDX: ffffbfdd433d9000 RSI: ffffbfdd44dfba30 RDI: ffff97c2fa9d9600 [17773.222465] RBP: ffffffffc0eadde0 R08: 00000000000007b9 R09: 000000000000000e [17773.223836] R10: ffff97c2fa1a5000 R11: ffff97c2fa1a4554 R12: ffffbfdd44dfbaa8 [17773.225211] R13: 0000000000000000 R14: ffff97c2fa9d9600 R15: 0000000000000000 [17773.226598] FS: 00007f18a889b080(0000) GS:ffff97c37fd00000(0000) knlGS:0000000000000000 [17773.228142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17773.229278] CR2: 00005591f490e668 CR3: 000000003497c003 CR4: 00000000001706e0 [17773.230656] Call Trace: [17773.297324] <IRQ> [17774.001000] ? watchdog_timer_fn.cold.10+0x46/0x9e [17774.037843] ? watchdog+0x30/0x30 [17774.038589] ? __hrtimer_run_queues+0x101/0x280 [17774.039524] ? hrtimer_interrupt+0x100/0x220 [17774.040386] ? smp_apic_timer_interrupt+0x6a/0x130 [17774.041370] ? apic_timer_interrupt+0xf/0x20 [17774.042216] </IRQ> [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.512833] ? class_disconnect_exports+0x187/0x2f0 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.516163] ? class_manual_cleanup+0x191/0x780 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.577312] ? fsnotify_sb_delete+0x138/0x1c0 [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb{noformat} |
Description |
Original:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012 Test session details: clients: https://build.whamcloud.com/job/lustre-reviews/111944 - 4.18.0-553.44.1.el8_10.x86_64 servers: https://build.whamcloud.com/job/lustre-reviews/111944 - 4.18.0-553.44.1.el8_lustre.x86_64 Crashes executing sanity test 900 during umount: [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.108981] Modules linked in: dm_flakey obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs intel_rapl_msr lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic crc32c_intel virtio_net ata_piix libata serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey] [17772.371426] CPU: 1 PID: 573821 Comm: umount Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.521037] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.214742] Code: 24 40 00 00 00 00 8b 40 2c 89 44 24 20 49 8b 46 38 48 8d 74 24 38 4c 89 f7 48 8b 00 e8 be ae 99 d0 48 85 c0 0f 84 f1 01 00 00 <48> 8b 18 48 85 db 0f 84 cb 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [17773.218264] RSP: 0018:ffffbfdd44dfb9f8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [17773.219722] RAX: ffffbfdd433f8008 RBX: 0000000000000000 RCX: 000000000000000e [17773.221092] RDX: ffffbfdd433d9000 RSI: ffffbfdd44dfba30 RDI: ffff97c2fa9d9600 [17773.222465] RBP: ffffffffc0eadde0 R08: 00000000000007b9 R09: 000000000000000e [17773.223836] R10: ffff97c2fa1a5000 R11: ffff97c2fa1a4554 R12: ffffbfdd44dfbaa8 [17773.225211] R13: 0000000000000000 R14: ffff97c2fa9d9600 R15: 0000000000000000 [17773.226598] FS: 00007f18a889b080(0000) GS:ffff97c37fd00000(0000) knlGS:0000000000000000 [17773.228142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17773.229278] CR2: 00005591f490e668 CR3: 000000003497c003 CR4: 00000000001706e0 [17773.230656] Call Trace: [17773.297324] <IRQ> [17774.001000] ? watchdog_timer_fn.cold.10+0x46/0x9e [17774.037843] ? watchdog+0x30/0x30 [17774.038589] ? __hrtimer_run_queues+0x101/0x280 [17774.039524] ? hrtimer_interrupt+0x100/0x220 [17774.040386] ? smp_apic_timer_interrupt+0x6a/0x130 [17774.041370] ? apic_timer_interrupt+0xf/0x20 [17774.042216] </IRQ> [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.512833] ? class_disconnect_exports+0x187/0x2f0 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.516163] ? class_manual_cleanup+0x191/0x780 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.577312] ? fsnotify_sb_delete+0x138/0x1c0 [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb |
New:
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: [https://testing.whamcloud.com/test_sets/aec6b1c3-5f88-425a-8bf5-50415f59f012] Test session details: clients: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_10.x86_64 servers: [https://build.whamcloud.com/job/lustre-reviews/111944] - 4.18.0-553.44.1.el8_lustre.x86_64 +*Crashes executing sanity test 900 during umount:*+ {noformat} [17771.667290] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:573821] [17772.108981] Modules linked in: dm_flakey obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs intel_rapl_msr lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic crc32c_intel virtio_net ata_piix libata serio_raw net_failover failover virtio_blk [last unloaded: dm_flakey] [17772.371426] CPU: 1 PID: 573821 Comm: umount Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.44.1.el8_lustre.x86_64 #1 [17772.521037] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [17772.522202] RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17773.214742] Code: 24 40 00 00 00 00 8b 40 2c 89 44 24 20 49 8b 46 38 48 8d 74 24 38 4c 89 f7 48 8b 00 e8 be ae 99 d0 48 85 c0 0f 84 f1 01 00 00 <48> 8b 18 48 85 db 0f 84 cb 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [17773.218264] RSP: 0018:ffffbfdd44dfb9f8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [17773.219722] RAX: ffffbfdd433f8008 RBX: 0000000000000000 RCX: 000000000000000e [17773.221092] RDX: ffffbfdd433d9000 RSI: ffffbfdd44dfba30 RDI: ffff97c2fa9d9600 [17773.222465] RBP: ffffffffc0eadde0 R08: 00000000000007b9 R09: 000000000000000e [17773.223836] R10: ffff97c2fa1a5000 R11: ffff97c2fa1a4554 R12: ffffbfdd44dfbaa8 [17773.225211] R13: 0000000000000000 R14: ffff97c2fa9d9600 R15: 0000000000000000 [17773.226598] FS: 00007f18a889b080(0000) GS:ffff97c37fd00000(0000) knlGS:0000000000000000 [17773.228142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17773.229278] CR2: 00005591f490e668 CR3: 000000003497c003 CR4: 00000000001706e0 [17773.230656] Call Trace: [17773.297324] <IRQ> [17774.001000] ? watchdog_timer_fn.cold.10+0x46/0x9e [17774.037843] ? watchdog+0x30/0x30 [17774.038589] ? __hrtimer_run_queues+0x101/0x280 [17774.039524] ? hrtimer_interrupt+0x100/0x220 [17774.040386] ? smp_apic_timer_interrupt+0x6a/0x130 [17774.041370] ? apic_timer_interrupt+0xf/0x20 [17774.042216] </IRQ> [17774.042684] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.044248] ? cfs_hash_for_each_relax+0x17b/0x480 [obdclass] [17774.045444] ? cfs_hash_for_each_relax+0x172/0x480 [obdclass] [17774.046634] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.047719] ? cleanup_resource+0x350/0x350 [ptlrpc] [17774.048817] cfs_hash_for_each_nolock+0x124/0x200 [obdclass] [17774.049984] ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc] [17774.262073] __ldlm_namespace_free+0x52/0x4e0 [ptlrpc] [17774.263285] ldlm_namespace_free_prior+0x5e/0x200 [ptlrpc] [17774.264623] mdt_device_fini+0x480/0xf80 [mdt] [17775.511739] obd_precleanup+0xf4/0x220 [obdclass] [17775.512833] ? class_disconnect_exports+0x187/0x2f0 [obdclass] [17775.514029] class_cleanup+0x322/0x900 [obdclass] [17775.515047] class_process_config+0x3bb/0x20a0 [obdclass] [17775.516163] ? class_manual_cleanup+0x191/0x780 [obdclass] [17775.517336] class_manual_cleanup+0x45b/0x780 [obdclass] [17775.518435] server_put_super+0xd62/0x11f0 [ptlrpc] [17775.577312] ? fsnotify_sb_delete+0x138/0x1c0 [17775.578275] generic_shutdown_super+0x6c/0x110 [17775.579220] kill_anon_super+0x14/0x30 [17775.580050] deactivate_locked_super+0x34/0x70 [17775.581003] cleanup_mnt+0x3b/0x70 [17775.581767] task_work_run+0x8a/0xb0 [17775.582579] exit_to_usermode_loop+0xef/0x100 [17775.583529] do_syscall_64+0x195/0x1a0 [17775.584330] entry_SYSCALL_64_after_hwframe+0x66/0xcb{noformat} |