Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17097

RCU stall caused by osc_quota_cleanup

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      sanity-quota@ldiskfs+DNE failed with Timeout(8009s)(Client: RCU stall).
      https://testing.whamcloud.com/gerrit-janitor/34572/testresults/sanity-quota-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/

      Sep  6 06:54:52 oleg310-client kernel: Lustre: DEBUG MARKER: == sanity-quota test complete, duration 4313 sec ========= 06:54:51 (1693997691)
      Sep  6 06:54:52 oleg310-client sshd[20583]: Received disconnect from 192.168.203.10 port 42586:11: disconnected by user
      Sep  6 06:54:52 oleg310-client sshd[20583]: Disconnected from 192.168.203.10 port 42586
      Sep  6 06:54:52 oleg310-client sshd[20583]: pam_unix(sshd:session): session closed for user root
      Sep  6 06:54:52 oleg310-client systemd-logind: Removed session 600.
      Sep  6 06:55:07 oleg310-client kernel: ------------[ cut here ]------------
      Sep  6 06:55:07 oleg310-client kernel: WARNING: CPU: 2 PID: 20679 at lib/debugobjects.c:286 debug_print_object+0x83/0xa0
      Sep  6 06:55:07 oleg310-client kernel: ODEBUG: activate active (active state 1) object type: rcu_head hint:           (null)
      Sep  6 06:55:07 oleg310-client kernel: Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
      Sep  6 06:55:07 oleg310-client kernel: CPU: 2 PID: 20679 Comm: umount Kdump: loaded Tainted: G           OE  ------------   3.10.0-7.9-debug #1
      Sep  6 06:55:07 oleg310-client kernel: Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Sep  6 06:55:07 oleg310-client kernel: Call Trace:
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff817ded29>] dump_stack+0x19/0x1b
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8108d558>] __warn+0xd8/0x100
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8108d5df>] warn_slowpath_fmt+0x5f/0x80
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81414723>] debug_print_object+0x83/0xa0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff814150af>] debug_object_activate+0x1af/0x210
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8114dc8f>] __call_rcu+0x3f/0x2d0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8114df3d>] call_rcu_sched+0x1d/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0179f44>] xas_free_nodes+0xa4/0xf0 [libcfs]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa017b26f>] xa_destroy+0xdf/0xf0 [libcfs]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa088d4d5>] osc_quota_cleanup+0x15/0x20 [osc]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa086ef1f>] osc_cleanup_common+0xbf/0x1b0 [osc]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f48f9>] class_free_dev+0x219/0x730 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f4ff0>] class_export_put+0x1e0/0x2e0 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f6c15>] class_unlink_export+0x125/0x160 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa030ca30>] class_decref+0x80/0x160 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa030ce71>] class_detach+0x1c1/0x310 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0314b2b>] class_process_config+0x163b/0x27c0 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0315e90>] class_manual_cleanup+0x1e0/0x770 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0903955>] lov_tgts_putref+0x385/0xad0 [lov]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0908927>] lov_disconnect+0x237/0x280 [lov]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0e8fa96>] obd_disconnect+0x56/0x300 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0e98ebc>] ll_put_super+0x81c/0xf30 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81247ac2>] kill_anon_super+0x12/0x20 Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0ec711b>] lustre_kill_super+0x2b/0x30 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81248616>] deactivate_super+0x46/0x60
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff810b69b5>] task_work_run+0xb5/0xf0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff817f4363>] int_signal+0x12/0x17
      Sep  6 06:55:07 oleg310-client kernel: ---[ end trace 37e266df8306097f ]---
      Sep  6 06:55:07 oleg310-client kernel: ------------[ cut here ]------------
      Sep  6 06:55:07 oleg310-client kernel: WARNING: CPU: 2 PID: 20679 at kernel/rcupdate.c:311 rcuhead_fixup_activate+0x5a/0x70
      Sep  6 06:55:07 oleg310-client kernel: Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
      Sep  6 06:55:07 oleg310-client kernel: CPU: 2 PID: 20679 Comm: umount Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-7.9-debug #1
      Sep  6 06:55:07 oleg310-client kernel: Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Sep  6 06:55:07 oleg310-client kernel: Call Trace: Sep  6 06:55:07 oleg310-client kernel: [<ffffffff817ded29>] dump_stack+0x19/0x1b
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8108d558>] __warn+0xd8/0x100
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8108d69d>] warn_slowpath_null+0x1d/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff810b6e8a>] rcuhead_fixup_activate+0x5a/0x70
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff814150cf>] debug_object_activate+0x1cf/0x210
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8114dc8f>] __call_rcu+0x3f/0x2d0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8114df3d>] call_rcu_sched+0x1d/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0179f44>] xas_free_nodes+0xa4/0xf0 [libcfs]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa017b26f>] xa_destroy+0xdf/0xf0 [libcfs]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa088d4d5>] osc_quota_cleanup+0x15/0x20 [osc]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa086ef1f>] osc_cleanup_common+0xbf/0x1b0 [osc]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f48f9>] class_free_dev+0x219/0x730 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f4ff0>] class_export_put+0x1e0/0x2e0 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa02f6c15>] class_unlink_export+0x125/0x160 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa030ca30>] class_decref+0x80/0x160 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa030ce71>] class_detach+0x1c1/0x310 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0314b2b>] class_process_config+0x163b/0x27c0 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0315e90>] class_manual_cleanup+0x1e0/0x770 [obdclass]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0903955>] lov_tgts_putref+0x385/0xad0 [lov]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0908927>] lov_disconnect+0x237/0x280 [lov]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0e8fa96>] obd_disconnect+0x56/0x300 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0e98ebc>] ll_put_super+0x81c/0xf30 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81247ac2>] kill_anon_super+0x12/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffffa0ec711b>] lustre_kill_super+0x2b/0x30 [lustre]
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81248616>] deactivate_super+0x46/0x60
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff810b69b5>] task_work_run+0xb5/0xf0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0
      Sep  6 06:55:07 oleg310-client kernel: [<ffffffff817f4363>] int_signal+0x12/0x17
      Sep  6 06:55:07 oleg310-client kernel: ---[ end trace 37e266df83060980 ]---
       

       

      Attachments

        Issue Links

          Activity

            [LU-17097] RCU stall caused by osc_quota_cleanup
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52381/
            Subject: LU-17097 osc: delete items in Xarray before its destroy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a66daa9c1bf40695e10a283dff40a119dfd060bb

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52381/ Subject: LU-17097 osc: delete items in Xarray before its destroy Project: fs/lustre-release Branch: master Current Patch Set: Commit: a66daa9c1bf40695e10a283dff40a119dfd060bb
            [ 3896.802710] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [ptlrpcd_00_00:1855]
            [ 3896.804652] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
            [ 3896.814091] CPU: 1 PID: 1855 Comm: ptlrpcd_00_00 Kdump: loaded Tainted: G           OE  ------------   3.10.0-7.9-debug #1
            [ 3896.816533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 3896.818482] task: ffff88012b245550 ti: ffff8800ae224000 task.ti: ffff8800ae224000
            [ 3896.820255] RIP: 0010:[<ffffffffa016e00d>]  [<ffffffffa016e00d>] xas_free_nodes+0xdd/0xf0 [libcfs]
            [ 3896.822272] RSP: 0018:ffff8800ae2279f0  EFLAGS: 00000282
            [ 3896.823454] RAX: ffff880136faefe8 RBX: 0000000000000000 RCX: 000000000000000c
            [ 3896.824968] RDX: ffff880136faefea RSI: 0000000000000002 RDI: ffff8800ae227a90
            [ 3896.826450] RBP: ffff8800ae227a18 R08: 000000000000000e R09: 0000000000000000
            [ 3896.827961] R10: 0000000000000000 R11: 0000000008000000 R12: ffff880136faefe8
            [ 3896.829424] R13: ffffffffa016dec6 R14: 0000000000000078 R15: ffff8800ae227fd8
            [ 3896.830997] FS:  0000000000000000(0000) GS:ffff88013e280000(0000) knlGS:0000000000000000
            [ 3896.832791] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [ 3896.834035] CR2: 00007f3a1172c000 CR3: 00000000a8dcc000 CR4: 00000000000006e0
            [ 3896.835631] Call Trace:
            [ 3896.836160]  [<ffffffffa016fa54>] xas_store+0x184/0x540 [libcfs]
            [ 3896.837538]  [<ffffffffa017027c>] __xa_insert+0xdc/0x150 [libcfs]
            [ 3896.838887]  [<ffffffffa08897c0>] osc_quota_setdq+0x220/0x510 [osc]
            [ 3896.840285]  [<ffffffffa0873adc>] osc_brw_fini_request+0xa9c/0x1b10 [osc]
            [ 3896.841694]  [<ffffffffa0874ba7>] brw_interpret+0x57/0xdb0 [osc]
            [ 3896.843082]  [<ffffffffa05b3fc8>] ptlrpc_check_set+0x428/0x2170 [ptlrpc]
            [ 3896.844618]  [<ffffffffa05e4f94>] ptlrpcd+0xa94/0xb70 [ptlrpc]
            [ 3896.845884]  [<ffffffff810baff0>] ? abort_exclusive_wait+0xa0/0xa0
            [ 3896.847259]  [<ffffffffa05e4500>] ? ptlrpcd_partners+0x3a0/0x3a0 [ptlrpc]
            [ 3896.848744]  [<ffffffff810ba114>] kthread+0xe4/0xf0
            [ 3896.849779]  [<ffffffff810ba030>] ? kthread_create_on_node+0x140/0x140
            [ 3896.851231]  [<ffffffff817f3e5d>] ret_from_fork_nospec_begin+0x7/0x21
            [ 3896.852681]  [<ffffffff810ba030>] ? kthread_create_on_node+0x140/0x140
            [ 3896.854083] Code: 5d c3 0f 1f 40 00 41 0f b6 4d 00 4c 89 eb e9 5c ff ff ff 0f 1f 00 48 81 fa 00 10 00 00 0f 86 6b ff ff ff 48 8d 5a fe 0f b6 4a fe <45> 31 e4 e9 3c ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 
            [ 3896.859259] Kernel panic - not syncing: softlockup: hung tasks 

            https://testing.whamcloud.com/gerrit-janitor/37274/testresults/sanity-quota-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/

            scherementsev Sergey Cheremencev added a comment - [ 3896.802710] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [ptlrpcd_00_00:1855] [ 3896.804652] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata [ 3896.814091] CPU: 1 PID: 1855 Comm: ptlrpcd_00_00 Kdump: loaded Tainted: G OE ------------ 3.10.0-7.9-debug #1 [ 3896.816533] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 3896.818482] task: ffff88012b245550 ti: ffff8800ae224000 task.ti: ffff8800ae224000 [ 3896.820255] RIP: 0010:[<ffffffffa016e00d>] [<ffffffffa016e00d>] xas_free_nodes+0xdd/0xf0 [libcfs] [ 3896.822272] RSP: 0018:ffff8800ae2279f0 EFLAGS: 00000282 [ 3896.823454] RAX: ffff880136faefe8 RBX: 0000000000000000 RCX: 000000000000000c [ 3896.824968] RDX: ffff880136faefea RSI: 0000000000000002 RDI: ffff8800ae227a90 [ 3896.826450] RBP: ffff8800ae227a18 R08: 000000000000000e R09: 0000000000000000 [ 3896.827961] R10: 0000000000000000 R11: 0000000008000000 R12: ffff880136faefe8 [ 3896.829424] R13: ffffffffa016dec6 R14: 0000000000000078 R15: ffff8800ae227fd8 [ 3896.830997] FS: 0000000000000000(0000) GS:ffff88013e280000(0000) knlGS:0000000000000000 [ 3896.832791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3896.834035] CR2: 00007f3a1172c000 CR3: 00000000a8dcc000 CR4: 00000000000006e0 [ 3896.835631] Call Trace: [ 3896.836160] [<ffffffffa016fa54>] xas_store+0x184/0x540 [libcfs] [ 3896.837538] [<ffffffffa017027c>] __xa_insert+0xdc/0x150 [libcfs] [ 3896.838887] [<ffffffffa08897c0>] osc_quota_setdq+0x220/0x510 [osc] [ 3896.840285] [<ffffffffa0873adc>] osc_brw_fini_request+0xa9c/0x1b10 [osc] [ 3896.841694] [<ffffffffa0874ba7>] brw_interpret+0x57/0xdb0 [osc] [ 3896.843082] [<ffffffffa05b3fc8>] ptlrpc_check_set+0x428/0x2170 [ptlrpc] [ 3896.844618] [<ffffffffa05e4f94>] ptlrpcd+0xa94/0xb70 [ptlrpc] [ 3896.845884] [<ffffffff810baff0>] ? abort_exclusive_wait+0xa0/0xa0 [ 3896.847259] [<ffffffffa05e4500>] ? ptlrpcd_partners+0x3a0/0x3a0 [ptlrpc] [ 3896.848744] [<ffffffff810ba114>] kthread+0xe4/0xf0 [ 3896.849779] [<ffffffff810ba030>] ? kthread_create_on_node+0x140/0x140 [ 3896.851231] [<ffffffff817f3e5d>] ret_from_fork_nospec_begin+0x7/0x21 [ 3896.852681] [<ffffffff810ba030>] ? kthread_create_on_node+0x140/0x140 [ 3896.854083] Code: 5d c3 0f 1f 40 00 41 0f b6 4d 00 4c 89 eb e9 5c ff ff ff 0f 1f 00 48 81 fa 00 10 00 00 0f 86 6b ff ff ff 48 8d 5a fe 0f b6 4a fe <45> 31 e4 e9 3c ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f [ 3896.859259] Kernel panic - not syncing: softlockup: hung tasks https://testing.whamcloud.com/gerrit-janitor/37274/testresults/sanity-quota-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/

            Yes. Patch 52381 will fix the crash

            simmonsja James A Simmons added a comment - Yes. Patch 52381 will fix the crash
            scherementsev Sergey Cheremencev added a comment - - edited

            Hit the same panic as Alex reported above:

            [ 5388.400457] ------------[ cut here ]------------
            [ 5388.401251] WARNING: CPU: 2 PID: 12374 at lib/debugobjects.c:286 debug_print_object+0x83/0xa0
            [ 5388.402684] ODEBUG: activate active (active state 1) object type: rcu_head hint:           (null)
            [ 5388.404149] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
            [ 5388.411563] CPU: 2 PID: 12374 Comm: umount Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-7.9-debug #1
            [ 5388.413343] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 5388.414823] Call Trace:
            [ 5388.415251]  [<ffffffff817ded29>] dump_stack+0x19/0x1b
            [ 5388.416116]  [<ffffffff8108d558>] __warn+0xd8/0x100
            [ 5388.416925]  [<ffffffff8108d5df>] warn_slowpath_fmt+0x5f/0x80
            [ 5388.417850]  [<ffffffff81414723>] debug_print_object+0x83/0xa0
            [ 5388.418827]  [<ffffffff814150af>] debug_object_activate+0x1af/0x210
            [ 5388.419897]  [<ffffffffa018be60>] ? xas_alloc+0xd0/0xd0 [libcfs]
            [ 5388.420931]  [<ffffffff8114dc8f>] __call_rcu+0x3f/0x2d0
            [ 5388.421801]  [<ffffffff8114df3d>] call_rcu_sched+0x1d/0x20
            [ 5388.422722]  [<ffffffffa018bf44>] xas_free_nodes+0xa4/0xf0 [libcfs]
            [ 5388.423795]  [<ffffffffa018d26f>] xa_destroy+0xdf/0xf0 [libcfs]
            [ 5388.424779]  [<ffffffffa08a38a5>] osc_quota_cleanup+0x15/0x20 [osc]
            [ 5388.425804]  [<ffffffffa0884f1f>] osc_cleanup_common+0xbf/0x1b0 [osc]
            [ 5388.426902]  [<ffffffffa03068f9>] class_free_dev+0x219/0x730 [obdclass]
            [ 5388.428025]  [<ffffffffa0306ff0>] class_export_put+0x1e0/0x2e0 [obdclass]
            [ 5388.429205]  [<ffffffffa0308c15>] class_unlink_export+0x125/0x160 [obdclass]
            [ 5388.430404]  [<ffffffffa031e18e>] class_decref_free+0x4e/0x90 [obdclass]
            [ 5388.431549]  [<ffffffffa031eaf8>] class_decref+0x48/0xf0 [obdclass]
            [ 5388.432643]  [<ffffffffa031ef01>] class_detach+0x1c1/0x310 [obdclass]
            [ 5388.433728]  [<ffffffffa0326bbb>] class_process_config+0x163b/0x27c0 [obdclass]
            [ 5388.434945]  [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340
            [ 5388.435881]  [<ffffffffa0327f20>] class_manual_cleanup+0x1e0/0x770 [obdclass]
            [ 5388.437115]  [<ffffffffa0919ed5>] lov_tgts_putref+0x385/0xad0 [lov]
            [ 5388.438171]  [<ffffffffa091eea7>] lov_disconnect+0x237/0x280 [lov]
            [ 5388.439178]  [<ffffffffa0eb9c96>] obd_disconnect+0x56/0x300 [lustre]
            [ 5388.440276]  [<ffffffffa0ec30d7>] ll_put_super+0x767/0xce0 [lustre]
            [ 5388.441341]  [<ffffffff8114df3d>] ? call_rcu_sched+0x1d/0x20
            [ 5388.442307]  [<ffffffffa0ef185c>] ? ll_destroy_inode+0x1c/0x20 [lustre]
            [ 5388.443408]  [<ffffffff812639b8>] ? destroy_inode+0x38/0x60
            [ 5388.444361]  [<ffffffff81263aee>] ? evict+0x10e/0x180
            [ 5388.445220]  [<ffffffff817e8d7e>] ? _raw_spin_unlock+0xe/0x20
            [ 5388.446197]  [<ffffffff812907a6>] ? fsnotify_unmount_inodes+0x1d6/0x1e0
            [ 5388.447300]  [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0
            [ 5388.448335]  [<ffffffff81247ac2>] kill_anon_super+0x12/0x20
            [ 5388.449275]  [<ffffffffa0ef188b>] lustre_kill_super+0x2b/0x30 [lustre]
            [ 5388.450347]  [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60
            [ 5388.451397]  [<ffffffff81248616>] deactivate_super+0x46/0x60
            [ 5388.452334]  [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80
            [ 5388.453213]  [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20
            [ 5388.454129]  [<ffffffff810b69b5>] task_work_run+0xb5/0xf0
            [ 5388.455070]  [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0
            [ 5388.456025]  [<ffffffff817f4363>] int_signal+0x12/0x17
            [ 5388.456860] ---[ end trace a193f2979542d3e3 ]---
            [ 5388.457628] BUG: unable to handle kernel NULL pointer dereference at           (null)
            [ 5388.458920] IP: [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs]
            [ 5388.460023] PGD 0 
            [ 5388.460373] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
            [ 5388.461195] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
            [ 5388.468704] CPU: 2 PID: 12374 Comm: umount Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-7.9-debug #1
            [ 5388.470441] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 5388.471875] task: ffff88012b485550 ti: ffff8800a9968000 task.ti: ffff8800a9968000
            [ 5388.473121] RIP: 0010:[<ffffffffa018bf58>]  [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs]
            [ 5388.474643] RSP: 0018:ffff8800a996b980  EFLAGS: 00010083
            [ 5388.475541] RAX: 0000000000002710 RBX: ffff8800b0a26910 RCX: 0000000000000005
            [ 5388.476753] RDX: ffff8800b0a26b70 RSI: ffff8800b0a26928 RDI: 0000000000000046
            [ 5388.477945] RBP: ffff8800a996b9a8 R08: 0000000000000000 R09: 5130000000000000
            [ 5388.479144] R10: 6633393161206563 R11: 61727420646e6520 R12: 0000000000000001
            [ 5388.480277] R13: 0000000000000000 R14: ffff8800b0aa06d8 R15: ffff8800a996b9b8
            [ 5388.481433] FS:  00007fbcfa47f880(0000) GS:ffff88013e300000(0000) knlGS:0000000000000000
            [ 5388.482807] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [ 5388.483784] CR2: 0000000000000000 CR3: 00000000b6230000 CR4: 00000000000006e0
            [ 5388.484962] Call Trace:
            [ 5388.485372]  [<ffffffffa018d26f>] xa_destroy+0xdf/0xf0 [libcfs]
            [ 5388.486415]  [<ffffffffa08a38a5>] osc_quota_cleanup+0x15/0x20 [osc]
            [ 5388.487506]  [<ffffffffa0884f1f>] osc_cleanup_common+0xbf/0x1b0 [osc]
            [ 5388.488607]  [<ffffffffa03068f9>] class_free_dev+0x219/0x730 [obdclass]
            [ 5388.489606]  [<ffffffffa0306ff0>] class_export_put+0x1e0/0x2e0 [obdclass]
            [ 5388.490799]  [<ffffffffa0308c15>] class_unlink_export+0x125/0x160 [obdclass]
            [ 5388.492043]  [<ffffffffa031e18e>] class_decref_free+0x4e/0x90 [obdclass]
            [ 5388.493112]  [<ffffffffa031eaf8>] class_decref+0x48/0xf0 [obdclass]
            [ 5388.494192]  [<ffffffffa031ef01>] class_detach+0x1c1/0x310 [obdclass]
            [ 5388.495278]  [<ffffffffa0326bbb>] class_process_config+0x163b/0x27c0 [obdclass]
            [ 5388.496501]  [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340
            [ 5388.497465]  [<ffffffffa0327f20>] class_manual_cleanup+0x1e0/0x770 [obdclass]
            [ 5388.498699]  [<ffffffffa0919ed5>] lov_tgts_putref+0x385/0xad0 [lov]
            [ 5388.499813]  [<ffffffffa091eea7>] lov_disconnect+0x237/0x280 [lov]
            [ 5388.500829]  [<ffffffffa0eb9c96>] obd_disconnect+0x56/0x300 [lustre]
            [ 5388.501904]  [<ffffffffa0ec30d7>] ll_put_super+0x767/0xce0 [lustre]
            [ 5388.503002]  [<ffffffff8114df3d>] ? call_rcu_sched+0x1d/0x20
            [ 5388.503986]  [<ffffffffa0ef185c>] ? ll_destroy_inode+0x1c/0x20 [lustre]
            [ 5388.505124]  [<ffffffff812639b8>] ? destroy_inode+0x38/0x60
            [ 5388.506008]  [<ffffffff81263aee>] ? evict+0x10e/0x180
            [ 5388.506891]  [<ffffffff817e8d7e>] ? _raw_spin_unlock+0xe/0x20
            [ 5388.507887]  [<ffffffff812907a6>] ? fsnotify_unmount_inodes+0x1d6/0x1e0
            [ 5388.509019]  [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0
            [ 5388.510087]  [<ffffffff81247ac2>] kill_anon_super+0x12/0x20
            [ 5388.511048]  [<ffffffffa0ef188b>] lustre_kill_super+0x2b/0x30 [lustre]
            [ 5388.512173]  [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60
            [ 5388.513269]  [<ffffffff81248616>] deactivate_super+0x46/0x60
            [ 5388.514218]  [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80
            [ 5388.515137]  [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20
            [ 5388.516044]  [<ffffffff810b69b5>] task_work_run+0xb5/0xf0
            [ 5388.516974]  [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0
            [ 5388.517922]  [<ffffffff817f4363>] int_signal+0x12/0x17
            [ 5388.518743] Code: 8d 7b 18 48 c7 43 10 01 00 00 00 48 c7 c6 60 be 18 a0 e8 dc 1f fc e0 4c 39 f3 75 b7 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 40 00 <41> 0f b6 4d 00 4c 89 eb e9 5c ff ff ff 0f 1f 00 48 81 fa 00 10 
            [ 5388.523115] RIP  [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs]
            [ 5388.524266]  RSP <ffff8800a996b980>
            [ 5388.524876] CR2: 0000000000000000 

            simmonsja , could you comment? Do we need a separate ticket for that? Can we expect that https://review.whamcloud.com/c/fs/lustre-release/+/52381 could solve the problem with the panic also?

            https://testing.whamcloud.com/gerrit-janitor/37062/testresults/sanity-quota-zfs-centos7_x86_64-centos7_x86_64/

            scherementsev Sergey Cheremencev added a comment - - edited Hit the same panic as Alex reported above: [ 5388.400457] ------------[ cut here ]------------ [ 5388.401251] WARNING: CPU: 2 PID: 12374 at lib/debugobjects.c:286 debug_print_object+0x83/0xa0 [ 5388.402684] ODEBUG: activate active (active state 1) object type: rcu_head hint: (null) [ 5388.404149] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata [ 5388.411563] CPU: 2 PID: 12374 Comm: umount Kdump: loaded Tainted: G W OE ------------ 3.10.0-7.9-debug #1 [ 5388.413343] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 5388.414823] Call Trace: [ 5388.415251] [<ffffffff817ded29>] dump_stack+0x19/0x1b [ 5388.416116] [<ffffffff8108d558>] __warn+0xd8/0x100 [ 5388.416925] [<ffffffff8108d5df>] warn_slowpath_fmt+0x5f/0x80 [ 5388.417850] [<ffffffff81414723>] debug_print_object+0x83/0xa0 [ 5388.418827] [<ffffffff814150af>] debug_object_activate+0x1af/0x210 [ 5388.419897] [<ffffffffa018be60>] ? xas_alloc+0xd0/0xd0 [libcfs] [ 5388.420931] [<ffffffff8114dc8f>] __call_rcu+0x3f/0x2d0 [ 5388.421801] [<ffffffff8114df3d>] call_rcu_sched+0x1d/0x20 [ 5388.422722] [<ffffffffa018bf44>] xas_free_nodes+0xa4/0xf0 [libcfs] [ 5388.423795] [<ffffffffa018d26f>] xa_destroy+0xdf/0xf0 [libcfs] [ 5388.424779] [<ffffffffa08a38a5>] osc_quota_cleanup+0x15/0x20 [osc] [ 5388.425804] [<ffffffffa0884f1f>] osc_cleanup_common+0xbf/0x1b0 [osc] [ 5388.426902] [<ffffffffa03068f9>] class_free_dev+0x219/0x730 [obdclass] [ 5388.428025] [<ffffffffa0306ff0>] class_export_put+0x1e0/0x2e0 [obdclass] [ 5388.429205] [<ffffffffa0308c15>] class_unlink_export+0x125/0x160 [obdclass] [ 5388.430404] [<ffffffffa031e18e>] class_decref_free+0x4e/0x90 [obdclass] [ 5388.431549] [<ffffffffa031eaf8>] class_decref+0x48/0xf0 [obdclass] [ 5388.432643] [<ffffffffa031ef01>] class_detach+0x1c1/0x310 [obdclass] [ 5388.433728] [<ffffffffa0326bbb>] class_process_config+0x163b/0x27c0 [obdclass] [ 5388.434945] [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340 [ 5388.435881] [<ffffffffa0327f20>] class_manual_cleanup+0x1e0/0x770 [obdclass] [ 5388.437115] [<ffffffffa0919ed5>] lov_tgts_putref+0x385/0xad0 [lov] [ 5388.438171] [<ffffffffa091eea7>] lov_disconnect+0x237/0x280 [lov] [ 5388.439178] [<ffffffffa0eb9c96>] obd_disconnect+0x56/0x300 [lustre] [ 5388.440276] [<ffffffffa0ec30d7>] ll_put_super+0x767/0xce0 [lustre] [ 5388.441341] [<ffffffff8114df3d>] ? call_rcu_sched+0x1d/0x20 [ 5388.442307] [<ffffffffa0ef185c>] ? ll_destroy_inode+0x1c/0x20 [lustre] [ 5388.443408] [<ffffffff812639b8>] ? destroy_inode+0x38/0x60 [ 5388.444361] [<ffffffff81263aee>] ? evict+0x10e/0x180 [ 5388.445220] [<ffffffff817e8d7e>] ? _raw_spin_unlock+0xe/0x20 [ 5388.446197] [<ffffffff812907a6>] ? fsnotify_unmount_inodes+0x1d6/0x1e0 [ 5388.447300] [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0 [ 5388.448335] [<ffffffff81247ac2>] kill_anon_super+0x12/0x20 [ 5388.449275] [<ffffffffa0ef188b>] lustre_kill_super+0x2b/0x30 [lustre] [ 5388.450347] [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60 [ 5388.451397] [<ffffffff81248616>] deactivate_super+0x46/0x60 [ 5388.452334] [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80 [ 5388.453213] [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20 [ 5388.454129] [<ffffffff810b69b5>] task_work_run+0xb5/0xf0 [ 5388.455070] [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0 [ 5388.456025] [<ffffffff817f4363>] int_signal+0x12/0x17 [ 5388.456860] ---[ end trace a193f2979542d3e3 ]--- [ 5388.457628] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5388.458920] IP: [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs] [ 5388.460023] PGD 0 [ 5388.460373] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 5388.461195] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata [ 5388.468704] CPU: 2 PID: 12374 Comm: umount Kdump: loaded Tainted: G W OE ------------ 3.10.0-7.9-debug #1 [ 5388.470441] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 5388.471875] task: ffff88012b485550 ti: ffff8800a9968000 task.ti: ffff8800a9968000 [ 5388.473121] RIP: 0010:[<ffffffffa018bf58>] [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs] [ 5388.474643] RSP: 0018:ffff8800a996b980 EFLAGS: 00010083 [ 5388.475541] RAX: 0000000000002710 RBX: ffff8800b0a26910 RCX: 0000000000000005 [ 5388.476753] RDX: ffff8800b0a26b70 RSI: ffff8800b0a26928 RDI: 0000000000000046 [ 5388.477945] RBP: ffff8800a996b9a8 R08: 0000000000000000 R09: 5130000000000000 [ 5388.479144] R10: 6633393161206563 R11: 61727420646e6520 R12: 0000000000000001 [ 5388.480277] R13: 0000000000000000 R14: ffff8800b0aa06d8 R15: ffff8800a996b9b8 [ 5388.481433] FS: 00007fbcfa47f880(0000) GS:ffff88013e300000(0000) knlGS:0000000000000000 [ 5388.482807] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5388.483784] CR2: 0000000000000000 CR3: 00000000b6230000 CR4: 00000000000006e0 [ 5388.484962] Call Trace: [ 5388.485372] [<ffffffffa018d26f>] xa_destroy+0xdf/0xf0 [libcfs] [ 5388.486415] [<ffffffffa08a38a5>] osc_quota_cleanup+0x15/0x20 [osc] [ 5388.487506] [<ffffffffa0884f1f>] osc_cleanup_common+0xbf/0x1b0 [osc] [ 5388.488607] [<ffffffffa03068f9>] class_free_dev+0x219/0x730 [obdclass] [ 5388.489606] [<ffffffffa0306ff0>] class_export_put+0x1e0/0x2e0 [obdclass] [ 5388.490799] [<ffffffffa0308c15>] class_unlink_export+0x125/0x160 [obdclass] [ 5388.492043] [<ffffffffa031e18e>] class_decref_free+0x4e/0x90 [obdclass] [ 5388.493112] [<ffffffffa031eaf8>] class_decref+0x48/0xf0 [obdclass] [ 5388.494192] [<ffffffffa031ef01>] class_detach+0x1c1/0x310 [obdclass] [ 5388.495278] [<ffffffffa0326bbb>] class_process_config+0x163b/0x27c0 [obdclass] [ 5388.496501] [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340 [ 5388.497465] [<ffffffffa0327f20>] class_manual_cleanup+0x1e0/0x770 [obdclass] [ 5388.498699] [<ffffffffa0919ed5>] lov_tgts_putref+0x385/0xad0 [lov] [ 5388.499813] [<ffffffffa091eea7>] lov_disconnect+0x237/0x280 [lov] [ 5388.500829] [<ffffffffa0eb9c96>] obd_disconnect+0x56/0x300 [lustre] [ 5388.501904] [<ffffffffa0ec30d7>] ll_put_super+0x767/0xce0 [lustre] [ 5388.503002] [<ffffffff8114df3d>] ? call_rcu_sched+0x1d/0x20 [ 5388.503986] [<ffffffffa0ef185c>] ? ll_destroy_inode+0x1c/0x20 [lustre] [ 5388.505124] [<ffffffff812639b8>] ? destroy_inode+0x38/0x60 [ 5388.506008] [<ffffffff81263aee>] ? evict+0x10e/0x180 [ 5388.506891] [<ffffffff817e8d7e>] ? _raw_spin_unlock+0xe/0x20 [ 5388.507887] [<ffffffff812907a6>] ? fsnotify_unmount_inodes+0x1d6/0x1e0 [ 5388.509019] [<ffffffff812476ca>] generic_shutdown_super+0x6a/0xf0 [ 5388.510087] [<ffffffff81247ac2>] kill_anon_super+0x12/0x20 [ 5388.511048] [<ffffffffa0ef188b>] lustre_kill_super+0x2b/0x30 [lustre] [ 5388.512173] [<ffffffff81247ec9>] deactivate_locked_super+0x49/0x60 [ 5388.513269] [<ffffffff81248616>] deactivate_super+0x46/0x60 [ 5388.514218] [<ffffffff81268b1f>] cleanup_mnt+0x3f/0x80 [ 5388.515137] [<ffffffff81268bb2>] __cleanup_mnt+0x12/0x20 [ 5388.516044] [<ffffffff810b69b5>] task_work_run+0xb5/0xf0 [ 5388.516974] [<ffffffff8102ccb2>] do_notify_resume+0x92/0xb0 [ 5388.517922] [<ffffffff817f4363>] int_signal+0x12/0x17 [ 5388.518743] Code: 8d 7b 18 48 c7 43 10 01 00 00 00 48 c7 c6 60 be 18 a0 e8 dc 1f fc e0 4c 39 f3 75 b7 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 40 00 <41> 0f b6 4d 00 4c 89 eb e9 5c ff ff ff 0f 1f 00 48 81 fa 00 10 [ 5388.523115] RIP [<ffffffffa018bf58>] xas_free_nodes+0xb8/0xf0 [libcfs] [ 5388.524266] RSP <ffff8800a996b980> [ 5388.524876] CR2: 0000000000000000 simmonsja , could you comment? Do we need a separate ticket for that? Can we expect that https://review.whamcloud.com/c/fs/lustre-release/+/52381 could solve the problem with the panic also? https://testing.whamcloud.com/gerrit-janitor/37062/testresults/sanity-quota-zfs-centos7_x86_64-centos7_x86_64/

            close to this issue

            [ 5407.737299] BUG: unable to handle kernel NULL pointer dereference at           (null)
            [ 5407.739021] IP: [<ffffffffa016df58>] xas_free_nodes+0xb8/0xf0 [libcfs]
            [ 5407.740398] PGD 0 
            [ 5407.740789] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
            [ 5407.741671] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata
            [ 5407.750803] CPU: 0 PID: 9909 Comm: umount Kdump: loaded Tainted: G           OE  ------------   3.10.0-7.9-debug #1
            [ 5407.753185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
            [ 5407.755316] task: ffff8800b6498000 ti: ffff8800a804c000 task.ti: ffff8800a804c000
            [ 5407.756831] RIP: 0010:[<ffffffffa016df58>]  [<ffffffffa016df58>] xas_free_nodes+0xb8/0xf0 [libcfs]
            [ 5407.758521] RSP: 0018:ffff8800a804f950  EFLAGS: 00010083
            [ 5407.759476] RAX: 0000000000002710 RBX: ffff880136ef66c8 RCX: 0000000000000023
            [ 5407.760702] RDX: ffff880136ef5930 RSI: ffff880136ef66e0 RDI: 0000000000000046
            [ 5407.761989] RBP: ffff8800a804f978 R08: 0000000000000000 R09: 77b0000000000000
            [ 5407.763203] R10: 00000000af0c2d01 R11: ffff8800af0c2f80 R12: 000000000000002a
            [ 5407.764433] R13: 0000000000000000 R14: ffff880136ef56d0 R15: ffff8800a804f988
            [ 5407.765708] FS:  00007f19ba87f880(0000) GS:ffff88013e200000(0000) knlGS:0000000000000000
            [ 5407.767225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [ 5407.768257] CR2: 0000000000000000 CR3: 00000000a4d64000 CR4: 00000000000006f0
            [ 5407.769490] Call Trace:
            [ 5407.769952]  [<ffffffffa016f26f>] xa_destroy+0xdf/0xf0 [libcfs]
            [ 5407.771357]  [<ffffffffa08814c5>] osc_quota_cleanup+0x15/0x20 [osc]
            [ 5407.773200]  [<ffffffffa0862f1f>] osc_cleanup_common+0xbf/0x1b0 [osc]
            [ 5407.774839]  [<ffffffffa02e88f9>] class_free_dev+0x219/0x730 [obdclass]
            [ 5407.776139]  [<ffffffffa02e8ff0>] class_export_put+0x1e0/0x2e0 [obdclass]
            [ 5407.777489]  [<ffffffffa02eac15>] class_unlink_export+0x125/0x160 [obdclass]
            [ 5407.778876]  [<ffffffffa030016e>] class_decref_free+0x4e/0x90 [obdclass]
            [ 5407.780371]  [<ffffffffa0300ad8>] class_decref+0x48/0xf0 [obdclass]
            [ 5407.781595]  [<ffffffffa0300ee1>] class_detach+0x1c1/0x310 [obdclass]
            [ 5407.782914]  [<ffffffffa0308b9b>] class_process_config+0x163b/0x27c0 [obdclass]
            [ 5407.784281]  [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340
            [ 5407.785303]  [<ffffffffa0309f00>] class_manual_cleanup+0x1e0/0x770 [obdclass]
            [ 5407.786693]  [<ffffffffa08f7955>] lov_tgts_putref+0x385/0xad0 [lov]
            [ 5407.787835]  [<ffffffffa08fc927>] lov_disconnect+0x237/0x280 [lov]
            [ 5407.789099]  [<ffffffffa0e82b46>] obd_disconnect+0x56/0x300 [lustre]
            

            https://testing.whamcloud.com/gerrit-janitor/35992/testresults/sanity-quota-zfs-centos7_x86_64-centos7_x86_64/

            aboyko Alexander Boyko added a comment - close to this issue [ 5407.737299] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5407.739021] IP: [<ffffffffa016df58>] xas_free_nodes+0xb8/0xf0 [libcfs] [ 5407.740398] PGD 0 [ 5407.740789] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 5407.741671] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) crc32_generic libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 squashfs i2c_piix4 i2c_core pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix serio_raw libata [ 5407.750803] CPU: 0 PID: 9909 Comm: umount Kdump: loaded Tainted: G OE ------------ 3.10.0-7.9-debug #1 [ 5407.753185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 5407.755316] task: ffff8800b6498000 ti: ffff8800a804c000 task.ti: ffff8800a804c000 [ 5407.756831] RIP: 0010:[<ffffffffa016df58>] [<ffffffffa016df58>] xas_free_nodes+0xb8/0xf0 [libcfs] [ 5407.758521] RSP: 0018:ffff8800a804f950 EFLAGS: 00010083 [ 5407.759476] RAX: 0000000000002710 RBX: ffff880136ef66c8 RCX: 0000000000000023 [ 5407.760702] RDX: ffff880136ef5930 RSI: ffff880136ef66e0 RDI: 0000000000000046 [ 5407.761989] RBP: ffff8800a804f978 R08: 0000000000000000 R09: 77b0000000000000 [ 5407.763203] R10: 00000000af0c2d01 R11: ffff8800af0c2f80 R12: 000000000000002a [ 5407.764433] R13: 0000000000000000 R14: ffff880136ef56d0 R15: ffff8800a804f988 [ 5407.765708] FS: 00007f19ba87f880(0000) GS:ffff88013e200000(0000) knlGS:0000000000000000 [ 5407.767225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5407.768257] CR2: 0000000000000000 CR3: 00000000a4d64000 CR4: 00000000000006f0 [ 5407.769490] Call Trace: [ 5407.769952] [<ffffffffa016f26f>] xa_destroy+0xdf/0xf0 [libcfs] [ 5407.771357] [<ffffffffa08814c5>] osc_quota_cleanup+0x15/0x20 [osc] [ 5407.773200] [<ffffffffa0862f1f>] osc_cleanup_common+0xbf/0x1b0 [osc] [ 5407.774839] [<ffffffffa02e88f9>] class_free_dev+0x219/0x730 [obdclass] [ 5407.776139] [<ffffffffa02e8ff0>] class_export_put+0x1e0/0x2e0 [obdclass] [ 5407.777489] [<ffffffffa02eac15>] class_unlink_export+0x125/0x160 [obdclass] [ 5407.778876] [<ffffffffa030016e>] class_decref_free+0x4e/0x90 [obdclass] [ 5407.780371] [<ffffffffa0300ad8>] class_decref+0x48/0xf0 [obdclass] [ 5407.781595] [<ffffffffa0300ee1>] class_detach+0x1c1/0x310 [obdclass] [ 5407.782914] [<ffffffffa0308b9b>] class_process_config+0x163b/0x27c0 [obdclass] [ 5407.784281] [<ffffffff81220310>] ? __kmalloc+0x1e0/0x340 [ 5407.785303] [<ffffffffa0309f00>] class_manual_cleanup+0x1e0/0x770 [obdclass] [ 5407.786693] [<ffffffffa08f7955>] lov_tgts_putref+0x385/0xad0 [lov] [ 5407.787835] [<ffffffffa08fc927>] lov_disconnect+0x237/0x280 [lov] [ 5407.789099] [<ffffffffa0e82b46>] obd_disconnect+0x56/0x300 [lustre] https://testing.whamcloud.com/gerrit-janitor/35992/testresults/sanity-quota-zfs-centos7_x86_64-centos7_x86_64/

            Sorry it took awhile to run the patch.  As you can see with a non-debug normal kernel sanity-quota passes.

            https://testing.whamcloud.com/test_sets/921b8bf1-b5ba-4664-9d6c-1c4c164e543c

            simmonsja James A Simmons added a comment - Sorry it took awhile to run the patch.  As you can see with a non-debug normal kernel sanity-quota passes. https://testing.whamcloud.com/test_sets/921b8bf1-b5ba-4664-9d6c-1c4c164e543c

            I just pushed a patch to validate the xarray work with a normal RHEL7 kernel. 

            simmonsja James A Simmons added a comment - I just pushed a patch to validate the xarray work with a normal RHEL7 kernel. 

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52381
            Subject: LU-17097 tests: validate xarray on RHEL7 non debug kernels
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: f615ad49ca9d4673acdbdafdbea595b911c6269e

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52381 Subject: LU-17097 tests: validate xarray on RHEL7 non debug kernels Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: f615ad49ca9d4673acdbdafdbea595b911c6269e

            Hi simmonsja 

            'm working with Oleg to move to RHLE8 for debug kernel testing. This should go away then.

            Thanks for update. Does it mean there is no issue with your patch and it is caused by some problem in RHEL7 debug kernel?

            scherementsev Sergey Cheremencev added a comment - Hi simmonsja   'm working with Oleg to move to RHLE8 for debug kernel testing. This should go away then. Thanks for update. Does it mean there is no issue with your patch and it is caused by some problem in RHEL7 debug kernel?

            People

              simmonsja James A Simmons
              scherementsev Sergey Cheremencev
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: