[LU-13587] sanity-quota test_68: Oops: RIP: qpi_state_seq_show+0x86/0xe0 [lquota] Created: 19/May/20 Updated: 22/Apr/22 Resolved: 06/Jan/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Sergey Cheremencev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
This issue was created by maloo for jianyu <yujian@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/e4263a98-dc1d-45bd-a14b-079369daac21 test_68 failed with the following error: MDS crashed during sanity-quota test_68: Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n qmt.lustre-QMT0000.dt-qpool1.info BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffc14b5216>] qpi_state_seq_show+0x86/0xe0 [lquota] PGD 800000005ec8d067 PUD 79c98067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) zfs(POE) zunicode(POE) zlua(POE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt rpcrdma rdma_ucm ib_iser rdma_cm iw_cm libiscsi ib_umad scsi_transport_iscsi ib_ipoib ib_cm mlx4_ib sunrpc ib_uverbs ib_core dm_mod iosf_mbi crc32_pclmul ppdev ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev i2c_piix4 virtio_balloon pcspkr parport_pc parport ip_tables ext4 mbcache jbd2 ata_generic pata_acpi mlx4_en ptp pps_core mlx4_core virtio_blk ata_piix 8139too crct10dif_pclmul crct10dif_common libata crc32c_intel devlink serio_raw virtio_pci virtio_ring virtio 8139cp mii floppy CPU: 1 PID: 28101 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-1062.9.1.el7_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 task: ffff9f3b9a8ea0e0 ti: ffff9f3b9f36c000 task.ti: ffff9f3b9f36c000 RIP: 0010:[<ffffffffc14b5216>] [<ffffffffc14b5216>] qpi_state_seq_show+0x86/0xe0 [lquota] RSP: 0018:ffff9f3b9f36fe28 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000 RDX: ffffffffc14c0b2d RSI: ffffffffc14c5e08 RDI: ffff9f3b8b4ef180 RBP: ffff9f3b9f36fe40 R08: 000000000000000a R09: 000000000000fffe R10: 0000000000000000 R11: ffff9f3b9f36fcbe R12: ffff9f3b91286800 R13: ffff9f3b8b4ef180 R14: ffff9f3b9f36ff18 R15: ffff9f3b8b4ef180 FS: 00007f86d2e0e740(0000) GS:ffff9f3bbfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000004bb4e000 CR4: 00000000000606e0 Call Trace: [<ffffffff8de72780>] seq_read+0x130/0x440 [<ffffffff8dec2d00>] proc_reg_read+0x40/0x80 [<ffffffff8de4a65f>] vfs_read+0x9f/0x170 [<ffffffff8de4b51f>] SyS_read+0x7f/0xf0 [<ffffffff8e38de21>] ? system_call_after_swapgs+0xae/0x146 [<ffffffff8e38dede>] system_call_fastpath+0x25/0x2a [<ffffffff8e38de21>] ? system_call_after_swapgs+0xae/0x146 Code: 5c a8 01 00 00 41 8b 8c 1c c0 01 00 00 48 c7 c6 08 5e 4c c1 41 03 8c 1c cc 01 00 00 48 8b 94 1b e0 eb 4b c1 4c 89 ef 48 83 c3 04 <48> 8b 00 44 8b 40 40 31 c0 e8 1c cc 9b cc 48 83 fb 0c 75 be 5b RIP [<ffffffffc14b5216>] qpi_state_seq_show+0x86/0xe0 [lquota] RSP <ffff9f3b9f36fe28> CR2: 0000000000000000 <<Please provide additional information about the failure here>> VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Jian Yu [ 19/May/20 ] |
|
One more instance on master branch: |
| Comment by John Hammond [ 04/Feb/21 ] |
|
This is very easy to reproduce by getting the pool info files in a loop and creating a new pool. When the info proc file is registered (in qmt_pool_alloc()), the qpi_site pointers are still NULL. They are not set until later in qmt_pool_prepare(). The test was added by |
| Comment by Gerrit Updater [ 11/Jun/21 ] |
|
Sergey Cheremencev (sergey.cheremencev@hpe.com) uploaded a new patch: https://review.whamcloud.com/43986 |
| Comment by Gerrit Updater [ 11/Jun/21 ] |
|
Sergey Cheremencev (sergey.cheremencev@hpe.com) uploaded a new patch: https://review.whamcloud.com/43987 |
| Comment by Gerrit Updater [ 06/Jan/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43987/ |
| Comment by Cory Spitz [ 06/Jan/22 ] |
|
Fixed for 2.15.0. https://review.whamcloud.com/#/c/43986/ remains, but it is to be abandoned since the test it added was included in https://review.whamcloud.com/#/c/43987/. |