Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.13.0, Lustre 2.12.1
-
3
-
9223372036854775807
Description
I see we are having this sort of a crash on master semi-regularly in full testing.
[11400.772957] Lustre: DEBUG MARKER: == sanity-quota test 51: Test project accounting with mv/cp ========================================== 11:53:35 (1549367615) [11402.675816] Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 [11403.528916] Lustre: DEBUG MARKER: lctl set_param -n osd*.*OS*.force_sync=1 [11404.500328] Lustre: DEBUG MARKER: lctl set_param -n osd*.*OS*.force_sync=1 [11404.961396] BUG: unable to handle kernel paging request at ffff9775f906a000 [11404.962319] IP: [<ffffffffc0b453d5>] lustre_swab_fiemap+0x85/0xa0 [ptlrpc] [11404.963223] PGD 56852067 PUD 56856067 PMD 7a5ba063 PTE 800000007906a061 [11404.963934] Oops: 0003 [#1] SMP [11404.964307] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc dm_mod iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul i2c_piix4 parport_pc pcspkr joydev parport glue_helper virtio_balloon ablk_helper cryptd ip_tables ext4 mbcache jbd2 virtio_blk ata_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel serio_raw floppy ata_piix [11404.972408] 8139too libata virtio_pci virtio_ring virtio 8139cp mii [11404.973038] CPU: 0 PID: 16619 Comm: ll_ost00_001 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.1.3.el7_lustre.x86_64 #1 [11404.974208] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [11404.974750] task: ffff9775f9b51040 ti: ffff9775f9d58000 task.ti: ffff9775f9d58000 [11404.975462] RIP: 0010:[<ffffffffc0b453d5>] [<ffffffffc0b453d5>] lustre_swab_fiemap+0x85/0xa0 [ptlrpc] [11404.976388] RSP: 0018:ffff9775f9d5bbb8 EFLAGS: 00010202 [11404.976899] RAX: ffff9775f9069fd8 RBX: 0000000000000000 RCX: 00000000000048c7 [11404.977570] RDX: 00000000000002d7 RSI: 0000000000000002 RDI: ffff9775f90600e8 [11404.978244] RBP: ffff9775f9d5bbb8 R08: 00000000000000b8 R09: 00000000000000e8 [11404.978917] R10: 0000000000000000 R11: 0000000000000020 R12: ffffffffc0b41120 [11404.979595] R13: ffffffffc0c3c260 R14: ffff9775f90600e8 R15: ffffffffc0c41120 [11404.980273] FS: 0000000000000000(0000) GS:ffff9775ffc00000(0000) knlGS:0000000000000000 [11404.981037] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [11404.981586] CR2: ffff9775f906a000 CR3: 0000000079bc6000 CR4: 00000000000606f0 [11404.982265] Call Trace: [11404.982573] [<ffffffffc0b69477>] __req_capsule_get+0x4c7/0x740 [ptlrpc] [11404.983253] [<ffffffffc0b45350>] ? lustre_swab_obd_quotactl+0xb0/0xb0 [ptlrpc] [11404.983989] [<ffffffffc0b69705>] req_capsule_client_get+0x15/0x20 [ptlrpc] [11404.984682] [<ffffffffc0ed7d4e>] ofd_get_info_hdl+0x54e/0x10f0 [ofd] [11404.985332] [<ffffffffc0b41752>] ? lustre_msg_get_opc+0x22/0xf0 [ptlrpc] [11404.986050] [<ffffffffc0bab149>] ? tgt_request_preprocess.isra.31+0x299/0x7a0 [ptlrpc] [11404.986839] [<ffffffffc0bac40a>] tgt_request_handle+0xafa/0x1590 [ptlrpc] [11404.987523] [<ffffffffc070cf07>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [11404.988188] [<ffffffffc0b4f99e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc] [11404.988951] [<ffffffff8d4cba9b>] ? __wake_up_common+0x5b/0x90 [11404.989539] [<ffffffffc0b5347c>] ptlrpc_main+0xbbc/0x2090 [ptlrpc] [11404.990155] [<ffffffff8d4d0880>] ? finish_task_switch+0x50/0x1c0 [11404.990760] [<ffffffffc0b528c0>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [11404.991477] [<ffffffff8d4c1c31>] kthread+0xd1/0xe0 [11404.991958] [<ffffffff8d4c1b60>] ? insert_kthread_work+0x40/0x40 [11404.992556] [<ffffffff8db74c37>] ret_from_fork_nospec_begin+0x21/0x21 [11404.993184] [<ffffffff8d4c1b60>] ? insert_kthread_work+0x40/0x40 [11404.993761] Code: 29 c8 48 8d 44 07 20 48 8b 08 48 0f c9 48 89 08 48 8b 48 08 48 0f c9 48 89 48 08 48 8b 48 10 48 0f c9 48 89 48 10 8b 48 28 0f c9 <89> 48 28 8b 48 2c 0f c9 89 48 2c 39 57 14 77 b3 5d c3 66 0f 1f [11404.996791] RIP [<ffffffffc0b453d5>] lustre_swab_fiemap+0x85/0xa0 [ptlrpc] [11404.997499] RSP <ffff9775f9d5bbb8> [11404.997845] CR2: ffff9775f906a000
Sample reports:
https://testing.whamcloud.com/test_sessions/a7100cee-9175-47ca-aaa4-509431ebb316
https://testing.whamcloud.com/test_sessions/99cd3009-304d-4d61-901e-1eeebea118af
https://testing.whamcloud.com/test_sets/247fe132-2970-11e9-b901-52540065bddc (his one failed on Feb 5th)