Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.8.0
-
3
-
9223372036854775807
Description
Testing current master I hit this:
<4>[14176.514805] Lustre: lustre-MDT0000-mdc-ffff880058b347f0: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete <4>[14176.529190] Lustre: Skipped 34 previous similar messages <1>[14176.531440] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c <1>[14176.531442] IP: [<ffffffffa094ee36>] old_init_ucred+0x156/0x390 [mdt] <4>[14176.531460] PGD 8cea9067 PUD 8ceaa067 PMD 0 <4>[14176.531462] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC <4>[14176.531464] last sysfs file: /sys/devices/system/cpu/possible <4>[14176.531465] CPU 1 <4>[14176.531466] Modules linked in: lustre ofd osp lod ost mdt mdd mgs osd_ldiskfs ldiskfs lquota lfsck obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 jbd2 mbcache virtio_console virtio_balloon i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] <4>[14176.531488] <4>[14176.531489] Pid: 4336, comm: mdt00_004 Not tainted 2.6.32-rhe6.7-debug #1 Red Hat KVM <4>[14176.531491] RIP: 0010:[<ffffffffa094ee36>] [<ffffffffa094ee36>] old_init_ucred+0x156/0x390 [mdt] <4>[14176.531503] RSP: 0018:ffff8800972b3b20 EFLAGS: 00010287 <4>[14176.531504] RAX: 0000000000000000 RBX: ffff8800983940e0 RCX: 0000000000000000 <4>[14176.531505] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8800900dbc70 <4>[14176.531506] RBP: ffff8800972b3b60 R08: 00000000ffffffec R09: 00000000ffffffef <4>[14176.531508] R10: 000000000000000f R11: 000000000000000f R12: ffff8800969c4f30 <4>[14176.531509] R13: ffff8800900cf7f0 R14: 0000000000000000 R15: ffff8800b2d5c000 <4>[14176.531511] FS: 0000000000000000(0000) GS:ffff880006240000(0000) knlGS:0000000000000000 <4>[14176.531512] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>[14176.531513] CR2: 000000000000001c CR3: 000000008cea8000 CR4: 00000000000006e0 <4>[14176.531517] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[14176.531518] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[14176.531519] Process mdt00_004 (pid: 4336, threadinfo ffff8800972b0000, task ffff88006ba1e080) <4>[14176.531520] Stack: <4>[14176.531521] ffff8800972b3b30 00ffffffa104b390 ffff8800972b3b40 ffff8800900cf7f0 <4>[14176.531523] <d> ffff8800969c4f30 ffff880058e65ce8 0000000000001000 0000000000000013 <4>[14176.531525] <d> ffff8800972b3b90 ffffffffa0950e9d 0000000000001000 ffff8800900cf7f0 <4>[14176.531527] Call Trace: <4>[14176.531537] [<ffffffffa0950e9d>] mdt_init_ucred_intent_getattr+0x9d/0xe0 [mdt] <4>[14176.531546] [<ffffffffa094ad51>] mdt_intent_getattr+0x1e1/0x470 [mdt] <4>[14176.531554] [<ffffffffa093a694>] mdt_intent_policy+0x494/0xc40 [mdt] <4>[14176.531585] [<ffffffffa11b211f>] ldlm_lock_enqueue+0x12f/0x860 [ptlrpc] <4>[14176.531613] [<ffffffffa11de067>] ldlm_handle_enqueue0+0x807/0x1580 [ptlrpc] <4>[14176.531650] [<ffffffffa1264dd1>] tgt_enqueue+0x61/0x230 [ptlrpc] <4>[14176.531681] [<ffffffffa126585c>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] <4>[14176.531710] [<ffffffffa1210b74>] ptlrpc_main+0xd74/0x1850 [ptlrpc] <4>[14176.531738] [<ffffffffa120fe00>] ? ptlrpc_main+0x0/0x1850 [ptlrpc] <4>[14176.531742] [<ffffffff8109f82e>] kthread+0x9e/0xc0 <4>[14176.531745] [<ffffffff8100c2ca>] child_rip+0xa/0x20 <4>[14176.531747] [<ffffffff8109f790>] ? kthread+0x0/0xc0 <4>[14176.531748] [<ffffffff8100c2c0>] ? child_rip+0x0/0x20 <4>[14176.531749] Code: c7 c7 57 4d 99 a0 f3 a6 0f 84 37 01 00 00 89 c6 48 89 d7 e8 0d e0 01 00 48 3d 00 f0 ff ff 0f 87 8a 01 00 00 48 89 43 40 8b 43 04 <41> 3b 46 1c 0f 84 23 01 00 00 49 8b 55 00 31 c0 48 85 d2 74 03 <1>[14176.531764] RIP [<ffffffffa094ee36>] old_init_ucred+0x156/0x390 [mdt] <4>[14176.531774] RSP <ffff8800972b3b20> <4>[14176.531775] CR2: 000000000000001c
This is in replay-dual test 26.
Code is
(gdb) l *(old_init_ucred+0x156) 0x1ee66 is in old_init_ucred (/home/green/git/lustre-release/lustre/mdt/mdt_lib.c:469). 464 } 465 466 static void mdt_squash_nodemap_id(struct lu_ucred *ucred, 467 struct lu_nodemap *nodemap) 468 { 469 if (ucred->uc_o_uid == nodemap->nm_squash_uid) { 470 ucred->uc_fsuid = nodemap->nm_squash_uid; 471 ucred->uc_fsgid = nodemap->nm_squash_gid;
nodemap is NULL in this case.