Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.5.1
-
None
-
Lustre servers: 2.4.3
Lustre clients: 2.5.1
-
2
-
14248
Description
The setup is as follows:
There are two filesystems: pfs2dat2 and pfs2wor2
Clients:
uc1n996
uc1n997
For pfs2dat2:
MDS: pfs2n12/13
OSS: pfs2n14/15
For pfs2wor2:
MDS: pfs2n16/17
OSS: pfs2n18/19/20/21
The two MDSes involved in failover were pfs2n12 and pds2n13. The client uc1n996 panicked with the following stack trace:
last sysfs file:
/sys/devices/system/cpu/online
CPU 5
Modules linked in: iptable_filter ip_tables
nfs lockd fscache auth_rpcgss nfs_acl sunrpc lmv(U) fld(U) mgc(U) lustre(U)
lov(U) osc(U) mdc(U) fid(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U)
sha512_generic sha256_generic crc32c_intel libcfs(U) ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_multipath vhost_net
macvtap macvlan tun kvm_intel kvm uinput microcode iTCO_wdt
iTCO_vendor_support acpi_pad power_meter dcdbas sg mlx4_ib ib_sa ib_mad
ib_core mlx4_en mlx4_core sb_edac edac_core lpc_ich mfd_core shpchp igb
i2c_algo_bit i2c_core ixgbe dca ptp pps_core mdio xfs exportfs sd_mod
crc_t10dif wmi ahci megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: speedstep_lib]
Pid: 2895, comm: ptlrpcd_rcv Not tainted
2.6.32-431.11.2.el6.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
RIP: 0010:[<ffffffffa0708bde>]
[<ffffffffa0708bde>] lustre_msg_get_opc+0xe/0x110 [ptlrpc]
RSP: 0018:ffff88082b5ddc80 EFLAGS: 00010282
RAX: ffff8800a585e208 RBX: 0000000000000000
RCX: ffff8801a22893a0
RDX: 0000000000000002 RSI: 0000000000000000
RDI: 3237323033093932
RBP: ffff88082b5ddc90 R08: 0000000000000000
R09: 00000000fffffffc
R10: 0000000000000002 R11: 0000000000000004
R12: ffff8809421d7000
R13: ffff8800a585e208 R14: 00000032a434f11a
R15: ffff8801a22890c8
FS: 0000000000000000(0000)
GS:ffff88085c440000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b
CR2: 000000346b2727d0 CR3: 000000102a8e5000
CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0
DR7: 0000000000000400
Process ptlrpcd_rcv (pid: 2895, threadinfo
ffff88082b5dc000, task ffff8808314deaa0)
Stack:
ffff88082b5ddc90 0000000000000000
ffff88082b5ddcd0 ffffffffa08b6c2d
<d> ffff880563411000 ffff8801a2289000
ffff8801a2289000 ffff88102d915800
<d> ffff8801a22892e0 00000032a434f11a
ffff88082b5ddd00 ffffffffa06fd312
Call Trace:
[<ffffffffa08b6c2d>]
mdc_replay_open+0xad/0x420 [mdc]
[<ffffffffa06fd312>]
ptlrpc_replay_interpret+0x142/0x740 [ptlrpc]
[<ffffffffa06fe994>]
ptlrpc_check_set+0x2c4/0x1b40 [ptlrpc]
[<ffffffffa0729ebb>] ptlrpcd_check+0x53b/0x560
[ptlrpc]
[<ffffffffa072a3db>] ptlrpcd+0x20b/0x370
[ptlrpc]
[<ffffffff81065df0>] ?
default_wake_function+0x0/0x20
[<ffffffffa072a1d0>] ? ptlrpcd+0x0/0x370
[ptlrpc]
[<ffffffff8109aee6>] kthread+0x96/0xa0
[<ffffffff8100c20a>] child_rip+0xa/0x20
[<ffffffff8109ae50>] ? kthread+0x0/0xa0
[<ffffffff8100c200>] ? child_rip+0x0/0x20
Code: 24 48 48 83 c4 68 4c 89 e0 5b 41 5c 41
5d 41 5e 41 5f c9 c3 45 31 e4 e9 26 ff ff ff 90 55 48 89 e5 53 48 83 ec 08 0f
1f 44 00 00 <81> 7f 08 d3 0b d0 0b 48 89 fb 74 76 c7 05 fc 7e 0a 00 00 01 00
RIP [<ffffffffa0708bde>]
lustre_msg_get_opc+0xe/0x110 [ptlrpc]
RSP <ffff88082b5ddc80>
--[ end trace ee65cdcf6a61aa8a ]--
Attachments
Issue Links
- is related to
-
LU-5507 sanity-quota test_18: Oops: IP: lustre_msg_get_opc+0xe/0x110 [ptlrpc]
-
- Resolved
-