[LU-1799] Oops: Kernel access of bad area with IPv6 address Created: 28/Aug/12 Updated: 26/Jun/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Ned Bass | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | llnl | ||
| Environment: |
https://github.com/chaos/lustre/commits/orion-2_3_49_54_2-62chaos |
||
| Severity: | 3 |
| Project: | Orion |
| Rank (Obsolete): | 4447 |
| Description |
|
IBM reported this kernel panic on their BGQ IO node when loading the ptlrpc module with an o2ib network. The IB interface had an IPv4 and IPv6 address. Removing the IPv6 address avoided the crash. {0}.1.0: Unable to handle kernel paging request for data at address 0x00000138
{0}.1.0: Faulting instruction address: 0xc0000000002cb518
{0}.1.0: Oops: Kernel access of bad area, sig: 11 [#1]
{0}.1.0: SMP NR_CPUS=68 Blue Gene/Q
{0}.1.0: Modules linked in: ko2iblnd(U) ptlrpc(+)(U) obdclass(U) lnet(U) lvfs(U) libcfs(U)
{0}.1.0: NIP: c0000000002cb518 LR: 80000000035f82a8 CTR: c0000000002cb4fc
{0}.1.0: REGS: c0000003e4c27290 TRAP: 0300 Not tainted (2.6.32-220.23.3.bgq.el6_V1R1M2_0.ppc64)
{0}.1.0: MSR: 0000000080029000 <EE,ME,CE> CR: 24228480 XER: 20000000
{0}.1.0: DEAR: 0000000000000138, ESR: 0000000000000000
{0}.1.0: TASK = c0000003ec085b20[2170] 'modprobe' THREAD: c0000003e4c24000 CPU: 4
{0}.1.0: GPR00: 80000000035f82a8 c0000003e4c27510 c0000000006bc1c0 0000000000000000
{0}.1.0: GPR04: 0000000000000000 0000000000000000 c0000003ebdd0118 c000000002342408
{0}.1.0: GPR08: 0000000000000001 0000000000000000 0000000000008000 c0000000002cb4fc
{0}.1.0: GPR12: 8000000003612280 c000000000725900 0000000000000400 0000000000000730
{0}.1.0: GPR16: 0000000000000000 8000000000dd24b8 0000000000000000 0000000000000010
{0}.1.0: GPR20: 80000000007607c0 0000000000003590 8000000000754608 800000000361b050
{0}.1.0: GPR24: c0000003efa99400 c0000003e4c27610 c0000003e4c27600 c0000003e4c27620
{0}.1.0: GPR28: c0000003ebdd00c0 c0000003ebc60600 800000000362c568 0000000000000000
{0}.1.0: NIP [c0000000002cb518] .ib_alloc_pd+0x1c/0x6c
{0}.1.0: LR [80000000035f82a8] .kiblnd_dev_failover+0x228/0xc30 [ko2iblnd]
{0}.1.0: Call Trace:
{0}.1.0: [c0000003e4c27510] [c0000003e4c27670] 0xc0000003e4c27670 (unreliable)
{0}.1.0: [c0000003e4c27590] [80000000035f82a8] .kiblnd_dev_failover+0x228/0xc30 [ko2iblnd]
{0}.1.0: [c0000003e4c276f0] [80000000035f8e2c] .kiblnd_create_dev+0x17c/0x650 [ko2iblnd]
{0}.1.0: [c0000003e4c277e0] [80000000035fff58] .kiblnd_startup+0x3a8/0x6e0 [ko2iblnd]
{0}.1.0: [c0000003e4c278d0] [8000000000d962fc] .lnet_startup_lndnis+0x1bc/0xa50 [lnet]
{0}.1.0: [c0000003e4c27a00] [8000000000d96d24] .LNetNIInit+0x194/0x2b0 [lnet]
{0}.1.0: [c0000003e4c27ac0] [80000000031351e4] .ptlrpc_ni_init+0x84/0x260 [ptlrpc]
{0}.1.0: [c0000003e4c27b80] [8000000003135794] .ptlrpc_init_portals+0x34/0x1c0 [ptlrpc]
{0}.1.0: [c0000003e4c27c30] [80000000031825e8] .init_module+0x158/0x7dc8 [ptlrpc]
{0}.1.0: [c0000003e4c27cd0] [c000000000000e74] .do_one_initcall+0x88/0x1bc
{0}.1.0: [c0000003e4c27d80] [c00000000006c910] .SyS_init_module+0x11c/0x2b4
{0}.1.0: [c0000003e4c27e30] [c000000000000580] syscall_exit+0x0/0x2c
{0}.1.0: Instruction dump:
{0}.1.0: 38210080 e8010010 ebe1fff8 7c0803a6 4e800020 7c0802a6 fbe1fff8 38800000
{0}.1.0: 38a00000 7c7f1b78 f8010010 f821ff81 <e9230138> e8090000 f8410028 7c0903a6
{0}.1.0: Kernel panic - not syncing: Fatal exception
{0}.1.0: Call Trace:
{0}.1.0: [c0000003e4c26fc0] [c000000000008190] .show_stack+0x7c/0x184 (unreliable)
{0}.1.0: [c0000003e4c27070] [c000000000423614] .panic+0x80/0x1a8
{0}.1.0: [c0000003e4c27100] [c000000000018d58] .die+0x1a4/0x1bc
{0}.1.0: [c0000003e4c271a0] [c00000000001e9e0] .bad_page_fault+0xb8/0xd4
{0}.1.0: [c0000003e4c27220] [c000000000013e4c] storage_fault_common+0x48/0x4c
{0}.1.0: --- Exception: 300 at .ib_alloc_pd+0x1c/0x6c
{0}.1.0: LR = .kiblnd_dev_failover+0x228/0xc30 [ko2iblnd]
{0}.1.0: [c0000003e4c27510] [c0000003e4c27670] 0xc0000003e4c27670 (unreliable)
{0}.1.0: [c0000003e4c27590] [80000000035f82a8] .kiblnd_dev_failover+0x228/0xc30 [ko2iblnd]
{0}.1.0: [c0000003e4c276f0] [80000000035f8e2c] .kiblnd_create_dev+0x17c/0x650 [ko2iblnd]
{0}.1.0: [c0000003e4c277e0] [80000000035fff58] .kiblnd_startup+0x3a8/0x6e0 [ko2iblnd]
{0}.1.0: [c0000003e4c278d0] [8000000000d962fc] .lnet_startup_lndnis+0x1bc/0xa50 [lnet]
{0}.1.0: [c0000003e4c27a00] [8000000000d96d24] .LNetNIInit+0x194/0x2b0 [lnet]
{0}.1.0: [c0000003e4c27ac0] [80000000031351e4] .ptlrpc_ni_init+0x84/0x260 [ptlrpc]
{0}.1.0: [c0000003e4c27b80] [8000000003135794] .ptlrpc_init_portals+0x34/0x1c0 [ptlrpc]
{0}.1.0: [c0000003e4c27c30] [80000000031825e8] .init_module+0x158/0x7dc8 [ptlrpc]
{0}.1.0: [c0000003e4c27cd0] [c000000000000e74] .do_one_initcall+0x88/0x1bc
{0}.1.0: [c0000003e4c27d80] [c00000000006c910] .SyS_init_module+0x11c/0x2b4
{0}.1.0: [c0000003e4c27e30] [c000000000000580] syscall_exit+0x0/0x2c
|
| Comments |
| Comment by Doug Oucharek (Inactive) [ 28/Aug/12 ] |
|
LNet currently does not support IPv6. A presentation was done by Isaac Huang at this year's LUG talking about the large challenge of supporting IPv6. No one has taken ownership of this task yet (I believe it was on the OpenSFS list of projects to fund). |
| Comment by Ian Colle (Inactive) [ 28/Aug/12 ] |
|
Doug it seems reasonable that we don't support IPv6, but we also probably shouldn't kernel panic just because we see an IPv6 address. |
| Comment by Ned Bass [ 28/Aug/12 ] |
|
Indeed the desired outcome from this issue is to fix the panic LNetError: 3588:0:(linux-tcpip.c:137:libcfs_ipif_query()) Can't get IP address for interface ib0 LNetError: 3588:0:(o2iblnd.c:2569:kiblnd_create_dev()) Can't query IPoIB interface ib0: -99 LNetError: 105-4: Error -100 starting up LNI o2ib LustreError: 3588:0:(events.c:737:ptlrpc_init_portals()) network initialisation failed I believe our IO nodes on the LLNL BGQ clusters have both IPv4 and IPv6 addresses and we didn't run into this, so I'm not sure why it was a problem on IBM's system. |
| Comment by Doug Oucharek (Inactive) [ 28/Aug/12 ] |
|
I wonder if this is an endian issue (since the IBM system is PPC64)? Are any of the IO nodes on the LLNL BGQ cluster PPC? |
| Comment by Ned Bass [ 28/Aug/12 ] |
|
Yes they are all PPC. |
| Comment by Prakash Surya (Inactive) [ 28/Aug/12 ] |
|
Personally, I wouldn't call those messages on the console as "handled gracefully"; but it's better than a panic. If the issue is IPv6 compatibility, stating that on the console would be nice: LNetError: ib0 configured for IPv6, which is not supported by LNet. Network initialization failed as a result. |
| Comment by Ned Bass [ 28/Aug/12 ] |
|
Good point. It was only the EADDRNOTAVAIL return value that suggested it might be an IPv6 issue, but that's not exactly user-friendly. |
| Comment by Liang Zhen (Inactive) [ 29/Aug/12 ] |
|
I suspect we got NULL pointer (cmid->device) at here: kiblnd_dev_failover()->ib_alloc_pd(cmid->device); I'm wondering how it could happen because we did check returned value for all calls, and we also specified both address and AF_INET before calling rdma_bind_addr(), which can attach cmid to device: rdma_bind_addr()->cma_acquire_dev()->cma_acquire_dev()->cma_attach_to_dev() |
| Comment by Ned Bass [ 29/Aug/12 ] |
|
Perhaps address scope is a factor. IBM's system had a site-scoped v6 address whereas the LLNL systems have only automatically derived link-local addresses. |
| Comment by Doug Oucharek (Inactive) [ 30/Aug/12 ] |
|
When we do a kinlnd_create_dev(), it calls libcfs_ipif_query() to get the address which is used in the rdma_bind_addr(). If libcfs_ipif_query() were to get an IPv6 address (struct sockaddr_in6) rather than an IPv4 address (struct sockaddr_in), then we would misinterpret what was returned and pass a badly formed address to rdma_bind_addr(). I'm wondering if that is what happened here. In kiblnd_create_dev(), we make use of the ioctl SIOGIFADDR with the address type set to AF_INET. In theory, this should only return an IPv4 address; we need to use AF_INET6 for IPv6. There seems to be a lot of confusion in networking forums as to how SIOGIFADDR works which is made worse by the fact that the IPv6 struct, sockaddr_in6, does not fit into the generic sockaddr structure which is returned by SIOGIFADDR. I will need to look at the specific source code implementation of SIOGIFADDR for the distro IBM is running. Ned: what distro and version is IBM running? Liang: Another way we can do this is to use SIOGIFCONF to get a list of all interfaces and addresses. That way we can be smart in our error messages when we can see there is only an IPv6 address. However, due to the structure size differences I mention above, some implementations do not return IPv6 addresses at all. |
| Comment by Ned Bass [ 30/Aug/12 ] |
seqio33-ib0@root:uname -a Linux seqio33-ib0 2.6.32-220.23.3.bgq.el6_V1R1M2_0.ppc64 #1 SMP Sat Aug 25 18:22:10 CDT 2012 ppc64 ppc64 ppc64 GNU/Linux seqio33-ib0@root:cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.2 (Santiago) |
| Comment by Doug Oucharek (Inactive) [ 28/Sep/12 ] |
|
Fix on master: http://review.whamcloud.com/#change,3815 |
| Comment by Isaac Huang (Inactive) [ 04/Oct/12 ] |
|
Doug, was that a fix? I thought it was just a debugging patch. Perhaps I've missed something. |
| Comment by Doug Oucharek (Inactive) [ 04/Oct/12 ] |
|
Whoops! My bad. Doing bulk cleanup of Jira tickets and did not see "debug patch" in patch title. Change 3815 is only a debug patch and not a fix. |