[LU-8362] page fault: exception RIP: lnet_mt_match_md+135 Created: 02/Jul/16 Updated: 14/Jun/18 Resolved: 22/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Mahmoud Hanafi | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre 2.7.1-fe |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Epic/Theme: | lnet | ||||||||||||
| Severity: | 2 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
OSS console errors LNet: Can't send to 17456000@<65535:34821>: src 0@<0:0> is not a local nid^M
LNet: 46045:0:(lib-move.c:2241:LNetPut()) Error sending PUT to 0-17456000@<65535:34821>: -22^M
LNet: Can't send to 17456000@<65535:34821>: src 0@<0:0> is not a local nid^M
LNet: 56154:0:(lib-move.c:2241:LNetPut()) Error sending PUT to 0-17456000@<65535:34821>: -22^M
------------[ cut here ]------------^M
WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)^M
Hardware name: SUMMIT^M
list_del corruption. prev->next should be ffff881d63ead4d0, but was (null)^M
Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_ldiskfs(U) lquota(U) ldiskfs(U) jbd2 acpi_cpufreq freq_table mperf lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) dm_round_robin scsi_dh_rdac lpfc scsi_transport_fc scsi_tgt sunrpc bonding ib_ucm(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) configfs ib_ipoib(U) ib_cm(U) ib_uverbs(U) ib_umad(U) dm_mirror dm_region_hash dm_log dm_multipath dm_mod iTCO_wdt iTCO_vendor_support microcode sg wmi igb hwmon dca i2c_algo_bit ptp pps_core i2c_i801 i2c_core lpc_ich mfd_core shpchp tcp_bic ext3 jbd sd_mod crc_t10dif isci libsas mpt2sas scsi_transport_sas raid_class mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) ipv6 mlx4_core(U) mlx_compat(U) ahci gru [last unloaded: scsi_wait_scan]^M
Pid: 8603, comm: kiblnd_sd_02_01 Not tainted 2.6.32-504.30.3.el6.20151008.x86_64.lustre271 #1^M
Call Trace:^M
[<ffffffff81074127>] ? warn_slowpath_common+0x87/0xc0^M
[<ffffffff81074216>] ? warn_slowpath_fmt+0x46/0x50^M
[<ffffffff812bda6e>] ? list_del+0x6e/0xa0^M
[<ffffffffa052c5c9>] ? lnet_me_unlink+0x39/0x140 [lnet]^M
[<ffffffffa05303f8>] ? lnet_md_unlink+0x2f8/0x3e0 [lnet]^M
[<ffffffffa0531b9f>] ? lnet_try_match_md+0x22f/0x310 [lnet]^M
[<ffffffffa0a1f727>] ? kiblnd_recv+0x107/0x780 [ko2iblnd]^M
[<ffffffffa0531d1c>] ? lnet_mt_match_md+0x9c/0x1c0 [lnet]^M
[<ffffffffa0532621>] ? lnet_ptl_match_md+0x281/0x870 [lnet]^M
[<ffffffffa05396e7>] ? lnet_parse_local+0x307/0xc60 [lnet]^M
[<ffffffffa053a6da>] ? lnet_parse+0x69a/0xcf0 [lnet]^M
[<ffffffffa0a1ff3b>] ? kiblnd_handle_rx+0x19b/0x620 [ko2iblnd]^M
[<ffffffffa0a212be>] ? kiblnd_scheduler+0xefe/0x10d0 [ko2iblnd]^M
[<ffffffff81064f90>] ? default_wake_function+0x0/0x20^M
[<ffffffffa0a203c0>] ? kiblnd_scheduler+0x0/0x10d0 [ko2iblnd]^M
[<ffffffff8109dc8e>] ? kthread+0x9e/0xc0^M
[<ffffffff8100c28a>] ? child_rip+0xa/0x20^M
[<ffffffff8109dbf0>] ? kthread+0x0/0xc0^M
[<ffffffff8100c280>] ? child_rip+0x0/0x20^M
---[ end trace 1063d2ffc2578a2f ]---^M
------------[ cut here ]------------^M
From the crash dump bt looks like this. PID: 8603 TASK: ffff8810271fa040 CPU: 11 COMMAND: "kiblnd_sd_02_01"
#0 [ffff880ff8b734f0] machine_kexec at ffffffff8103b5db
#1 [ffff880ff8b73550] crash_kexec at ffffffff810c9412
#2 [ffff880ff8b73620] kdb_kdump_check at ffffffff812973d7
#3 [ffff880ff8b73630] kdb_main_loop at ffffffff8129a5c7
#4 [ffff880ff8b73740] kdb_save_running at ffffffff8129472e
#5 [ffff880ff8b73750] kdba_main_loop at ffffffff8147cd68
#6 [ffff880ff8b73790] kdb at ffffffff812978c6
#7 [ffff880ff8b73800] kdba_entry at ffffffff8147c687
#8 [ffff880ff8b73810] notifier_call_chain at ffffffff81568515
#9 [ffff880ff8b73850] atomic_notifier_call_chain at ffffffff8156857a
#10 [ffff880ff8b73860] notify_die at ffffffff810a44fe
#11 [ffff880ff8b73890] __die at ffffffff815663e2
#12 [ffff880ff8b738c0] no_context at ffffffff8104c822
#13 [ffff880ff8b73910] __bad_area_nosemaphore at ffffffff8104cad5
#14 [ffff880ff8b73960] bad_area_nosemaphore at ffffffff8104cba3
#15 [ffff880ff8b73970] __do_page_fault at ffffffff8104d29c
#16 [ffff880ff8b73a90] do_page_fault at ffffffff8156845e
#17 [ffff880ff8b73ac0] page_fault at ffffffff81565765
[exception RIP: lnet_mt_match_md+135]
RIP: ffffffffa0531d07 RSP: ffff880ff8b73b70 RFLAGS: 00010286
RAX: ffff881d88420000 RBX: ffff880ff8b73c70 RCX: 0000000000000007
RDX: 0000000000000004 RSI: ffff880ff8b73c70 RDI: ffffffffffffffff
RBP: ffff880ff8b73bb0 R8: 0000000000000001 R9: d400000000000000
R10: 0000000000000001 R11: 0000000000000012 R12: 0000000000000000
R13: ffff881730ca6200 R14: 00d100120be91b91 R15: 0000000000000008
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#18 [ffff880ff8b73bb8] lnet_ptl_match_md at ffffffffa0532621 [lnet]
#19 [ffff880ff8b73c38] lnet_parse_local at ffffffffa05396e7 [lnet]
#20 [ffff880ff8b73cd8] lnet_parse at ffffffffa053a6da [lnet]
#21 [ffff880ff8b73d68] kiblnd_handle_rx at ffffffffa0a1ff3b [ko2iblnd]
#22 [ffff880ff8b73db8] kiblnd_scheduler at ffffffffa0a212be [ko2iblnd]
#23 [ffff880ff8b73ee8] kthread at ffffffff8109dc8e
#24 [ffff880ff8b73f48] kernel_thread at ffffffff8100c28a
|
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 02/Jul/16 ] |
|
Hi Mahmoud, Can you confirm that this is a severity 1 (production filesystem out of service)? Thanks. |
| Comment by Bruno Faccini (Inactive) [ 02/Jul/16 ] |
|
This problem looks similar to the one reported in |
| Comment by Mahmoud Hanafi [ 02/Jul/16 ] |
|
should be severity 2 or 3 |
| Comment by Peter Jones [ 02/Jul/16 ] |
|
ok Mahmoud. Then we'll assess this more fully on Monday. |
| Comment by Bruno Faccini (Inactive) [ 04/Jul/16 ] |
|
Mahmoud, I know that as a non-US citizen I am not allowed to work on the crash-dump, but can you at least provide some pieces of memory content to help qualify the corruption ? |
| Comment by Mahmoud Hanafi [ 04/Jul/16 ] |
|
Here is the info. btw how did you come up with those address locations. crash> rd 0xffff881d63ead440 32 ffff881d63ead440: 3463346339376238 65342d616630622d 8b79c4c4-b0fa-4e ffff881d63ead450: 2d393762652d3134 3162326230316464 41-eb79-dd10b2b1 ffff881d63ead460: 0000000064616535 000000230199f7ea 5ead........#... ffff881d63ead470: 000574241b1d26fc 0000000000000000 .&..$t.......... ffff881d63ead480: 0000000000000000 0000000000000000 ................ ffff881d63ead490: 0000000000000000 0000000000000000 ................ ffff881d63ead4a0: 0000000000000000 0000000000000000 ................ ffff881d63ead4b0: 0000000000000000 0000000000000000 ................ ffff881d63ead4c0: ffff8815da7214c0 ffff881dbcd19740 ..r.....@....... ffff881d63ead4d0: ffff881b6dcd3cd0 ffffc900220983f0 .<.m.......".... ffff881d63ead4e0: 0000001490b5d3fa ffffffffffffffff ................ ffff881d63ead4f0: ffff881dffffffff 000001000000001c ................ ffff881d63ead500: 0000000000000000 ffffffffffffffff ................ ffff881d63ead510: 0000000000000001 ffff881a20848840 ........@.. .... ffff881d63ead520: 0000000000000000 0000000000000000 ................ ffff881d63ead530: 0000000000000000 0000000000000000 ................ crash> rd ffff881d63ead448 32 ffff881d63ead448: 65342d616630622d 2d393762652d3134 -b0fa-4e41-eb79- ffff881d63ead458: 3162326230316464 0000000064616535 dd10b2b15ead.... ffff881d63ead468: 000000230199f7ea 000574241b1d26fc ....#....&..$t.. ffff881d63ead478: 0000000000000000 0000000000000000 ................ ffff881d63ead488: 0000000000000000 0000000000000000 ................ ffff881d63ead498: 0000000000000000 0000000000000000 ................ ffff881d63ead4a8: 0000000000000000 0000000000000000 ................ ffff881d63ead4b8: 0000000000000000 ffff8815da7214c0 ..........r..... ffff881d63ead4c8: ffff881dbcd19740 ffff881b6dcd3cd0 @........<.m.... ffff881d63ead4d8: ffffc900220983f0 0000001490b5d3fa ..."............ ffff881d63ead4e8: ffffffffffffffff ffff881dffffffff ................ ffff881d63ead4f8: 000001000000001c 0000000000000000 ................ ffff881d63ead508: ffffffffffffffff 0000000000000001 ................ ffff881d63ead518: ffff881a20848840 0000000000000000 @.. ............ ffff881d63ead528: 0000000000000000 0000000000000000 ................ ffff881d63ead538: 0000000000000000 ffff881d36d68d40 ........@..6.... |
| Comment by Bruno Faccini (Inactive) [ 05/Jul/16 ] |
|
Thanks Mahmoud, the address comes from the msg just before the crash, "list_del corruption. prev->next should be ffff881d63ead4d0, but was (null)". Well, according to the memory content, there is no real evidence of Can you provide me with, "log" sub-cmd output, and also the assembly listing of lnet_mt_match_md() (you will need to load the Kernel+Lustre modules debuginfo before using the "mod -S" crash sub-cmd) and also with the memory content from "rd ffff881d88420000 100". |
| Comment by Bruno Faccini (Inactive) [ 06/Jul/16 ] |
|
Thanks Mahmoud, but I had asked you the assembly listing of lnet_mt_match_md() instead of lnet_ptl_match_md(), can you also provide it ? |
| Comment by Mahmoud Hanafi [ 06/Jul/16 ] |
|
attaching lnet_mt_match_md.dis and lnet_mt_match_md.withlinenumbers.dis |
| Comment by Bruno Faccini (Inactive) [ 08/Jul/16 ] |
|
Thanks, so seems we crash here in the code : int
lnet_mt_match_md(struct lnet_match_table *mtable,
struct lnet_match_info *info, struct lnet_msg *msg)
{
struct list_head *head;
lnet_me_t *me;
lnet_me_t *tmp;
int exhausted = 0;
int rc;
/* any ME with ignore bits? */
if (!list_empty(&mtable->mt_mhash[LNET_MT_HASH_IGNORE]))
head = &mtable->mt_mhash[LNET_MT_HASH_IGNORE];
else
head = lnet_mt_match_head(mtable, info->mi_id, info->mi_mbits);
again:
/* NB: only wildcard portal needs to return LNET_MATCHMD_EXHAUSTED */
if (lnet_ptl_is_wildcard(the_lnet.ln_portals[mtable->mt_portal]))
exhausted = LNET_MATCHMD_EXHAUSTED;
list_for_each_entry_safe(me, tmp, head, me_list) {
/* ME attached but MD not attached yet */
if (me->me_md == NULL)
continue;
LASSERT(me == me->me_md->md_me); <<<<<<<<<<<<<<<<<
rc = lnet_try_match_md(me->me_md, info, msg);
if ((rc & LNET_MATCHMD_EXHAUSTED) == 0)
exhausted = 0; /* mlist is not empty */
if ((rc & LNET_MATCHMD_FINISH) != 0) {
/* don't return EXHAUSTED bit because we don't know
* whether the mlist is empty or not */
return rc & ~LNET_MATCHMD_EXHAUSTED;
}
}
when dereferencing me->me_md, using rax, is found to be non-NULL but 0xf...f in fact! |
| Comment by Bruno Faccini (Inactive) [ 13/Jul/16 ] |
|
Yes, only for the first value/pointer in fact! Well, this 0xffff88201fc9b000 value should be "head" and thus point to the choosen hash-list head in the mtable/match-table, and seems also wrong (I mean strangely page-aligned, like the next/0xffff881d88420000 on the list which is also the current one causing the crash!). Thus, I need more from the current stack frame to confirm my current thoughts, so can you also dump its full content using "rd ffff880ff8b73b70 20" cmd ?? |
| Comment by Mahmoud Hanafi [ 14/Jul/16 ] |
|
Here you go crash> rd ffff880ff8b73b70 20 ffff880ff8b73b70: ffff88201fa43e00 ffff88201fc9b000 .>.. ....... ... ffff880ff8b73b80: ffff880ff8b73c50 ffff880ff8b73c70 P<......p<...... ffff880ff8b73b90: ffff881730ca6200 ffff880ffb3c6140 .b.0....@a<..... ffff880ff8b73ba0: ffff88201fa43e00 0000000000000004 .>.. ........... ffff880ff8b73bb0: ffff880ff8b73c30 ffffffffa0532621 0<......!&S..... ffff880ff8b73bc0: 0000000000000000 ffffffff00000000 ................ ffff880ff8b73bd0: ffff881000000000 ffff880f000000e0 ................ ffff880ff8b73be0: 000000000000ec80 ffff881026e79ac0 ...........&.... ffff880ff8b73bf0: ffff881730ca6200 0000000000000002 .b.0............ ffff880ff8b73c00: ffff880ff8b73c30 ffff881730ca6200 0<.......b.0.... |
| Comment by Bruno Faccini (Inactive) [ 20/Jul/16 ] |
|
Hello Mahmoud, thanks for these last infos (it confirms the wrong "head" value, and helps go forward in analysis) and sorry for this late feed-back. At this point of the crash-dump analysis I am almost convinced that this crash is not related to In fact, I am now more inclined to suspect it could be related to |
| Comment by Jay Lan (Inactive) [ 20/Jul/16 ] |
|
We added We did have two |
| Comment by Jay Lan (Inactive) [ 20/Jul/16 ] |
|
The lnet_msg-lnet_match_table.data has been attached to this ticket. |
| Comment by Bruno Faccini (Inactive) [ 22/Jul/16 ] |
|
Well lnet_msg and lnet_match_table look ok, so can you get "kmem 0xffff88204b64a000", "rd 0xffff88204b64a000 1024" (to confirm that head pointer 0xffff88201fc9b000 should be finally ok, even if strangely aligned as I stated before but this can happen due to hash-table sizing), and also "kmem 0xffff881d88420000" and "rd ffff88138ecf5c40 32" for me now?? |
| Comment by Jay Lan (Inactive) [ 22/Jul/16 ] |
|
The first two failed. The third did not return for > 5 minutes before I terminated it. crash> kmem 0xffff88204b64a000 crash> rd ffff88138ecf5c40 32 |
| Comment by Bruno Faccini (Inactive) [ 25/Jul/16 ] |
|
Humm I have made a mistake with the first address I have asked you to check and dump, it has nothing to do with your crash-dump because it comes from the one I was using to mimic what I wanted to be extracted from yours ... So replace the previous 2 first commands by "kmem 0xffff88201fc9a000" and "rd 0xffff88201fc9a000 1024", and also let the 3rd/"kmem 0xffff881d88420000" command go to completion. |
| Comment by Jay Lan (Inactive) [ 25/Jul/16 ] |
|
Requested information in attachment lu-8362.20160725. |
| Comment by Bruno Faccini (Inactive) [ 29/Jul/16 ] |
|
There is still no evidence of |
| Comment by Jay Lan (Inactive) [ 03/Aug/16 ] |
|
Uploaded information in lu8362.20160802. The list contains 24564 entries and ended with a duplicate. |
| Comment by Bruno Faccini (Inactive) [ 03/Aug/16 ] |
|
Many thanks again Jay. With these new datas I still can not definitely conclude about the exact content/cause of the corruption. This is mainly due to the fact that several list corruptions have been detected (and kind of corrected during the unlink mechanism, thus very likely to create the loop in the "prev" linked-list of MEs shown in "list -o 8 0xffff881d8d8bff40") preceding the crash. But what I can say now, is that it finally looks very similar to several occurrences I have examined during Also, during |
| Comment by Jay Lan (Inactive) [ 03/Aug/16 ] |
|
File lu8362-29169893 is attached that contains crash data you requested. |
| Comment by Bruno Faccini (Inactive) [ 04/Aug/16 ] |
|
Thanks one more time for these datas. |
| Comment by Mahmoud Hanafi [ 22/Sep/16 ] |