Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
When I try to mount lustre with GSS (SSK) enabled I receive checksum errors using a multi-rail client where I do not when using only a single interface. My guess is the NID is encoded in the checksum though I haven't dug into the cause yet. I also had lots of errors when using GSS on multi-rail servers although the errors were different.
[154311.786639] LustreError: 194908:0:(gss_sk_mech.c:388:sk_verify_hmac()) checksum mismatch [154311.798154] LustreError: 194908:0:(sec_gss.c:242:gss_verify_msg()) mic verify error: 00060000 [154311.810015] LustreError: 194908:0:(sec_gss.c:2125:gss_svc_verify_request()) failed to verify request: 60000
Attachments
Activity
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45277/
Subject: LU-15047 gss: gss integrity check with multi-rail
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c8301a65c5672a1d081669343466746df983eabc
If the primary NID was deleted by a user command I think it would trigger (lnet_peer_del_nid()). In the code it looks like it could replace the lp_primary_nid but when I just ran "lnetctl peer del" it seemed to delete the whole peer not just the primary NID. Not sure if that is the intended way things work but this was my concern.
ssmirnov ashehata in patch https://review.whamcloud.com/45277 Jeremy is asking about the possibility that the primary NID for a node is changed. What events could trigger such a change?
"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/45277
Subject: LU-15047 gss: gss integrity check with multi-rail
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bcc1cc38a2286b39c78464f7fd34f237a66fd2be
Indeed, GSS must make use of primary NIDs on both ends of the communication channel, so that the computed HMAC is based on these unique identifiers rather than the actual NIDs being used for the current request.
Hi Sebastien,
I think I misunderstood your question earlier. While primary NID is used as a node identifier, MR does introduce the ability to select the local net/NID as well as remote net/NID, so that if all options have equal health and priority, LNET is going to round-robin across them. The primary NID is going to stay the same on both sides of the transaction, but different NIDs are going to be actually sending/receiving. After discovery is complete, you can observe that behaviour by using "lnetctl ping" to initiate communication and "lnetctl net show -v 4" to see individual NID send/receive counts on both sides.
How does encryption work in this case? If it is unaware of what LNet does with NID selection but depends on what NIDs are selected, that does sound like a problem.
Thanks,
Serguei
I managed to reproduce a similar issue on my test cluster. After properly tuning Linux routing as explained on the wiki page at https://wiki.whamcloud.com/display/LNet/MR+Cluster+Setup , I formatted a simple Lustre file system made of 3 servers (1 MGS, 1 MDS, 1 OSS) and 1 client. All nodes use Eth, and have the same network configuration:
- 2 network interfaces, enp0s8 on 192.168.56.0/24 and enp0s9 on 192.168.57.0/24;
- /etc/modprobe.d/lustre.conf contains options lnet networks="tcp0(enp0s8,enp0s9)";
- LNet auto discovery is enabled.
With this configuration, we enable LNet Multirail on tcp0.
This file system works fine without SSK enabled. When I enable SSK (skpi flavor for cli2ost connections), the client fails to mount, and we can see the following messages on OSS side:
Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: handling request Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: readline: read 1378 chars into buffer of size 2048: \x02000200 \xce39a8c000000200 \xa7c7df3fb8a605b2 \x64656661756c7400 \x \x343a000000012c343a4f6825ee2c3235363affd19cfee97a53c7928649dd3245347ae724b53b59ccbb7fb50c1dfe93b726510a27a08dacf539bce044c62c56878719dab244396990d477 b6be879153a42e6529ca0d4192154592aebcea3e72709f2133b565229304974d243e36b5b176bccced176a280ee0727623871508406eff4120172ddf3601521fe8ce2c1a139234b17284242cd219189393a0cee481417ee89e0e2422a30e322c301077a07c917d5941bef9d942f164... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: handling req: svc 2, nid 00020000c0a839ce, idx b205a6b83fdfc7a7 nodemap default Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: in_handle: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: length 0 Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: in_tok: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: length 652 Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0000: 343a 0000 0001 2c34 3a4f 6825 ee2c 3235 4:....,4:Oh%.,25 Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0010: 363a ffd1 9cfe e97a 53c7 9286 49dd 3245 6:.....zS...I.2E Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0020: 347a e724 b53b 59cc bb7f b50c 1dfe 93b7 4z.$.;Y......... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0030: 2651 0a27 a08d acf5 39bc e044 c62c 5687 &Q.'....9..D.,V. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0040: 8719 dab2 4439 6990 d477 b6be 8791 53a4 ....D9i..w....S. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0050: 2e65 29ca 0d41 9215 4592 aebc ea3e 7270 .e)..A..E....>rp Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0060: 9f21 33b5 6522 9304 974d 243e 36b5 b176 .!3.e"...M$>6..v Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0070: bccc ed17 6a28 0ee0 7276 2387 1508 406e ....j(..rv#...@n Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0080: ff41 2017 2ddf 3601 521f e8ce 2c1a 1392 .A .-.6.R...,... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0090: 34b1 7284 242c d219 1893 93a0 cee4 8141 4.r.$,.........A Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00a0: 7ee8 9e0e 2422 a30e 322c 3010 77a0 7c91 ~...$"..2,0.w.|. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00b0: 7d59 41be f9d9 42f1 64fd 6f67 5b3f 9c4c }YA...B.d.og[?.L Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00c0: 3ae0 dc89 568d 961a 6d85 fc70 f1da c3f8 :...V...m..p.... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00d0: e5fd 035f a530 cbb8 5c9b 11ad 79c4 ff4d ..._.0..\...y..M Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00e0: 1d0c e95e 0e95 4725 06d5 5689 95a1 f765 ...^..G%..V....e Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 00f0: 0540 eb78 c5a6 4f69 ac1f fe30 024b 6dda .@.x..Oi...0.Km. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0100: 5e6d 1457 d72a 2236 6b8a 97ca 52d1 ffcd ^m.W.*"6k...R... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0110: 98a3 2c32 3536 3afa ca0c 67e8 89c0 5aa7 ..,256:...g...Z. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0120: ca9c bc91 d844 be46 7fe4 59c0 abc0 9028 .....D.F..Y....( Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0130: dd72 857b 98a8 8614 9fa4 ed1f fcc1 0c7a .r.{...........z Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0140: 6de1 e356 ed15 8f80 d717 ee8b 9be9 0783 m..V............ Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0150: e2ac 8e7a 5940 3aa4 7aaa ac32 df62 2bce ...zY@:.z..2.b+. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0160: f208 1e1b 5f39 e22e c741 0c98 e5e5 c846 ...._9...A.....F Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0170: 82c9 0832 7c08 7635 2b0b a5c9 ab60 cbeb ...2|.v5+....`.. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0180: 2212 c7de 3bfd dfd4 9eb2 e461 768f b1e6 "...;......av... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0190: d592 a22e b9c6 3d2a 5f2c 7be5 4b57 d60f ......=*_,{.KW.. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01a0: 0134 78bc 2648 9854 e600 1e28 7197 d119 .4x.&H.T...(q... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01b0: 6c69 eeb0 e592 ea78 f9e1 509f 1ac9 6b06 li.....x..P...k. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01c0: b2b6 fe8c 59b5 59a2 8e5e 0a53 8403 f6e3 ....Y.Y..^.S.... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01d0: c8eb 31b5 c0f6 1c28 a07c cdcd 6dbc 98c4 ..1....(.|..m... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01e0: bf51 8cb4 676d 8823 3224 eea1 7dfa 8c3d .Q..gm.#2$..}..= Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 01f0: 462d 1775 aab0 a6f1 a01d 8cfe 8a5c f2c8 F-.u.........\.. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0200: a092 ae4a 89c5 8d15 8529 614a 5af3 1f26 ...J.....)aJZ..& Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0210: 7544 a6f4 e8ad 812c 3333 3a73 6562 2d4f uD.....,33:seb-O Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0220: 5354 3030 3030 2d6f 7363 2d66 6666 6639 ST0000-osc-ffff9 Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0230: 6532 3338 3433 3962 3830 3000 2c33 323a e238439b800.,32: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0240: 37a8 eec1 ce19 687d 132f e290 51dc a629 7.....h}./..Q..) Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0250: d164 e2c4 958b a141 d5f4 133a 33f0 688f .d.....A...:3.h. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0260: 2c34 3a00 0000 812c 3332 3ad6 4c7d 91a5 ,4:....,32:.L}.. Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0270: b8f4 85f5 404e 4a47 a695 072a e965 5f74 ....@NJG...*.e_t Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: 0280: b53b 5416 6ea8 3f10 692b bd2c .;T.n.?.i+., Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Handling sk request Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Decoded netstring of 652 bytes Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Creating credentials for target: seb-OST0000-osc-ffff9e238439b800 with nodemap: default Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Searching for key with description: lustre:seb:default Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Encoded netstring of 311 bytes Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Created netstring of 311 bytes Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Serialized buffer of 400 bytes for kernel Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: doing downcall Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: \xa7c7df3fb8a605b2 1634229849 0 1 0 0 -1 0 0 sk \x0100000073686132353600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000006374722861657329000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000... Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: sk returning success Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: sending reply Oct 14 17:44:09 lnode-vm4.makalu lsvcgssd[196651]: writing message: \x \x343a000000012c343a4f6825ee2c3235363affd19cfee97a53c7928649dd3245347ae724b53b59ccbb7fb50c1dfe93b726510a27a08dacf539bce044c62c56878719dab244396990d477b6be879153a42e6529ca0d4192154592aebcea3e72709f2133b565229304974d243e36b5b176bccced176a280ee0727623871508406eff4120172ddf3601521fe8ce2c1a139234b17284242cd219189393a0cee481417ee89e0e2422a30e322c301077a07c917d5941bef9d942f164fd6f675b3f9c4c3ae0dc89568d961a6d85fc70f1dac3f8e5fd035fa530cbb85c9b11ad79c4ff4d1d0ce95e0e95472506d556899... Oct 14 17:44:09 lnode-vm4.makalu kernel: Lustre: 196603:0:(sec_gss.c:2066:gss_svc_handle_init()) create svc ctx 000000003a027476: user from 192.168.56.206@tcp authenticated as root Oct 14 17:44:09 lnode-vm4.makalu kernel: LustreError: 196602:0:(gss_sk_mech.c:388:sk_verify_hmac()) checksum mismatch Oct 14 17:44:09 lnode-vm4.makalu kernel: LustreError: 196602:0:(sec_gss.c:283:gss_unseal_msg()) unwrap message error: 00060000 Oct 14 17:44:09 lnode-vm4.makalu kernel: LustreError: 196602:0:(sec_gss.c:2196:gss_svc_unseal_request()) failed to unwrap request: d0000 Oct 14 17:44:09 lnode-vm4.makalu kernel: LustreError: 196602:0:(sec_gss.c:2288:gss_svc_handle_data()) svc 3 failed: major 0x000d0000: req xid 1713610397058304 ctx 000000003a027476 idx 0xb205a6b83fdfc7a7(0->192.168.56.206@tcp)
After running git bisect I identified the commit that introduces this problem:
7d309d57fd LU-9121 lnet: select best peer and local net
This commit is part of the merge of the origin/multi-rail branch just after 2.14.50 tag was put. So basically we suffer from this behavior from very early on the master branch after 2.14.0 was released. Good news is that 2.14.0 is not impacted.
ashehata ssmirnov do you see how this patch could affect the way peers present themselves to others? My understanding was that the primary NID was always used as the unique identifier of the connection, do you think this commit could change this paradigm? Or maybe this commit could make the multi-rail implementation more effective, by switching between rails more often for instance?
Thanks,
Sebastien.
When I had the servers coming up with multirail they were even failing with GSS. So I had moved it to one interface and it came up fine. Then adding the client with multirail fails with the checksum error but with a single interface works fine.
I'm pretty sure the arp settings Whamcloud keeps recommending is wrong. I went through this with Amir a few months ago but arp_filter and rp_filter should be set to 1 for mutli-rail to function correctly. In every other case it was intermittent.
Is GSS and multirail actually being tested together? I was somewhat assuming when I filed this that they were only being tested independently. If I get a chance this week I'll try to dig into it more to get to the bottom of what's happening.
Sebastien,
Yes, there were LNet patches that went into 2.14.54 which could be related (LU-14668, LU-14661), but I just didn't think these patches would cause LNet to switch the peer's primary NID somehow. Perhaps ashehata can confirm.
Not sure if this can affect SSK, but another thing to check for a MR client would be the linux routing setup, to make sure that the intended interface is actually used for sending. For example:
sysctl -w net.ipv4.conf.all.rp_filter=0 sysctl -w net.ipv4.conf.all.arp_filter=0 sysctl -w net.ipv4.conf.ib0.arp_ignore=1 sysctl -w net.ipv4.conf.ib0.arp_filter=0 sysctl -w net.ipv4.conf.ib0.arp_announce=2 sysctl -w net.ipv4.conf.ib0.rp_filter=0
If tcp is used, the routes also need to be added. Manual steps are described here: https://wiki.whamcloud.com/display/LNet/MR+Cluster+Setup
The following patch (still under review) automates adding the routes for tcp interfaces:
https://review.whamcloud.com/#/c/44065/
Landed for 2.15