[LU-445] Send timestamps with LNet counters Created: 22/Jun/11 Updated: 01/Aug/12 Resolved: 12/Jul/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0 |
| Fix Version/s: | Lustre 2.3.0 |
| Type: | Improvement | Priority: | Blocker |
| Reporter: | Wally Wang (Inactive) | Assignee: | Doug Oucharek (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Bugzilla ID: | 22,639 |
| Rank (Obsolete): | 4569 |
| Description |
|
We should send the timestamp data from the node along with the performance This is originally discussed in Oracle Bug 22639. |
| Comments |
| Comment by Wally Wang (Inactive) [ 22/Jun/11 ] |
|
This is a change for lst: http://review.whamcloud.com/#change,1000 We use a millisecond timer on a per-session basis. Each node then sends the milliseconds since session start over with the LNet counters. With 32bits and millisecond resolution, this should allow ~49 days of session age before the timers would wrap. There might be a bit of an issue with using 'jiffies_to_msecs()' - but I didn't find anything in cfs_time_XXX land that'd map to it easily. This will change the wire attribute and is not backward compatible. |
| Comment by Andreas Dilger [ 22/Jun/11 ] |
|
I commented on this patch in bugzilla, since I want to make it clear to Oracle that this patch should not land for b1_8. |
| Comment by Liang Zhen (Inactive) [ 19/Jan/12 ] |
|
I think the simplest way is reuse sfw_counters_t::active_tests because nobody does/could really use it at all, for example, change it to active_tests::running. I will post a patch for this, also I will file a new bug for larger LST message. Liang |
| Comment by Bruce Korb (Inactive) [ 25/Jan/12 ] |
|
One of our (Xyratex's) customers is bumping into this limitation. Silently losing data is not nice, but they have a higher priority in being able to do the testing with a homogeneously versioned cluster. I upgraded your old patch to current source and see you've upgraded it yourself. I'll take a peek at it. Thank you! |
| Comment by Alexey Lyashkov [ 25/Jan/12 ] |
|
Liang, what about a fix let to avoid to use "sizeof(srpc_msg_t)" as send size in LST code? |
| Comment by Bruce Korb (Inactive) [ 26/Jan/12 ] |
|
As a "hint", this is what I already implemented. SRPC_STRUCT_TYPE_TABLE is a #define of SRpc() elements that Anyway, this seems to work okay and would mean that the version #define SRPC_STRUCT_TYPE_TABLE \ diff --git a/lnet/selftest/rpc.c b/lnet/selftest/rpc.c -int ; ev->ev_fired = 0; rc = srpc_post_passive_rdma(SRPC_RDMA_PORTAL, *id,
switch (wi->swi_state) { |
| Comment by Liang Zhen (Inactive) [ 31/Jan/12 ] |
|
Shadow, yes it's exactly what I want to do in the future, this might require quite some changes to LNet to avoid silent message dropping on old versions nodes because LST doesn't have handshake. As I mentioned in previous comment, we might want to create another ticket for this change, and just reuse old field to satisfy requirement on this ticket. Liang |
| Comment by Alexey Lyashkov [ 27/Feb/12 ] |
|
Hm.. we have a version field in any LST message - so we able to have several handlers for one message type. |
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Peter Jones [ 06/Mar/12 ] |
|
Landed for 2.3 |
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 06/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Alexander Boyko [ 28/Mar/12 ] |
|
Can this be land to b1_8? |
| Comment by Liang Zhen (Inactive) [ 30/Mar/12 ] |
|
Sure, I will post a patch for b1_8 |
| Comment by Alexander Boyko [ 12/Apr/12 ] |
|
Liang, can you tell LU number or gerrit request for b1_8 patch? |
| Comment by Liang Zhen (Inactive) [ 12/Apr/12 ] |
|
I just posted a patch at here: http://review.whamcloud.com/#change,2522 |
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Doug Oucharek (Inactive) [ 26/Jun/12 ] |
|
With Liang's recommendation, I am reopening this ticket to add a patch to make the change previously done backwards compatible. With the change as is, a 2.3 system running against a 2.2 or 2.1 system will have an invalid timestamp for doing bandwidth calculations. I plan to add a new flag to "lst stat" which will trigger the use of the remote timestamps. If the flag is not given, then the previous behaviour, using the local timestamp, will be done. This change will be set up to change the default from using local timestamps to using remote timestamp when the Lustre version hits 2.8. |
| Comment by Doug Oucharek (Inactive) [ 26/Jun/12 ] |
|
The patch for making the original work backward compatible is: http://review.whamcloud.com/#change,3192 |
| Comment by Andreas Dilger [ 27/Jun/12 ] |
|
Doug, |
| Comment by Liang Zhen (Inactive) [ 27/Jun/12 ] |
|
Yes we can distinguish this by value, the minimum interval of stat request would be 1 second which is 1000 milliseconds, but number of running tests on a node is almost impossible to be larger than 100. |
| Comment by Liang Zhen (Inactive) [ 27/Jun/12 ] |
|
Yes we can distinguish this by value, the minimum interval of stat request would be 1 second which is 1000 milliseconds, but number of running tests on a node is almost impossible to be larger than 100. I thought about this at the beginning, but felt it could be confusing for code maintainers (OK, we are the maintainers), that's the reason I suggested Doug to fix by current way. However, I agree user will like it more if it can automatically decide which timestamp to choose, so if you think it's kind of acceptable style, I will not object to choose the easier way. |
| Comment by Doug Oucharek (Inactive) [ 27/Jun/12 ] |
|
Ok, checking programmatically makes sense. Based on Liang's comment, I'm assuming I can use 100 as the value to check the timestamp against to determine whether to use local or remote timestamps. |
| Comment by Liang Zhen (Inactive) [ 27/Jun/12 ] |
|
Please be more cautious about this and use at least 500 as the value, we can't use interval less than one second, even milliseconds stamp could not be so accurate but it shouldn't be less than 500. |
| Comment by Liang Zhen (Inactive) [ 01/Jul/12 ] |
|
Change it to blocker |
| Comment by Doug Oucharek (Inactive) [ 12/Jul/12 ] |
|
Change http://review.whamcloud.com/#change,3192 has landed thereby addressing the backwards compatibility issue. |
| Comment by Wally Wang (Inactive) [ 01/Aug/12 ] |
|
http://review.whamcloud.com/#change,3514 is the 1st patch for b2_2. Will submit the 2nd patch after landing this one. |