Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.5.0
-
None
-
3
-
11102
Description
In a patch cleaning up calls to mkdir in Lustre tests (http://review.whamcloud.com/#/c/5022), Andreas requested a patch be made to replay-dual test 18. Commenting on the test 18 code:
dmesg | grep "entering recovery in server" && error "client not evicted" || true
the request:
This error message as written doesn't exist in the Lustre code anywhere. Checking back in b1_8 it comes from ldlm_expired_completion_wait(), and I see that this string does exist in master, but is split across multiple lines... The other problem is that it is LDLM_DEBUG() now instead of LDLM_ERROR(), since it was changed in http://review.whamcloud.com/2201 (commit 57373a29) "Quiet/cleanup various common console message".
The string "not entering recovery" is visible in all releases, but is still in LDLM_DEBUG() since 2.3.59. If this test enabled D_DLMTRACE at the start, it could consistently find this in the MDS debug log. I also observe that this test is only checking the local console log instead of the MDS console log, so it has probably been broken for multi-node testing for a long time (though I've never seen it in my local node testing either). In a separate patch, could you please fix this to be:
local DLMTRACE=$(do_facet $SINGLEMDS lctl get_param debug)
do_facet $SINGLEMDS lctl set_param debug=+dlmtrace
mkdir $MOUNT1/$tdir ...
:
:
wait $OPENPID
do_facet $SINGLEMDS lctl debug_kernel |
grep "not entering recovery" && error "client not evicted"
Attachments
Issue Links
- is related to
-
LU-6652 replay-dual test 18 statmany wrong file
-
- Resolved
-
Activity
Link | Original: This issue is related to LDEV-14 [ LDEV-14 ] |
Link | New: This issue is related to LDEV-14 [ LDEV-14 ] |
Fix Version/s | New: Lustre 2.6.0 [ 10595 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Description |
Original:
In a patch cleaning up calls to mkdir in Lustre tests (http://review.whamcloud.com/#/c/5022), Andreas requested a patch be made to replay-dual test 18. Commenting on the test 18 code: {noformat} dmesg | grep "entering recovery in server" && error "client not evicted" || true {noformat} the request: {noformat} This error message as written doesn't exist in the Lustre code anywhere. Checking back in b1_8 it comes from ldlm_expired_completion_wait(), and I see that this string does exist in master, but is split across multiple lines... The other problem is that it is LDLM_DEBUG() now instead of LDLM_ERROR(), since it was changed in http://review.whamcloud.com/2201 (commit 57373a29) "Quiet/cleanup various common console message". The string "not entering recovery" is visible in all releases, but is still in LDLM_DEBUG() since 2.3.59. If this test enabled D_DLMTRACE at the start, it could consistently find this in the MDS debug log. I also observe that this test is only checking the local console log instead of the MDS console log, so it has probably been broken for multi-node testing for a long time (though I've never seen it in my local node testing either). In a separate patch, could you please fix this to be: local DLMTRACE=$(do_facet $SINGLEMDS lctl get_param debug) do_facet $SINGLEMDS lctl set_param debug=+dlmtrace mkdir $MOUNT1/$tdir ... : : wait $OPENPID do_facet $SINGLEMDS lctl debug_kernel | grep "not entering recovery" && error "client not evicted" {noformat} |
New:
In a patch cleaning up calls to mkdir in Lustre tests (http://review.whamcloud.com/#/c/5022), Andreas requested a patch be made to replay-dual test 18. Commenting on the test 18 code: {noformat} dmesg | grep "entering recovery in server" && error "client not evicted" || true {noformat} the request: {quote} This error message as written doesn't exist in the Lustre code anywhere. Checking back in b1_8 it comes from ldlm_expired_completion_wait(), and I see that this string does exist in master, but is split across multiple lines... The other problem is that it is LDLM_DEBUG() now instead of LDLM_ERROR(), since it was changed in http://review.whamcloud.com/2201 (commit 57373a29) "Quiet/cleanup various common console message". The string "not entering recovery" is visible in all releases, but is still in LDLM_DEBUG() since 2.3.59. If this test enabled D_DLMTRACE at the start, it could consistently find this in the MDS debug log. I also observe that this test is only checking the local console log instead of the MDS console log, so it has probably been broken for multi-node testing for a long time (though I've never seen it in my local node testing either). In a separate patch, could you please fix this to be: local DLMTRACE=$(do_facet $SINGLEMDS lctl get_param debug) do_facet $SINGLEMDS lctl set_param debug=+dlmtrace mkdir $MOUNT1/$tdir ... : : wait $OPENPID do_facet $SINGLEMDS lctl debug_kernel | grep "not entering recovery" && error "client not evicted" {quote} |
Landed to master