Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.12.0
-
None
-
9223372036854775807
Description
It appears that CDEBUG_LIMIT is not working properly, since test logs in Maloo are full of repeated error messages. This would be a nightmare for a large cluster if there are many clients and servers spewing repeated messages.
For example, sanity test_60a on the MDS console:
[ 2786.154265] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.155427] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.156482] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.157628] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.158671] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.159789] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.160824] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.161934] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.162977] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.164074] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [ 2786.165111] Lustre: 30107:0:(llog_cat.c:98:llog_cat_new_log()) MGS: there are no more free slots in catalog e973af35 [repeats hundreds of times]
It might relate to ktime_t patches that James landed previously, but that is just speculation as I haven't investigated it yet.
The sanity test_60b should be catching the failure of CDEBUG_LIMIT() but it is checking the logs on the client, while the test is being run on the MGS.