Well, I am hitting this pretty regularly again and again, so I did some digging and here's what I uncovered:
on a client in the log (for failed request):
I found op_data for this request and it's:
So op_data has different name than what was reported in the log, but the length is the same.
What really happened, I think is some sort of rename succeeded in between we got the revalidate request from the kernel and sending the message for over the wire sending that replaced the name in dentry (I also found dentry).
In fact here it is:
This is all thanks to us not storing the name in op_data, but merely a pointer to some other location (dentry in this case, but e.g. in 3270 there is a bug of similar nature where op_data points to sai entry that gets freed unexpectedly).
So after sending this malformed request, on the server side our string check fails (can't unpack short string) because we expect string length 2, but got only 1, this in turn returns error to mdt_reint_internal:
That sets err_serious on the rc.
Now we return all the way through the stack to mdt_enqueue, and there:
So we call err_serious(rc) AGAIN which triggers the assertion.
I think we should just remove the assertion and allow double err_serious setting, as otherwise we will need to go through all the callchains to ensure it is never called twice, yet everywhere where it's needed it should be called once for sure, which might be quite cumbersome.
Closing ticket as patch landed to Master. Please let me know if more work is needed in this ticket and I will reopen.