Andreas, we hit this last night on a production MDS still running 2.3.63. In case you're still interested I'm attaching the console log and lustre debug log. Not sure what the client load was like at the time. I see a lot of changelog and fid2path activity which is probably from a RobinHood scan.
Ned Bass (Inactive)
added a comment - Andreas, we hit this last night on a production MDS still running 2.3.63. In case you're still interested I'm attaching the console log and lustre debug log. Not sure what the client load was like at the time. I see a lot of changelog and fid2path activity which is probably from a RobinHood scan.
Andriy, any information on how this bug was triggered? Was it under testing, or some user load? MDS recovery, network errors, etc?
Andreas Dilger
added a comment - Andriy, any information on how this bug was triggered? Was it under testing, or some user load? MDS recovery, network errors, etc?
Dropping this from the blocker list. The patch is incorrect and we have no information about how this bug was hit or the symptoms of the failure (stack trace, error logs, etc), or how often it is hit, so no way to know how common or rare the problem is.
Andreas Dilger
added a comment - Dropping this from the blocker list. The patch is incorrect and we have no information about how this bug was hit or the symptoms of the failure (stack trace, error logs, etc), or how often it is hit, so no way to know how common or rare the problem is.
Andreas, we hit this last night on a production MDS still running 2.3.63. In case you're still interested I'm attaching the console log and lustre debug log. Not sure what the client load was like at the time. I see a lot of changelog and fid2path activity which is probably from a RobinHood scan.