[LU-4705] LustreError: 89827:0:(mdc_locks.c:916:mdc_enqueue()) ldlm_cli_enqueue: -2 Created: 04/Mar/14 Updated: 26/Oct/17 Resolved: 24/Oct/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.1 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.2 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Brett Lee (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Running tip of Lustre b2_5, 1 MGS, 1 MDS, 2 OSS, 12 clients. |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 12942 | ||||||||||||
| Description |
|
Unexpected MDC LustreError's on most clients. Client 10: Client 11: Client 12: Client 13: Client 14: Client 16: Client 17: Client 18: |
| Comments |
| Comment by Keith Mannthey (Inactive) [ 10/Mar/14 ] |
|
I see these same errors with a Lustre 2.5.0 Client. The do not seem to impact the usability of the filesystem. But this is listed as a Error so there could be something happening. |
| Comment by Andreas Dilger [ 10/Mar/14 ] |
|
Is the filesystem re-exported via NFS, or possibly have concurrent threads that are accessing and unlinking files? These messages mean that the client was looking up some file, but it was deleted by the time it tried to access it. -116 = -ESTALE, -2 = -ENOENT. The errors are not really fatal, and could probably be quieted from the console. |
| Comment by Keith Mannthey (Inactive) [ 10/Mar/14 ] |
|
I have seen this error with IOR no NFS. I am not sure if the errors were generated during one single file or file per process. |
| Comment by Brett Lee (Inactive) [ 11/Mar/14 ] |
|
No, there was no re-exporting, but each Lustre client did have four (4) mounts of the file system - each mount appearing active via the stats files in /proc. |
| Comment by Andreas Dilger [ 13/Mar/14 ] |
|
Brett, what was the workload being run here? Something that is creating and deleting files concurrently (e.g. racer), or possibly multiple threads doing "rm -r" on the same tree? Either this is "normal" and maybe we should quiet the error messages, or it might imply some sort of bug on the MDS with inode lookup or files unexpectedly being deleted. Are there application-visible errors that are unexpected ("No such file or directory")? |
| Comment by Brett Lee (Inactive) [ 27/Mar/14 ] |
|
Andreas, the workload was a mix of real jobs with varying IO patterns - most prominent of which was many small reads from large files. There was no artificial creating/deleting of files. As for the application, am now noticing that a setting disabled printing of "some" error an warning messages during this run, however, each job completed successfully. No unexpected application-visible errors were seen. |
| Comment by Mike O'Connor [ 30/Jan/16 ] |
|
This is being seen at Gulfstream. In their environment, there doesn't appear to be any operational consequence to it. But, it scared them. It'd be nice if we could mute these errors, as discussed in https://jira.hpdd.intel.com/browse/LU-4705?focusedCommentId=79255&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-79255 |
| Comment by Kurt J. Strosahl (Inactive) [ 18/Feb/16 ] |
|
I just saw an instance of this error in the Lustre file system at TJNAF. It is the only instance I can recall of it being seen here, we are running lustre 2.5.3 pristine To expand a bit more... I have a test environment that I'm using to benchmark oss systems. Presently I have three osts on a single server running lustre 2.5.3. I've mounted it on a single client and am running IOR tests with the following parameters: mpirun -np 12 -bynode -machinefile ./nodelist ./ior -F -e -m -g -i 10 -t 1024k -b 42G -o /testL/benchmark/test where nodelist contains a single node. |
| Comment by Gerrit Updater [ 13/Sep/17 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/28978 |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28978/ |
| Comment by Peter Jones [ 24/Oct/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29736 |
| Comment by Gerrit Updater [ 26/Oct/17 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29736/ |