[LU-10237] "ls" hangs on a particular directory Created: 13/Nov/17 Updated: 09/Feb/18 Resolved: 14/Jan/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3, Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Andreas Dilger | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
OLCF Atlas production system: clients running 2.8.0+ (with patches), server running 2.5.5+ (with patches) |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
On atlas2 file system, we have a particular directory, any operations such as "ls" or "stat" will completely hang the process. This incurs no OS error or Lustre error from the client side. On server side, we did observe OI scrub message a few times, which may suggest there is some MDS data inconsistency, and it is "trying" to do the fix but no avail. We can't correlate the two yet. Ops teams have collected traces on the client side by: mount -t lustre 10.36.226.77@o2ib:/atlas2 /lustre/atlas2 -o rw,flock,nosuid,nodev Step2: cd /lustre/atlas2/path/to/offending_directory/ Step1: lctl dk > /dev/null the log is attached. |
| Comments |
| Comment by Andreas Dilger [ 13/Nov/17 ] |
|
While |
| Comment by Gerrit Updater [ 19/Nov/17 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/30166 |
| Comment by Gerrit Updater [ 14/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30166/ |
| Comment by Peter Jones [ 14/Jan/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 17/Jan/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30903 |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30903/ |