[LU-1628] du for a certain set of user seems to take an inordinate amount of time to complete Created: 13/Jul/12  Updated: 07/Jun/16

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Story Priority: Minor
Reporter: Joe Mervini Assignee: Oleg Drokin
Resolution: Unresolved Votes: 0
Labels: None
Environment:

lustre 1.8.5


Rank (Obsolete): 10702

 Description   

This is just a question of observance:

We recently ran into a problem with mballoc that was causing OSTs to misbehave. We considered that the capacity/usage of the file system was the culprit (which ultimately appear to be the case - since we got down below 92% everything was happy) and as a result I started a process of examining every user's usage with du. Beyond the fact that large files are reported better than small files I am running into a situation where the last 1% of the user's are taking an extraordinary amount of time to complete a du - on the order of 2 or more days.

At first I thought that the problem was the number of files associated with the users directory (and in fact the users that remain have > 5M files in their directories) but other users have similar number of files yet their du output completes in hours. (Caveat: I haven't checked the stripe sizes on the directories of the users that are still running.)

So my question is: has this behavior been observed in other systems and what are the conclusions.

This is not high priority. Just a matter of curiosity.



 Comments   
Comment by Peter Jones [ 13/Jul/12 ]

Oleg

What do you think about this one?

Thanks

Peter

Generated at Sat Feb 10 01:18:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.