[LU-3631] file stats are slow on filesystem start up Created: 24/Jul/13  Updated: 30/Apr/18  Resolved: 02/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.9
Fix Version/s: None

Type: Bug Priority: Trivial
Reporter: Kit Westneat (Inactive) Assignee: Oleg Drokin
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10967 MDT page cache management improvements Open
Severity: 3
Rank (Obsolete): 9351

 Description   

NOAA had a question about an observed behavior. As part of their change management validation, they rerun a battery of tests after they make a change. One of those is mdtest. What has been observed is that after a downtime, the file stat performance is terrible. Like 400/s. After a while, it goes up to around 10-15k. It seems to be a function of load as opposed to time. The more the filesystem is used (due to other testing), the faster the stat performance increases.

Are there any thoughts on why this might happen? I have tried preloading the OST metadata, but that didn't seem to have any effect. I thought that IB routing might be an issue, but even when the IB fabric is untouched, we see this issue. It seems like it must be a cache issue, but I am unsure what caches are being warmed up. Any insight on why we are seeing this and how to preload the cache would be great.

Thanks.



 Comments   
Comment by Peter Jones [ 25/Jul/13 ]

Oleg

What do you recommend?

Peter

Comment by Andreas Dilger [ 29/Jul/13 ]

This is a fairly typical scenario as the filesystem cache is warmed up after a restart. In particular, the inode bitmaps need to be read from disk. If there are partially allocated itable blocks are also read from disk at allocation time (unallocated blocks are not read from disk).

The inode bitmaps can be prefetched at boot time by running:

dumpe2fs /dev/mdsdev 2> /dev/null

The problem of loading the inode bitmaps at startup time is avoided with Lustre 2.x formatted filesystems by using the flexbg feature.

Comment by Kit Westneat (Inactive) [ 29/Jul/13 ]

Ok great, thanks!

Generated at Sat Feb 10 01:35:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.