[LU-16331] "lfs find -O UUID" does not match first OST after gap Created: 22/Nov/22  Updated: 17/Aug/23  Resolved: 19/May/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0, Lustre 2.12.9
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17037 Tests should run with high and sparse... In Progress
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Running "lfs find -O testfs-OST000a -type f /mnt/testfs" does not correctly find files with objects on OST000a if there is a gap in the OST numbering and this is the first OST number after the gap, but this works properly for finding files on this OST by index number:

# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
testfs-MDT0000_UUID       125056        4924      108896   5% /mnt/testfs[MDT:0]
testfs-MDT0001_UUID       125056        7832      105988   7% /mnt/testfs[MDT:1]
testfs-MDT0002_UUID       125056        6700      107120   6% /mnt/testfs[MDT:2]
testfs-MDT0003_UUID       125056        6088      107732   6% /mnt/testfs[MDT:3]
testfs-OST0000_UUID       313104       12476      273468   5% /mnt/testfs[OST:0]
testfs-OST0001_UUID       313104        9636      276308   4% /mnt/testfs[OST:1]
testfs-OST0002_UUID       313104       11444      274500   5% /mnt/testfs[OST:2]
testfs-OST0003_UUID       313104       13156      272788   5% /mnt/testfs[OST:3]
testfs-OST000a_UUID       313104       15520      270424   6% /mnt/testfs[OST:10]
testfs-OST000b_UUID       313104        8972      276972   4% /mnt/testfs[OST:11]
testfs-OST000c_UUID       313104       11752      274192   5% /mnt/testfs[OST:12]
testfs-OST000d_UUID       313104       12500      273444   5% /mnt/testfs[OST:13]

filesystem_summary:      2504832       95456     2192096   5% /mnt/testfs
# lfs find /mnt/testfs -O testfs-OST000a_UUID
# lfs find /mnt/testfs -O 0xa
/mnt/testfs/etc/lvm/profile/thin-generic.profile
/mnt/testfs/etc/lvm/archive/centos_00001-283055415.vg
:


 Comments   
Comment by Andreas Dilger [ 22/Nov/22 ]

The problem appears to be in llapi_get_target_uuids():

# lctl get_param lov.*.target_obd
lov.testfs-clilov-ffff99bbd7239000.target_obd=
0: testfs-OST0000_UUID ACTIVE
1: testfs-OST0001_UUID ACTIVE
2: testfs-OST0002_UUID ACTIVE
3: testfs-OST0003_UUID ACTIVE
10: testfs-OST000a_UUID ACTIVE
11: testfs-OST000b_UUID ACTIVE
12: testfs-OST000c_UUID ACTIVE
13: testfs-OST000d_UUID ACTIVE
        int index = 0;

        while (fgets(buf, sizeof(buf), fp) != NULL) {
                if (uuidp && (index < *ost_count)) {
                        if (sscanf(buf, format, &index, uuidp[index].uuid) < 2)
                                break;
                }
                index++;
        }

This is filling the uuidp[] array using index starting at 0, but is increased by the target OBD number in the target_obd after the OST line is parsed. This puts the OST000a line into uuidp[4], index=10 is parsed then incremented to index=11, and then OST000b is loaded into uuidp[11].

Comment by Gerrit Updater [ 22/Nov/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49207
Subject: LU-16331 utils: fix 'lfs find -O <uuid>' with gaps
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cd443908ad09ea429d290ef3181562ad2eb7d17f

Comment by Gerrit Updater [ 19/May/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49207/
Subject: LU-16331 utils: fix 'lfs find -O <uuid>' with gaps
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 05334b90a5d3ddd6c8eabc3683fd487f47df6e35

Comment by Peter Jones [ 19/May/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:26:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.