[LU-17037] Tests should run with high and sparse index numbers for OSTs and MDTs Created: 17/Aug/23  Updated: 26/Oct/23

Status: In Progress
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.3
Fix Version/s: Lustre 2.17.0

Type: Task Priority: Minor
Reporter: Colin Faber Assignee: Jian Yu
Resolution: Unresolved Votes: 0
Labels: tests

Issue Links:
Related
is related to LU-17034 memory corruption caused by bug in qm... Resolved
is related to LU-16331 "lfs find -O UUID" does not match fir... Resolved
Severity: 4
Rank (Obsolete): 9223372036854775807

 Description   

As a long term effort to improve the overall stability of lustre, the test suite should be evaluated and modified to allow for testing against OST and MDT sets which contain index numbers that are both high and sparse.

What I mean by this is that in many cases we're seeing more sites choosing to deploy flash OSTs within the first 100 slots, and then moving all HDD OSTs to index slots > 100 or vice-versa.

This has been shown to introduce issues which some features (most recently a memory corruption issue within pool quotas: LU-17034)

Right now the existing test suite assumes that OSTs will exist on certain low-index numbers which are hard coded into it.

This is sub-optimal to catch cases such as LU-17034, as a result we will need to modify the test suite to allow for testing these cases for all features.



 Comments   
Comment by Andreas Dilger [ 17/Aug/23 ]

Note that there are some test cases which are already testing sparse OST indexes, for example conf-sanity.sh test_81, test_82a.

There is already test-framework.sh support for non-sequential OST numbers by using OST_INDEX_LIST to specify the index values, but it might make sense to improve this support as needed. This is "documented" in lustre/tests/cfg/local.sh:

# OST indices can be specified as follows:
# OSTINDEX1="1"
# OSTINDEX2="2"
# OSTINDEX3="4"
# ......
# or            
# OST_INDEX_LIST="[1,2,4-6,8]"  # [n-m,l-k,...], where n < m and l < k, etc.
#               
# The default index value of an individual OST is its facet number minus 1.
# More specific ones override more general ones. See facet_index().

What needs to be done here is to fix the many, many subtests that assume ost1 == OST0000, ost2 == OST0001, etc. (often using the facet number - 1 as the index, or the index number + 1 as the facet name), and instead use helpers that map the facet name/number to the OST number in OST_INDEX_LIST (which is mapped internally to an associative array $OST_INDICES.

There are some helper functions that exist, but might need to be updated, and definitely need to be used more widely:

  • facet_number() converts a facet name like ostN to the facet number N, looks OK
  • facet_type() converts a facet name like ostN to the type OST, looks OK
  • facet_svc() converts a facet name like ostN to the service name via ${facet}_svc variables, likely $fsname-$typeXXXX (not sure)
  • facet_index() converts a facet name like ostN to the OST index number n (whatever it is), via OSTINDEXN or OST_INDICES[N] variables, though I'm not sure why we have both). This should be used lots of places but is not.

It might be useful to add some more helper functions to simplify the remapping, like:

  • facet_ost_name() converts an index number like "n" to the OST facet name ostN
  • facet_mdt_name() converts an index number like "n" to the MDT facet name mdsN

There is currently no support for non-contiguous MDT index numbers in test-framework.sh, and I don't think this has been tested anywhere. Until we get MDT pools, I'm not sure if there is much motivation to configure discontiguous index numbers, but I'm sure it will happen somewhere eventually. However, I don't think implementing support for testing this and fixing the many resulting bugs is a priority compared to fixing discontiguous OST support.

Comment by Andreas Dilger [ 17/Aug/23 ]

Probably due to the many subtests that need to be fixed, it would make sense to split patches into separate files (or maybe multiple patches for large scripts like sanity.sh) so that they can land independently, unless there are only a few changes in a single file.

It might be possible to test which subtests are having obvious problems by running "env=OST_INDEX_LIST=[0,10,20,40,55,60,80]" (for OSTCOUNT=8) or similar in autotest (or just set OST_INDEX_LIST in your local test environment) and run through the test scripts multiple times to fix failures as they are hit. Probably a huge number of test failures would be hit if there is no OST0000, so that might be last to test after other subtests are fixed.

Comment by Jian Yu [ 29/Aug/23 ]

In conf-sanity test_82a(), the random sparse indices for OSTs are generated as follows:

        # Format OSTs with random sparse indices.
        local i
        local index
        local ost_indices
        local LOV_V1_INSANE_STRIPE_COUNT=65532
        for i in $(seq $OSTCOUNT); do
                index=$(((RANDOM * 2) % LOV_V1_INSANE_STRIPE_COUNT))
                ost_indices+=" $index"
        done
        ost_indices=$(comma_list $ost_indices)

        stack_trap "restore_ostindex" EXIT
        echo -e "\nFormat $OSTCOUNT OSTs with sparse indices $ost_indices"
        OST_INDEX_LIST=[$ost_indices] formatall

To make a quick experiment, I used the above way in cfg/local.sh to set OST_INDEX_LIST with random sparse indices, and then ran runtests. It passed:
https://testing.whamcloud.com/test_sets/b4443be0-b163-4ec9-9271-f36464ac41c4

UUID                   1K-blocks        Used   Available Use% Mounted on
lustre-MDT0000_UUID        95248        4340       82252   6% /mnt/lustre[MDT:0]
lustre-OST090c_UUID       142216        7288      120928   6% /mnt/lustre[OST:2316]
lustre-OST4b24_UUID       142216        9088      119128   8% /mnt/lustre[OST:19236]
lustre-OST9234_UUID       142216        9416      118800   8% /mnt/lustre[OST:37428]
lustre-OST986c_UUID       142216       14568      113648  12% /mnt/lustre[OST:39020]

filesystem_summary:       568864       40360      472504   8% /mnt/lustre

I'm going to push a fortestonly patch to run the full test group by autotest with the above change to see which subtests are failing.

Comment by Gerrit Updater [ 29/Aug/23 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52158
Subject: LU-17037 tests: full group testing with sparse OST indexes
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a5a1a2346e87afe5914b8ab098583ae8920e4598

Comment by Jian Yu [ 30/Aug/23 ]

I'm vetting the test results in https://review.whamcloud.com/52158 on master branch.
With sparse OST indexes "OST_INDEX_LIST=[0,10,20,40,55,60,80]" (for OSTCOUNT=7) and "ENABLE_QUOTA=yes" specified, at least performance-sanity test 2 and sanity-benchmark test dbench crashed on master branch. I just updated LU-17034 with the detailed LBUG info.
 

Comment by Andreas Dilger [ 30/Aug/23 ]

I think of particular interest is also sanity-quota and ost-pools, since pools + quota + sparse OST index was the source of the problem.

Comment by Jian Yu [ 30/Aug/23 ]

Here are the full-dne-part-{1,2,3} test results with "OST_INDEX_LIST=[0,10,20,40,55,60,80]" and "ENABLE_QUOTA=yes":
https://testing.whamcloud.com/test_sessions/551b2e1f-2493-411f-9cbd-bb28dd0b1607
https://testing.whamcloud.com/test_sessions/f7c3d574-e349-42ee-99f3-35c761205148
https://testing.whamcloud.com/test_sessions/a1fffb9a-56b2-439a-ab48-20f9c5fcd5eb
sanity-quota hit the LBUG at test 0. ost-pools didn't crash and it passed with 38 subtests out of 56.

Comment by Jian Yu [ 30/Aug/23 ]

I just removed the "ENABLE_QUOTA=yes" test parameter and triggered the full group testing again to make LBUG not block other test suites. After LU-17034 is fixed, I'll add the parameter and test again.
Now I'm looking into the non-LBUG failures and trying to fix them.

Generated at Sat Feb 10 03:32:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.