[LU-11524] sanity-sec test 31 fails with '/usr/bin/lfs setquota -u quota_usr -b 1024 -B 1075 -i -I 0 /mnt/lustre FAILED!' Created: 15/Oct/18  Updated: 29/Oct/18  Resolved: 29/Oct/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Major
Reporter: James Nunez (Inactive) Assignee: Sebastien Buisson
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity-sec test_31 is failing when ENABLE_QUOTA=yes which is done when running the full test group. This test has failed full test group since October 13, 2018 lustre-master build #3805; the first full test session run since test 31 landed.

An example of this failure is at https://testing.whamcloud.com/test_sets/9b92b186-cf12-11e8-9238-52540065bddc . In the client test_log, we see some issues setting osc..idle_timeout=debug and getting mdc..connect_flag, but I think those can be ignored because the real issue is that we don’t have any value for inode-softlimit (lfs setquota –i value) and the inode-hardlimit is 0 (lfs setquota -I value)

CMD: onyx-30vm4 lctl get_param -n timeout
Using TIMEOUT=20
CMD: onyx-30vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
CMD: onyx-30vm1.onyx.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
error: set_param: param_path 'osc/*/idle_timeout': No such file or directory
error: get_param: param_path 'mdc/*/connect_flags': No such file or directory
jobstats not supported by server
enable quota as required
CMD: onyx-30vm4 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
CMD: onyx-30vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled
[HOST:onyx-30vm1.onyx.whamcloud.com] [old_mdt_qtype:none] [old_ost_qtype:none] [new_qtype:ug3]
CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.ost=ug3
Total disk size:   block-softlimit: 1024 block-hardlimit: 1075 inode-softlimit:  inode-hardlimit: 0
Setting up quota on onyx-30vm1.onyx.whamcloud.com:/mnt/lustre for quota_usr...
+ /usr/bin/lfs setquota -u quota_usr -b 1024 -B 1075 -i  -I 0 /mnt/lustre
lfs setquota: warning: block softlimit '1024' smaller than minimum qunit size
See 'lfs help setquota' or Lustre manual for details
lfs: invalid limit '-I'
Set filesystem quotas.
usage: setquota <-u|-g|-p> <uname>|<uid>|<gname>|<gid>|<projid>
                -b <block-softlimit> -B <block-hardlimit>
                -i <inode-softlimit> -I <inode-hardlimit> <filesystem>

In setup_quota(), we calculate the inode –softlimit from

2129         # get_filesystem_size
2130         local disksz=$(lfs_df $mntpt | grep "summary" | awk '{print $2}')
2131         local blk_soft=$((disksz + 1024))
2132         local blk_hard=$((blk_soft + blk_soft / 20)) # Go 5% over
2133 
2134         local inodes=$(lfs_df -i $mntpt | grep "summary" | awk '{print $2}')
2135         local i_soft=$inodes
2136         local i_hard=$((i_soft + i_soft / 20))
2137 
2138         echo "Total disk size: $disksz  block-softlimit: $blk_soft" \
2139                 "block-hardlimit: $blk_hard inode-softlimit: $i_soft" \
2140                 "inode-hardlimit: $i_hard"
2141 
2142         local cmd
2143         for usr in $quota_usrs; do
2144                 echo "Setting up quota on $HOSTNAME:$mntpt for $usr..."
2145                 for type in u g; do
2146                         cmd="$LFS setquota -$type $usr -b $blk_soft"
2147                         cmd="$cmd -B $blk_hard -i $i_soft -I $i_hard $mntpt"
2148                         echo "+ $cmd"
2149                         eval $cmd || error "$cmd FAILED!"
2150                 done
2151                 # display the quota status
2152                 echo "Quota settings for $usr : "

The strange thing is, sanity-sec test 25 enables quota just as test 31 does and it succeeds. From the suite_log, we see

osc.lustre-OST0005-osc-ffff9c13bc588000.idle_timeout=debug
osc.lustre-OST0006-osc-ffff9c13bc588000.idle_timeout=debug
enable quota as required
CMD: onyx-30vm4 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
CMD: onyx-30vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled
[HOST:onyx-30vm1.onyx.whamcloud.com] [old_mdt_qtype:ug] [old_ost_qtype:ug] [new_qtype:ug3]
CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.ost=ug3
Total disk size: 13532932  block-softlimit: 13533956 block-hardlimit: 14210653 inode-softlimit: 838864 inode-hardlimit: 880807
Setting up quota on onyx-30vm1.onyx.whamcloud.com:/mnt/lustre for quota_usr...
+ /usr/bin/lfs setquota -u quota_usr -b 13533956 -B 14210653 -i 838864 -I 880807 /mnt/lustre
+ /usr/bin/lfs setquota -g quota_usr -b 13533956 -B 14210653 -i 838864 -I 880807 /mnt/lustre

Unfortunately, almost every other test suite fails to run any tests after sanity-sec test 31 fails with various issues. Looking in the suite_log for each test suite,
In sanity-pfl, the client fails with

CMD: onyx-30vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
CMD: onyx-30vm1.onyx.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
error: set_param: param_path 'osc/*/idle_timeout': No such file or directory

lustre-rsync-test, metadata-updates, ost-pools, … fails with

Starting mds1:   /dev/mapper/mds1_flakey /mnt/lustre-mds1
CMD: onyx-30vm4 mkdir -p /mnt/lustre-mds1; mount -t lustre   /dev/mapper/mds1_flakey /mnt/lustre-mds1
onyx-30vm4: mount.lustre: according to /etc/mtab /dev/mapper/mds1_flakey is already mounted on /mnt/lustre-mds1


 Comments   
Comment by Sebastien Buisson [ 16/Oct/18 ]

The problem stems from the fact that init_param_vars() tries to do Lustre client specific tunings (including quota settings), even when Lustre clients are not mounted.
I will push a patch to address this problem.

Comment by Gerrit Updater [ 16/Oct/18 ]

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/33380
Subject: LU-11524 tests: make init_param_vars() aware of server_only
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: fbd14dd0a08a80892be29ef73e6a8a478df9c919

Comment by Gerrit Updater [ 19/Oct/18 ]

Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/33380/
Subject: LU-11524 tests: fix sanity-sec test_31 for all situations
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: fde0e290a8cc370f6eb7986e9ada8b5bcc41fef7

Comment by Peter Jones [ 29/Oct/18 ]

Seems to have landed for 2.12

Generated at Sat Feb 10 02:44:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.