Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.12.0
-
None
-
3
-
9223372036854775807
Description
sanity-sec test_31 is failing when ENABLE_QUOTA=yes which is done when running the full test group. This test has failed full test group since October 13, 2018 lustre-master build #3805; the first full test session run since test 31 landed.
An example of this failure is at https://testing.whamcloud.com/test_sets/9b92b186-cf12-11e8-9238-52540065bddc . In the client test_log, we see some issues setting osc..idle_timeout=debug and getting mdc..connect_flag, but I think those can be ignored because the real issue is that we don’t have any value for inode-softlimit (lfs setquota –i value) and the inode-hardlimit is 0 (lfs setquota -I value)
CMD: onyx-30vm4 lctl get_param -n timeout Using TIMEOUT=20 CMD: onyx-30vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l CMD: onyx-30vm1.onyx.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l error: set_param: param_path 'osc/*/idle_timeout': No such file or directory error: get_param: param_path 'mdc/*/connect_flags': No such file or directory jobstats not supported by server enable quota as required CMD: onyx-30vm4 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled CMD: onyx-30vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled [HOST:onyx-30vm1.onyx.whamcloud.com] [old_mdt_qtype:none] [old_ost_qtype:none] [new_qtype:ug3] CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3 CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.ost=ug3 Total disk size: block-softlimit: 1024 block-hardlimit: 1075 inode-softlimit: inode-hardlimit: 0 Setting up quota on onyx-30vm1.onyx.whamcloud.com:/mnt/lustre for quota_usr... + /usr/bin/lfs setquota -u quota_usr -b 1024 -B 1075 -i -I 0 /mnt/lustre lfs setquota: warning: block softlimit '1024' smaller than minimum qunit size See 'lfs help setquota' or Lustre manual for details lfs: invalid limit '-I' Set filesystem quotas. usage: setquota <-u|-g|-p> <uname>|<uid>|<gname>|<gid>|<projid> -b <block-softlimit> -B <block-hardlimit> -i <inode-softlimit> -I <inode-hardlimit> <filesystem>
In setup_quota(), we calculate the inode –softlimit from
2129 # get_filesystem_size 2130 local disksz=$(lfs_df $mntpt | grep "summary" | awk '{print $2}') 2131 local blk_soft=$((disksz + 1024)) 2132 local blk_hard=$((blk_soft + blk_soft / 20)) # Go 5% over 2133 2134 local inodes=$(lfs_df -i $mntpt | grep "summary" | awk '{print $2}') 2135 local i_soft=$inodes 2136 local i_hard=$((i_soft + i_soft / 20)) 2137 2138 echo "Total disk size: $disksz block-softlimit: $blk_soft" \ 2139 "block-hardlimit: $blk_hard inode-softlimit: $i_soft" \ 2140 "inode-hardlimit: $i_hard" 2141 2142 local cmd 2143 for usr in $quota_usrs; do 2144 echo "Setting up quota on $HOSTNAME:$mntpt for $usr..." 2145 for type in u g; do 2146 cmd="$LFS setquota -$type $usr -b $blk_soft" 2147 cmd="$cmd -B $blk_hard -i $i_soft -I $i_hard $mntpt" 2148 echo "+ $cmd" 2149 eval $cmd || error "$cmd FAILED!" 2150 done 2151 # display the quota status 2152 echo "Quota settings for $usr : "
The strange thing is, sanity-sec test 25 enables quota just as test 31 does and it succeeds. From the suite_log, we see
osc.lustre-OST0005-osc-ffff9c13bc588000.idle_timeout=debug osc.lustre-OST0006-osc-ffff9c13bc588000.idle_timeout=debug enable quota as required CMD: onyx-30vm4 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled CMD: onyx-30vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled [HOST:onyx-30vm1.onyx.whamcloud.com] [old_mdt_qtype:ug] [old_ost_qtype:ug] [new_qtype:ug3] CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3 CMD: onyx-30vm4 /usr/sbin/lctl conf_param lustre.quota.ost=ug3 Total disk size: 13532932 block-softlimit: 13533956 block-hardlimit: 14210653 inode-softlimit: 838864 inode-hardlimit: 880807 Setting up quota on onyx-30vm1.onyx.whamcloud.com:/mnt/lustre for quota_usr... + /usr/bin/lfs setquota -u quota_usr -b 13533956 -B 14210653 -i 838864 -I 880807 /mnt/lustre + /usr/bin/lfs setquota -g quota_usr -b 13533956 -B 14210653 -i 838864 -I 880807 /mnt/lustre
Unfortunately, almost every other test suite fails to run any tests after sanity-sec test 31 fails with various issues. Looking in the suite_log for each test suite,
In sanity-pfl, the client fails with
CMD: onyx-30vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l CMD: onyx-30vm1.onyx.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l error: set_param: param_path 'osc/*/idle_timeout': No such file or directory
lustre-rsync-test, metadata-updates, ost-pools, … fails with
Starting mds1: /dev/mapper/mds1_flakey /mnt/lustre-mds1 CMD: onyx-30vm4 mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1 onyx-30vm4: mount.lustre: according to /etc/mtab /dev/mapper/mds1_flakey is already mounted on /mnt/lustre-mds1
Attachments
Issue Links
- mentioned in
-
Page Loading...