[LU-7312] sanity-hsm: $? verification is not valid with set -e Created: 19/Oct/15 Updated: 10/Oct/21 Resolved: 10/Oct/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Bhagyesh Dudhediya (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Project: | HSM | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
The part of the fix for while (( SECONDS < end_wait )); do sleep 2 do_nodesv $agents "pgrep -x $HSMTOOL_BASE" if [ $? -ne 0 ]; then echo "Copytool is stopped on $agents" break fi echo "Copytool still running on $agents" done if do_nodesv $agents "pgrep -x $HSMTOOL_BASE"; then error "Copytool failed to stop in ${TIMEOUT}s ..." else echo "Copytool has stopped in " \ "$((TIMEOUT - (end_wait - SECONDS)))s." fi causes failure in sanity-hsm. |
| Comments |
| Comment by Gerrit Updater [ 19/Oct/15 ] |
|
Bhagyesh Dudhediya (bhagyesh.dudhediya@seagate.com) uploaded a new patch: http://review.whamcloud.com/16866 |
| Comment by Bruno Faccini (Inactive) [ 19/Oct/15 ] |
|
Hello Bhagyesh, OTOH, re-reading Bash/set documentation (https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html), I understand that when "set -e" is being used, a simple command failure can lead to shell/script exit, even if this does not seem to occur ... Is this what you mean ? And then that the : do_nodesv $agents "pgrep -x $HSMTOOL_BASE"
if [ $? -ne 0 ]; then
construct should be replaced by : if ! do_nodesv $agents "pgrep -x $HSMTOOL_BASE"; then to avoid any problem? |
| Comment by Bhagyesh Dudhediya (Inactive) [ 19/Oct/15 ] |
|
Yes, IMO the replacement should be done as mentioned. if [ $? -ne 0 ]; then
leads to exit. [root@Bhagyesh tests]# ONLY=400 ./sanity-hsm.sh Logging to shared log directory: /tmp/test_logs/1445246722 Starting client Bhagyesh: -o user_xattr,flock Bhagyesh@tcp:/lustre /mnt/lustre2 Started clients Bhagyesh: Bhagyesh@tcp:/lustre on /mnt/lustre2 type lustre (rw,user_xattr,flock) Bhagyesh: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients Bhagyesh environments Using TIMEOUT=20 disable quota as required osd-ldiskfs.track_declares_assert=1 running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/mnt/lustre/d0_runas_test/f3594] excepting tests: 34 35 36 Killing existing copytools on Bhagyesh Set HSM on and start Start copytool Purging archive on Bhagyesh Starting copytool agt1 on Bhagyesh Set sanity-hsm HSM policy == sanity-hsm test 400: Single request is sent to the right MDT == 14:55:43 (1445246743) SKIP: sanity-hsm test_400 needs >= 2 MDTs Resetting fail_loc on all nodes...done. SKIP 400 (0s) [root@Bhagyesh tests]# After the patch : [root@Bhagyesh tests]# ONLY=400 ./sanity-hsm.sh
Logging to shared log directory: /tmp/test_logs/1445246819
Bhagyesh: Checking config lustre mounted on /mnt/lustre2
Checking servers environments
Checking clients Bhagyesh environments
Bhagyesh: Checking config lustre mounted on /mnt/lustre
Checking servers environments
Checking clients Bhagyesh environments
Using TIMEOUT=20
disable quota as required
osd-ldiskfs.track_declares_assert=1
running as uid/gid/euid/egid 500/500/500/500, groups:
[touch] [/mnt/lustre/d0_runas_test/f4943]
excepting tests: 34 35 36
Killing existing copytools on Bhagyesh
Set HSM on and start
Start copytool
Purging archive on Bhagyesh
Starting copytool agt1 on Bhagyesh
Set sanity-hsm HSM policy
== sanity-hsm test 400: Single request is sent to the right MDT == 14:57:20 (1445246840)
SKIP: sanity-hsm test_400 needs >= 2 MDTs
Resetting fail_loc on all nodes...done.
SKIP 400 (0s)
Copytool is stopped on Bhagyesh
Copytool has stopped in 2s.
mdt.lustre-MDT0000.hsm_control=shutdown
Waiting 20 secs for update
mdt.lustre-MDT0000.hsm_control=enabled
== sanity-hsm test complete, duration 24 sec == 14:57:23 (1445246843)
Stopping clients: Bhagyesh /mnt/lustre2 (opts:)
Stopping client Bhagyesh /mnt/lustre2 opts:
[root@Bhagyesh tests]#
|
| Comment by jacques-charles lafoucriere [ 20/Oct/15 ] |
|
It seems there are other places in sanity-hsm.sh and sanity.sh with the same error test (and may be other test scripts). |