Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.8.0, Lustre 2.11.0
-
None
-
Hard Failover
EL6.7 Server/SLES11 SP4 Clients
Tag 2.7.66 , master, build# 3316
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/203c5f0c-ce41-11e5-90aa-5254006e85c2.
The sub-test test_53f failed with the following error:
close_pid should not exist
Test log:
== replay-single test 53f: |X| open reply and close reply while two MDC requests in flight =========== 23:18:48 (1454915928)
CMD: shadow-49vm3 lctl set_param fail_loc=0x119
fail_loc=0x119
CMD: shadow-49vm3 lctl set_param fail_loc=0x8000013b
fail_loc=0x8000013b
CMD: shadow-49vm3 sync; sync; sync
Replay barrier on lustre-MDT0000
CMD: shadow-49vm3 /usr/sbin/lctl --device lustre-MDT0000 notransno
CMD: shadow-49vm3 /usr/sbin/lctl --device lustre-MDT0000 readonly
CMD: shadow-49vm3 /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
CMD: shadow-49vm3 /usr/sbin/lctl dl
Failing mds1 on shadow-49vm3
+ pm -h powerman --off shadow-49vm3
Command completed successfully
reboot facets: mds1
+ pm -h powerman --on shadow-49vm3
Command completed successfully
Failover mds1 to shadow-49vm7
23:19:04 (1454915944) waiting for shadow-49vm7 network 900 secs ...
23:19:04 (1454915944) network interface is UP
CMD: shadow-49vm7 hostname
mount facets: mds1
CMD: shadow-49vm7 test -b /dev/lvm-Role_MDS/P1
Starting mds1: /dev/lvm-Role_MDS/P1 /mnt/mds1
CMD: shadow-49vm7 mkdir -p /mnt/mds1; mount -t lustre /dev/lvm-Role_MDS/P1 /mnt/mds1
shadow-49vm7: mount.lustre: increased /sys/block/dm-0/queue/max_sectors_kb from 1024 to 16384
shadow-49vm7: mount.lustre: increased /sys/block/sda/queue/max_sectors_kb from 1024 to 16384
CMD: shadow-49vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck\" \"all -lnet -lnd -pinger\" 4
CMD: shadow-49vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: shadow-49vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: shadow-49vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null
Started lustre-MDT0000
replay-single test_53f: @@@@@@ FAIL: close_pid should not exist
Client1 dmesg:
[290105.069249] Lustre: DEBUG MARKER: == replay-single test 53f: |X| open reply and close reply while two MDC requests in flight =========== 23:18:48 (1454915928) [290108.561999] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000 [290138.404135] LustreError: 27898:0:(mgc_request.c:529:do_requeue()) failed processing log: -5 [290194.907325] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_53f: @@@@@@ FAIL: close_pid should not exist [290195.039656] Lustre: DEBUG MARKER: replay-single test_53f: @@@@@@ FAIL: close_pid should not exist [290195.273829] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2016-02-04/lustre-master-el6_7-x86_64-vs-lustre-master-sles11sp4-x86_64--failover--2_3_1__3316__-70200093201000-150616/replay-single.test_53f.debug_log.$(hostname -s).1454916018.log;