Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.3, Lustre 2.12.4, Lustre 2.12.6, Lustre 2.12.7
-
3
-
9223372036854775807
Description
stdout
== replay-single test 65a: AT: verify early replies ================================================== 11:07:04 (1473160024) at_history=8 at_history=8 debug=other fail_val=6000 fail_loc=0x8000050a replay-single test_65a: @@@@@@ FAIL: No early reply Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4863:error() = /usr/lib64/lustre/tests/replay-single.sh:1701:test_65a() = /usr/lib64/lustre/tests/test-framework.sh:5123:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5161:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4965:run_test() = /usr/lib64/lustre/tests/replay-single.sh:1708:main() Dumping lctl log to /tmp/test_logs/1473160016/replay-single.test_65a.*.1473160052.log fre1337: Warning: Permanently added 'fre1339,192.168.113.39' (ECDSA) to the list of known hosts. fre1340: Warning: Permanently added 'fre1339,192.168.113.39' (ECDSA) to the list of known hosts. fre1338: Warning: Permanently added 'fre1339,192.168.113.39' (ECDSA) to the list of known hosts. debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck Resetting fail_loc on all nodes...done. FAIL 65a (29s)
cmd
SLOW=YES NAME=ncli mgs_HOST=fre1337 MGSDEV=/dev/vdb NETTYPE=tcp mds1_HOST=fre1337 MDSDEV1=/dev/vdc mds_HOST=fre1337 MDSDEV=/dev/vdc mds2_HOST=fre1337 MDSDEV2=/dev/vdd MDSCOUNT=2 ost1_HOST=fre1338 OSTDEV1=/dev/vdb ost2_HOST=fre1338 OSTDEV2=/dev/vdc OSTCOUNT=2 CLIENTS=fre1339 RCLIENTS="fre1340" PDSH="/usr/bin/pdsh -R ssh -S -w " ONLY=65a MDS_MOUNT_OPTS="-o rw,user_xattr" OST_MOUNT_OPTS="-o user_xattr" MDSSIZE=0 OSTSIZE=0
- Please note, this issue is seen while testing the patch of
LU-8062for SEA-101
Looking at failures over the past year, replay-single test_65a is still failing with the 'No early reply' message mostly for the b2_12 branch. When test 65a fails on other branches, it looks like there are many others replay-single tests that fail and there may be a different cause for the failures.
Looking at failures for the past year, here are a sample of test failures:
2.12.7 RC1 - https://testing.whamcloud.com/test_sets/4cf924bb-bfb3-4c60-ba5e-62121562e68d
2.12.6.69 ppc - https://testing.whamcloud.com/test_sets/d0f3af4d-d159-426b-8ad5-efe4412cf8b4
2.12.6.55 ppc - https://testing.whamcloud.com/test_sets/3ec45aa6-13ef-49f7-ac81-fe6a5105ef02
2.12.6.51 ppc - https://testing.whamcloud.com/test_sets/6885c8ec-74b6-44f7-8bf6-a83ba7c6ee0e
2.12.5.83 - https://testing.whamcloud.com/test_sets/7bc00870-4ba5-4421-9a05-e6b4e9766b61
2.12.5.60 ppc - https://testing.whamcloud.com/test_sets/ea3af64e-1750-477f-a939-3e8cf5e40474
2.12.5.32 ppc - https://testing.whamcloud.com/test_sets/98bc51af-75b0-4442-8d8a-561c13b2a108