[LU-17229] replay-dual test_33: import is not in REPLAY_WAIT state Created: 26/Oct/23 Updated: 04/Feb/24 Resolved: 04/Feb/24 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Etienne Aujames |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Sergey Cheremencev <scherementsev@ddn.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/5937b834-7be4-4f39-b44f-9be432e94b5f test_33 failed with the following error: lctl dl | grep ' ST ' || true error: read_param: '/sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff9a37051e0000/ping': Transport endpoint is not connected error: read_param: '/sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff9a372060e800/ping': Transport endpoint is not connected ... rpc test_33: @@@@@@ FAIL: can't put import for mdc.lustre-MDT0000-mdc-*.mds_server_uuid into REPLAY_WAIT state after 1475 sec, have FULL Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6686:error() = /usr/lib64/lustre/tests/test-framework.sh:8086:_wait_import_state() = /usr/lib64/lustre/tests/test-framework.sh:8108:wait_import_state() = /usr/lib64/lustre/tests/test-framework.sh:8118:wait_import_state_mount() = rpc.sh:20:main() CMD: onyx-82vm10,onyx-82vm1.onyx.whamcloud.com,onyx-82vm2,onyx-82vm5,onyx-82vm9 /usr/sbin/lctl dk > /autotest/autotest-2/2023-10-19/lustre-reviews_review-dne-part-2_99525_4_c51e2a6f-6696-46b6-b871-c03b2779b9df//rpc.test_33.debug_log.\$(hostname -s).1697711480.log; dmesg > /autotest/autotest-2/2023-10-19/lustre-reviews_review-dne-part-2_99525_4_c51e2a6f-6696-46b6-b871-c03b2779b9df//rpc.test_33.dmesg.\$(hostname -s).1697711480.log onyx-82vm1.onyx.whamcloud.com: Dumping lctl log to /autotest/autotest-2/2023-10-19/lustre-reviews_review-dne-part-2_99525_4_c51e2a6f-6696-46b6-b871-c03b2779b9df//rpc.test_33.*.1697711480.log replay-dual test_33: @@@@@@ FAIL: import is not in REPLAY_WAIT state Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6686:error() = /usr/lib64/lustre/tests/test-framework.sh:8352:wait_clients_import_state() = /usr/lib64/lustre/tests/replay-dual.sh:1303:test_33() = /usr/lib64/lustre/tests/test-framework.sh:7026:run_one() = /usr/lib64/lustre/tests/test-framework.sh:7082:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6912:run_test() = /usr/lib64/lustre/tests/replay-dual.sh:1323:main() Test session details: <<Please provide additional information about the failure here>> VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 27/Oct/23 ] |
|
Hi Etienne, could you please look at this failure. It was first hit with your patch https://review.whamcloud.com/50434 " |
| Comment by Nikitas Angelinas [ 03/Nov/23 ] |
|
+1 on master: https://testing.whamcloud.com/test_sets/a4c57330-8b26-48df-ab88-29d50600dd6c |
| Comment by Andreas Dilger [ 27/Nov/23 ] |
|
Still being hit regularly, also in replay-single test_135 |
| Comment by Gerrit Updater [ 28/Nov/23 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53267 |
| Comment by Etienne Aujames [ 28/Nov/23 ] |
|
I have tried something, but I am blind here. I cannot reproduce this on test failure on my VMs. |
| Comment by Aurelien Degremont [ 08/Jan/24 ] |
|
+2 on master:
|
| Comment by Arshad Hussain [ 25/Jan/24 ] |
|
+1 on Master https://testing.whamcloud.com/sub_tests/697b09ea-deeb-4e84-a108-c15cb8808e99 |
| Comment by Alex Zhuravlev [ 28/Jan/24 ] |
|
hitting this one quite often: Failure Rate: 17.33% of most recent 75 runs, 25 skipped (all branches |
| Comment by Gerrit Updater [ 04/Feb/24 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53267/ |
| Comment by Peter Jones [ 04/Feb/24 ] |
|
Merged for 2.16 |