[LU-7675] replay-single test_101 times out after aborting recovery on mount of the mds1 Created: 15/Jan/16 Updated: 14/Dec/21 Resolved: 14/Dec/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
autotest review-dne-part-2 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
replay-single test 101 times out on mount of the mds1 with the abort recovery flag. The last information in the test_log is 01:57:12 (1452765432) waiting for onyx-34vm7 network 900 secs ... 01:57:12 (1452765432) network interface is UP CMD: onyx-34vm7 hostname CMD: onyx-34vm7 test -b /dev/lvm-Role_MDS/P1 Starting mds1: -o abort_recovery /dev/lvm-Role_MDS/P1 /mnt/mds1 CMD: onyx-34vm7 mkdir -p /mnt/mds1; mount -t lustre -o abort_recovery /dev/lvm-Role_MDS/P1 /mnt/mds1 From the MDS1 console, we see: 01:57:22:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 01:57:22:LustreError: 14301:0:(mdt_handler.c:5605:mdt_iocontrol()) lustre-MDT0000: Aborting recovery for device 01:57:44:LustreError: 14301:0:(ldlm_lib.c:2479:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery 01:57:44:Lustre: 14377:0:(ldlm_lib.c:1945:target_recovery_overseer()) recovery is aborted, evict exports in recovery 01:57:44:Lustre: 14377:0:(ldlm_lib.c:1945:target_recovery_overseer()) Skipped 2 previous similar messages 01:57:44:Lustre: lustre-MDT0000: disconnecting 5 stale clients 01:57:44:LustreError: 14377:0:(update_records.c:72:update_records_dump()) master transno = 382252089401 batchid = 373662154835 flags = 0 ops = 19 params = 9 01:57:44:LustreError: 14377:0:(update_records.c:72:update_records_dump()) master transno = 382252089401 batchid = 373662154836 flags = 0 ops = 28 params = 24 01:57:44:LustreError: 14377:0:(update_records.c:72:update_records_dump()) master transno = 382252089401 batchid = 377957122268 flags = 0 ops = 19 params = 9 01:57:44: Press any key to continue. 01:57:44: Press any key to continue. 01:57:44: Press any key to continue. 01:57:44: Press any key to continue. 01:57:44: Press any key to continue. 01:57:44: [H [J 01:57:44: GNU GRUB version 0.97 (631K lower / 2096116K upper memory) We’ve seen this error four times in the past two months during review-dne-part-2 testing. Logs are at |
| Comments |
| Comment by Di Wang [ 20/Jan/16 ] |
|
This will probably fixed by http://review.whamcloud.com/#/c/17885/ |
| Comment by James Nunez (Inactive) [ 20/Jan/16 ] |
|
Another failure on master: |