Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.4
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/bfbdbdae-6c4b-4ef0-abb4-ba956983c322
test_3a failed with the following error:
Timeout occurred after 576 minutes, last suite running was obdfilter-survey
Test session details:
clients: https://build.whamcloud.com/job/lustre-b2_15/77 - 4.18.0-477.15.1.el8_8.x86_64
servers: https://build.whamcloud.com/job/lustre-b2_15/77 - 4.18.0-477.15.1.el8_lustre.x86_64
<<Please provide additional information about the failure here>>
MDS console
[34451.919649] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == obdfilter-survey test 3a: Network survey ============== 23:18:40 \(1701299920\) [34452.111000] Lustre: DEBUG MARKER: == obdfilter-survey test 3a: Network survey ============== 23:18:40 (1701299920) [34453.201758] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true [34453.514311] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 [34457.041800] Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [34457.044694] Lustre: Skipped 30 previous similar messages [34457.045979] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [34457.047440] Lustre: Skipped 48 previous similar messages [34460.050142] LustreError: 1067372:0:(lprocfs_jobstats.c:137:job_stat_exit()) should not have any items [34460.051957] LustreError: 1067372:0:(lprocfs_jobstats.c:137:job_stat_exit()) Skipped 40 previous similar messages [34460.108661] Lustre: server umount lustre-MDT0000 complete ... [34461.559320] Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey [34461.882390] Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1 [34462.161732] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [34462.164874] LustreError: Skipped 64 previous similar messages [34462.196821] Lustre: DEBUG MARKER: modprobe -r dm-flakey [34462.987368] LustreError: 1060464:0:(ldlm_lockd.c:2521:ldlm_cancel_handler()) ldlm_cancel from 10.240.39.170@tcp arrived at 1701299931 with bad export cookie 10225034501323090856 [34462.990356] LustreError: 1060464:0:(ldlm_lockd.c:2521:ldlm_cancel_handler()) Skipped 14 previous similar messages [34463.185819] LustreError: 11-0: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.39.170@tcp failed: rc = -107 [34463.187938] LustreError: Skipped 4 previous similar messages [34469.328952] Lustre: 12617:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1701299931/real 1701299931] req@0000000037606504 x1783906650521728/t0(0) o400->MGC10.240.40.14@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1701299938 ref 1 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'kworker/u4:3.0' [34469.334831] Lustre: 12617:0:(client.c:2295:ptlrpc_expire_one_request()) Skipped 13 previous similar messages [34469.336851] LustreError: 166-1: MGC10.240.40.14@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [34469.339375] LustreError: Skipped 8 previous similar messages
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
obdfilter-survey test_3a - Timeout occurred after 576 minutes, last suite running was obdfilter-survey
Attachments
Issue Links
- mentioned in
-
Page Loading...