Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.16.0, Lustre 2.15.6
Labels:
- zfs

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

This issue was created by maloo for S Buisson <sbuisson@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/37495c46-a593-48c0-a5d8-28bc0e342fc3

test_6 failed with the following error:

dd not finished in 240 secs

This happened with ZFS backend.
In the OSS dmesg, we can see complaints about lost connection to the MDS:

[ 3066.737564] Lustre: 17933:0:(client.c:2245:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1591889766/real 1591889766]  req@ffff8c0c6e17f180 x1669214324881792/t0(0) o601->lustre-MDT0000-lwp-OST0001@10.9.6.237@tcp:23/10 lens 336/336 e 0 to 1 dl 1591889773 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'ll_ost_io00_005.0'
[ 3066.742583] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.6.237@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 3075.071151] Lustre: 17933:0:(client.c:2245:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1591889774/real 1591889774]  req@ffff8c0c6e17df80 x1669214324882240/t0(0) o601->lustre-MDT0000-lwp-OST0001@10.9.6.237@tcp:23/10 lens 336/336 e 0 to 1 dl 1591889781 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'ll_ost_io00_005.0'
[ 3075.076488] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.6.237@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 3083.079772] Lustre: 17933:0:(client.c:2245:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1591889782/real 1591889782]  req@ffff8c0c6e17c480 x1669214324882752/t0(0) o601->lustre-MDT0000-lwp-OST0001@10.9.6.237@tcp:23/10 lens 336/336 e 0 to 1 dl 1591889789 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'ll_ost_io00_005.0'
[ 3083.084896] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.6.237@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 3091.610366] Lustre: 17933:0:(client.c:2245:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1591889791/real 1591889791]  req@ffff8c0c7be6b180 x1669214324883264/t0(0) o601->lustre-MDT0000-lwp-OST0001@10.9.6.237@tcp:23/10 lens 336/336 e 0 to 1 dl 1591889798 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'ll_ost_io00_005.0'
[ 3091.615435] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.6.237@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 3098.919547] Lustre: DEBUG MARKER: lctl set_param at_max=600
[ 3099.698478] Lustre: DEBUG MARKER: dmesg
[ 3101.618918] Lustre: 17933:0:(client.c:2245:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1591889801/real 1591889801]  req@ffff8c0c5d30fa80 x1669214324883776/t0(0) o601->lustre-MDT0000-lwp-OST0001@10.9.6.237@tcp:23/10 lens 336/336 e 0 to 1 dl 1591889808 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'ll_ost_io00_005.0'
[ 3101.623866] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.6.237@tcp) was lost; in progress operations using this service will wait for recovery to complete

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-quota test_6 - dd not finished in 240 secs

Attachments

Issue Links

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

Activity

People

Assignee:: WC Triage

Reporter:: Maloo

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 12/Jun/20 7:27 AM

Updated:: 08/Jul/25 4:00 AM