[LU-7020] OST_DESTROY message times out on MDS repeatedly, indefinitely - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
- llnl
- p4hc

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We are seeing a single message on one of our MDS nodes time out over and over with no apparent end in sight. The message is destined for an OST, so each time it times out the connection with the OST is assumed dead, and the MDS reconnects. Here is a sample of the timeout:

00000100:00000100:15.0:1437780834.157231:0:12825:0:(client.c:1976:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1437780188/real 1437780188]  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:Xru/2/ffffffff rc -11/-1
00000100:00000200:15.0:1437780834.157246:0:12825:0:(events.c:99:reply_in_callback()) @@@ type 6, status 0  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:Xru/2/ffffffff rc -11/-1
00000100:00000200:15.0:1437780834.157250:0:12825:0:(events.c:120:reply_in_callback()) @@@ unlink  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
00000400:00000200:15.0:1437780834.157254:0:12825:0:(lib-md.c:73:lnet_md_unlink()) Unlinking md ffff88072424e1c0
00000100:02000400:15.0:1437780834.157258:0:12825:0:(import.c:179:ptlrpc_set_import_discon()) lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.163@o2ib100) was lost; in progress operations using this service will wait for recovery to complete

Thread 12825 (ptlrpcd_14) is repeatedly timing out an opcode 6 (OST_DESTROY) with OST lsd-OST003a. Interestingly, it has the same transaction number each time. I cannot, however, see any resend of the request, so the OSS never really has an opportunity to try to reply to the message. It is especially strange that the send time is getting updated each time even though the message does not seem to be resent.

This problem has persisted through reboots of both the MDS and OSTs. The sysadmins tried failing over the OST, and the problem followed the OST to the failover OSS (that part probably isn't terribly surprising).

This has persisted for weeks now.

Currently, the servers are running lustre version 2.5.4-4chaos.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

debug_log.pilsner59.LU-7020.2015.10.15.txt.gz
837 kB
16/Oct/15 12:09 AM
debug_log.pilsner-mds1.LU-7020.2015.10.15.txt.gz
0.3 kB
16/Oct/15 12:09 AM
pilsner59.stack.out.03Nov2015.gz
89 kB
03/Nov/15 4:44 PM
pilsner59.stack.out.gz
82 kB
19/Oct/15 3:38 PM

Issue Links

is related to

LU-5242 Test hang sanity test_132, test_133: umount ost

Resolved

Activity

[LU-7020] OST_DESTROY message times out on MDS repeatedly, indefinitely

Peter Jones added a comment - 20/Jan/16 1:54 PM

Thanks Cameron

Peter Jones added a comment - 20/Jan/16 1:54 PM Thanks Cameron

Cameron Harr added a comment - 20/Jan/16 12:21 AM

Yes. It can be closed.

Cameron Harr added a comment - 20/Jan/16 12:21 AM Yes. It can be closed.

Peter Jones added a comment - 19/Jan/16 11:52 PM

That's good news Cameron. So, can we close this ticket as a duplicate of ~~LU-5242~~?

Peter Jones added a comment - 19/Jan/16 11:52 PM That's good news Cameron. So, can we close this ticket as a duplicate of LU-5242 ?

Cameron Harr added a comment - 19/Jan/16 4:37 PM

Hongchao,
These messages were symptomatic of the reported issue, following the same format and period. However, looking again, it looks like the messages stopped about 10 minutes after I copied these messages. It appears that the problem did resolve itself with the patched code and I should have waited about 22 more minutes before responding .

Cameron Harr added a comment - 19/Jan/16 4:37 PM Hongchao, These messages were symptomatic of the reported issue, following the same format and period. However, looking again, it looks like the messages stopped about 10 minutes after I copied these messages. It appears that the problem did resolve itself with the patched code and I should have waited about 22 more minutes before responding .

Hongchao Zhang added a comment - 16/Jan/16 8:49 AM

Hi Cameron,

The logs in the above seems to be normal. the precreated objects at OST will be deleted by MDT when the MDT recover the connection
to OST, the logs "deleting orphan objects from xxx to xxx ..." is just these cases, and the corresponding request code OST_CREATE(5)
is different from the request code in the issue (OST_DESTORY 6).

Is there other logs to indicate the OST_DESTROY request is resent periodically?

Thanks

Hongchao Zhang added a comment - 16/Jan/16 8:49 AM Hi Cameron, The logs in the above seems to be normal. the precreated objects at OST will be deleted by MDT when the MDT recover the connection to OST, the logs "deleting orphan objects from xxx to xxx ..." is just these cases, and the corresponding request code OST_CREATE(5) is different from the request code in the issue (OST_DESTORY 6). Is there other logs to indicate the OST_DESTROY request is resent periodically? Thanks

Cameron Harr added a comment - 14/Jan/16 6:03 PM

I updated to Lustre 2.5.5-3chaos today on the system and there appears to be no change in the behavior. Following is output taken after the system has been updated:

Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26258401 to 0x0:26258561
Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: Skipped 1 previous similar message
Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
Jan 14 09:45:20 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
Jan 14 09:45:39 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26260525 to 0x0:26260961
Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26262647 to 0x0:26262977
Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)

Cameron Harr added a comment - 14/Jan/16 6:03 PM I updated to Lustre 2.5.5-3chaos today on the system and there appears to be no change in the behavior. Following is output taken after the system has been updated: Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26258401 to 0x0:26258561 Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100) Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: Skipped 1 previous similar message Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:45:20 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100) Jan 14 09:45:39 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26260525 to 0x0:26260961 Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26262647 to 0x0:26262977 Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)

Hongchao Zhang added a comment - 13/Nov/15 2:42 PM - edited

the patch back-ported from http://review.whamcloud.com/#/c/13630/ in ~~LU-5242~~ is tracked at http://review.whamcloud.com/#/c/17154/

Hongchao Zhang added a comment - 13/Nov/15 2:42 PM - edited the patch back-ported from http://review.whamcloud.com/#/c/13630/ in LU-5242 is tracked at http://review.whamcloud.com/#/c/17154/

Christopher Morrone (Inactive) added a comment - 11/Nov/15 9:45 PM

Yes, it certainly sounds like ~~LU-5242~~ is a likely candidate for fixing this issue. It looks like something that fell through the cracks at Intel when it came time to land the backport.

Christopher Morrone (Inactive) added a comment - 11/Nov/15 9:45 PM Yes, it certainly sounds like LU-5242 is a likely candidate for fixing this issue. It looks like something that fell through the cracks at Intel when it came time to land the backport.

Cameron Harr added a comment - 11/Nov/15 4:05 PM

This is late notice, but we have planned OS upgrades on this system this morning (08:30-10:00 PST) and I will have the system offline for a small amount of time. It looks like we have some pointers to chase down, but let me know if you'd like me to do anything with that downtime.

Cameron Harr added a comment - 11/Nov/15 4:05 PM This is late notice, but we have planned OS upgrades on this system this morning (08:30-10:00 PST) and I will have the system offline for a small amount of time. It looks like we have some pointers to chase down, but let me know if you'd like me to do anything with that downtime.

Alex Zhuravlev added a comment - 11/Nov/15 2:36 PM - edited

object size is not equal transaction size. transaction size includes blocks the transaction modifies, but blocks used for user's data aren't modified, so they are not part of transaction. IOW, it's only metadata blocks (dnode, tree).

Alex Zhuravlev added a comment - 11/Nov/15 2:36 PM - edited object size is not equal transaction size. transaction size includes blocks the transaction modifies, but blocks used for user's data aren't modified, so they are not part of transaction. IOW, it's only metadata blocks (dnode, tree).

People

Assignee:: Hongchao Zhang

Reporter:: Christopher Morrone (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 18/Aug/15 9:32 PM

Updated:: 19/Feb/16 7:05 PM

Resolved:: 20/Jan/16 1:54 PM