[LU-3295] ASSERTION(list_empty(&service->srv_active_rqbds)) failed Created: 08/May/13  Updated: 09/May/14  Resolved: 09/May/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.9
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Kit Westneat (Inactive) Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL5: 2.6.18-348.1.1.el5_lustre


Attachments: File May10.kern.log.mds-2-1     File May10.kern.log.oss-2-09    
Severity: 3
Rank (Obsolete): 8161

 Description   

Recently we have been hitting an LBUG during Lustre startup. It appears to have happened before the 1.8.9, but it seems to be much more frequent since the upgrade. During startup our HA software attempts to mount all the resources in parallel. It looks like occasionally the OST attempts to contact the MGS before it is mounted, and we get messages like:

May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_config.c:372:class_setup()) setup OSS failed (-22)
May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_mount.c:483:lustre_start_simple()) OSS setup error -22
May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_mount.c:1096:server_start_targets()) failed to start OSS: -22
May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_mount.c:1672:server_fill_super()) Unable to start targets: -22
May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_mount.c:1455:server_put_super()) no obd scratch2-OST00a7
May 7 17:36:35 lfs-oss-2-16 kernel: LustreError: 8596:0:(obd_mount.c:149:server_deregister_mount()) scratch2-OST00a7 not registered
May 7 17:36:40 lfs-oss-2-16 kernel: Lustre: 19173:0:(client.c:1529:ptlrpc_expire_one_request()) @@@ Request x1431216012696932 sent from MGC10.175.31.242@o2ib4 to NID 10.175.31.242@o2ib4 5s ago has timed out (5s prior to deadline).
May 7 17:36:40 lfs-oss-2-16 kernel: req@ffff8105831f2c00 x1431216012696932/t0 o250->MGS@MGC10.175.31.242@o2ib4_0:26/25 lens 368/584 e 0 to 1 dl 1367948200 ref 1 fl Rpc:N/0/0 rc 0/0
May 7 17:36:41 lfs-oss-2-16 kernel: Lustre: server umount scratch2-OST00a7 complete

The MGS then starts, and the other OSTs on the OSS begin starting up. Then we hit:
May 7 17:37:38 lfs-oss-2-16 kernel: LustreError: 10235:0:(service.c:2041:ptlrpc_unregister_service()) ASSERTION(list_empty(&service->srv_active_rqbds)) failed
May 7 17:37:38 lfs-oss-2-16 kernel: LustreError: 10235:0:(service.c:2041:ptlrpc_unregister_service()) LBUG

My guess is that it is some kind of race between mount and umount. There is the obvious workaround of making sure that the MGS is started, and we will do that in the future, but the HA isn't perfect and sometimes it just happens that the MGS isn't there.



 Comments   
Comment by Peter Jones [ 09/May/13 ]

Hongchao

Could you please look into this one?

Thanks

Peter

Comment by Hongchao Zhang [ 10/May/13 ]

HI, could you please attach the whole logs of this issue, especially the logs around the ASSERTION, Thanks!

Comment by Kit Westneat (Inactive) [ 10/May/13 ]

I need to raise this to a Sev1, we are unable to start the filesystem due to these LBUGs

Comment by Kit Westneat (Inactive) [ 10/May/13 ]

Some more info. We were able to manually mount the OSTs and get everything working. In manually mounting, we ran into the "not registered" error on some of the OSTs. It seemed to always be the first OST to be mounted, and it seemed as if I had to lustre_rmmod in order to get it to work. We were unable to get more debugging information, but I think this helps to verify the timeline:

1) Linux HA attempts to do a parallel mount of several OSTs
2) The first OST fails due to a new bug involving registration and/or MGS communication.
3) As the first OST is going through a umount, the second OST successfully mounts
4) this triggers the LBUG in ptlrpc_unregister_service

Let me know if there is any more information we can gather.

Comment by Oleg Drokin [ 10/May/13 ]

it looks like ost threads failed to start or there were enough started already. e probably should add a bit more debug to see exactly why did it fail those threads.

--- a/lustre/ost/ost_handler.c
+++ b/lustre/ost/ost_handler.c
@@ -2109,8 +2109,10 @@ static int ost_setup(struct obd_device *obd, obd_count len, void *buf)
         }
 
         rc = ptlrpc_start_threads(obd, ost->ost_service);
-        if (rc)
+        if (rc) {
+		CERROR("Failed to start ost threads %d\n", rc);
                 GOTO(out_service, rc = -EINVAL);
+	}
 
         if (oss_num_create_threads) {
                 if (oss_num_create_threads > OSS_MAX_CREATE_THREADS)
@@ -2139,8 +2141,10 @@ static int ost_setup(struct obd_device *obd, obd_count len, void *buf)
         }
 
         rc = ptlrpc_start_threads(obd, ost->ost_create_service);
-        if (rc)
+        if (rc) {
+		CERROR("Failed to start ost create threads %d\n", rc);
                 GOTO(out_create, rc = -EINVAL);
+	}
 
         ost->ost_io_service =
                 ptlrpc_init_svc(OST_NBUFS, OST_BUFSIZE, OST_MAXREQSIZE,
@@ -2159,8 +2163,10 @@ static int ost_setup(struct obd_device *obd, obd_count len, void *buf)
         ost->ost_io_service->srv_done = ost_thread_done;
         ost->ost_io_service->srv_cpu_affinity = 1;
         rc = ptlrpc_start_threads(obd, ost->ost_io_service);
-        if (rc)
+        if (rc) {
+		CERROR("Failed to start ost io threads %d\n", rc);
                 GOTO(out_io, rc = -EINVAL);
+	}
 
         ping_evictor_start();

something like that as a first step

Comment by Kit Westneat (Inactive) [ 10/May/13 ]

Here are the decoded lustre logs. There are also some from Tuesday's downtime (May 07):
http://eu.ddn.com:8080/lustre/LU-3295-llogs.tgz

Comment by Hongchao Zhang [ 13/May/13 ]

this issue is caused by the conflict between the recovery and ptlrpc_unregister_service,
1, the recovery will queue the ptlrpc_request into "obd->obd_delayed_reply_queue" for last replay reply
2, in ptlrpc_unregister_service, all ptlrpc_request_buffer_desc are unlinked but there is still ptlrpc_request(last replay reply) in it,
then it can't be released, which triggers this issue.

by launching a kernel thread to stop the OST service after one last replay reply is queued manually, this issue can be reproduced.

there is one more question, what is the cause of failure of ost_setup? there is an CERROR in each possible failure place, ptlrpc_init_svc and
ptlrp_start_threads(will trigger CERROR if one thread is failed to start, except -EMFILE).

Hi Kit, do you specify ost_num_threads, oss_number_threads or oss_num_create_threads when you load ost.ko (which could trigger -EMFILE)?

Comment by Kit Westneat (Inactive) [ 13/May/13 ]

Hi Hongchao,

It looks like we are specifying:
options ost oss_num_threads=256

Do you think removing this line will work-around the issue?

Thanks.

Comment by Hongchao Zhang [ 15/May/13 ]

Hi, Kit

Sorry for delayed response!
Yes, it should work-around the issue by removing this line.

Comment by Kit Westneat (Inactive) [ 15/May/13 ]

Hi Hongchao,

Were there any changes between 1.8.8 (or perhaps 1.8.7) and 1.8.9 that would cause this bug to happen more often? We never used to see it until we upgraded to 1.8.9.

Thanks.

Comment by Hongchao Zhang [ 17/May/13 ]

I have checked the related codes in git, and found no trace about this issue, do you use any additional patches in your 1.8.9(especially those ported from master)?
normally, set "options ost oss_num_threads=256" can't cause "ptlrpc_start_threads" to fail with -EMFILE, could you please help to test it again with "rpctrace" and "trace"
by calling 'echo "+rpctrace +trace" /proc/sys/lnet/debug'? Thanks in advace!

Comment by Kit Westneat (Inactive) [ 17/May/13 ]

We are using WC built Lustre, no patches. The only other module option we use is options ko2iblnd peer_timeout=0.

I will try to see if I can reproduce the issue on their test system.

Comment by Kit Westneat (Inactive) [ 21/May/13 ]

Also, are there plans to write a patch to fix the conflict between the recovery and ptlrpc_unregister_service?

Thanks.

Comment by Hongchao Zhang [ 24/May/13 ]

okay, I'll try to create a patch to fix the conflict firstly, but the patch could be not enough if the reason that ptlrpc_unregister_service is triggered
in ost_setup is not clear, which could need other patch to fix it.

Comment by Hongchao Zhang [ 30/May/13 ]

status update:
the patch to fix the confict between the recovery and ptlrpc_unregister_service was created, but it encounters some problems during test and will attach it
once completing it ASAP.

Comment by Hongchao Zhang [ 31/May/13 ]

the patch is tracked at http://review.whamcloud.com/#change,6512

Generated at Sat Feb 10 01:32:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.