[LU-1212] On MDS startup upon client connection mdt_xx threads Consume All Available CPU - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.2.0, Lustre 2.3.0, Lustre 2.1.2
Affects Version/s: Lustre 2.2.0
Labels:
None

Severity:
3
Rank (Obsolete):
4687

Description

Found during IR testing at ORNL.

On MDS startup soon after clients start hitting it, all mdt_xx threads are starting to use all cpu there is.

we tried to sysrq-t and all of them are in grow_rqbd
I checked the code and as soon as the thread is in that state, there is a unbreakable loop, that does 64*numonlinecpus(=16) = 1024 allocations of 16k in size.

the condition to enter there is racy the num posted rqbds < nbuf_group/2
so if 1000 of them would enter there at one time, we have 1000 threads doing 1024 of those allocations

we have kdump log, but it still needs to be transported.

Attachments

Issue Links

Trackbacks

Changelog 2.1 Changes from version 2.1.1 to version 2.1.2 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1....

Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....

Activity

[LU-1212] On MDS startup upon client connection mdt_xx threads Consume All Available CPU

Build Master (Inactive) added a comment - 08/Apr/12 2:17 PM

Integrated in lustre-b2_1 » x86_64,client,el5,ofa #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 2:17 PM Integrated in lustre-b2_1 » x86_64,client,el5,ofa #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 2:13 PM

Integrated in lustre-b2_1 » i686,server,el5,ofa #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 2:13 PM Integrated in lustre-b2_1 » i686,server,el5,ofa #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 2:07 PM

Integrated in lustre-b2_1 » x86_64,server,el5,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 2:07 PM Integrated in lustre-b2_1 » x86_64,server,el5,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:58 PM

Integrated in lustre-b2_1 » i686,server,el5,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:58 PM Integrated in lustre-b2_1 » i686,server,el5,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:55 PM

Integrated in lustre-b2_1 » x86_64,client,el5,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:55 PM Integrated in lustre-b2_1 » x86_64,client,el5,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:54 PM

Integrated in lustre-b2_1 » i686,server,el6,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:54 PM Integrated in lustre-b2_1 » i686,server,el6,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:53 PM

Integrated in lustre-b2_1 » x86_64,client,el6,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:53 PM Integrated in lustre-b2_1 » x86_64,client,el6,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM

Integrated in lustre-b2_1 » x86_64,server,el5,ofa #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM Integrated in lustre-b2_1 » x86_64,server,el5,ofa #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM

Integrated in lustre-b2_1 » i686,client,el5,ofa #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM Integrated in lustre-b2_1 » i686,client,el5,ofa #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM

Integrated in lustre-b2_1 » x86_64,server,el6,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:44 PM Integrated in lustre-b2_1 » x86_64,server,el6,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:42 PM

Integrated in lustre-b2_1 » i686,client,el6,inkernel #41
~~LU-1212~~ ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4)

Result = SUCCESS
Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4
Files :

lustre/ptlrpc/service.c

Build Master (Inactive) added a comment - 08/Apr/12 1:42 PM Integrated in lustre-b2_1 » i686,client,el6,inkernel #41 LU-1212 ptlrpc: ptlrpc_grow_req_bufs is racy (Revision 67b5f9305a080885c9a2a2bc08d07e4e227308e4) Result = SUCCESS Oleg Drokin : 67b5f9305a080885c9a2a2bc08d07e4e227308e4 Files : lustre/ptlrpc/service.c

People

Assignee:: Liang Zhen (Inactive)

Reporter:: Ian Colle (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 13/Mar/12 8:21 PM

Updated:: 27/Sep/12 4:54 PM

Resolved:: 08/Apr/12 3:09 PM