[LU-11693] Soft lockups on Lustre clients Created: 22/Nov/18  Updated: 28/Feb/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.2
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Campbell Mcleay (Inactive) Assignee: Jian Yu
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File build.log    
Issue Links:
Duplicate
duplicates LU-11391 soft lockup in ldlm_prepare_lru_list() Open
duplicates LU-9230 soft lockup on v2.9 Lustre clients (l... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We get quite a few soft lockups on our Lustre gateways (Lustre clients that export Lustre filesystems over NFS). Example:

Nov 13 00:26:06 foxtrot2 kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [nfsd:11973]
Nov 13 00:26:06 foxtrot2 kernel: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [rsync:36079]
Nov 13 00:26:06 foxtrot2 kernel: Modules linked in: vfat fat dm_service_time mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptb
ase nfsv3 nfs fscache osc(OE) mgc(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE)
dell_rbu libcfs(OE) bonding sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel iTCO_wdt iTCO_vendor_support kv
m joydev dcdbas irqbypass sg shpchp ipmi_si ipmi_devintf ipmi_msghandler lpc_ich mei_me mei acpi_power_meter acpi_pad nfsd auth_rpcgss
nfs_acl lockd grace binfmt_misc ip_tables xfs sd_mod crc_t10dif crct10dif_generic 8021q garp stp llc mrp mgag200 i2c_algo_bit drm_kms
_helper scsi_transport_iscsi bnx2x syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32_pclmul cr
c32c_intel ahci drm ghash_clmulni_intel
Nov 13 00:26:06 foxtrot2 kernel: libahci aesni_intel dm_multipath libata lrw gf128mul glue_helper ablk_helper cryptd megaraid_sas i2c_
core ptp pps_core mdio libcrc32c wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: usb_storage]
Nov 13 00:26:06 foxtrot2 kernel: CPU: 1 PID: 36079 Comm: rsync Tainted: G W OE ------------ 3.10.0-693.5.2.el7_lustre.x86_6
4 #1
Nov 13 00:26:06 foxtrot2 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.5.4 01/22/2016
Nov 13 00:26:06 foxtrot2 kernel: task: ffff883ff8a04f10 ti: ffff8815a1200000 task.ti: ffff8815a1200000
Nov 13 00:26:06 foxtrot2 kernel: RIP: 0010:[<ffffffff810fa332>] [<ffffffff810fa332>] native_queued_spin_lock_slowpath+0x112/0x1e0
Nov 13 00:26:06 foxtrot2 kernel: RSP: 0018:ffff8815a1203700 EFLAGS: 00000246
Nov 13 00:26:06 foxtrot2 kernel: RAX: 0000000000000000 RBX: ffff883fff017880 RCX: 0000000000090000
Nov 13 00:26:06 foxtrot2 kernel: RDX: ffff883fff4d7880 RSI: 0000000001390101 RDI: ffff881ff99da818
Nov 13 00:26:06 foxtrot2 kernel: RBP: ffff8815a1203700 R08: ffff883fff017880 R09: 0000000000000000
Nov 13 00:26:06 foxtrot2 kernel: R10: 0004c5dab524ba0b R11: 0000000000000000 R12: 0004c5dab524ba0b
Nov 13 00:26:06 foxtrot2 kernel: R13: 0000000000000000 R14: 0004c5dab39dc857 R15: ffff8815a12036e8
Nov 13 00:26:06 foxtrot2 kernel: FS: 00007f0ff1094740(0000) GS:ffff883fff000000(0000) knlGS:0000000000000000
Nov 13 00:26:06 foxtrot2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 13 00:26:06 foxtrot2 kernel: CR2: 00007fd6cb1e9000 CR3: 000000163eff9000 CR4: 00000000001407e0
Nov 13 00:26:06 foxtrot2 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 13 00:26:06 foxtrot2 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 13 00:26:06 foxtrot2 kernel: Stack:
Nov 13 00:26:06 foxtrot2 kernel: ffff8815a1203710 ffffffff8169e6bf ffff8815a1203720 ffffffff816abbf0
Nov 13 00:26:06 foxtrot2 kernel: ffff8815a12037a0 ffffffffc0c2d421 ffff8815a12037e0 ffffffffc0c2ba60
Nov 13 00:26:06 foxtrot2 kernel: 0000000000000000 00000161000ab602 0004c5dab524ba0b ffff88130fb65c00
Nov 13 00:26:06 foxtrot2 kernel: Call Trace:
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff8169e6bf>] queued_spin_lock_slowpath+0xb/0xf
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff816abbf0>] _raw_spin_lock+0x20/0x30
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c2d421>] ldlm_prepare_lru_list+0x361/0x4e0 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c2ba60>] ? ldlm_cancel_aged_no_wait_policy+0x70/0x70 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c30c5a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c30e8e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c31128>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc07c67a3>] mdc_intent_getattr_pack.isra.15+0x93/0x280 [mdc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc07c8f3b>] mdc_enqueue_base+0x9fb/0x18f0 [mdc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff810c45a3>] ? try_to_wake_up+0x183/0x340
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc07ca6cb>] mdc_intent_lock+0x26b/0x520 [mdc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c66243>] ? reply_in_callback+0x143/0x5e0 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0972e30>] ? ll_invalidate_negative_children+0x1d0/0x1d0 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0c2c7a0>] ? ldlm_expired_completion_wait+0x240/0x240 [ptlrpc]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0910e4f>] lmv_intent_lock+0x5cf/0x1b50 [lmv]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff810b8a01>] ? in_group_p+0x31/0x40
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc09738c5>] ? ll_i2suppgid+0x15/0x40 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0973914>] ? ll_i2gids+0x24/0xb0 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff81114b02>] ? from_kgid+0x12/0x20
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0972e30>] ? ll_invalidate_negative_children+0x1d0/0x1d0 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0974feb>] ll_lookup_it+0x29b/0xee0 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff810c8f28>] ? __enqueue_entity+0x78/0x80
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffffc0976fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff8120b3dd>] lookup_real+0x1d/0x50
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff8120bcb2>] __lookup_hash+0x42/0x60
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff816a13e2>] lookup_slow+0x42/0xa7
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff8120f25b>] path_lookupat+0x77b/0x7b0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff811df623>] ? kmem_cache_alloc+0x193/0x1e0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff81211c9f>] ? getname_flags+0x4f/0x1a0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff8120f2bb>] filename_lookup+0x2b/0xc0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff81212e37>] user_path_at_empty+0x67/0xc0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff81212ea1>] user_path_at+0x11/0x20
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff812063e3>] vfs_fstatat+0x63/0xc0
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff812069b1>] SYSC_newlstat+0x31/0x60
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff81206c3e>] SyS_newlstat+0xe/0x10
Nov 13 00:26:06 foxtrot2 kernel: [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b

 



 Comments   
Comment by Campbell Mcleay (Inactive) [ 22/Nov/18 ]

Versions are:

lustre-2.10.2-1.el7

kernel-3.10.0-693.5.2.el7_lustre

Comment by Andreas Dilger [ 22/Nov/18 ]

This looks like a duplicate of a previously-reported issue. Please try:

# lctl set_param ldlm.namespaces.*.lru_size=10000

On these clients to see if this avoids the issue?

Comment by Campbell Mcleay (Inactive) [ 22/Nov/18 ]

Hi Andreas,

I think we set this already, when I run:

lctl get_param 'ldlm.namespaces.*.lru_size'

I get:

ldlm.namespaces.MGC10.21.22.10@tcp.lru_size=10000

where 10.21.22.10 is our MDS

Thanks,

Campbell

Comment by Campbell Mcleay (Inactive) [ 22/Nov/18 ]

I should mention it prints out a list of values for all the OSTs and the MDT which are larger than this value, e.g., 

ldlm.namespaces.foxtrot-MDT0000-mdc-ffff883ff9b89000.lru_size=246531
ldlm.namespaces.foxtrot-OST0000-osc-ffff883ff9b89000.lru_size=17716
ldlm.namespaces.foxtrot-OST0001-osc-ffff883ff9b89000.lru_size=17472
ldlm.namespaces.foxtrot-OST0002-osc-ffff883ff9b89000.lru_size=17561
ldlm.namespaces.foxtrot-OST0003-osc-ffff883ff9b89000.lru_size=17628
ldlm.namespaces.foxtrot-OST0004-osc-ffff883ff9b89000.lru_size=17492
ldlm.namespaces.foxtrot-OST0005-osc-ffff883ff9b89000.lru_size=17555
ldlm.namespaces.foxtrot-OST0006-osc-ffff883ff9b89000.lru_size=17334
ldlm.namespaces.foxtrot-OST0007-osc-ffff883ff9b89000.lru_size=17511
ldlm.namespaces.foxtrot-OST0008-osc-ffff883ff9b89000.lru_size=17534
ldlm.namespaces.foxtrot-OST0009-osc-ffff883ff9b89000.lru_size=17689
ldlm.namespaces.foxtrot-OST000a-osc-ffff883ff9b89000.lru_size=17609
ldlm.namespaces.foxtrot-OST000b-osc-ffff883ff9b89000.lru_size=17144
ldlm.namespaces.foxtrot-OST000c-osc-ffff883ff9b89000.lru_size=17438

etc

in case that is important

Comment by Andreas Dilger [ 22/Nov/18 ]

The patch https://review.whamcloud.com/33130 "LU-9230 ldlm: speed up preparation for list of lock cancel" should resolve the CPU contention in ldlm_prepare_lru_list() that you are seeing here.

This patch has been landed to the master branch (for 2.12) for over 6 months and has seen a lot of testing already. It is in the process of landing to the b2_10 branch for the next 2.10.x release, so there is not yet a release package available with this patch included. It is a client-only patch, so could be installed on the affected nodes without taking down the whole system.

Comment by Campbell Mcleay (Inactive) [ 05/Dec/18 ]

Thanks Andreas. I was looking for a compatibility matrix to see whether 2.12 on the client is compatible with 2.10 on the server. Is there something available online that shows compatibility of releases? 

regards,

Campbell

Comment by Campbell Mcleay (Inactive) [ 05/Dec/18 ]

Actually, I see that 2.12 is not listed as supported by Whamcloud. I'll patch it then.

Comment by Peter Jones [ 05/Dec/18 ]

Campbell

2.12 is very close to release - we tagged the first RC yesterday. So, upon GA, another option to patching will be to use a 2.12 client as this interoperates with 2.10.x servers ok.

Peter

Comment by Campbell Mcleay (Inactive) [ 05/Dec/18 ]

Thanks Peter. I applied the patches for both the crashes and the lockups to 2.10.2-1 source and it fails to build. Can you tell me what I need to do here? Attached is the build log build.log

Comment by Peter Jones [ 05/Dec/18 ]

Jian

Could you please assist Campbell in porting the fix for LU-9230 to 2.10.2?

thanks

Peter

Comment by Campbell Mcleay (Inactive) [ 05/Dec/18 ]

I'm also happy to apply the patches to a later supported release if that is easier...

Comment by Peter Jones [ 05/Dec/18 ]

Jian

The port already exists to b2_10 - https://review.whamcloud.com/#/c/33130/ - but does it need refreshing to apply to the tip of b2_10 or else to the 2.10.5 release?

Peter

Comment by Jian Yu [ 05/Dec/18 ]

Hi Peter,
I just checked that the patch can be applied cleanly to both the tip of b2_10 and 2.10.5 release.

Comment by Jian Yu [ 05/Dec/18 ]

Hi Campbell,
Patch https://review.whamcloud.com/33130 is now on the tip of Lustre b2_10 branch. Please find the el7 builds in https://build.whamcloud.com/job/lustre-reviews/60456/.

Comment by Campbell Mcleay (Inactive) [ 06/Dec/18 ]

Thanks Jian. I still have to add a patch for a kernel panic issue (LU-11692), so might grab the src rpm for 2.10.5 and try to patch that.

-Campbell

Comment by Campbell Mcleay (Inactive) [ 06/Dec/18 ]

2.10.5 fails to build. Should I send the build log?

Comment by Andreas Dilger [ 06/Dec/18 ]

Campbell, I'm not sure what build problem you are seeing (we build this branch daily), but I've cherry-picked the LU-11692 patch on top of 33130. It looks like the builders are a bit backed up, but there should be a link to a build reported in https://review.whamcloud.com/33798 in a couple of hours. Feel free to attach your build logs here, in case it is a trivial problem to fix.

Comment by Campbell Mcleay (Inactive) [ 06/Dec/18 ]

Hi Andreas,

I'm doing something wrong here, I cloned git://git.whamcloud.com/fs/lustre-release.git and checked out the b2_10 branch, but the files are unpatched and I'm not quite sure how to add that patch via git. I can't find it to cherry-pick it. Or can I just add the patches manually via diff and patch? I was doing it this way before but the build fails (whereas an unpatched tree compiles fine). Sorry for my ignorance here.

regards,

Campbell

 

Comment by Jian Yu [ 06/Dec/18 ]

Hi Campbell,
Build https://build.whamcloud.com/job/lustre-reviews/60480/ in https://review.whamcloud.com/33798 is ready. It contains both the patches for LU-11693/LU-9230 and LU-11692/LU-11647 applied on the tip of Lustre b2_10 branch (tag 2.10.6-RC3).

Comment by Andreas Dilger [ 07/Dec/18 ]

Campbell, what process are you using to build, and what files are "unpatched"? I'd recommend to follow e.g. https://wiki.whamcloud.com/pages/viewpage.action?pageId=52104622 or http://wiki.lustre.org/Compiling_Lustre if you've never done this before. At its simplest, doing "sh autogen.sh; ./configure; make rpms" is all that is needed, once you have the kernel source RPMs but it can become more complex if you are using OFED, ZFS, etc.

As Jian wrote, it is a lot easier to use a pre-built package if that has the features you need.

Comment by Campbell Mcleay (Inactive) [ 07/Dec/18 ]

Hi Andreas,

I cloned the lustre repo and then checked the b2_10 branch. I then ran an autogen, copied the spec file to my rpmbuild tree and tarred the source up and copied it to rpmbuild/SOURCES. I was expecting the b2_10 to already be patched but a comparison showed it hadn't been. I created a patch file from a recursive diff and then modify the spec file to apply that patch. I then built a source rpm and tried an rpm rebuild. I was getting build errors, e.g., 

 /u/cmcl/rpmbuild/BUILD/lustre-2.10.2/lustre/include/lustre_lib.h:357:9: error: implicit declaration of function 'is_bl_done' [-Werror=implicit-function-declaration]
 struct l_wait_info *__info = (info); \
 ^
/u/cmcl/rpmbuild/BUILD/lustre-2.10.2/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.c:2330:3: note: in expansion of macro 'l_wait_event'
 l_wait_event(lock->l_waitq, is_bl_done(lock), &lwi)

I'm doing something wrong and/or in an overly complicated way. I thought the b2_10 branch would have already been patched.
I'd seen the whamcloud wiki page you'd mentioned but thought that was for server rather than client. The wiki.lustre.org I hadn't seen.
Anyway, I found a build on https://review.whamcloud.com/33798 linked by Jian which has the patches in it, so I'll build from that. Sorry for wasting your time with this but hopefully I'll be on the right track from here on in.

Cheers,

Campbell

Comment by Peter Jones [ 07/Dec/18 ]

Glad to hear that you've got this sorted out. Let us know whether the fix works as expected.

Comment by Campbell Mcleay (Inactive) [ 07/Dec/18 ]

I've built the rpms fine but I have another question: the client has the lustre kernel package installed (I am told it was installed as the lustre kernel has better performance than a vanilla kernel), which provides the fs and net kernel modules. The kmod-lustre-client package provides the kernel modules, though it installs them in /lib/modules/`uname -r`/extra/lustre-client rather than /lib/modules/`uname -r`/extra/lustre. Will this cause any kind of issue if both are installed, or is it better to install e.g., a vanilla kernel and rebuild the packages against this?

Thanks,

Campbell

Comment by Jian Yu [ 07/Dec/18 ]

Hi Campbell,
Lustre client is patchless, which means while building Lustre codes for client, we do not need to patch Linux vendor or vanilla kernel. All of the regression testings were performed on patchless Lustre clients, so we suggest to use vendor kernel.

Comment by Peter Jones [ 07/Dec/18 ]

Campbell

Even the servers only need to be patched if you are using the project quotas feature. The patches that gave performance improvements in past versions have now been upstreamed and many customers prefer the simplified admin over project quotas...

Peter

Comment by Campbell Mcleay (Inactive) [ 28/Feb/19 ]

Just some feedback: got some soft lockups on one of our clients, though it only happened once however. The other clients have been fine.

Generated at Sat Feb 10 02:46:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.