[LU-4832] verbose warnings logged at client umount time Created: 28/Mar/14  Updated: 19/Aug/15  Resolved: 14/May/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Minor
Reporter: Bob Glossman (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None
Environment:

sles11sp3 clients


Issue Links:
Related
Severity: 3
Rank (Obsolete): 13319

 Description   

every time I do a client unmount of a lustre filesystem I see a very verbose warning like

Mar 28 11:06:59 susesp3-3 kernel: [  105.665817] Lustre: Unmounted lustre-client
Mar 28 11:06:59 susesp3-3 kernel: [  105.667023] ------------[ cut here ]------------
Mar 28 11:06:59 susesp3-3 kernel: [  105.667031] WARNING: at fs/proc/generic.c:809 remove_proc_entry+0x22d/0x280()
Mar 28 11:06:59 susesp3-3 kernel: [  105.667032] Hardware name: VMware Virtual Platform
Mar 28 11:06:59 susesp3-3 kernel: [  105.667033] name 'nrs_tbf_quantum'
Mar 28 11:06:59 susesp3-3 kernel: [  105.667034] Modules linked in: osc(FN) mgc(FN) lustre(FN) lov(FN) mdc(FN) fid(FN) lmv(FN) fld(FN) ksocklnd(FN) ptlrpc(FN) obdclass(FN) lnet(FN) sha512_generic(FN) sha1_generic(FN) md5(FN) crc32c(FN) libcfs(FN) lp(FN) binfmt_misc(FN) snd_pcm_oss(FN) snd_mixer_oss(FN) snd_seq_midi(FN) snd_seq_midi_event(FN) snd_seq(FN) edd(FN) rdma_ucm(FN) rdma_cm(FN) iw_cm(FN) ib_addr(FN) ib_srp(FN) scsi_transport_srp(FN) scsi_tgt(FN) ib_ipoib(FN) ib_cm(FN) ib_uverbs(FN) ib_umad(FN) iw_cxgb3(FN) cxgb3(FN) mdio(FN) mlx4_en(FN) mlx4_ib(FN) ib_sa(FN) mlx4_core(FN) ib_mthca(FN) ib_mad(FN) ib_core(FN) mperf(FN) acpiphp(FN) microcode(FN) fuse(FN) loop(FN) dm_mod(FN) snd_ens1371(FN) gameport(FN) snd_rawmidi(FN) snd_seq_device(FN) ipv6(FN) snd_ac97_codec(FN) ipv6_lib(FN) btusb(FN) bluetooth(FN) ac97_bus(FN) snd_pcm(FN) snd_timer(FN) ppdev(FN) rfkill(FN) snd(FN) vmw_balloon(FN) usbhid(FN) parport_pc(FN) hid(FN) e1000(FN) floppy(FN) sr_mod(FN) soundcore(FN) rtc_cmos(FN) crc16(FN) parport(FN) i2c_piix4(FN) 
Mar 28 11:06:59 susesp3-3 kernel: sg(FN) shpchp(FN) pciehp(FN) pcspkr(FN) snd_page_alloc(FN) mptctl(FN) acpi_memhotplug(FN) intel_agp(FN) pci_hotplug(FN) cdrom(FN) i2c_core(FN) container(FN) button(FN) ac(FN) intel_gtt(FN) ext3(FN) jbd(FN) mbcache(FN) uhci_hcd(FN) ehci_hcd(FN) sd_mod(FN) crc_t10dif(FN) usbcore(FN) usb_common(FN) processor(FN) thermal_sys(FN) hwmon(FN) scsi_dh_hp_sw(FN) scsi_dh_rdac(FN) scsi_dh_alua(FN) scsi_dh_emc(FN) scsi_dh(FN) vmw_pvscsi(FN) vmxnet3(FN) ata_generic(FN) ata_piix(FN) ahci(FN) libahci(FN) libata(FN) mptspi(FN) mptscsih(FN) mptbase(FN) scsi_transport_spi(FN) scsi_mod(FN)
Mar 28 11:06:59 susesp3-3 kernel: [  105.667091] Supported: No, Unsupported modules are loaded
Mar 28 11:06:59 susesp3-3 kernel: [  105.667094] Pid: 4811, comm: obd_zombid Tainted: GF    U     N  3.0.101-0.18-default #1
Mar 28 11:06:59 susesp3-3 kernel: [  105.667095] Call Trace:
Mar 28 11:06:59 susesp3-3 kernel: [  105.667103]  [<ffffffff81004935>] dump_trace+0x75/0x310
Mar 28 11:06:59 susesp3-3 kernel: [  105.667107]  [<ffffffff8145fcd3>] dump_stack+0x69/0x6f
Mar 28 11:06:59 susesp3-3 kernel: [  105.667113]  [<ffffffff8106063b>] warn_slowpath_common+0x7b/0xc0
Mar 28 11:06:59 susesp3-3 kernel: [  105.667115]  [<ffffffff81060735>] warn_slowpath_fmt+0x45/0x50
Mar 28 11:06:59 susesp3-3 kernel: [  105.667118]  [<ffffffff811bf14d>] remove_proc_entry+0x22d/0x280
Mar 28 11:06:59 susesp3-3 kernel: [  105.667159]  [<ffffffffa0a8fdc7>] ptlrpc_service_nrs_cleanup+0x97/0xc0 [ptlrpc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667224]  [<ffffffffa0a57bcd>] ptlrpc_unregister_service+0xdd/0x1f0 [ptlrpc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667264]  [<ffffffffa0a22a09>] ldlm_cleanup+0x379/0x630 [ptlrpc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667292]  [<ffffffffa0a22de5>] ldlm_put_ref+0x125/0x1a0 [ptlrpc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667320]  [<ffffffffa0a123fa>] client_obd_cleanup+0xda/0x2e0 [ptlrpc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667329]  [<ffffffffa0df8938>] mgc_cleanup+0x38/0xe0 [mgc]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667357]  [<ffffffffa07f4f2f>] class_decref+0x11f/0x550 [obdclass]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667387]  [<ffffffffa07d67fe>] class_export_destroy+0xfe/0x480 [obdclass]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667408]  [<ffffffffa07d6c4d>] obd_zombie_impexp_cull+0xcd/0x1e0 [obdclass]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667429]  [<ffffffffa07d6db5>] obd_zombie_impexp_thread+0x55/0x1a0 [obdclass]
Mar 28 11:06:59 susesp3-3 kernel: [  105.667438]  [<ffffffff810828a6>] kthread+0x96/0xa0
Mar 28 11:06:59 susesp3-3 kernel: [  105.667441]  [<ffffffff8146bb64>] kernel_thread_helper+0x4/0x10
Mar 28 11:06:59 susesp3-3 kernel: [  105.667444] ---[ end trace b99a6b85a455c4ed ]---

This is coming from the following lustre code in ptlrpc/nrs_tbf.c

void nrs_tbf_lprocfs_fini(struct ptlrpc_service *svc)
{
        if (svc->srv_procroot == NULL)
                return;

        lprocfs_remove_proc_entry("nrs_tbf_quantum", svc->srv_procroot);
}

I'm certain this error is only seen in SLES kernels because the SLES version of the kernel API remove_proc_entry() is much more verbose than the RHEL one.
In SLES11SP3 the error return in remove_proc_entry() is

        if (!de) {
                WARN(1, "name '%s'\n", name);
                return;
        }

In RHEL6.5 it is

        if (!de)
                return;

I'm pretty sure that bad call to lprocfs_remove_proc_entry() is always happening, but it's only noisy & logs warnings in SLES.



 Comments   
Comment by Jodi Levi (Inactive) [ 01/Apr/14 ]

James,
Is this something that you are working on?
Thank you!

Comment by James A Simmons [ 01/Apr/14 ]

This is good to know. LU-3319 hasn't got around to handling the NRS TBF properly yet so this is something to keep in mind.

Comment by Jeff Mahoney [ 30/Apr/14 ]

This looks like it's because the file "nrs_tbf_rule" was added and we're trying to remove "nrs_tbf_quantum"

It's a bug in the original commit: 33e35c0bf2 (LU-3558 ptlrpc: Add the NRS TBF policy)

Comment by Bob Glossman (Inactive) [ 06/May/14 ]

while there may be a more complete rework of lproc handling for LU-3319 coming along later as James says, just fixing the flaw Jeff pointed out is worth the trouble in the meantime.

Pushing a patch to just correct the bogus string:
http://review.whamcloud.com/10241

Comment by James A Simmons [ 07/May/14 ]

Thank you for fixing this. I didn't get a change to get to this.

Comment by James A Simmons [ 14/May/14 ]

Patch landed to master. Ticket can be closed.

Comment by Peter Jones [ 14/May/14 ]

Landed for 2.6

Generated at Sat Feb 10 01:46:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.