[LU-1687] Unloading lustre modules and reloading again leaves MDS with an empty /proc/fs/lustre Created: 27/Jul/12  Updated: 19/Apr/19  Resolved: 06/Aug/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.2
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Jay Lan (Inactive) Assignee: Lai Siyao
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

https://github.com/jlan/lustre-nas/tree/nas-2.1.2
sanity-quota test_32
mds: service337
oss1: service261
oss2: service262
client: service331, service332


Attachments: File nas-config.sh.rhel62.212     File nas-make.sh.rhel62.212     File sanity-quota.test_32.tgz    
Issue Links:
Related
Severity: 3
Rank (Obsolete): 6179

 Description   

The sanity-quota test_32 failed in my testing.
After modules were unloaded and reloaded, the mds was left with an empty /proc/fs/lustre/. No file or subdirs under that directories. I had to reboot the mds and the clients to get them back. There were no error messages on service337 (mds) about problems of unloading and loading modules.

The problem is reproducible. Fortunately, we do not need to perform this sequence of operation on mds often.



 Comments   
Comment by Peter Jones [ 27/Jul/12 ]

Lai

Could you please look into this one?

Thanks

Peter

Comment by Lai Siyao [ 30/Jul/12 ]

I can't reproduce in my setup, I'll look more into debuglog and code to find the cause.

Comment by Jay Lan (Inactive) [ 30/Jul/12 ]

Fortunately I am still able to reproduce. Let me know what I can do to help debug this problem.

BTW, the source of our lustre source can be git cloned from
https://github.com/jlan/lustre-nas/commits/nas-2.1.2

Comment by Lai Siyao [ 31/Jul/12 ]

I can't build lustre against your git code because LUSTRE_KERNEL_VERSION is undefined. Git log shows it's removed in commit c2751b31e55518d1791cd5b87adc842f4fbbee83, could you help verify it? And if the code can be built on your system, could you output /proc/fs/lustre/version?

Comment by Jay Lan (Inactive) [ 31/Jul/12 ]

I checked again, the commit c2751b3 is in nas-2.1.2 branch. It was not removed or reversed.

service337 ~ # cat /proc/fs/lustre/version
lustre: 2.1.2
kernel: 2.6.32-220.4.1.el6.20120607.x86_64.lustre212
build: 2nasS_ofed154
service337 ~ #

I built the 2.6.32-220.4.1.el6 kernel with kernel_patches for el6 from the 2.1.2 branch. Your kernel can be named different. Also, I used 1.5.4.1 ofa_kernel modules.

I hereby attach two scripts file I used for my build: nas-config.sh.rhel62.212 and nas-make.sh.rhel62.212.
They are for your reference. You should adjust for your target system.

Comment by Lai Siyao [ 01/Aug/12 ]

Yes, I can compile with your script, but this test still passes on my system. After this test fails, could you successfully mount lustre on MDS?

Comment by Jay Lan (Inactive) [ 01/Aug/12 ]

I am able to reproduce "good" and "bad" cases without running sanity-quota test_32.

After a lustre system has been set up, the "good" case operation sequence will be:

  1. #Start to tear down mds
    #1 lustre_rmmod
    #2 umount /mnt/mds1
  2. #Now to recover mds
    #1 modprobe lustre
    #2 mount -t lustre -o errors=panic,acl,noextents /dev/sdb1 /mnt/mds1

The filesystem will be in good shape and usable.

If I 'umount /mnt/mds1' before run 'lustre_rmmod', the the empty /proc/fs/lustre problem will happen after I do the same mds recovery opearations. The mount command will return and 'mount' will show mds1 mounted. However, the filesystem is not usable.

Can you try the operation sequence and let me know if you can reproduce the problem?

Comment by Lai Siyao [ 01/Aug/12 ]

No, I can't reproduce here. BTW could you explain why you need do mds recovery? IMO it's shutdown and started up normally in your case.

Comment by Jay Lan (Inactive) [ 01/Aug/12 ]

Ah, I meant to say restart.

Somehow the restart after shutdown does not behave the same way as the initial start. I will do more debugging tomorrow.

Comment by Jay Lan (Inactive) [ 02/Aug/12 ]

I put in printk to lprocfs_seq_create():

int lprocfs_seq_create(cfs_proc_dir_entry_t *parent, char *name, mode_t mode,
struct file_operations *seq_fops, void *data)
{
struct proc_dir_entry *entry;
ENTRY;

LPROCFS_WRITE_ENTRY();
entry = create_proc_entry(name, mode, parent);
if (entry)

{ entry->proc_fops = seq_fops; entry->data = data; }

LPROCFS_WRITE_EXIT();

if (entry == NULL)

{ printk("lprocfs_seq_create: failed to create %s\n", name); RETURN(-ENOMEM); }

else
printk("lprocfs_seq_create: successfully created %s\n", name);

RETURN(0);
}

And the syslog showed:
Aug 2 15:42:32 kern:info:service337 Lustre: Lustre: Build Version: 2.1nasS_ofed154
Aug 2 15:42:32 kern:warning:service337 class_procfs_init: registering /proc/fs/lustre
Aug 2 15:42:32 kern:warning:service337 lprocfs_seq_create: successfully created devices
Aug 2 15:42:32 kern:warning:service337 class_procfs_init: successfully created lustre/devices

So the lprocfs_seq_create() thought the /proc/fs/devices was created successfully. Yet, `ls /proc/fs/lustre` returned empty. This is weird.

I will continue look into this.

Comment by Jay Lan (Inactive) [ 02/Aug/12 ]

Oh, in addition to the changes to lproc_seq_create(), I also made changes to class_procfs_init(). Without this, the syslog output I quoted above did not make sense.

diff -git a/lustre/obdclass/linux/linux-module.c b/lustre/obdclass/linux/linux
index 06cda1f..05b0390 100644
— a/lustre/obdclass/linux/linux-module.c
+++ b/lustre/obdclass/linux/linux-module.c
@@ -421,13 +421,17 @@ int class_procfs_init(void)
int rc;
ENTRY;

+ printk("class_procfs_init: registering /proc/fs/lustre\n");
obd_sysctl_init();
proc_lustre_root = lprocfs_register("fs/lustre", NULL,
lprocfs_base, NULL);
rc = lprocfs_seq_create(proc_lustre_root, "devices", 0444,
&obd_device_list_fops, NULL);

  • if (rc)
    + if (rc) { + printk("class_procfs_init: Failed to add lustre/devices, rc=%d\ CERROR("error adding /proc/fs/lustre/devices file\n"); + }

    else
    + printk("class_procfs_init: successfully created lustre/devices\
    #else
    ENTRY;
    #endif

Comment by Jay Lan (Inactive) [ 03/Aug/12 ]

In the test case, when /proc/fs/lustre appeared to have been removed, it actually not.

service337 /proc/fs # ls
fscache jbd2 nfs nfsd nfsfs
service337 /proc/fs # ls -lid lustre
4026532497 dr-xr-xr-x 17 root root 0 Aug 3 16:30 lustre

You need to use 'ls -lid' to see it. So when next time we restart mds, another /proc/fs/lustre was created (with a different inode number.) All other inodes were created succssfully under the new /proc/fs/lustre. Unfortunately the system can not see it.

What caused the original /proc/fs/lustre to hang around? From lustre's perspective, it was completed:
Aug 3 16:40:27 kern:warning:service337 lprocfs_remove: removing lustre, parent=fs
Aug 3 16:40:27 kern:warning:service337 about to remove lustre
Aug 3 16:40:27 kern:warning:service337 removing devices from lustre
Aug 3 16:40:27 kern:warning:service337 removing health_check from lustre
Aug 3 16:40:27 kern:warning:service337 removing pinger from lustre
Aug 3 16:40:27 kern:warning:service337 removing version from lustre
Aug 3 16:40:27 kern:warning:service337 removing lustre from fs

Systemtap into kernel showed:
remove_proc_entry: lustre/devices
free_proc_entry: devices, inode=4026532501
remove_proc_entry: lustre/health_check
free_proc_entry: health_check, inode=4026532500
remove_proc_entry: lustre/pinger
free_proc_entry: pinger, inode=4026532499
remove_proc_entry: lustre/version
remove_proc_entry: fs/lustre

So, /proc/fs/lustre/version was not removed for some reason. Function remove_proc_entry() did not call free_proc_entry() in the case of lustre/version. Consequently, /proc/lustre can not be removed I suppose.

service337 /proc/fs # ls -lid lustre/version
4026532498 rrr- 1 root root 0 Aug 3 16:30 lustre/version

Why? I will investigate more...

Comment by Jay Lan (Inactive) [ 06/Aug/12 ]

service337 /proc/fs # fuser lustre/version
lustre/version: 4937
service337 /proc/fs # ps -ef |grep 4937
root 4937 4933 0 Aug03 ? 00:00:01 perl /var/lib/pcp/pmdas/lustre/pmdalustre.pl
root 28872 28473 0 10:35 pts/0 00:00:00 grep 4937
service337 /proc/fs #

Please close this ticket. The problem was caused by /proc/fs/lustre/version being used by a NASA admin script.

Comment by Peter Jones [ 06/Aug/12 ]

ok thanks Jay!

Generated at Sat Feb 10 01:18:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.