[LU-6619] lustre/obdclass does not get cleared while stopping/cleanup Lustre if setup had additional MDT present Created: 19/May/15  Updated: 28/Feb/20  Resolved: 28/Feb/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Paramita varma (Inactive) Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: mdt
Environment:

Scientific Linux release 6.6 (Carbon)


Attachments: Text File dmesg-proc-cleaning.txt     Text File log-message-with-disk-device-MDT.txt     Text File log-message-with-disk-device-MDT.txt     Text File proc-cleaning-log-messages.txt    
Epic/Theme: Lustre-2.5.2, test
Severity: 3
Epic: client, metadata, mount, server, test
Project: Test Infrastructure
Rank (Obsolete): 9223372036854775807

 Description   

Hi,

Pre-requisite to reproduce this bug :

1. A single Scientific Linux (release 6.6) VM, with min 1 GB memory and
50GB disk space.
2. A lustre setup 2.7.51 up and running on the above VM, in my case all
lustre components are configured on the same VM .
3. I have added 1 extra MDT of 15 GB to the lustre setup, the MDT was
created on loop device.
===================================

  1. Steps to reproduce the issue :
    ===================================
    1. run dd command to generate some IO on the lustre filesystem .
    ( dd if=/dev/zero of=/mnt/lustre/test bs=512M count=10).
    2. Once IOs are completed , stop Lustre filesystem , i had executed
    lustrecleanup.sh script (../lustre-release/lustre/tests/llmountcleanup.sh)
    to unmount/stop the lustre.
    3. After the unmount completes, lustre prints error message on the
    terminal :module lustre/obdclass stil loaded
    =======================================
    Command prompt trace :
    =============================
    [root@localhost ~]# sh /var/lib/jenkins/jobs/Lustre-New-Test/workspace/default/lustre-release/lustre/tests/ llmountcleanup.sh
    Stopping clients: localhost /mnt/lustre (opts:-f)
    Stopping client localhost /mnt/lustre opts:-f
    Stopping clients: localhost /mnt/lustre2 (opts:-f)
    Stopping /mnt/mds1 (opts:-f) on localhost
    Stopping /mnt/ost1 (opts:-f) on localhost
    Stopping /mnt/ost2 (opts:-f) on localhost
    2 UP mgc MGC192.168.102.13@tcp 9918b9be-ec01-ce40-5dc1-d4ebb297e839 5
    3 UP mds MDS MDS_uuid 3
    23 UP osd-ldiskfs lustre-MDT0001-osd lustre-MDT0001-osd_UUID 9
    24 UP lod lustre-MDT0001-mdtlov lustre-MDT0001-mdtlov_UUID 4
    25 UP mdt lustre-MDT0001 lustre-MDT0001_UUID 5
    26 UP mdd lustre-MDD0001 lustre-MDD0001_UUID 4
    27 UP osp lustre-MDT0000-osp-MDT0001 lustre-MDT0001-mdtlov_UUID 5
    28 UP osp lustre-OST0000-osc-MDT0001 lustre-MDT0001-mdtlov_UUID 5
    29 UP osp lustre-OST0001-osc-MDT0001 lustre-MDT0001-mdtlov_UUID 5
    30 UP lwp lustre-MDT0000-lwp-MDT0001 lustre-MDT0000-lwp-MDT0001_UUID 5
    Modules still loaded: ************
    lustre/osp/osp.o lustre/lod/lod.o lustre/mdt/mdt.o lustre/mdd/mdd.o ldiskfs/ldiskfs.o lustre/quota/lquota.o lustre/lfsck/lfsck.o lustre/mgc/mgc.o lustre/fid/fid.o lustre/fld/fld.o lustre/ptlrpc/ptlrpc.o lustre/obdc lass/obdclass.o lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o
    ==================================================
    4. Now after this message, I had manually unmounted the additional MDT , which results into call traces.
    5. Var/log/messages shows--> proc_dir_entry 'lustre/lov' already registered, proc_dir_entry 'lustre/osc' already registered followed by call trace .
    5. After this , if lustre mount/start is attempted , then the process Hangs
    sometime ,Again results into call traces at the backend.
    6. Reboot is the solution to cleanup everything and start a fresh.
    ====================================================
    Attaching /var/log/messages and dmesgs from the Lustre setup.
    ====================================================
    Thanks,
    Paramita Varma


 Comments   
Comment by Paramita varma (Inactive) [ 25/May/15 ]

Hi,
This issue is reproducible with additional MDTcreated on Disk device also.
I have retested the scenario with disk device and hit the issue again.

Thanks & Regards,

Paramita Varma

Comment by Andreas Dilger [ 28/Feb/20 ]

Close old bug that hasn't been seen in a long time.

Generated at Sat Feb 10 02:01:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.