Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13980

Kernel panic on OST after removing files under '/O' folder

Details

    • Task
    • Resolution: Unresolved
    • Trivial
    • None
    • Lustre 2.10.8, Lustre 2.12.4
    • None
    • CentOS Linux release 7.7.1908 (Core) with kernel 3.10.0-957.1.3.el7_lustre.x86_64 for Lustre 2.10.8 and CentOS Linux release 7.7.1908 (Core) with 3.10.0-1062.9.1.el7_lustre.x86_64 for Lustre 2.12.4.
    • 9223372036854775807

    Description

      I removed some data stripes under '/O' folder on OST and started LFSCK. Then OST was forced to reboot because of kernel panic. By looking into vmcore, I found the specific error line is:

      [ 1057.367833] Lustre: lustre-OST0000: new disk, initializing
      [ 1057.367877] Lustre: srv-lustre-OST0000: No data found on store. Initialize space
      [ 1057.417121] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 300-900
      [ 1062.018722] Lustre: lustre-OST0000: Connection restored to lustre-MDT0000-mdtlov_UUID (at 10.0.0.122@tcp)
      [ 1089.010284] Lustre: lustre-OST0000: Connection restored to 89c68bff-12c8-9f48-f01e-f6306c666eb9 (at 10.0.0.98@tcp)
      [ 1281.516928] LustreError: 10410:0:(osd_handler.c:1982:osd_object_release()) LBUG
      [ 1281.516939] Pid: 10410, comm: ll_ost_out00_00 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Mon May 27 03:45:37 UTC 2019
      [ 1281.516944] Call Trace:
      [ 1281.516960]  [<ffffffffc05fd7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 1281.516986]  [<ffffffffc05fd87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 1281.517004]  [<ffffffffc0b93820>] osd_get_ldiskfs_dirent_param+0x0/0x130 [osd_ldiskfs]
      [ 1281.517173]  [<ffffffffc07442b0>] lu_object_put+0x190/0x3e0 [obdclass]
      [ 1281.517244]  [<ffffffffc09d8bc3>] out_handle+0x1503/0x1bc0 [ptlrpc]
      [ 1281.517369]  [<ffffffffc09ce7ca>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [ 1281.517481]  [<ffffffffc097705b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [ 1281.517582]  [<ffffffffc097a7a2>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      

      (The full dmesg log collected in vmcore is in the attachment)

      Then I found after removing files under '/O' on OST, even a simple write operation can result in the same kernel panic.

      I'm just curious about why 'osd_object_release' is evoked in above situations and the position of LFSCK functions in the error call trace.

      Thanks a lot! 

      Attachments

        Issue Links

          Activity

            [LU-13980] Kernel panic on OST after removing files under '/O' folder

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40738/
            Subject: LU-13980 osd-ldiskfs: print label instead of device
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8f793f14bf9928352623e61122f005252605b136

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40738/ Subject: LU-13980 osd-ldiskfs: print label instead of device Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8f793f14bf9928352623e61122f005252605b136

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40738
            Subject: LU-13980 osd-ldiskfs: print label instead of device
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 1e1137a66c634288e40f3a2a017ce6d5c003fa2d

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40738 Subject: LU-13980 osd-ldiskfs: print label instead of device Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 1e1137a66c634288e40f3a2a017ce6d5c003fa2d

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40058
            Subject: LU-13980 osd: remove osd_object_release LASSERT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 50d7b40a457a733130f346515ab37ad3e1b54424

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40058 Subject: LU-13980 osd: remove osd_object_release LASSERT Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 50d7b40a457a733130f346515ab37ad3e1b54424
            rzhan Runzhou Han (Inactive) added a comment - - edited

            I see. Maybe I should not mount them at the same time.

            Actually, I was trying to emulate some special cases in which the underneath file system is corrupted by accident while the system is still running. I want to see how Lustre reacts to these unexpected failures (especially LFSCK's reaction).

            In fact, to help develop more robust fsck for PFS/DFS, I'm also doing the same thing to other systems (e.g., BeeGFS, OrangeFS and Ceph). Since Lustre heavily relies on kernel modules, I observed more kernel crashes in Lustre when injecting faults. That's why I'm here to learn more about Lustre . If possible, I'm willing to help with moderate solutions to unexpected crashes.

            rzhan Runzhou Han (Inactive) added a comment - - edited I see. Maybe I should not mount them at the same time. Actually, I was trying to emulate some special cases in which the underneath file system is corrupted by accident while the system is still running. I want to see how Lustre reacts to these unexpected failures (especially LFSCK's reaction). In fact, to help develop more robust fsck for PFS/DFS, I'm also doing the same thing to other systems (e.g., BeeGFS, OrangeFS and Ceph). Since Lustre heavily relies on kernel modules, I observed more kernel crashes in Lustre when injecting faults. That's why I'm here to learn more about Lustre . If possible, I'm willing to help with moderate solutions to unexpected crashes.

            Mounting the OST filesystem as both "lustre" and "ldiskfs" at the same time is not supported, since (as you can see with this assertion) the state is being changed from underneath the filesystem in an unexpected manner. It would be the same as if modifying the blocks underneath ext4 while it is mounted.

            I mistakenly thought that the "lustre-OST0000: new disk, initializing" was caused by a large number of files being deleted from the filesystem before startup, but I now can see from the low OST object numbers in your "lfs getstripe" output that this is a new filesystem, so this message is expected.

            I agree that it would be good to handle this error more gracefully (e.g. return an error instead of LBUG). Looking elsewhere in Jira, it seems that this LBUG is hit often enough that the error handling should really be more tolerant, since the design policy is that the server should not LASSERT() on bad values that come from the client or disk.

            adilger Andreas Dilger added a comment - Mounting the OST filesystem as both " lustre " and " ldiskfs " at the same time is not supported, since (as you can see with this assertion) the state is being changed from underneath the filesystem in an unexpected manner. It would be the same as if modifying the blocks underneath ext4 while it is mounted. I mistakenly thought that the " lustre-OST0000: new disk, initializing " was caused by a large number of files being deleted from the filesystem before startup, but I now can see from the low OST object numbers in your " lfs getstripe " output that this is a new filesystem, so this message is expected. I agree that it would be good to handle this error more gracefully (e.g. return an error instead of LBUG). Looking elsewhere in Jira, it seems that this LBUG is hit often enough that the error handling should really be more tolerant, since the design policy is that the server should not LASSERT() on bad values that come from the client or disk.
            rzhan Runzhou Han (Inactive) added a comment - - edited

            Thank you for you reply.  

            I mounted as both type "lustre" and type "ldiskfs" at the same time. 

            The file removed is a data stripe of a client file. In my configuration my stripe setting is:

            lfs setstripe -i 0 -c -1 -S 64K /lustre
            

            For example, on client node I create a file with the following command:

            dd if=/dev/zero of=/lustre/10M bs=1M count=10
            

            Then I use the "lfs getstripe /lustre/10M" to locate its data stripes on OSTs,

            [root@mds Desktop]# lfs getstripe 10M 
            10M
            lmm_stripe_count:  3
            lmm_stripe_size:   65536
            lmm_pattern:       1
            lmm_layout_gen:    0
            lmm_stripe_offset: 0
            	obdidx		 objid		 objid		 group
            	     0	             2	          0x2	             0
            	     1	             2	          0x2	             0
            	     2	             2	          0x2	             0
            

            Next I remove one of them under one OST "ldiskfs" mount point.

            [root@oss0 osboxes]# rm -f /ost0_ldiskfs/O/0/d2/2
            

            Then running LFSCK on MDT will trigger kernel panic caused by LBUG.

            I'm able to reproduce the LBUG. However, after kernel panic takes place, I'm not able to manipulate the VM any more (I was using a virtual machine cluster). The VM would either freezes or I configure the kernel to reboot in x seconds after kernel panic. In the next boot I'm not able to find /tmp/lustre_log.<timestamp>.

             

            rzhan Runzhou Han (Inactive) added a comment - - edited Thank you for you reply.   I mounted as both type " lustre " and type " ldiskfs " at the same time.  The file removed is a data stripe of a client file. In my configuration my stripe setting is: lfs setstripe -i 0 -c -1 -S 64K /lustre For example, on client node I create a file with the following command: dd if=/dev/zero of=/lustre/10M bs=1M count=10 Then I use the " lfs getstripe /lustre/10M " to locate its data stripes on OSTs, [root@mds Desktop]# lfs getstripe 10M 10M lmm_stripe_count: 3 lmm_stripe_size: 65536 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 2 0x2 0 1 2 0x2 0 2 2 0x2 0 Next I remove one of them under one OST " ldiskfs " mount point. [root@oss0 osboxes]# rm -f /ost0_ldiskfs/O/0/d2/2 Then running LFSCK on MDT will trigger kernel panic caused by LBUG. I'm able to reproduce the LBUG. However, after kernel panic takes place, I'm not able to manipulate the VM any more (I was using a virtual machine cluster). The VM would either freezes or I configure the kernel to reboot in x seconds after kernel panic. In the next boot I'm not able to find /tmp/lustre_log.<timestamp> .  

            Can you please provide some more information about this problem.

            How did you remove the objects under the /O directory? Was the filesystem mounted as both type "lustre" and type "ldiskfs" at the same time, or was the "lustre" OST unmounted first?

            Which files were removed? The log messages makes it appear that the filesystem was sufficiently corrupted that the OST startup process wasn't able to detect the Lustre configuration files.

            If you are able to reproduce this, please enable full debugging with "lctl set_param debug=-1" on the OST before starting LFSCK, and then attach the debug log which should be written to /tmp/lustre_log.<timestamp> when the LBUG is triggered, or can be dumped manually like "lctl dk /tmp/lustre_log.txt".

            adilger Andreas Dilger added a comment - Can you please provide some more information about this problem. How did you remove the objects under the /O directory? Was the filesystem mounted as both type " lustre " and type " ldiskfs " at the same time, or was the " lustre " OST unmounted first? Which files were removed? The log messages makes it appear that the filesystem was sufficiently corrupted that the OST startup process wasn't able to detect the Lustre configuration files. If you are able to reproduce this, please enable full debugging with " lctl set_param debug=-1 " on the OST before starting LFSCK, and then attach the debug log which should be written to /tmp/lustre_log.<timestamp> when the LBUG is triggered, or can be dumped manually like " lctl dk /tmp/lustre_log.txt ".

            People

              adilger Andreas Dilger
              rzhan Runzhou Han (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: