Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11122

NULL pointer dereference in fld_local_lookup()

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • None
    • lustre-2.8.2_2.chaos-1.ch6.x86_64
      4 MDTs used in DNE-1 fashion (remote dirs, no striped dirs)
      RHEL 7.5
    • 1
    • 9223372036854775807

    Description

      MDS nodes were power cycled during hardware maintenance. After they came back up, got below (some material redacted, see comments below for full console log contents):

      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld]
      CPU: 17 PID: 180501 Comm: orph_cleanup_ls Kdump: loaded Tainted: P OE ------------ 3.10.0-862.3.2.1chaos.ch6.x86_64 #1
      Call Trace:
      [<ffffffffc06e8f6c>] ? dmu_tx_hold_object_impl+0x6c/0xc0 [zfs]
      [<ffffffffc109ff28>] osd_fld_lookup+0x48/0xd0 [osd_zfs]
      [<ffffffffc10a008a>] fid_is_on_ost+0xda/0x2f0 [osd_zfs]
      [<ffffffffc10a02e9>] osd_get_name_n_idx+0x49/0xd00 [osd_zfs]
      [<ffffffffc109902c>] ? osd_declare_attr_set+0x14c/0x730 [osd_zfs]
      [<ffffffffc0753b7e>] ? zap_lookup_by_dnode+0x2e/0x30 [zfs]
      [<ffffffffc1097510>] osd_declare_object_destroy+0xe0/0x3e0 [osd_zfs]
      [<ffffffffc1139ffe>] lod_sub_object_declare_destroy+0xce/0x2d0 [lod]
      [<ffffffffc1129700>] lod_declare_object_destroy+0x170/0x4a0 [lod]
      [<ffffffffc1513689>] ? orph_declare_index_delete+0x179/0x460 [mdd]
      [<ffffffffc1513f66>] orph_key_test_and_del+0x5f6/0xd30 [mdd]
      [<ffffffffc1514c57>] __mdd_orphan_cleanup+0x5b7/0x840 [mdd]
      [<ffffffffc15146a0>] ? orph_key_test_and_del+0xd30/0xd30 [mdd]
      [<ffffffffbb2c05f1>] kthread+0xd1/0xe0
      [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40
      [<ffffffffbb9438b7>] ret_from_fork_nospec_begin+0x21/0x21
      [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40

      Attachments

        Activity

          [LU-11122] NULL pointer dereference in fld_local_lookup()
          ofaaland Olaf Faaland added a comment -

          Thank you for responding so quickly.

          ofaaland Olaf Faaland added a comment - Thank you for responding so quickly.
          ofaaland Olaf Faaland added a comment -

          John, thanks for the suggestion.  I'll try it if this comes up again.

          ofaaland Olaf Faaland added a comment - John, thanks for the suggestion.  I'll try it if this comes up again.
          ofaaland Olaf Faaland added a comment -

          The file system is back up.  This is no longer an emergency.

          In the course of rebooting the nodes and mounting manually to get more information, the messages about orphan cleanup failing stopped appearing after recovery.  It's been about 6 minutes now since the MDTs completed recovery.

          ofaaland Olaf Faaland added a comment - The file system is back up.  This is no longer an emergency. In the course of rebooting the nodes and mounting manually to get more information, the messages about orphan cleanup failing stopped appearing after recovery.  It's been about 6 minutes now since the MDTs completed recovery.
          ofaaland Olaf Faaland added a comment -

          Lustre only, no IML.

          ofaaland Olaf Faaland added a comment - Lustre only, no IML.
          wcjohnso Will Johnson added a comment -

          Hi ofaaland,

          Is this a Lustre only system or do you also have IML installed?

          Regards,

          Will

          wcjohnso Will Johnson added a comment - Hi ofaaland , Is this a Lustre only system or do you also have IML installed? Regards, Will
          jhammond John Hammond added a comment -

          > Is there a way we can tell the MDT to skip orphan cleanup once, so it forgets about these orphaned objects? We don't really care if some unused objects are orphaned.

          You could mount the backing FS and remove the files from the PENDING directory.

          jhammond John Hammond added a comment - > Is there a way we can tell the MDT to skip orphan cleanup once, so it forgets about these orphaned objects? We don't really care if some unused objects are orphaned. You could mount the backing FS and remove the files from the PENDING directory.
          ofaaland Olaf Faaland added a comment -

          Is there a way we can tell the MDT to skip orphan cleanup once, so it forgets about these orphaned objects?  We don't really care if some unused objects are orphaned.

          ofaaland Olaf Faaland added a comment - Is there a way we can tell the MDT to skip orphan cleanup once, so it forgets about these orphaned objects?  We don't really care if some  unused objects are orphaned.
          ofaaland Olaf Faaland added a comment -

          For our code, including the tags we build from, see the lustre-release-fe-llnl project in gerrit.

           

           

          ofaaland Olaf Faaland added a comment - For our code, including the tags we build from, see the lustre-release-fe-llnl project in gerrit.    
          ofaaland Olaf Faaland added a comment -

          Top of stack:

          [2507417.726696] [<ffffffffc109ff28>] osd_fld_lookup+0x48/0xd0 [osd_zfs]
          [2507417.726700] [<ffffffffc10a008a>] fid_is_on_ost+0xda/0x2f0 [osd_zfs]
          [2507417.726703] [<ffffffffc10a02e9>] osd_get_name_n_idx+0x49/0xd00 [osd_zfs]
          

          And point of NULL ptr dereference:

          [2507417.726655] RIP: 0010:[<ffffffffc0cdc392>] [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld]
          
          (gdb) l *(fld_local_lookup+0x52)
          0x53c2 is in fld_local_lookup (/usr/src/debug/lustre-2.8.2_2.chaos/lustre/fld/fld_handler.c:218).
          213 info = lu_context_key_get(&env->le_ctx, &fld_thread_key);
          214 LASSERT(info != NULL);
          215 erange = &info->fti_lrange;
          216
          217 /* Lookup it in the cache. */
          218 rc = fld_cache_lookup(fld->lsf_cache, seq, erange);
          219 if (rc == 0) {
          220 if (unlikely(fld_range_type(erange) != fld_range_type(range) &&
          221 !fld_range_is_any(range))) {
          222 CERROR("%s: FLD cache range "DRANGE" does not match"
          

          Variable info cannot be NULL, because it is local, not shared, and the ASSERT verified it was not NULL after it was last assigned. So it seems like line 218 really is where the NULL ptr deref is occurring:

          218 rc = fld_cache_lookup(fld->lsf_cache, seq, erange);
          

          fld was passed in and we have not check it for NULL, and that's the only pointer used. So it seems fld must be NULL.

           

          ofaaland Olaf Faaland added a comment - Top of stack: [2507417.726696] [<ffffffffc109ff28>] osd_fld_lookup+0x48/0xd0 [osd_zfs] [2507417.726700] [<ffffffffc10a008a>] fid_is_on_ost+0xda/0x2f0 [osd_zfs] [2507417.726703] [<ffffffffc10a02e9>] osd_get_name_n_idx+0x49/0xd00 [osd_zfs] And point of NULL ptr dereference: [2507417.726655] RIP: 0010:[<ffffffffc0cdc392>] [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld] (gdb) l *(fld_local_lookup+0x52) 0x53c2 is in fld_local_lookup (/usr/src/debug/lustre-2.8.2_2.chaos/lustre/fld/fld_handler.c:218). 213 info = lu_context_key_get(&env->le_ctx, &fld_thread_key); 214 LASSERT(info != NULL); 215 erange = &info->fti_lrange; 216 217 /* Lookup it in the cache. */ 218 rc = fld_cache_lookup(fld->lsf_cache, seq, erange); 219 if (rc == 0) { 220 if (unlikely(fld_range_type(erange) != fld_range_type(range) && 221 !fld_range_is_any(range))) { 222 CERROR("%s: FLD cache range "DRANGE" does not match" Variable info cannot be NULL, because it is local, not shared, and the ASSERT verified it was not NULL after it was last assigned. So it seems like line 218 really is where the NULL ptr deref is occurring: 218 rc = fld_cache_lookup(fld->lsf_cache, seq, erange); fld was passed in and we have not check it for NULL, and that's the only pointer used. So it seems fld must be NULL.  
          ofaaland Olaf Faaland added a comment -

          More complete console log contents:

          [2507417.597913] Lustre: lsrza-MDT0003: Recovery over after 1:45, of 1011 clients 0 recovered and 1011 were evicted.
          [2507417.634717] LustreError: 162832:0:(client.c:1164:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff9eaed88fef00 x1602545796916964/t0(0) o5->lsrza-OST0003-osc-MDT0003@172.21.3.8@o2ib700:28/4 lens 432/432 e 0 to 0 dl 0 ref 2 fl Rpc:N/0/ffffffff rc 0/-1           
          [2507417.634723] LustreError: 162879:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) lsrza-OST0005-osc-MDT0003: cannot cleanup orphans: rc = -5
          [2507417.634725] LustreError: 162963:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) lsrza-OST0007-osc-MDT0003: cannot cleanup orphans: rc = -5
          [2507417.634726] LustreError: 162879:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message
          [2507417.634727] LustreError: 162963:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message
          [2507417.724392] LustreError: 162832:0:(client.c:1164:ptlrpc_import_delay_req()) Skipped 926 previous similar messages
          [2507417.726598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
          [2507417.726603] IP: [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld]
          [2507417.726604] PGD 0
          [2507417.726605] Oops: 0000 [#1] SMP
          <<Long line listing Modules linked in redacted>>
          [2507417.726650] CPU: 17 PID: 180501 Comm: orph_cleanup_ls Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.3.2.1chaos.ch6.x86_64 #1
          [2507417.726651] Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0019.101220160604 10/12/2016
          [2507417.726651] task: ffff9eacbe858000 ti: ffff9ea134cdc000 task.ti: ffff9ea134cdc000
          [2507417.726655] RIP: 0010:[<ffffffffc0cdc392>]  [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld]
          [2507417.726655] RSP: 0018:ffff9ea134cdfc00  EFLAGS: 00010286
          [2507417.726656] RAX: ffff9e7059e223c0 RBX: ffff9e7059e223c0 RCX: ffff9eaff5e9d430
          [2507417.726657] RDX: ffff9e6e32fa4800 RSI: ffffffffc0ce4ec0 RDI: ffff9eb0007e9240
          [2507417.726657] RBP: ffff9ea134cdfc38 R08: 00000002c000f1db R09: ffff9e32bfc07a00
          [2507417.726658] R10: ffff9e32bfc07a00 R11: ffffffffc03d4208 R12: ffff9eaff5e9d430
          [2507417.726658] R13: 0000000000000000 R14: 00000002c000f1e7 R15: ffff9eaf7318c000
          [2507417.726659] FS:  0000000000000000(0000) GS:ffff9e70be840000(0000) knlGS:0000000000000000
          [2507417.726660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          [2507417.726660] CR2: 0000000000000018 CR3: 0000003f7e378000 CR4: 00000000001607e0
          [2507417.726661] Call Trace:
          [2507417.726689]  [<ffffffffc06e8f6c>] ? dmu_tx_hold_object_impl+0x6c/0xc0 [zfs]
          [2507417.726696]  [<ffffffffc109ff28>] osd_fld_lookup+0x48/0xd0 [osd_zfs]
          [2507417.726700]  [<ffffffffc10a008a>] fid_is_on_ost+0xda/0x2f0 [osd_zfs]
          [2507417.726703]  [<ffffffffc10a02e9>] osd_get_name_n_idx+0x49/0xd00 [osd_zfs]
          [2507417.726706]  [<ffffffffc109902c>] ? osd_declare_attr_set+0x14c/0x730 [osd_zfs]
          [2507417.726729]  [<ffffffffc0753b7e>] ? zap_lookup_by_dnode+0x2e/0x30 [zfs]
          [2507417.726733]  [<ffffffffc1097510>] osd_declare_object_destroy+0xe0/0x3e0 [osd_zfs]
          [2507417.726742]  [<ffffffffc1139ffe>] lod_sub_object_declare_destroy+0xce/0x2d0 [lod]
          [2507417.726748]  [<ffffffffc1129700>] lod_declare_object_destroy+0x170/0x4a0 [lod]
          [2507417.726756]  [<ffffffffc1513689>] ? orph_declare_index_delete+0x179/0x460 [mdd]
          [2507417.726760]  [<ffffffffc1513f66>] orph_key_test_and_del+0x5f6/0xd30 [mdd]
          [2507417.726776]  [<ffffffffc1514c57>] __mdd_orphan_cleanup+0x5b7/0x840 [mdd]
          [2507417.726780]  [<ffffffffc15146a0>] ? orph_key_test_and_del+0xd30/0xd30 [mdd]
          [2507417.726782]  [<ffffffffbb2c05f1>] kthread+0xd1/0xe0
          [2507417.726783]  [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40
          [2507417.726786]  [<ffffffffbb9438b7>] ret_from_fork_nospec_begin+0x21/0x21
          [2507417.726786]  [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40
          
          ofaaland Olaf Faaland added a comment - More complete console log contents: [2507417.597913] Lustre: lsrza-MDT0003: Recovery over after 1:45, of 1011 clients 0 recovered and 1011 were evicted. [2507417.634717] LustreError: 162832:0:(client.c:1164:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff9eaed88fef00 x1602545796916964/t0(0) o5->lsrza-OST0003-osc-MDT0003@172.21.3.8@o2ib700:28/4 lens 432/432 e 0 to 0 dl 0 ref 2 fl Rpc:N/0/ffffffff rc 0/-1 [2507417.634723] LustreError: 162879:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) lsrza-OST0005-osc-MDT0003: cannot cleanup orphans: rc = -5 [2507417.634725] LustreError: 162963:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) lsrza-OST0007-osc-MDT0003: cannot cleanup orphans: rc = -5 [2507417.634726] LustreError: 162879:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message [2507417.634727] LustreError: 162963:0:(osp_precreate.c:900:osp_precreate_cleanup_orphans()) Skipped 1 previous similar message [2507417.724392] LustreError: 162832:0:(client.c:1164:ptlrpc_import_delay_req()) Skipped 926 previous similar messages [2507417.726598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [2507417.726603] IP: [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld] [2507417.726604] PGD 0 [2507417.726605] Oops: 0000 [#1] SMP <<Long line listing Modules linked in redacted>> [2507417.726650] CPU: 17 PID: 180501 Comm: orph_cleanup_ls Kdump: loaded Tainted: P OE ------------ 3.10.0-862.3.2.1chaos.ch6.x86_64 #1 [2507417.726651] Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0019.101220160604 10/12/2016 [2507417.726651] task: ffff9eacbe858000 ti: ffff9ea134cdc000 task.ti: ffff9ea134cdc000 [2507417.726655] RIP: 0010:[<ffffffffc0cdc392>] [<ffffffffc0cdc392>] fld_local_lookup+0x52/0x270 [fld] [2507417.726655] RSP: 0018:ffff9ea134cdfc00 EFLAGS: 00010286 [2507417.726656] RAX: ffff9e7059e223c0 RBX: ffff9e7059e223c0 RCX: ffff9eaff5e9d430 [2507417.726657] RDX: ffff9e6e32fa4800 RSI: ffffffffc0ce4ec0 RDI: ffff9eb0007e9240 [2507417.726657] RBP: ffff9ea134cdfc38 R08: 00000002c000f1db R09: ffff9e32bfc07a00 [2507417.726658] R10: ffff9e32bfc07a00 R11: ffffffffc03d4208 R12: ffff9eaff5e9d430 [2507417.726658] R13: 0000000000000000 R14: 00000002c000f1e7 R15: ffff9eaf7318c000 [2507417.726659] FS: 0000000000000000(0000) GS:ffff9e70be840000(0000) knlGS:0000000000000000 [2507417.726660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2507417.726660] CR2: 0000000000000018 CR3: 0000003f7e378000 CR4: 00000000001607e0 [2507417.726661] Call Trace: [2507417.726689] [<ffffffffc06e8f6c>] ? dmu_tx_hold_object_impl+0x6c/0xc0 [zfs] [2507417.726696] [<ffffffffc109ff28>] osd_fld_lookup+0x48/0xd0 [osd_zfs] [2507417.726700] [<ffffffffc10a008a>] fid_is_on_ost+0xda/0x2f0 [osd_zfs] [2507417.726703] [<ffffffffc10a02e9>] osd_get_name_n_idx+0x49/0xd00 [osd_zfs] [2507417.726706] [<ffffffffc109902c>] ? osd_declare_attr_set+0x14c/0x730 [osd_zfs] [2507417.726729] [<ffffffffc0753b7e>] ? zap_lookup_by_dnode+0x2e/0x30 [zfs] [2507417.726733] [<ffffffffc1097510>] osd_declare_object_destroy+0xe0/0x3e0 [osd_zfs] [2507417.726742] [<ffffffffc1139ffe>] lod_sub_object_declare_destroy+0xce/0x2d0 [lod] [2507417.726748] [<ffffffffc1129700>] lod_declare_object_destroy+0x170/0x4a0 [lod] [2507417.726756] [<ffffffffc1513689>] ? orph_declare_index_delete+0x179/0x460 [mdd] [2507417.726760] [<ffffffffc1513f66>] orph_key_test_and_del+0x5f6/0xd30 [mdd] [2507417.726776] [<ffffffffc1514c57>] __mdd_orphan_cleanup+0x5b7/0x840 [mdd] [2507417.726780] [<ffffffffc15146a0>] ? orph_key_test_and_del+0xd30/0xd30 [mdd] [2507417.726782] [<ffffffffbb2c05f1>] kthread+0xd1/0xe0 [2507417.726783] [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40 [2507417.726786] [<ffffffffbb9438b7>] ret_from_fork_nospec_begin+0x21/0x21 [2507417.726786] [<ffffffffbb2c0520>] ? insert_kthread_work+0x40/0x40
          ofaaland Olaf Faaland added a comment -

          This a production file system, it's currently down due to this issue.

          ofaaland Olaf Faaland added a comment - This a production file system, it's currently down due to this issue.

          People

            bzzz Alex Zhuravlev
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: