Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.6.0, Lustre 2.7.0
    • 3
    • 14925

    Description

      I have a crashed FS with 4 MDTs. On each restart I see the following GPF at the end of recovery:

      [  239.572023] Lustre: lustre-MDT0001-osp-MDT0002: Connection restored to lustre-MDT0001 (at 0@lo)
      [  239.572031] Lustre: lustre-MDT0002: Recovery over after 1:10, of 5 clients 3 recovered and 2 we\
      re evicted.
      [  239.593697] general protection fault: 0000 [#1] SMP
      [  239.594184] last sysfs file: /sys/devices/system/cpu/possible
      [  239.594184] CPU 0
      [  239.594184] Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) mgs(U) nodem\
      ap(U) osd_ldiskfs(U) ldiskfs(U) exportfs lquota(U) lfsck(U) jbd obdecho(U) mgc(U) lov(U) osc(U) md\
      c(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic \
      libcfs(U) autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 microcode virtio_balloon virti\
      o_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_\
      generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      [  239.594184]
      [  239.594184] Pid: 3307, comm: orph_cleanup_lu Not tainted 2.6.32-431.5.1.el6.lustre.x86_64 #1 Bo\
      chs Bochs
      [  239.594184] RIP: 0010:[<ffffffffa0b9b4dc>]  [<ffffffffa0b9b4dc>] ldiskfs_htree_lock+0xc/0xe0 [l\
      diskfs]
      [  239.594184] RSP: 0018:ffff8801e2f21d80  EFLAGS: 00010246
      [  239.594184] RAX: ffff8801e2ef0000 RBX: ffff8801e2ef0570 RCX: 0000000000000003
      [  239.594184] RDX: 6b6b6b6b6b6b6b6b RSI: 6b6b6b6b6b6b6b6b RDI: ffff8801f7f76720
      [  239.594184] RBP: ffff8801e2f21d90 R08: 0000000000000000 R09: ffff8801f52b5c70
      [  239.594184] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801e38f9488
      [  239.594184] R13: ffff8801f7f76720 R14: 6b6b6b6b6b6b6b6b R15: ffff8801e2ef0578
      [  239.594184] FS:  0000000000000000(0000) GS:ffff88002f800000(0000) knlGS:0000000000000000
      [  239.594184] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      [  239.594184] CR2: 00007f4526ddd000 CR3: 0000000217e23000 CR4: 00000000000006f0
      [  239.594184] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  239.594184] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  239.594184] Process orph_cleanup_lu (pid: 3307, threadinfo ffff8801e2f20000, task ffff8801e2f1e\
      2c0)
      [  239.594184] Stack:
      [  239.594184]  ffff8801e2ef0570 ffff8801e38f9488 ffff8801e2f21de0 ffffffffa0bfe43c
      [  239.594184] <d> ffff8801e2ef0570 ffffffffa0455c4d ffff88021908a4b0 ffff8801e2ef0570
      [  239.594184] <d> ffff88021908a4b0 ffffffffa0d71380 ffff8801f52b46e8 ffff8801f50dc9e8
      [  239.594184] Call Trace:
      [  239.594184]  [<ffffffffa0bfe43c>] osd_ldiskfs_it_fill+0x9c/0x240 [osd_ldiskfs]
      [  239.594184]  [<ffffffffa0455c4d>] ? lu_object_put+0xad/0x330 [obdclass]
      [  239.594184]  [<ffffffffa0bfe779>] osd_it_ea_next+0x99/0x190 [osd_ldiskfs]
      [  239.594184]  [<ffffffffa0d4e941>] lod_it_next+0x21/0x90 [lod]
      [  239.594184]  [<ffffffffa081c0da>] __mdd_orphan_cleanup+0x30a/0xfb0 [mdd]
      [  239.594184]  [<ffffffffa081bdd0>] ? __mdd_orphan_cleanup+0x0/0xfb0 [mdd]
      [  239.594184]  [<ffffffff8109eab6>] kthread+0x96/0xa0
      [  239.594184]  [<ffffffff8100c30a>] child_rip+0xa/0x20
      [  239.594184]  [<ffffffff81554710>] ? _spin_unlock_irq+0x30/0x40
      [  239.594184]  [<ffffffff8100bb10>] ? restore_args+0x0/0x30
      [  239.594184]  [<ffffffff8109ea20>] ? kthread+0x0/0xa0
      [  239.594184]  [<ffffffff8100c300>] ? child_rip+0x0/0x20
      [  239.594184] Code: 4c 89 e7 e8 37 ee ff ff 8b 55 cc 89 55 b0 e9 7c ff ff ff c7 45 b0 00 00 00 00\
       e9 68 ff ff ff 55 48 89 e5 41 54 53 0f 1f 44 00 00 <48> 8b 82 18 02 00 00 48 89 fb 49 89 d4 48 8b\
       80 f0 04 00 00 48
      [  239.594184] RIP  [<ffffffffa0b9b4dc>] ldiskfs_htree_lock+0xc/0xe0 [ldiskfs]
      [  239.594184]  RSP <ffff8801e2f21d80>
      

      Messages leading up to this are unremarkable.

      It seems noteworthy that at the bottom of orph_index_iterate() we are clobbering the rc of iops->get() and iops->load():

                      /* after index delete reset iterator */
                      if (rc == 0)
                              rc = iops->get(env, it, (const void *)"");
                      else
                              rc = iops->load(env, it, cookie);
      next:
                      rc = iops->next(env, it);
              } while (rc == 0);
      

      Attachments

        Issue Links

          Activity

            [LU-5353] GPF in __mdd_orphan_cleanup()

            Andreas, we have hit similar trouble before, and introduced osd_thread_info::oti_it_inline to allocate osd_it_ea dynamically via the patch:
            http://review.whamcloud.com/#/c/11847/
            http://review.whamcloud.com/#/c/11847/12/lustre/osd-ldiskfs/osd_handler.c,cm

            But such patch is not enough. Because the osd_thread_info::oti_it_dentry/oti_it_ea_buf still may be shared by multiple iterations. Wangdi have a patch to fix that:
            http://review.whamcloud.com/#/c/13223/
            http://review.whamcloud.com/#/c/13223/7/lustre/osd-ldiskfs/osd_handler.c,cm

            yong.fan nasf (Inactive) added a comment - Andreas, we have hit similar trouble before, and introduced osd_thread_info::oti_it_inline to allocate osd_it_ea dynamically via the patch: http://review.whamcloud.com/#/c/11847/ http://review.whamcloud.com/#/c/11847/12/lustre/osd-ldiskfs/osd_handler.c,cm But such patch is not enough. Because the osd_thread_info::oti_it_dentry/oti_it_ea_buf still may be shared by multiple iterations. Wangdi have a patch to fix that: http://review.whamcloud.com/#/c/13223/ http://review.whamcloud.com/#/c/13223/7/lustre/osd-ldiskfs/osd_handler.c,cm

            Fan Yong, did this problem with the duplicate iterator usage get fixed in another patch of yours? If yes, please link this to the other bug as a duplicate and then close as a duplicate. Please also paste in the Gerrit patch URL here.

            adilger Andreas Dilger added a comment - Fan Yong, did this problem with the duplicate iterator usage get fixed in another patch of yours? If yes, please link this to the other bug as a duplicate and then close as a duplicate. Please also paste in the Gerrit patch URL here.
            jhammond John Hammond added a comment -

            This is because the same osd_it_ea is being used twice in orph_index_iterate(). Once to start iterating through the orphans:

            00000004:00000001:2.0:1405700574.465351:0:6666:0:(mdd_orphans.c:451:orph_index_iterate()) Process entered
            00080000:00000001:2.0:1405700574.468385:0:6666:0:(osd_handler.c:4735:osd_it_ea_init()) Process entered
            00080000:00000001:2.0:1405700574.468392:0:6666:0:(osd_handler.c:4762:osd_it_ea_init()) Process leaving (rc=18446612139451008368 : -131934258543248 : ffff8801a962c570)
            

            and once to destroy an orphaned striped directory:

            orph_key_test_and_del()
             orphan_object_destroy()
              lod_declare_object_destroy()
               lod_load_striping_locked()
                lod_load_lmv_shards()
                 osd_it_ea_init()
            00080000:00000001:2.0:1405700574.468853:0:6666:0:(osd_handler.c:4762:osd_it_ea_init()) Process leaving (rc=18446612139451008368 : -13193\
            4258543248 : ffff8801a962c570)
            
            jhammond John Hammond added a comment - This is because the same osd_it_ea is being used twice in orph_index_iterate(). Once to start iterating through the orphans: 00000004:00000001:2.0:1405700574.465351:0:6666:0:(mdd_orphans.c:451:orph_index_iterate()) Process entered 00080000:00000001:2.0:1405700574.468385:0:6666:0:(osd_handler.c:4735:osd_it_ea_init()) Process entered 00080000:00000001:2.0:1405700574.468392:0:6666:0:(osd_handler.c:4762:osd_it_ea_init()) Process leaving (rc=18446612139451008368 : -131934258543248 : ffff8801a962c570) and once to destroy an orphaned striped directory: orph_key_test_and_del() orphan_object_destroy() lod_declare_object_destroy() lod_load_striping_locked() lod_load_lmv_shards() osd_it_ea_init() 00080000:00000001:2.0:1405700574.468853:0:6666:0:(osd_handler.c:4762:osd_it_ea_init()) Process leaving (rc=18446612139451008368 : -13193\ 4258543248 : ffff8801a962c570)
            jhammond John Hammond added a comment -

            To reproduce it's enough to have an orphan striped directory.

            export MDSCOUNT=4
            sh lustre/tests/llmount.sh
            cd /mnt/lustre
            lfs mkdir -c4 d0
            exec 7<d0
            rmdir d0
            echo 'Hi Alex!' > /proc/sys/lnet/force_lbug
            ...
            export MDSCOUNT=4
            export NOFORMAT=1
            sh lustre/tests/llmount.sh
            
            jhammond John Hammond added a comment - To reproduce it's enough to have an orphan striped directory. export MDSCOUNT=4 sh lustre/tests/llmount.sh cd /mnt/lustre lfs mkdir -c4 d0 exec 7<d0 rmdir d0 echo 'Hi Alex!' > /proc/sys/lnet/force_lbug ... export MDSCOUNT=4 export NOFORMAT=1 sh lustre/tests/llmount.sh

            People

              bzzz Alex Zhuravlev
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: