Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8907

LBUG: (namei.c:816:ll_create_node()) ASSERTION( list_empty(&inode->i_dentry) ) failed

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • Lustre 2.5.3
    • RHEL 6 Bull kernel 2.6.32-642.4.2.el6.Bull.100.x86_64
      Lustre build based on 2.5.3.90
    • 3
    • 9223372036854775807

    Description

      One Lustre client crashed right after a recovery with the following messages:

      2016-10-29 16:00:01 Lustre: DEBUG MARKER: Sat Oct 29 16:00:01 2016
       2016-10-29 16:00:01
       2016-10-29 16:01:07 Lustre: 17766:0:(client.c:1942:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1477749640/real 1477749640] req@ffff8811e1b9a400 x1547883906151520/t0(0) 
       o101->scratch3-MDT0000-mdc-ffff88205ec91400@JO.BOO.AL.IL@o2ib2:12/10 lens 632/1136 e 0 to 1 dl 1477749667 ref 2 fl Rpc:XP/0/ffffffff rc 0/-1
       2016-10-29 16:01:07 Lustre: 17766:0:(client.c:1942:ptlrpc_expire_one_request()) Skipped 12 previous similar messages
       2016-10-29 16:01:07 Lustre: scratch3-MDT0000-mdc-ffff88205ec91400: Connection to scratch3-MDT0000 (at JO.BOO.AL.IL@o2ib2) was lost; in progress operations using this service will wait for recovery to complet
       e
       2016-10-29 16:01:07 Lustre: Skipped 12 previous similar messages
       2016-10-29 16:01:07 Lustre: scratch3-MDT0000-mdc-ffff88205ec91400: Connection restored to scratch3-MDT0000 (at JO.BOO.AL.IL@o2ib2)
       2016-10-29 16:01:07 Lustre: Skipped 12 previous similar messages
       2016-10-29 16:01:07 LustreError: 17766:0:(namei.c:816:ll_create_node()) ASSERTION( list_empty(&inode->i_dentry) ) failed:
       2016-10-29 16:01:07 LustreError: 17766:0:(namei.c:816:ll_create_node()) LBUG
      
      

      The stack is as follows:

      PID: 17766 TASK: ffff8819ab5b0ab0 CPU: 6 COMMAND: "rsync"
       #0 [ffff88189c697b20] machine_kexec at ffffffff8103ff4b
       #1 [ffff88189c697b80] crash_kexec at ffffffff810cfce2
       #2 [ffff88189c697c50] panic at ffffffff81546ce9
       #3 [ffff88189c697cd0] lbug_with_loc at ffffffffa067beeb [libcfs]
       #4 [ffff88189c697cf0] ll_create_nd at ffffffffa0dbf854 [lustre]
       #5 [ffff88189c697d70] vfs_create at ffffffff811a7946
       #6 [ffff88189c697db0] do_filp_open at ffffffff811ab75e
       #7 [ffff88189c697f20] do_sys_open at ffffffff81194e87
       #8 [ffff88189c697f70] sys_open at ffffffff81194f90
       #9 [ffff88189c697f80] system_call_fastpath at ffffffff8100b0d2
       RIP: 0000003c88adb480 RSP: 00007fffd85b6bc8 RFLAGS: 00010246
       RAX: 0000000000000002 RBX: ffffffff8100b0d2 RCX: 0000000000000001
       RDX: 0000000000000180 RSI: 00000000000000c2 RDI: 00007fffd85b7d70
       RBP: 00007fffd85b7d9b R8: 7a672e7761722e30 R9: 00000000ffffffff
       R10: 0000000000000001 R11: 0000000000000246 R12: ffffffff81194f90
       R13: ffff88189c697f78 R14: 00007fffd85b7d9c R15: 0000000000000000
       ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
      

       

      Even though the intent is to create the inode with a dentry in the filesystem namespace (/somepath/.tmpfile), the inode structure field i_dentry is populated with one dentry pointing at /.lustre/fid/[0x298cd542a:0x3b3d:0x0].

      Strangely enough, even though the crash occured in ll_create_nd() -> ll_create_it() -> ll_create_node, before the call to d_instantiate() in ll_create_nd() -> ll_create_it(), the fid actually really points back to the requested location:

      # lfs fid2path /mountpoint [0x298cd542a:0x3b3d:0x0]
      /somepath/.tmpfile
      

      I can't upload the crashdump due to site restriction, but will be happy to give more informations as you request them.

      Attachments

        Activity

          [LU-8907] LBUG: (namei.c:816:ll_create_node()) ASSERTION( list_empty(&inode->i_dentry) ) failed

          Thanks Bruno.

          The patch was backported to our branch.

          This ticket can be closed.

          spiechurski Sebastien Piechurski added a comment - Thanks Bruno. The patch was backported to our branch. This ticket can be closed.

          Hello Seb,
          As you can see https://review.whamcloud.com/25296/ has just landed on master.
          And as per its content (at least its pure fix part in core code, i.e. simply remove LBUG), it should be pretty easy for you to back-port it on top of your Lustre branch based on 2.5.3.90.

          bfaccini Bruno Faccini (Inactive) added a comment - Hello Seb, As you can see https://review.whamcloud.com/25296/ has just landed on master. And as per its content (at least its pure fix part in core code, i.e. simply remove LBUG), it should be pretty easy for you to back-port it on top of your Lustre branch based on 2.5.3.90.

          Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25296/
          Subject: LU-8907 llite: handle client racy case during create
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: ee25babfe72378f9496a9732742984f26eb7d4a5

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25296/ Subject: LU-8907 llite: handle client racy case during create Project: fs/lustre-release Branch: master Current Patch Set: Commit: ee25babfe72378f9496a9732742984f26eb7d4a5

          Hello Bruno,

           

          What is the next step for this patch ?

          Are we waiting for reviewers ?

          spiechurski Sebastien Piechurski added a comment - Hello Bruno,   What is the next step for this patch ? Are we waiting for reviewers ?

          Yes, new sanity/test_161d sub-test is also a reproducer in auto-tests environment and it also demonstrate that problem is still alive in master!
          This has been proofed by patch-set #2 testing of https://review.whamcloud.com/25296, and now patch-set #3 introduces a first attempt to fix simply removing the inaccurate

          LASSERT(ll_d_hlist_empty(&inode->i_dentry));

          in ll_create_node() and exposes it to full tests suites.

          bfaccini Bruno Faccini (Inactive) added a comment - Yes, new sanity/test_161d sub-test is also a reproducer in auto-tests environment and it also demonstrate that problem is still alive in master! This has been proofed by patch-set #2 testing of https://review.whamcloud.com/25296 , and now patch-set #3 introduces a first attempt to fix simply removing the inaccurate LASSERT(ll_d_hlist_empty(&inode->i_dentry)); in ll_create_node() and exposes it to full tests suites.

          Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/25296
          Subject: LU-8907 llite: handle client racy case during create
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 3d35f58c3b6c736e5db3d22dea2451bf634cdb95

          gerrit Gerrit Updater added a comment - Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/25296 Subject: LU-8907 llite: handle client racy case during create Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3d35f58c3b6c736e5db3d22dea2451bf634cdb95

          Hi Bruno, did you see the problem reproduced in the auto tests environment ?

          spiechurski Sebastien Piechurski added a comment - Hi Bruno, did you see the problem reproduced in the auto tests environment ?

          I have just pushed https://review.whamcloud.com/24976 to b2_5 branch. It introduces sanity/test_161d sub-test in order to check that the reproducer I have identified also fails in the auto-tests environment in a first time. And then, with next patch-set that removing LBUG could fix and not introduce any regression.

          Next, will also check, using the same way, if problem also exists in master.

          bfaccini Bruno Faccini (Inactive) added a comment - I have just pushed https://review.whamcloud.com/24976 to b2_5 branch. It introduces sanity/test_161d sub-test in order to check that the reproducer I have identified also fails in the auto-tests environment in a first time. And then, with next patch-set that removing LBUG could fix and not introduce any regression. Next, will also check, using the same way, if problem also exists in master.

          Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/24976
          Subject: LU-8907 llite: handle client racy case during create
          Project: fs/lustre-release
          Branch: b2_5
          Current Patch Set: 1
          Commit: 376e27c4ac97d31f9729f351317891bff3f54147

          gerrit Gerrit Updater added a comment - Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/24976 Subject: LU-8907 llite: handle client racy case during create Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 376e27c4ac97d31f9729f351317891bff3f54147

          Thanks Seb, I don't see any Kernel/Lustre patch that could be related to the problem.

          Concerning the other LBUG, related to LU-2048, that I have triggered during my first attempts to reproduce, it is a pure problem related to statahead code where some fixes appear to be missing in 2.5.3.90.

          On the other hand, I have been able to reproduce your LBUG by introducing the following tempo just before the LBUG :

          [root@eagle-28 lustre-release]# git status
          # On branch b2_5
          # Changed but not updated:
          #   (use "git add <file>..." to update what will be committed)
          #   (use "git checkout -- <file>..." to discard changes in working directory)
          #
          #       modified:   lustre/include/obd_support.h
          #       modified:   lustre/llite/namei.c
          #
          # Untracked files:
          #   (use "git add <file>..." to include in what will be committed)
          #
          #       lustre/include/obd_support.h.bfi
          #       lustre/llite/namei.c.bfi
          #       lustre/scripts/make_META.pl
          no changes added to commit (use "git add" and/or "git commit -a")
          [root@eagle-28 lustre-release]# 
          [root@eagle-28 lustre-release]# git diff
          diff --git a/lustre/include/obd_support.h b/lustre/include/obd_support.h
          index 37bee22..a21f8a0 100644
          --- a/lustre/include/obd_support.h
          +++ b/lustre/include/obd_support.h
          @@ -430,6 +430,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type,
           #define OBD_FAIL_MDC_GETATTR_ENQUEUE     0x803
           #define OBD_FAIL_MDC_RPCS_SEM           0x804
           #define OBD_FAIL_MDC_LIGHTWEIGHT        0x805
          +#define OBD_FAIL_MDC_CREATE_PAUSE       0x805
           
           #define OBD_FAIL_MGS                     0x900
           #define OBD_FAIL_MGS_ALL_REQUEST_NET     0x901
          diff --git a/lustre/llite/namei.c b/lustre/llite/namei.c
          index 39cd6d5..1cf3518 100644
          --- a/lustre/llite/namei.c
          +++ b/lustre/llite/namei.c
          @@ -813,6 +813,8 @@ static struct inode *ll_create_node(struct inode *dir, const char *name,
                   if (rc)
                           GOTO(out, inode = ERR_PTR(rc));
           
          +       OBD_FAIL_TIMEOUT(OBD_FAIL_MDC_CREATE_PAUSE, 5);
          +
                  LASSERT(ll_d_hlist_empty(&inode->i_dentry));
           
                   /* We asked for a lock on the directory, but were granted a
          [root@eagle-28 lustre-release]# 
          

          and by running the following snippet :

          [root@eagle-28 lustre-release]# date; lctl set_param fail_loc=0x80000805 ; echo foofour > /mnt/lustre/foo8 & sleep 2 ; date ; fid=$(lfs changelog lustre-MDT0000 | grep ' foo8' | sed 's/.* t=//' | sed 's/ p=.*//') ; date ; cat /mnt/lustre/.lustre/fid/$fid ; date
          Tue Jan 17 16:31:09 UTC 2017
          fail_loc=0x80000805
          [1] 27152
          Tue Jan 17 16:31:11 UTC 2017
          Tue Jan 17 16:31:11 UTC 2017
          Tue Jan 17 16:31:11 UTC 2017
          [root@eagle-28 lustre-release]# 
          Message from syslogd@eagle-28 at Jan 17 16:31:14 ...
           kernel:LustreError: 27152:0:(namei.c:818:ll_create_node()) ASSERTION( list_empty(&inode->i_dentry) ) failed: 
                            
          Message from syslogd@eagle-28 at Jan 17 16:31:14 ...
           kernel:LustreError: 27152:0:(namei.c:818:ll_create_node()) LBUG
          

          So this seems to mean that, as I have suspected from the start of our joint analysis, there is a potential race during file creation (including directory/dentry populate) and concurrent access using .lustre/fid method that is allowed to a ChangeLog reader, some kind of a
          "crime of insider trading". Or may be the LBUG is finally inaccurate and could be simply removed?

          More to come from crash-dump/debug-log analysis, and will also check if problem is still alive (like the same LBUG!) in master.

          bfaccini Bruno Faccini (Inactive) added a comment - Thanks Seb, I don't see any Kernel/Lustre patch that could be related to the problem. Concerning the other LBUG, related to LU-2048 , that I have triggered during my first attempts to reproduce, it is a pure problem related to statahead code where some fixes appear to be missing in 2.5.3.90. On the other hand, I have been able to reproduce your LBUG by introducing the following tempo just before the LBUG : [root@eagle-28 lustre-release]# git status # On branch b2_5 # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: lustre/include/obd_support.h # modified: lustre/llite/namei.c # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # lustre/include/obd_support.h.bfi # lustre/llite/namei.c.bfi # lustre/scripts/make_META.pl no changes added to commit (use "git add" and/or "git commit -a") [root@eagle-28 lustre-release]# [root@eagle-28 lustre-release]# git diff diff --git a/lustre/include/obd_support.h b/lustre/include/obd_support.h index 37bee22..a21f8a0 100644 --- a/lustre/include/obd_support.h +++ b/lustre/include/obd_support.h @@ -430,6 +430,7 @@ int obd_alloc_fail(const void *ptr, const char *name, const char *type, #define OBD_FAIL_MDC_GETATTR_ENQUEUE 0x803 #define OBD_FAIL_MDC_RPCS_SEM 0x804 #define OBD_FAIL_MDC_LIGHTWEIGHT 0x805 +#define OBD_FAIL_MDC_CREATE_PAUSE 0x805 #define OBD_FAIL_MGS 0x900 #define OBD_FAIL_MGS_ALL_REQUEST_NET 0x901 diff --git a/lustre/llite/namei.c b/lustre/llite/namei.c index 39cd6d5..1cf3518 100644 --- a/lustre/llite/namei.c +++ b/lustre/llite/namei.c @@ -813,6 +813,8 @@ static struct inode *ll_create_node(struct inode *dir, const char *name, if (rc) GOTO(out, inode = ERR_PTR(rc)); + OBD_FAIL_TIMEOUT(OBD_FAIL_MDC_CREATE_PAUSE, 5); + LASSERT(ll_d_hlist_empty(&inode->i_dentry)); /* We asked for a lock on the directory, but were granted a [root@eagle-28 lustre-release]# and by running the following snippet : [root@eagle-28 lustre-release]# date; lctl set_param fail_loc=0x80000805 ; echo foofour > /mnt/lustre/foo8 & sleep 2 ; date ; fid=$(lfs changelog lustre-MDT0000 | grep ' foo8' | sed 's/.* t=//' | sed 's/ p=.*//') ; date ; cat /mnt/lustre/.lustre/fid/$fid ; date Tue Jan 17 16:31:09 UTC 2017 fail_loc=0x80000805 [1] 27152 Tue Jan 17 16:31:11 UTC 2017 Tue Jan 17 16:31:11 UTC 2017 Tue Jan 17 16:31:11 UTC 2017 [root@eagle-28 lustre-release]# Message from syslogd@eagle-28 at Jan 17 16:31:14 ... kernel:LustreError: 27152:0:(namei.c:818:ll_create_node()) ASSERTION( list_empty(&inode->i_dentry) ) failed: Message from syslogd@eagle-28 at Jan 17 16:31:14 ... kernel:LustreError: 27152:0:(namei.c:818:ll_create_node()) LBUG So this seems to mean that, as I have suspected from the start of our joint analysis, there is a potential race during file creation (including directory/dentry populate) and concurrent access using .lustre/fid method that is allowed to a ChangeLog reader, some kind of a "crime of insider trading". Or may be the LBUG is finally inaccurate and could be simply removed? More to come from crash-dump/debug-log analysis, and will also check if problem is still alive (like the same LBUG!) in master.

          Hi Bruno, 

           

          Here is the list of patches we apply to our servers kernel:

          0003-Applied-Bull_lmigr-x86_64.el6.0.7.patch-for-RHEL-2.6.patch
          Patch which delays migration of pages between numa nodes

          0006-Applied-Bull_311740-GENERIC_ALLOCATOR-4.patch
          Include generic memory allocator for x86 architecture.

          0008-Applied-Bull_Dynamic_Power_Limitation_2.6.32-573.patch
          CState manipulation interface through /sys

          0011-Applied-Bull_mce75xx_2.6.32-504.diff.patch
          Intel Xeon 7500 series corrected memory error driver

          0012-Applied-Bull_improve-x86_64.csum.patch.patch
          Performance improvement on TCP/IP checksumming

          0017-Applied-raid5-mmp-unplug-dev-rhel6.Bull.3.patch
          Forces sync reads on MD devices' disks for MMP compatibility

          0018-lustre-blkdev_tunables-2.6-rhel6.patch.patch
          Some block device parameters value modifications

          0021-lustre-bh_lru_size_config.patch.patch
          Buffer head per-cpu LRU size increased from 8 to 16

          0022-quota-replace-dqptr-sem.patch
          Replaces dqptr semaphore by spin locks (lustre server kernel patch)

          0029-Applied-proc_pid_limits.patch.patch
          Ease use of writes to /proc/<pid>/limits file

          0030-Applied-patch-Bull_oomprotect.diff.patch
          Patch to protect processes from root to be killed by oom

          0055-quota-avoid-dqget-calls.patch
          (lustre server kernel patch)

          0058-fix-CONFIG_XEN.patch
          Some modifications to have kvm work when XEN is not enabled

          0059-ocfs2-getname.patch
          fix getname/putname changes in ocfs2

          0090-kernel_H-2.6.32-573.1.1-config-x86_64.patch
          0091-kernel_H-2.6.32-573.1.1-config-x86_64.debug.patch
          Some modifications to the default RHEL kernel configuration.

           

          For the lustre code, this is based on 2.5.3.90 (commit 35bb8577c83f787451cd9842a07ac670cb6f4d3f), with some patches that I will send you by mail.

          spiechurski Sebastien Piechurski added a comment - Hi Bruno,    Here is the list of patches we apply to our servers kernel: 0003-Applied-Bull_lmigr-x86_64.el6.0.7.patch-for-RHEL-2.6.patch Patch which delays migration of pages between numa nodes 0006-Applied-Bull_311740-GENERIC_ALLOCATOR-4.patch Include generic memory allocator for x86 architecture. 0008-Applied-Bull_Dynamic_Power_Limitation_2.6.32-573.patch CState manipulation interface through /sys 0011-Applied-Bull_mce75xx_2.6.32-504.diff.patch Intel Xeon 7500 series corrected memory error driver 0012-Applied-Bull_improve-x86_64.csum.patch.patch Performance improvement on TCP/IP checksumming 0017-Applied-raid5-mmp-unplug-dev-rhel6.Bull.3.patch Forces sync reads on MD devices' disks for MMP compatibility 0018-lustre-blkdev_tunables-2.6-rhel6.patch.patch Some block device parameters value modifications 0021-lustre-bh_lru_size_config.patch.patch Buffer head per-cpu LRU size increased from 8 to 16 0022-quota-replace-dqptr-sem.patch Replaces dqptr semaphore by spin locks (lustre server kernel patch) 0029-Applied-proc_pid_limits.patch.patch Ease use of writes to /proc/<pid>/limits file 0030-Applied-patch-Bull_oomprotect.diff.patch Patch to protect processes from root to be killed by oom 0055-quota-avoid-dqget-calls.patch (lustre server kernel patch) 0058-fix-CONFIG_XEN.patch Some modifications to have kvm work when XEN is not enabled 0059-ocfs2-getname.patch fix getname/putname changes in ocfs2 0090-kernel_H-2.6.32-573.1.1-config-x86_64.patch 0091-kernel_H-2.6.32-573.1.1-config-x86_64.debug.patch Some modifications to the default RHEL kernel configuration.   For the lustre code, this is based on 2.5.3.90 (commit 35bb8577c83f787451cd9842a07ac670cb6f4d3f), with some patches that I will send you by mail.

          People

            bfaccini Bruno Faccini (Inactive)
            spiechurski Sebastien Piechurski
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: