Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17736

LTest: Sanityn104 suite - test_73 failed (getxattr should not cause xattr lock cancellation)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      2021-09-20 23:21:35,549 INF == sanityn test 73: getxattr should not cause xattr lock cancellation ================================ 23:21:35 (1632180095)
      2021-09-20 23:21:35,956 INF llite.lustre-ffff888ee1790000.xattr_cache=1
      2021-09-20 23:21:35,956 INF llite.lustre-ffff88903ccbb000.xattr_cache=1
      2021-09-20 23:21:36,007 INF getfattr: Removing leading '/' from absolute path names
      2021-09-20 23:21:36,008 INF # file: mnt/lustre2/f73.sanityn
      2021-09-20 23:21:36,008 INF user.attr1="value1"
      2021-09-20 23:21:36,008 INF 
      2021-09-20 23:21:36,011 INF getfattr: Removing leading '/' from absolute path names
      2021-09-20 23:21:36,011 INF # file: mnt/lustre/f73.sanityn
      2021-09-20 23:21:36,011 INF user.attr1="value1"
      2021-09-20 23:21:36,011 INF 
      2021-09-20 23:21:36,015 INF getfattr: Removing leading '/' from absolute path names
      2021-09-20 23:21:36,015 INF # file: mnt/lustre/f73.sanityn
      2021-09-20 23:21:36,015 INF user.attr1="value1"
      2021-09-20 23:21:36,016 INF 
      2021-09-20 23:21:36,022 INF  sanityn test_73: @@@@@@ FAIL: not cached in /mnt/lustre 
      2021-09-20 23:21:36,415 INF   Trace dump:
      2021-09-20 23:21:36,415 INF   = /usr/lib/lustre/tests/test-framework.sh:6273:error()
      2021-09-20 23:21:36,415 INF   = /usr/lib/lustre/tests/sanityn.sh:3555:test_73()
      2021-09-20 23:21:36,416 INF   = /usr/lib/lustre/tests/test-framework.sh:6581:run_one()
      2021-09-20 23:21:36,416 INF   = /usr/lib/lustre/tests/test-framework.sh:6628:run_one_logged()
      2021-09-20 23:21:36,416 INF   = /usr/lib/lustre/tests/test-framework.sh:6455:run_test()
      2021-09-20 23:21:36,416 INF   = /usr/lib/lustre/tests/sanityn.sh:3565:main()
      2021-09-20 23:21:36,493 INF Dumping lctl log to /opt/results/1632179694.150497/2021-09-20/231729/sanityn.test_73.*.1632180096.log
      2021-09-20 23:21:38,206 INF Resetting fail_loc on all nodes...done.
      2021-09-20 23:21:38,213 INF FAIL 73 (3s

      This is caused by auditd rules, which trigger additional getxattrs within the kernel.  I root caused this by getting ebpf tools working on Ubuntu.  As you can see below, the five stacks associated with a single call to getfattr are:
       
      root@0b51536c-9f59-4c72-ae4b-0fc8e1cbc430-lc-a0-g0-vm:~# stackcount-bpfcc ll_xattr_cache_get

      Tracing 1 functions for "ll_xattr_cache_get"... Hit Ctrl-C to end.
      ^C
        ll_xattr_cache_get
        ll_xattr_get_common
        __vfs_getxattr
        vfs_getxattr
        getxattr
        path_getxattr
        __x64_sys_getxattr
        do_syscall_64
        entry_SYSCALL_64_after_hwframe
        [unknown]
        [unknown]
          getfattr [19698]
          1  
       
        ll_xattr_cache_get
        ll_xattr_get_common
        __vfs_getxattr
        vfs_getxattr
        getxattr
        path_getxattr
        __x64_sys_getxattr
        do_syscall_64
        entry_SYSCALL_64_after_hwframe
        [unknown]
        [unknown]
          getfattr [19698]
          1  
       
        ll_xattr_cache_get
        ll_xattr_get_common
        __vfs_getxattr
        get_vfs_caps_from_disk
        audit_copy_inode
        __audit_inode
        filename_lookup
        user_path_at_empty
        vfs_statx
        __do_sys_newlstat
        __x64_sys_newlstat
        do_syscall_64
        entry_SYSCALL_64_after_hwframe
        [unknown]
          getfattr [19698]
          1  
       
        ll_xattr_cache_get
        ll_xattr_get_common
        __vfs_getxattr
        get_vfs_caps_from_disk
        audit_copy_inode
        __audit_inode
        filename_lookup
        user_path_at_empty
        path_getxattr
        __x64_sys_getxattr
        do_syscall_64
        entry_SYSCALL_64_after_hwframe
        [unknown]
        [unknown]
          getfattr [19698]
          2
      Only the ones without "__audit_inode" in them are doing "real" work.  The remaining three are simply performing getxattrs against the path being operated on as part of filename_lookup.  So long as auditctl shows enabled AND there is at least a single rule that relates to the filesystem (e.g., audit all modifications to /etc/audit), every single lookup will result in a get(x)attr to get VFS caps.  Relevant call path through the kernel is:
       
      filename_lookup
      audit_inode
      __audit_inode
      audit_copy_inode
      audit_copy_fcaps
      get_vfs_caps_from_disk
       
      Disabling auditd via auditctl -e0 returns the total number of getxattrs to the expected value of 2.
       
      I suspect the simplest fix is to set the number expected to be a minimum of 2, rather than exactly 2.  Trying to do system detection of auditd rules that the kernel will detect as filesystem-oriented seems like a lot of work for little payoff.

      Patch to be sent shortly.

      Attachments

        Activity

          People

            elliswilson Ellis Wilson
            elliswilson Ellis Wilson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: