Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0, Lustre 2.10.5
    • Lustre 2.10.4
    • None
    • centos 7.5, x86_64, OPA, zfs 0.7.9
    • 3
    • 9223372036854775807

    Description

      2.10.4 client seems to have introduced a regression from 2.10.3.

      we now see this message from clients

      Jun  7 06:33:32 john73 kernel: Invalid argument reading file caps for /home/fstars/dwf_prepipe/dwf_prepipe_processccd.py
      Jun  7 10:55:40 bryan8 kernel: Invalid argument reading file caps for /bin/date
      Jun  7 11:05:29 john75 kernel: Invalid argument reading file caps for /usr/bin/basename
      Jun  7 11:51:29 john97 kernel: Invalid argument reading file caps for /usr/bin/id
      Jun  7 11:51:29 john97 kernel: Invalid argument reading file caps for /apps/lmod/lmod/lmod/libexec/addto
      

      the upshot of which is that those files then can't be exec'd by the kernel.

      all our servers are now centos 7.4 and 2.10.4 + LU10988 lfsck patch, zfs 0.7.9.
      we have 4 lustre filesystems in the cluster and this 'fail caps' issue happens on them all. more on the root filesystem because there are more exe's there.

      for some files it seems to happen on all clients and be persistent eg. all the 2.10.4 client nodes see this

      [root@john72 ~]# g++
      -bash: /usr/bin/g++: Invalid argument
      [root@john72 ~]# dmesg | tail -1
      [616489.562465] Invalid argument reading file caps for /usr/bin/g++
      

      and for other files it's transient. eg. the exe's on the nodes listed above all work again now

      [root@john97 ~]# /usr/bin/id
      uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)
      

      g++ is interesting because it's hard-linked 4 times (to c+, ...), which might be part of why it persists? copying each of c, g+. etc. to a separate (non-hardlinked) file is a workaround and lets it be exec'd again, but that doesn't explain all the other files that sometimes work and sometimes don't.

      apart from things like g++, the problem is rare, less than once per client per day.

      as a workaround (so we can get all clients onto the more secure centos7.5) we'd like to run 2.10.3 on centos7.5 for a while, but it doesn't seem to work (looks to mount, but then ls says 'not a directory'). I don't suppose there's a patch or two that'll let 2.10.3 be functional on centos7.5? thanks.

      cheers,
      robin

      Attachments

        Issue Links

          Activity

            [LU-11074] Invalid argument reading file caps
            pjones Peter Jones added a comment -

            Sorry - Lai, I intended that comment for another ticket

            pjones Peter Jones added a comment - Sorry - Lai, I intended that comment for another ticket
            pjones Peter Jones added a comment -

            Lai

            Can you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Lai Can you please investigate? Thanks Peter

            If you can't find which patch is the source of the problem, I'd suggest to use git bisect with your "good" reproducer (possibly run multiple times to ensure you don't get a false pass) to isolate the issue to a single patch. That will allow us to identify which patch introduced the problem and possibly see how it is interacting badly with overlayfs.

            adilger Andreas Dilger added a comment - If you can't find which patch is the source of the problem, I'd suggest to use git bisect with your "good" reproducer (possibly run multiple times to ensure you don't get a false pass) to isolate the issue to a single patch. That will allow us to identify which patch introduced the problem and possibly see how it is interacting badly with overlayfs.
            scadmin SC Admin added a comment -

            we're using 862.3.2 kernel, the latest AFAIK.

            I'm being hesitant about debug logs 'cos I'm not 100% convinced it's a lustre bug. we definitely don't see this issue with rhel7.4 + 2.10.3, but the complication is that we use overlayfs over our root lustre filesystem.

            overlayfs changed a lot between 7.4 and 7.5 and I've re-patched it etc, but it might still be an overlayfs bug, or an overlayfs interaction with lustre that's now different vs. a pure lustre bug.

            the thing that indicates it's maybe a real lustre issue is that we see the 'file caps' problem on all filesystems - /home, /apps, /fred(dagg) - and not just on /images (which is the only one with overlayfs over it).

            AFAIK the only thing these 4 filesystems share is the root inode, which is on overlayfs. it seems really unlikely that the node is healthy for all accesses via the root inode/dentry, and at the same time sees 'file caps' fail on one of the pure lustre filesystems, but I wanted to try a few things first. eg. patch the rhel 7.5 kernel with a bunch of stable capabilities namespace backports that rhel seem to have omitted... unfortunately that didn't fix it.

            the g++ 'file caps' bug (the one that's trivial to reproduce) doesn't happen if I go directly to lustre, so there's definitely something wrong with overlayfs. I was sure I'd tried this before making this bug report, but I guess not.

            however, g++ failing via overlayfs and working via lustre doesn't explain the much rarer fails direct to lustre on the other 3 filesystems (+/- that shared root inode). but I can't reproduce those at will - they are rare. so I don't see how I can get you a debug trace for those.

            I can't figure out from 'git log v2_10_3..v2_10_4' on b2_10 which patch(es) make the lustre client work with rhel7.5's kernel. if there is one or two that you can point me at then that would help.
            this is because if 2.10.3 is busted with rhel7.5 too, then that means it's a rhel7.5 kernel issue and nothing to do with lustre.

            cheers,
            robin

            scadmin SC Admin added a comment - we're using 862.3.2 kernel, the latest AFAIK. I'm being hesitant about debug logs 'cos I'm not 100% convinced it's a lustre bug. we definitely don't see this issue with rhel7.4 + 2.10.3, but the complication is that we use overlayfs over our root lustre filesystem. overlayfs changed a lot between 7.4 and 7.5 and I've re-patched it etc, but it might still be an overlayfs bug, or an overlayfs interaction with lustre that's now different vs. a pure lustre bug. the thing that indicates it's maybe a real lustre issue is that we see the 'file caps' problem on all filesystems - /home, /apps, /fred(dagg) - and not just on /images (which is the only one with overlayfs over it). AFAIK the only thing these 4 filesystems share is the root inode, which is on overlayfs. it seems really unlikely that the node is healthy for all accesses via the root inode/dentry, and at the same time sees 'file caps' fail on one of the pure lustre filesystems, but I wanted to try a few things first. eg. patch the rhel 7.5 kernel with a bunch of stable capabilities namespace backports that rhel seem to have omitted... unfortunately that didn't fix it. the g++ 'file caps' bug (the one that's trivial to reproduce) doesn't happen if I go directly to lustre, so there's definitely something wrong with overlayfs. I was sure I'd tried this before making this bug report, but I guess not. however, g++ failing via overlayfs and working via lustre doesn't explain the much rarer fails direct to lustre on the other 3 filesystems (+/- that shared root inode). but I can't reproduce those at will - they are rare. so I don't see how I can get you a debug trace for those. I can't figure out from 'git log v2_10_3..v2_10_4' on b2_10 which patch(es) make the lustre client work with rhel7.5's kernel. if there is one or two that you can point me at then that would help. this is because if 2.10.3 is busted with rhel7.5 too, then that means it's a rhel7.5 kernel issue and nothing to do with lustre. cheers, robin
            pjones Peter Jones added a comment -

            Robin

            Any idea how long it will take to get the debug logs?

            Peter

            pjones Peter Jones added a comment - Robin Any idea how long it will take to get the debug logs? Peter
            jhammond John Hammond added a comment -

            Which 7.5 kernel are you using?

            jhammond John Hammond added a comment - Which 7.5 kernel are you using?
            scadmin SC Admin added a comment -

            Hey John,

            no, not using any LSM.

            I'll gather the debug for eg. g++ when a node clears of jobs. otherwise there'll be lots of noise.

            cheers,
            robin

            scadmin SC Admin added a comment - Hey John, no, not using any LSM. I'll gather the debug for eg. g++ when a node clears of jobs. otherwise there'll be lots of noise. cheers, robin
            jhammond John Hammond added a comment -

            Hi Robin,

            Are you using any Linux Security Modules? Could you enable full debugging, clear the debug log, reproduce this, dump the log and attach? (You may need to increase the debug_mb parameter to get a full capture.)

            jhammond John Hammond added a comment - Hi Robin, Are you using any Linux Security Modules? Could you enable full debugging, clear the debug log, reproduce this, dump the log and attach? (You may need to increase the debug_mb parameter to get a full capture.)

            People

              jhammond John Hammond
              scadmin SC Admin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: