Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17323

fork() leaks ERESTARTNOINTR (errno 513) to user application

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.11.0, Lustre 2.12.5, Lustre 2.12.6, Lustre 2.12.9
    • None
    • RHEL6, RHEL7/CentOS7 (various kernels)
    • 3
    • 9223372036854775807

    Description

      When using file locks on a Lustre mount with the 'flock' mount option, fork()
      can leak ERESTARTNOINTR to a user application.  The fork() system call checks
      if a signal is pending, and if so, cleans up everything it did and returns 
      ERESTARTNOINTR.  The kernel transparently restarts the fork() from scratch,
      the user application is never supposed to get the ERESTARTNOINTR errno.

      The fork() cleanup code calls exit_files() which calls Lustre code.  I'm not
      positive what the problem is at a low level.  It may be that the Lustre code
      clears the TIF_SIGPENDING flag, which prevents the kernel from restarting the
      fork() and it leaks the ERESTARTNOINTR errno to the user application.

      It seems there has to be multiple threads involved.  My reproducer has two
      threads. Thread 1 does fork() calls in an infinite loop, spawning children
      that exit after a random number of seconds.  Thread 2 sleeps for a random
      number of seconds in an infinite loop.  There is a SIGCHLD handler set up and
      both threads can handle SIGCHLD signals.  The fork() gets interrupted by
      pending SIGCHLD signals from exiting children.  I think thread 2 has to handle
      the SIGCHLD signal for the problem to happen.  If thread 2 has SIGCHLD signals
      blocked, the problem never happens.

      The problem doesn't reproduce with the 'localflock' mount option, so we
      believe 'localflock' is safe from this issue.

      We've seen this on RHEL6, RHEL7/CentOS7 kernels,
      and Lustre 2.11.0, 2.12.5 and 2.12.6
      Lustre 2.12.0 does not reproduce the issue.

      Steps to reproduce:

      1) Lustre mount must be using 'flock' mount option.
      2) gcc -o repro ./repro.c -lpthread
      3) Run reproducer:

      Problem usually reproduces within 5-60 seconds.
      Reproducer runs indefinitely or until the issue occurs, 
      enter Ctrl-C to quit

      > touch /lustre_mnt/testfile.txt
      > ./repro /lustre_mnt/testfile.txt
      Fork returned -1, errno = 513, exiting...

      Use POSIX style read lock
      > ./repro /lustre_mnt/testfile.txt posix
      Fork returned -1, errno = 513, exiting...

      Use BSD style read lock
      > ./repro /lustre_mnt/testfile.txt flock
      Fork returned -1, errno = 513, exiting...

      Don't lock at all (this won't reproduce and will run indefinitely)
      > ./repro /lustre_mnt/testfile.txt none

      NOTE: be aware the reproducer can exhaust your maxprocs limit

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              mikedoo4 Mike D
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: