Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9735

Sles12Sp2 and 2.9 getcwd() sometimes fails

XMLWordPrintable

    • 2
    • 9223372036854775807

      This is a duplicate of LU-9208. Opening this case for tracking for nasa. We start to see this once we updated the clients to Sles12SP2 and lustre2.9

      Using the test code provide LU-9208 (miranda) I was able to reproduce the bug on a single node.

       

      Iteration =    868, Run Time =     0.9614 sec., Transfer Rate =   120.7790 10e+06 Bytes/sec/proc
      Iteration =    869, Run Time =     1.5308 sec., Transfer Rate =    75.8561 10e+06 Bytes/sec/proc
      forrtl: severe (121): Cannot access current working directory for unit 10012, file "Unknown"
      Image              PC                Routine            Line        Source             
      miranda            0000000000409F29  Unknown               Unknown  Unknown
      miranda            00000000004169D2  Unknown               Unknown  Unknown
      miranda            0000000000404045  Unknown               Unknown  Unknown
      miranda            0000000000402FDE  Unknown               Unknown  Unknown
      libc.so.6          00002AAAAB5B96E5  Unknown               Unknown  Unknown
      miranda            0000000000402EE9  Unknown               Unknown  Unknown
      MPT ERROR: MPI_COMM_WORLD rank 12 has terminated without calling MPI_Finalize()
      	aborting job
      
      

       I was able to capture some debug logs I have attached to the case. I was unable to reproduce it using "+trace". But will continue to try.

        1. getcwdHack.c
          6 kB
        2. miranda.debug.1499341246.gz
          84.13 MB
        3. miranda.dis
          9.19 MB
        4. r481i7n17.dump1.log.gz
          13.86 MB
        5. unoptimize-atomic_open-of-negative-dentry.patch
          2 kB

            simmonsja James A Simmons
            mhanafi Mahmoud Hanafi
            Votes:
            1 Vote for this issue
            Watchers:
            24 Start watching this issue

              Created:
              Updated:
              Resolved: