Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15195

IOR SSF: ior ERROR: write() failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      bluepill-client09: Commencing write performance test: Thu Aug 5 09:35:24 2021
      bluepill-client09: ior ERROR: write(15, 0x7f4e2e015000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
      bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
      bluepill-client09: ior ERROR: write(17, 0x7f8026d57000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
      bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 6
      pdsh@bluepill-client02: bluepill-client09: ssh exited with exit code 255
      pdsh@bluepill-client02: bluepill-client02: ssh exited with exit code 1

      EAGAIN comes from ofd_lvbo_init():

      		/* Object could be recreated during the first
      		 * CLEANUP_ORPHAN request. */
      		if (rc == -ENOENT) {
      			seq = fid_seq(&info->fti_fid);
      			oseq = ofd_seq_load(env, ofd, fid_seq_is_idif(seq) ?
      					    FID_SEQ_OST_MDT0 : seq);
      			if (!IS_ERR_OR_NULL(oseq)) {
      				if (!oseq->os_last_id_synced)
      					rc = -EAGAIN;
      				ofd_seq_put(env, oseq);
      			}
      		}
      

      the code was added in LU-11765 which sent EAGAIN instead of ENOENT expecting EAGAIN led to a resend. it happens indeed for enqueue due to (a hack?) cl_glimpse_size0(), but not for others.

      Attachments

        Activity

          People

            vitaly_fertman Vitaly Fertman
            vitaly_fertman Vitaly Fertman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: