Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
bluepill-client09: Commencing write performance test: Thu Aug 5 09:35:24 2021
bluepill-client09: ior ERROR: write(15, 0x7f4e2e015000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
bluepill-client09: ior ERROR: write(17, 0x7f8026d57000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 6
pdsh@bluepill-client02: bluepill-client09: ssh exited with exit code 255
pdsh@bluepill-client02: bluepill-client02: ssh exited with exit code 1
EAGAIN comes from ofd_lvbo_init():
/* Object could be recreated during the first * CLEANUP_ORPHAN request. */ if (rc == -ENOENT) { seq = fid_seq(&info->fti_fid); oseq = ofd_seq_load(env, ofd, fid_seq_is_idif(seq) ? FID_SEQ_OST_MDT0 : seq); if (!IS_ERR_OR_NULL(oseq)) { if (!oseq->os_last_id_synced) rc = -EAGAIN; ofd_seq_put(env, oseq); } }
the code was added in LU-11765 which sent EAGAIN instead of ENOENT expecting EAGAIN led to a resend. it happens indeed for enqueue due to (a hack?) cl_glimpse_size0(), but not for others.
Landed for 2.15