[LU-15195] IOR SSF: ior ERROR: write() failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535) Created: 04/Nov/21  Updated: 29/Jun/22  Resolved: 23/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Vitaly Fertman Assignee: Vitaly Fertman
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

bluepill-client09: Commencing write performance test: Thu Aug 5 09:35:24 2021
bluepill-client09: ior ERROR: write(15, 0x7f4e2e015000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
bluepill-client09: ior ERROR: write(17, 0x7f8026d57000, 1705984) failed, errno 11, Resource temporarily unavailable (aiori-POSIX.c:535)
bluepill-client09: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 6
pdsh@bluepill-client02: bluepill-client09: ssh exited with exit code 255
pdsh@bluepill-client02: bluepill-client02: ssh exited with exit code 1

EAGAIN comes from ofd_lvbo_init():

		/* Object could be recreated during the first
		 * CLEANUP_ORPHAN request. */
		if (rc == -ENOENT) {
			seq = fid_seq(&info->fti_fid);
			oseq = ofd_seq_load(env, ofd, fid_seq_is_idif(seq) ?
					    FID_SEQ_OST_MDT0 : seq);
			if (!IS_ERR_OR_NULL(oseq)) {
				if (!oseq->os_last_id_synced)
					rc = -EAGAIN;
				ofd_seq_put(env, oseq);
			}
		}

the code was added in LU-11765 which sent EAGAIN instead of ENOENT expecting EAGAIN led to a resend. it happens indeed for enqueue due to (a hack?) cl_glimpse_size0(), but not for others.



 Comments   
Comment by Gerrit Updater [ 04/Nov/21 ]

"Vitaly Fertman <vitaly.fertman@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45459
Subject: LU-15195 ofd: missing OST object
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4c4f67feb9d0becd9d569b3ad396c006be8e5da7

Comment by Gerrit Updater [ 23/Dec/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45459/
Subject: LU-15195 ofd: missing OST object
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 28769c65987cb1546918fe12d6f34b95ab9c5507

Comment by Peter Jones [ 23/Dec/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:16:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.