[LU-4061] replay-single test_73c: MDS returns error when no objects available Created: 04/Oct/13 Updated: 09/Jan/20 Resolved: 09/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 10884 |
| Description |
|
This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com> This issue relates to the following test suite run: The sub-test test_73c failed with the following error:
Info required for matching: replay-single 73c |
| Comments |
| Comment by Andreas Dilger [ 04/Oct/13 ] |
|
It looks like the MDS timed out trying to create any objects on the OSTs (which are very slow due to ZFS on a single VM disk), and returned an error back to multiop doing open(O_CREATE|O_RDWR). The MDS shouldn't ever return an error during create when all of the OSTs are out of objects. Instead, the MDS should try indefinitely to create the OST objects, and it can return -EINPROGRESS (instead of the current -EIO) to the client so that it will retry without blocking up an MDS thread. 15:14:44:Lustre: DEBUG MARKER: == replay-single test 73c: open(O_CREAT), unlink, replay, reconnect at last_replay, close == 15:13:31 (1380838411) 15:14:44:Lustre: lustre-OST0006-osc-MDT0000: slow creates, last=[0x0:0x1:0x0], next=[0x0:0x1:0x0], reserved=0, syn_changes=1, syn_rpc_in_progress=8, status=0 15:14:44:Lustre: lustre-OST0000-osc-MDT0000: slow creates, last=[0x0:0x1:0x0], next=[0x0:0x1:0x0], reserved=0, syn_changes=0, syn_rpc_in_progress=17, status=0 15:14:44:Lustre: lustre-OST0001-osc-MDT0000: slow creates, last=[0x0:0x1:0x0], next=[0x0:0x1:0x0], reserved=0, syn_changes=17, syn_rpc_in_progress=8, status=0 15:14:44:LNet: Service thread pid 3621 completed after 60.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). 15:14:44:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_73c: @@@@@@ FAIL: test_73c failed with 3 |
| Comment by Andreas Dilger [ 09/Jan/20 ] |
|
Close old bug |