Loading...

XML

Word

Printable

Details

Type: Question/Request
Resolution: Done
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- llnl
Environment:
Lustre: Build Version: 2.8.0_5.chaos

Rank (Obsolete):
9223372036854775807

Description

On a DNE file system, MDT0000 ran out of space while one or more other MDTs were in recovery.

   2016-10-31 18:26:53 [20537.964631] Lustre: Skipped 1 previous similar message
   2016-10-31 18:26:58 [20542.793836] LustreError: 31561:0:(osd_handler.c:223:osd_trans_start()) lsh-MDT0000: failed to start transaction due to ENOSPC. Metadata overhead is underestimated or grant_ratio is too low.
   2016-10-31 18:26:58 [20542.815473] LustreError: 31561:0:(osd_handler.c:223:osd_trans_start()) Skipped 39 previous similar messages
   2016-10-31 18:26:58 [20542.827434] LustreError: 31561:0:(llog_cat.c:744:llog_cat_cancel_records()) lsh-OST0009-osc-MDT0000: fail to cancel 1 of 1 llog-records: rc = -28
   2016-10-31 18:26:58 [20542.843771] LustreError: 31561:0:(osp_sync.c:1031:osp_sync_process_committed()) lsh-OST0009-osc-MDT0000: can't cancel record: -28

Obviously the first step is to increase the capacity of the pool. However, after that is done, is further action required? Should I run lfsck, or do anything else?

Attachments

Issue Links

is related to

LU-8753 Recovery already passed deadline with DNE

Resolved

Activity

People

Assignee:: nasf (Inactive)

Reporter:: Olaf Faaland

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 01/Nov/16 11:43 PM

Updated:: 02/Nov/17 10:02 PM

Resolved:: 02/Nov/17 10:02 PM