[LU-4290] osp_sync_threads encounters EIO on mount - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.6.0
Affects Version/s: Lustre 2.4.1
Labels:
None
Environment:
RHEL 6.4/distro IB

Severity:
2
Rank (Obsolete):
11773

Description

We encountered this assertion in production, libcfs_panic_on_lbug was set to 1, so server rebooted. On mount, the same assertion and lbug would occur. Filesystem will mount with panic_on_lbug set to 0. We've captured a crash dump and lustre log messages with the debug flags:

[root@atlas-mds3 ~]# cat /proc/sys/lnet/debug
trace ioctl neterror warning other error emerg ha config console

Ran e2fsck:
e2fsck -f -j /dev/mapper/atlas2-mdt1-journal /dev/mapper/atlas2-mdt1

and only fixed the quota inconsistencies it found.

At the moment, we are back to production after the osp_sync_threads lbugs on mount. There are hung task messages about osp_sync_threads as would be expected. We want to fix the root issue that is causing the assertions.

kernel messages during one of the failed mounts
Nov 21 21:16:44 atlas-mds3 kernel: [ 911.319839] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts:
Nov 21 21:16:44 atlas-mds3 kernel: [ 911.986208] Lustre: mdt_num_threads module parameter is deprecated, use mds_num_threads instead or unset both for dynamic thread startup
Nov 21 21:16:46 atlas-mds3 kernel: [ 913.069371] Lustre: atlas2-MDT0000: used disk, loading
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.261572] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 0 changes, 0 in progress, 0 in flight: -5
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.278318] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) LBUG
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.286036] Pid: 18945, comm: osp-syn-256
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290841]
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290844] Call Trace:

We also see this message:
Nov 21 23:01:01 atlas-mds3 kernel: [ 1512.633528] ERST: NVRAM ERST Log Address Range is not implemented yet

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

6305.llog_out
73 kB
22/Nov/13 6:50 PM
6306.llog_out
73 kB
22/Nov/13 6:50 PM
6307.llog_out
73 kB
22/Nov/13 6:50 PM
6308.llog_out
73 kB
22/Nov/13 6:50 PM
lustre-log.1385095225.19969.gz
38 kB
22/Nov/13 5:47 AM
lustre-log.1385095225.19971.gz
7 kB
22/Nov/13 5:47 AM
lustre-log.1385095225.19973.gz
2.88 MB
22/Nov/13 5:47 AM
lustre-log.1385095225.19975.gz
5 kB
22/Nov/13 5:47 AM

Activity

People

Assignee:: Alex Zhuravlev

Reporter:: Blake Caldwell

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 22/Nov/13 4:48 AM

Updated:: 20/Mar/14 2:28 PM

Resolved:: 20/Mar/14 2:28 PM