[LU-4290] osp_sync_threads encounters EIO on mount - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.6.0
Affects Version/s: Lustre 2.4.1
Labels:
None
Environment:
RHEL 6.4/distro IB

Severity:
2
Rank (Obsolete):
11773

Description

We encountered this assertion in production, libcfs_panic_on_lbug was set to 1, so server rebooted. On mount, the same assertion and lbug would occur. Filesystem will mount with panic_on_lbug set to 0. We've captured a crash dump and lustre log messages with the debug flags:

[root@atlas-mds3 ~]# cat /proc/sys/lnet/debug
trace ioctl neterror warning other error emerg ha config console

Ran e2fsck:
e2fsck -f -j /dev/mapper/atlas2-mdt1-journal /dev/mapper/atlas2-mdt1

and only fixed the quota inconsistencies it found.

At the moment, we are back to production after the osp_sync_threads lbugs on mount. There are hung task messages about osp_sync_threads as would be expected. We want to fix the root issue that is causing the assertions.

kernel messages during one of the failed mounts
Nov 21 21:16:44 atlas-mds3 kernel: [ 911.319839] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts:
Nov 21 21:16:44 atlas-mds3 kernel: [ 911.986208] Lustre: mdt_num_threads module parameter is deprecated, use mds_num_threads instead or unset both for dynamic thread startup
Nov 21 21:16:46 atlas-mds3 kernel: [ 913.069371] Lustre: atlas2-MDT0000: used disk, loading
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.261572] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 0 changes, 0 in progress, 0 in flight: -5
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.278318] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) LBUG
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.286036] Pid: 18945, comm: osp-syn-256
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290841]
Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290844] Call Trace:

We also see this message:
Nov 21 23:01:01 atlas-mds3 kernel: [ 1512.633528] ERST: NVRAM ERST Log Address Range is not implemented yet

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

6305.llog_out
73 kB
22/Nov/13 6:50 PM
6306.llog_out
73 kB
22/Nov/13 6:50 PM
6307.llog_out
73 kB
22/Nov/13 6:50 PM
6308.llog_out
73 kB
22/Nov/13 6:50 PM
lustre-log.1385095225.19969.gz
38 kB
22/Nov/13 5:47 AM
lustre-log.1385095225.19971.gz
7 kB
22/Nov/13 5:47 AM
lustre-log.1385095225.19973.gz
2.88 MB
22/Nov/13 5:47 AM
lustre-log.1385095225.19975.gz
5 kB
22/Nov/13 5:47 AM

Activity

[LU-4290] osp_sync_threads encounters EIO on mount

Andreas Dilger added a comment - 29/Nov/13 6:09 PM

Blake,
The timestamps on the logs are not updated by the Lustre code, so that is why it appears they are not modified after mount. Also, logs are only used once and then deleted, so new ones are crated each mount.

Alex,
I think that if there is an error looking up a record in the llog that unlink should be skipped and the next record processed. Once all the records are processed (for good or bad) the log file will be deleted anyway. I don't think this should be handled by the llog code internally, since we don't necessarily want to delete a config file if there us a bad block on disk or some other toor set problem. For the object unlink case, it would eventually be cleaned up by LFSCK so I don't think it is terrible if some records are not processed.

Andreas Dilger added a comment - 29/Nov/13 6:09 PM Blake, The timestamps on the logs are not updated by the Lustre code, so that is why it appears they are not modified after mount. Also, logs are only used once and then deleted, so new ones are crated each mount. Alex, I think that if there is an error looking up a record in the llog that unlink should be skipped and the next record processed. Once all the records are processed (for good or bad) the log file will be deleted anyway. I don't think this should be handled by the llog code internally, since we don't necessarily want to delete a config file if there us a bad block on disk or some other toor set problem. For the object unlink case, it would eventually be cleaned up by LFSCK so I don't think it is terrible if some records are not processed.

Alex Zhuravlev added a comment - 28/Nov/13 5:48 AM

yes, definitely. I'm thinking on what would be a good reaction here. just skip such a log? remove it?

Alex Zhuravlev added a comment - 28/Nov/13 5:48 AM yes, definitely. I'm thinking on what would be a good reaction here. just skip such a log? remove it?

Andreas Dilger added a comment - 27/Nov/13 7:03 PM

Alex, it is never good to assert on IO errors from the disk. Should this be converted to an error and handled more gracefully?

Andreas Dilger added a comment - 27/Nov/13 7:03 PM Alex, it is never good to assert on IO errors from the disk. Should this be converted to an error and handled more gracefully?

Blake Caldwell added a comment - 26/Nov/13 9:15 PM

When looking at the mtimes of other llogs we noticed that it corresponded to the time of the last successful mount. Note that 6305 was Nov 5 12:07. The output of llog_reader contains the same:
Time : Tue Nov 5 12:07:45 2013

Now the question of why these logs got truncated during normal operation. We identified that the MDT returned an error code not handled by the scsi layer.
Nov 26 10:58:27 atlas-mds3 kernel: [ 3820.152740] mpt2sas0: #011handle(0x000c), ioc_status(scsi data underrun)(0x0045), smid(1750)

So if the other llogs were consumed and cleared on lustre mount (is that correct?), they don't appear to get appended/committed to in normal operation. Why would an EIO affect the llogs?

root@atlas-mds3 d1]# ls -l
total 740
~~rw-r~~r- 1 root root 19776 Nov 21 23:40 10017
~~rw-r~~r- 1 root root 19584 Nov 21 23:40 10049
~~rw-r~~r- 1 root root 19712 Nov 21 23:40 10081
~~rw-r~~r- 1 root root 229120 Nov 5 12:07 6305
~~rw-r~~r- 1 root root 8320 Nov 21 22:42 8321
~~rw-r~~r- 1 root root 24192 Nov 21 23:40 9345
~~rw-r~~r- 1 root root 24192 Nov 21 23:40 9377

Blake Caldwell added a comment - 26/Nov/13 9:15 PM When looking at the mtimes of other llogs we noticed that it corresponded to the time of the last successful mount. Note that 6305 was Nov 5 12:07. The output of llog_reader contains the same: Time : Tue Nov 5 12:07:45 2013 Now the question of why these logs got truncated during normal operation. We identified that the MDT returned an error code not handled by the scsi layer. Nov 26 10:58:27 atlas-mds3 kernel: [ 3820.152740] mpt2sas0: #011handle(0x000c), ioc_status(scsi data underrun)(0x0045), smid(1750) So if the other llogs were consumed and cleared on lustre mount (is that correct?), they don't appear to get appended/committed to in normal operation. Why would an EIO affect the llogs? root@atlas-mds3 d1]# ls -l total 740 rw-r r - 1 root root 19776 Nov 21 23:40 10017 rw-r r - 1 root root 19584 Nov 21 23:40 10049 rw-r r - 1 root root 19712 Nov 21 23:40 10081 rw-r r - 1 root root 229120 Nov 5 12:07 6305 rw-r r - 1 root root 8320 Nov 21 22:42 8321 rw-r r - 1 root root 24192 Nov 21 23:40 9345 rw-r r - 1 root root 24192 Nov 21 23:40 9377

Blake Caldwell added a comment - 26/Nov/13 9:00 PM

Renaming the files is ldiskfs resolved the mount issues. It reported a failure reading 2 llogs and the mount was successful. The other 2 were likely rate-limited by rsyslog.

Nov 26 09:33:17 atlas-mds3 kernel: [ 251.406572] Lustre: Lustre: Build Version: 2.4.1--CHANGED-2.6.32-358.18.1.el6.atlas.x86_64
Nov 26 09:33:27 atlas-mds3 kernel: [ 261.291971] LDISKFS-fs (dm-3): recovery complete
Nov 26 09:33:27 atlas-mds3 kernel: [ 261.343137] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts:
Nov 26 09:33:28 atlas-mds3 kernel: [ 262.967910] Lustre: atlas2-MDT0000: used disk, loading
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.695531] LustreError: 14567:0:(llog_cat.c:192:llog_cat_id2handle()) atlas2-OST00ff-osc-MDT0000: error opening log id 0x18a1:1:0: rc = -2
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.709646] LustreError: 14567:0:(llog_cat.c:795:cat_cancel_cb()) atlas2-OST00ff-osc-MDT0000: cannot find handle for llog 0x18a1:1: -2
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.723280] LustreError: 14567:0:(llog_cat.c:833:llog_cat_init_and_process()) atlas2-OST00ff-osc-MDT0000: llog_process() with cat_cancel_cb failed: rc = -2
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.746477] LustreError: 14567:0:(llog_cat.c:192:llog_cat_id2handle()) atlas2-OST0100-osc-MDT0000: error opening log id 0x18a2:1:0: rc = -2
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.760683] LustreError: 14567:0:(llog_cat.c:795:cat_cancel_cb()) atlas2-OST0100-osc-MDT0000: cannot find handle for llog 0x18a2:1: -2
Nov 26 09:33:29 atlas-mds3 kernel: [ 263.774522] LustreError: 14567:0:(llog_cat.c:833:llog_cat_init_and_process()) atlas2-OST0100-osc-MDT0000: llog_process() with cat_cancel_cb failed: rc = -2
Nov 26 09:33:31 atlas-mds3 kernel: [ 265.647379] LustreError: 11-0: atlas2-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
Nov 26 09:33:31 atlas-mds3 kernel: [ 265.655134] Lustre: atlas2-MDT0000: Imperative Recovery enabled, recovery window shrunk from 1800-5400 down to 900-2700

Blake Caldwell added a comment - 26/Nov/13 9:00 PM Renaming the files is ldiskfs resolved the mount issues. It reported a failure reading 2 llogs and the mount was successful. The other 2 were likely rate-limited by rsyslog. Nov 26 09:33:17 atlas-mds3 kernel: [ 251.406572] Lustre: Lustre: Build Version: 2.4.1--CHANGED-2.6.32-358.18.1.el6.atlas.x86_64 Nov 26 09:33:27 atlas-mds3 kernel: [ 261.291971] LDISKFS-fs (dm-3): recovery complete Nov 26 09:33:27 atlas-mds3 kernel: [ 261.343137] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: Nov 26 09:33:28 atlas-mds3 kernel: [ 262.967910] Lustre: atlas2-MDT0000: used disk, loading Nov 26 09:33:29 atlas-mds3 kernel: [ 263.695531] LustreError: 14567:0:(llog_cat.c:192:llog_cat_id2handle()) atlas2-OST00ff-osc-MDT0000: error opening log id 0x18a1:1:0: rc = -2 Nov 26 09:33:29 atlas-mds3 kernel: [ 263.709646] LustreError: 14567:0:(llog_cat.c:795:cat_cancel_cb()) atlas2-OST00ff-osc-MDT0000: cannot find handle for llog 0x18a1:1: -2 Nov 26 09:33:29 atlas-mds3 kernel: [ 263.723280] LustreError: 14567:0:(llog_cat.c:833:llog_cat_init_and_process()) atlas2-OST00ff-osc-MDT0000: llog_process() with cat_cancel_cb failed: rc = -2 Nov 26 09:33:29 atlas-mds3 kernel: [ 263.746477] LustreError: 14567:0:(llog_cat.c:192:llog_cat_id2handle()) atlas2-OST0100-osc-MDT0000: error opening log id 0x18a2:1:0: rc = -2 Nov 26 09:33:29 atlas-mds3 kernel: [ 263.760683] LustreError: 14567:0:(llog_cat.c:795:cat_cancel_cb()) atlas2-OST0100-osc-MDT0000: cannot find handle for llog 0x18a2:1: -2 Nov 26 09:33:29 atlas-mds3 kernel: [ 263.774522] LustreError: 14567:0:(llog_cat.c:833:llog_cat_init_and_process()) atlas2-OST0100-osc-MDT0000: llog_process() with cat_cancel_cb failed: rc = -2 Nov 26 09:33:31 atlas-mds3 kernel: [ 265.647379] LustreError: 11-0: atlas2-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. Nov 26 09:33:31 atlas-mds3 kernel: [ 265.655134] Lustre: atlas2-MDT0000: Imperative Recovery enabled, recovery window shrunk from 1800-5400 down to 900-2700

Alex Zhuravlev added a comment - 25/Nov/13 4:39 PM

looks so. I suggest to rename them using direct ldiskfs mount and restart MDS.

Alex Zhuravlev added a comment - 25/Nov/13 4:39 PM looks so. I suggest to rename them using direct ldiskfs mount and restart MDS.

Blake Caldwell added a comment - 22/Nov/13 8:13 PM

If someone could please confirm whether the attached llogs contain the expected information, we will proceed with renaming them. Until then we will operate in this non-optimal state. Thanks.

Blake Caldwell added a comment - 22/Nov/13 8:13 PM If someone could please confirm whether the attached llogs contain the expected information, we will proceed with renaming them. Until then we will operate in this non-optimal state. Thanks.

Blake Caldwell added a comment - 22/Nov/13 6:49 PM

Alex,

We pulled the files below from debugfs and put them through llog_reader. The output of each (0x18a4, 0x18a3, 0x18a2, 0x18a1) is attached. We compared it to another log file that was processed successfully, which was empty.

We will wait to hear back if these look good and will make a copy and remove.

Blake Caldwell added a comment - 22/Nov/13 6:49 PM Alex, We pulled the files below from debugfs and put them through llog_reader. The output of each (0x18a4, 0x18a3, 0x18a2, 0x18a1) is attached. We compared it to another log file that was processed successfully, which was empty. We will wait to hear back if these look good and will make a copy and remove.

Alex Zhuravlev added a comment - 22/Nov/13 5:54 PM

0x18a3:1:0 - 1 is a sequence, so /O/1 - a hierarchy storying all the objects from sequence 1.

(gdb) p 0x18a3
$1 = 6307
(gdb) p 0x18a3 & 31
$1 = 3

/O/1/d3/6307

I'd suggest to rename the files to something like 6307.B to be able to recover in case, not just remove them.

Alex Zhuravlev added a comment - 22/Nov/13 5:54 PM 0x18a3:1:0 - 1 is a sequence, so /O/1 - a hierarchy storying all the objects from sequence 1. (gdb) p 0x18a3 $1 = 6307 (gdb) p 0x18a3 & 31 $1 = 3 /O/1/d3/6307 I'd suggest to rename the files to something like 6307.B to be able to recover in case, not just remove them.

David Dillow added a comment - 22/Nov/13 5:51 PM

Here's logs being processed by the the four threads that LBUG'd:
00000040:00080000:15.0:1385095225.247128:0:19969:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a1:1:0 at index 6 of catalog 0x200:1
00000040:00080000:15.0:1385095225.253880:0:19971:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a2:1:0 at index 6 of catalog 0x202:1
00000040:00080000:15.0:1385095225.257044:0:19973:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a3:1:0 at index 6 of catalog 0x204:1
00000040:00080000:15.0:1385095225.261700:0:19975:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a4:1:0 at index 6 of catalog 0x206:1

David Dillow added a comment - 22/Nov/13 5:51 PM Here's logs being processed by the the four threads that LBUG'd: 00000040:00080000:15.0:1385095225.247128:0:19969:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a1:1:0 at index 6 of catalog 0x200:1 00000040:00080000:15.0:1385095225.253880:0:19971:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a2:1:0 at index 6 of catalog 0x202:1 00000040:00080000:15.0:1385095225.257044:0:19973:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a3:1:0 at index 6 of catalog 0x204:1 00000040:00080000:15.0:1385095225.261700:0:19975:0:(llog_cat.c:558:llog_cat_process_cb()) processing log 0x18a4:1:0 at index 6 of catalog 0x206:1

David Dillow added a comment - 22/Nov/13 5:46 PM

Alex, can you comment on the mapping from 0x18a3:1:0 to /O/1/d3/6307 so we can replicate it for the other logs?

David Dillow added a comment - 22/Nov/13 5:46 PM Alex, can you comment on the mapping from 0x18a3:1:0 to /O/1/d3/6307 so we can replicate it for the other logs?

People

Assignee:: Alex Zhuravlev

Reporter:: Blake Caldwell

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 22/Nov/13 4:48 AM

Updated:: 20/Mar/14 2:28 PM

Resolved:: 20/Mar/14 2:28 PM