Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4290

osp_sync_threads encounters EIO on mount

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0
    • Lustre 2.4.1
    • None
    • RHEL 6.4/distro IB
    • 2
    • 11773

    Description

      We encountered this assertion in production, libcfs_panic_on_lbug was set to 1, so server rebooted. On mount, the same assertion and lbug would occur. Filesystem will mount with panic_on_lbug set to 0. We've captured a crash dump and lustre log messages with the debug flags:

      [root@atlas-mds3 ~]# cat /proc/sys/lnet/debug
      trace ioctl neterror warning other error emerg ha config console

      Ran e2fsck:
      e2fsck -f -j /dev/mapper/atlas2-mdt1-journal /dev/mapper/atlas2-mdt1

      and only fixed the quota inconsistencies it found.

      At the moment, we are back to production after the osp_sync_threads lbugs on mount. There are hung task messages about osp_sync_threads as would be expected. We want to fix the root issue that is causing the assertions.

      kernel messages during one of the failed mounts
      Nov 21 21:16:44 atlas-mds3 kernel: [ 911.319839] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts:
      Nov 21 21:16:44 atlas-mds3 kernel: [ 911.986208] Lustre: mdt_num_threads module parameter is deprecated, use mds_num_threads instead or unset both for dynamic thread startup
      Nov 21 21:16:46 atlas-mds3 kernel: [ 913.069371] Lustre: atlas2-MDT0000: used disk, loading
      Nov 21 21:16:47 atlas-mds3 kernel: [ 914.261572] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 0 changes, 0 in progress, 0 in flight: -5
      Nov 21 21:16:47 atlas-mds3 kernel: [ 914.278318] LustreError: 18945:0:(osp_sync.c:862:osp_sync_thread()) LBUG
      Nov 21 21:16:47 atlas-mds3 kernel: [ 914.286036] Pid: 18945, comm: osp-syn-256
      Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290841]
      Nov 21 21:16:47 atlas-mds3 kernel: [ 914.290844] Call Trace:

      We also see this message:
      Nov 21 23:01:01 atlas-mds3 kernel: [ 1512.633528] ERST: NVRAM ERST Log Address Range is not implemented yet

      Attachments

        1. 6305.llog_out
          73 kB
        2. 6306.llog_out
          73 kB
        3. 6307.llog_out
          73 kB
        4. 6308.llog_out
          73 kB
        5. lustre-log.1385095225.19969.gz
          38 kB
        6. lustre-log.1385095225.19971.gz
          7 kB
        7. lustre-log.1385095225.19973.gz
          2.88 MB
        8. lustre-log.1385095225.19975.gz
          5 kB

        Activity

          [LU-4290] osp_sync_threads encounters EIO on mount
          pjones Peter Jones added a comment -

          I checked with Alex and he agrees to close this ticket and track and further similar work under new tickets

          pjones Peter Jones added a comment - I checked with Alex and he agrees to close this ticket and track and further similar work under new tickets

          Great! This ticket can be closed now.

          blakecaldwell Blake Caldwell added a comment - Great! This ticket can be closed now.

          The patch http://review.whamcloud.com/#/c/9281/ landed for 2.6.

          The patch that landed was "preliminary". Does anything else need to be done to complete this ticket?

          jamesanunez James Nunez (Inactive) added a comment - The patch http://review.whamcloud.com/#/c/9281/ landed for 2.6. The patch that landed was "preliminary". Does anything else need to be done to complete this ticket?
          bzzz Alex Zhuravlev added a comment - a preliminary patch: http://review.whamcloud.com/#/c/9281/

          Jason, I've been working on an automated test for this and similar issues.

          bzzz Alex Zhuravlev added a comment - Jason, I've been working on an automated test for this and similar issues.

          So the last discussion point was about a future looking fix - has any work occurred? Should we go ahead and create another LU for that effort and go ahead and close this issue out?


          -Jason

          hilljjornl Jason Hill (Inactive) added a comment - So the last discussion point was about a future looking fix - has any work occurred? Should we go ahead and create another LU for that effort and go ahead and close this issue out? – -Jason

          Blake,
          The timestamps on the logs are not updated by the Lustre code, so that is why it appears they are not modified after mount. Also, logs are only used once and then deleted, so new ones are crated each mount.

          Alex,
          I think that if there is an error looking up a record in the llog that unlink should be skipped and the next record processed. Once all the records are processed (for good or bad) the log file will be deleted anyway. I don't think this should be handled by the llog code internally, since we don't necessarily want to delete a config file if there us a bad block on disk or some other toor set problem. For the object unlink case, it would eventually be cleaned up by LFSCK so I don't think it is terrible if some records are not processed.

          adilger Andreas Dilger added a comment - Blake, The timestamps on the logs are not updated by the Lustre code, so that is why it appears they are not modified after mount. Also, logs are only used once and then deleted, so new ones are crated each mount. Alex, I think that if there is an error looking up a record in the llog that unlink should be skipped and the next record processed. Once all the records are processed (for good or bad) the log file will be deleted anyway. I don't think this should be handled by the llog code internally, since we don't necessarily want to delete a config file if there us a bad block on disk or some other toor set problem. For the object unlink case, it would eventually be cleaned up by LFSCK so I don't think it is terrible if some records are not processed.

          yes, definitely. I'm thinking on what would be a good reaction here. just skip such a log? remove it?

          bzzz Alex Zhuravlev added a comment - yes, definitely. I'm thinking on what would be a good reaction here. just skip such a log? remove it?

          Alex, it is never good to assert on IO errors from the disk. Should this be converted to an error and handled more gracefully?

          adilger Andreas Dilger added a comment - Alex, it is never good to assert on IO errors from the disk. Should this be converted to an error and handled more gracefully?

          When looking at the mtimes of other llogs we noticed that it corresponded to the time of the last successful mount. Note that 6305 was Nov 5 12:07. The output of llog_reader contains the same:
          Time : Tue Nov 5 12:07:45 2013

          Now the question of why these logs got truncated during normal operation. We identified that the MDT returned an error code not handled by the scsi layer.
          Nov 26 10:58:27 atlas-mds3 kernel: [ 3820.152740] mpt2sas0: #011handle(0x000c), ioc_status(scsi data underrun)(0x0045), smid(1750)

          So if the other llogs were consumed and cleared on lustre mount (is that correct?), they don't appear to get appended/committed to in normal operation. Why would an EIO affect the llogs?

          root@atlas-mds3 d1]# ls -l
          total 740
          rw-rr- 1 root root 19776 Nov 21 23:40 10017
          rw-rr- 1 root root 19584 Nov 21 23:40 10049
          rw-rr- 1 root root 19712 Nov 21 23:40 10081
          rw-rr- 1 root root 229120 Nov 5 12:07 6305
          rw-rr- 1 root root 8320 Nov 21 22:42 8321
          rw-rr- 1 root root 24192 Nov 21 23:40 9345
          rw-rr- 1 root root 24192 Nov 21 23:40 9377

          blakecaldwell Blake Caldwell added a comment - When looking at the mtimes of other llogs we noticed that it corresponded to the time of the last successful mount. Note that 6305 was Nov 5 12:07. The output of llog_reader contains the same: Time : Tue Nov 5 12:07:45 2013 Now the question of why these logs got truncated during normal operation. We identified that the MDT returned an error code not handled by the scsi layer. Nov 26 10:58:27 atlas-mds3 kernel: [ 3820.152740] mpt2sas0: #011handle(0x000c), ioc_status(scsi data underrun)(0x0045), smid(1750) So if the other llogs were consumed and cleared on lustre mount (is that correct?), they don't appear to get appended/committed to in normal operation. Why would an EIO affect the llogs? root@atlas-mds3 d1]# ls -l total 740 rw-r r - 1 root root 19776 Nov 21 23:40 10017 rw-r r - 1 root root 19584 Nov 21 23:40 10049 rw-r r - 1 root root 19712 Nov 21 23:40 10081 rw-r r - 1 root root 229120 Nov 5 12:07 6305 rw-r r - 1 root root 8320 Nov 21 22:42 8321 rw-r r - 1 root root 24192 Nov 21 23:40 9345 rw-r r - 1 root root 24192 Nov 21 23:40 9377

          People

            bzzz Alex Zhuravlev
            blakecaldwell Blake Caldwell
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: