Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.1.3
    • None
    • 3
    • 6149

    Description

      Doing an ls gives the following error
      ls: reading directory d4_stats/: Input/output error

      client error:
      [5237686.818045] LustreError: 77522:0:(dir.c:648:ll_readdir()) error reading dir [0x4488b6ced74:0x1edb5:0x0] at 0: rc -5
      [5237686.849844] LustreError: 77522:0:(dir.c:648:ll_readdir()) Skipped 51 previous similar messages

      MDT Error:
      Jan 16 11:18:37 nbp1-mds kernel: Lustre: 15390:0:(mdd_object.c:2412:__mdd_readpage()) build page failed: -5!

      Please advise on debug flags to use to gather logs.

      Attachments

        1. fsck.2.8.2012.nbp1.out.gz
          1.63 MB
        2. mdtsnap.fsck.out.gz
          1.10 MB
        3. nbp1FSCK.out.gz
          4.56 MB

        Issue Links

          Activity

            [LU-2627] /bin/ls gets Input/output error
            nedbass Ned Bass (Inactive) made changes -
            Link New: This issue is duplicated by LU-3519 [ LU-3519 ]
            pjones Peter Jones made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones added a comment -

            As per NASA ok to close ticket

            pjones Peter Jones added a comment - As per NASA ok to close ticket

            There is nothing new in the fsck output compared to last time. I think you should go ahead and run fsck.

            johann Johann Lombardi (Inactive) added a comment - There is nothing new in the fsck output compared to last time. I think you should go ahead and run fsck.
            mhanafi Mahmoud Hanafi made changes -
            Attachment New: fsck.2.8.2012.nbp1.out.gz [ 12248 ]

            uploading fsck output for review before we run it for real.

            mhanafi Mahmoud Hanafi added a comment - uploading fsck output for review before we run it for real.

            This problem will persist for large 1.8 directories that are renamed until a version of the LU-2638 patch http://review.whamcloud.com/5179 is applied. For the short term, until this patch is applied, it is possible to disable the dirdata feature on the unmounted MDT filesystem:

            tune2fs -O dirdata /dev/mdtdev
            

            though this will have some negative performance impact for all newly-created files when doing name lookups and "ls -l".

            adilger Andreas Dilger added a comment - This problem will persist for large 1.8 directories that are renamed until a version of the LU-2638 patch http://review.whamcloud.com/5179 is applied. For the short term, until this patch is applied, it is possible to disable the dirdata feature on the unmounted MDT filesystem: tune2fs -O dirdata /dev/mdtdev though this will have some negative performance impact for all newly-created files when doing name lookups and "ls -l".

            We seem to have hit this issue again on the same filesystem.

            pfe1 ~ # ls -l /nobackupp1/xmeng/run_sc_anisopi/run06_dipole_semiimpl_nohyp_taug
            r_60000ss/SC
            ls: reading directory /nobackupp1/xmeng/run_sc_anisopi/run06_dipole_semiimpl_noh
            yp_taugr_60000ss/SC: Input/output error
            total 0

            from the mdt
            Feb 8 06:50:58 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 18 for directory #17309149
            Feb 8 06:50:58 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 17309149, running e2fsck is recommended.
            Feb 8 06:51:57 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 8 for directory #17309159
            Feb 8 06:51:57 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 17309159, running e2fsck is recommended.
            Feb 8 08:35:12 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 15 for directory #130557236
            Feb 8 08:35:12 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 130557236, running e2fsck is recommended.
            Feb 8 11:45:38 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 3 for directory #157287952
            Feb 8 11:45:39 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 157287952, running e2fsck is recommended.
            Feb 8 11:46:07 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 4 for directory #157331367
            Feb 8 11:46:07 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 157331367, running e2fsck is recommended.

            mhanafi Mahmoud Hanafi added a comment - We seem to have hit this issue again on the same filesystem. pfe1 ~ # ls -l /nobackupp1/xmeng/run_sc_anisopi/run06_dipole_semiimpl_nohyp_taug r_60000ss/SC ls: reading directory /nobackupp1/xmeng/run_sc_anisopi/run06_dipole_semiimpl_noh yp_taugr_60000ss/SC: Input/output error total 0 from the mdt Feb 8 06:50:58 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 18 for directory #17309149 Feb 8 06:50:58 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 17309149, running e2fsck is recommended. Feb 8 06:51:57 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 8 for directory #17309159 Feb 8 06:51:57 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 17309159, running e2fsck is recommended. Feb 8 08:35:12 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 15 for directory #130557236 Feb 8 08:35:12 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 130557236, running e2fsck is recommended. Feb 8 11:45:38 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 3 for directory #157287952 Feb 8 11:45:39 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 157287952, running e2fsck is recommended. Feb 8 11:46:07 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Unrecognised inode hash code 4 for directory #157331367 Feb 8 11:46:07 nbp1-mds kernel: LDISKFS-fs warning (device dm-4): dx_probe: Corrupt dir inode 157331367, running e2fsck is recommended.

            Is the issue closed, or is there some other help we can give you?

            cliffw Cliff White (Inactive) added a comment - Is the issue closed, or is there some other help we can give you?

            At this point we have been able to run fsck on the mdt and have recovered from the errors.

            mhanafi Mahmoud Hanafi added a comment - At this point we have been able to run fsck on the mdt and have recovered from the errors.

            People

              cliffw Cliff White (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: