<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:15:15 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1281] e2fsck: Inodes that were part of a corrupted orphan linked list found.</title>
                <link>https://jira.whamcloud.com/browse/LU-1281</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;During some maintenance today we did run read-only fsck on all our OSTs in one of our file systems after shutting down the file system. (All OSTs are unmounted.)&lt;/p&gt;

&lt;p&gt;On some OSTs we see these errors:&lt;/p&gt;

&lt;p&gt;sudo e2fsck -v -f -n  /dev/mapper/ost_lustre01_1&lt;br/&gt;
e2fsck 1.41.90.wc4 (01-Sep-2011)&lt;br/&gt;
MMP interval is 13 seconds and total wait time is 54 seconds. Please wait...&lt;br/&gt;
Pass 1: Checking inodes, blocks, and sizes&lt;br/&gt;
Inodes that were part of a corrupted orphan linked list found.  Fix? no&lt;/p&gt;

&lt;p&gt;Inode 393896873 was part of the orphaned inode list.  IGNORED.&lt;br/&gt;
Inode 393896873 is in use, but has dtime set.  Fix? no&lt;/p&gt;

&lt;p&gt;Inode 393896873, i_size is 0, should be 12288.  Fix? no&lt;/p&gt;

&lt;p&gt;Pass 2: Checking directory structure&lt;br/&gt;
Pass 3: Checking directory connectivity&lt;br/&gt;
Pass 4: Checking reference counts&lt;br/&gt;
Pass 5: Checking group summary information&lt;/p&gt;

&lt;p&gt;lustre01-OST0001: ********** WARNING: Filesystem still has errors **********&lt;/p&gt;


&lt;p&gt;  890452 inodes used (0.18%)&lt;br/&gt;
  257084 non-contiguous files (28.9%)&lt;br/&gt;
      32 non-contiguous directories (0.0%)&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;of inodes with ind/dind/tind blocks: 32/0/0&lt;br/&gt;
         Extent depth histogram: 778159/111004/1240&lt;br/&gt;
1437482870 blocks used (73.59%)&lt;br/&gt;
       0 bad blocks&lt;br/&gt;
      55 large files&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;  890404 regular files&lt;br/&gt;
      39 directories&lt;br/&gt;
       0 character device files&lt;br/&gt;
       0 block device files&lt;br/&gt;
       0 fifos&lt;br/&gt;
       0 links&lt;br/&gt;
       0 symbolic links (0 fast symbolic links)&lt;br/&gt;
       0 sockets&lt;br/&gt;
--------&lt;br/&gt;
  890443 files&lt;/p&gt;

&lt;p&gt;We have not yet attempted to mount these OSTs. Can these errors be ignored? Or should we just let fsck fix them?&lt;/p&gt;</description>
                <environment>RHEL5 OSSes, DDN 9900 controller.</environment>
        <key id="13864">LU-1281</key>
            <summary>e2fsck: Inodes that were part of a corrupted orphan linked list found.</summary>
                <type id="3" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11318&amp;avatarType=issuetype">Task</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="adilger">Andreas Dilger</assignee>
                                    <reporter username="ferner">Frederik Ferner</reporter>
                        <labels>
                    </labels>
                <created>Tue, 3 Apr 2012 15:04:43 +0000</created>
                <updated>Mon, 16 Jul 2012 12:10:00 +0000</updated>
                            <resolved>Mon, 16 Jul 2012 12:10:00 +0000</resolved>
                                    <version>Lustre 1.8.x (1.8.0 - 1.8.5)</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="33384" author="adilger" created="Tue, 3 Apr 2012 16:22:58 +0000"  >&lt;p&gt;This is a relatively benign error message.  It means some object was being truncated, the OST crashed, and the truncate didn&apos;t complete correctly on recovery.  Running e2fsck will repair the object to a &quot;safe&quot; size (12kB in this case), which might result in NUL padding at the end of the file.&lt;/p&gt;

&lt;p&gt;You can locate which MDS file this belongs to with the following commands (this is much easier on 2.x).  On the OST run:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;ost&amp;gt; debugfs -c -R &quot;stat &amp;lt;393896873&amp;gt;&quot; /dev/mapper/ost_lustre01_1

debugfs 1.41.90.wc3 (28-May-2011)
/dev/vgmyth/lvmythost0: catastrophic mode - not reading inode or group bitmaps
Inode: 63681   Type: regular    Mode:  0666   Flags: 0x80000
Generation: 2393149953    Version: 0x0000002a:00005f81
User:  1000   Group:  1000   Size: 25165824
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 49152
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x4bad91f0:00000000 -- Fri Mar 26 23:04:48 2010
 atime: 0x4bad91f0:00000000 -- Fri Mar 26 23:04:48 2010
 mtime: 0x4bad91d1:00000000 -- Fri Mar 26 23:04:17 2010
crtime: 0x4bad87f0:d2d42b84 -- Fri Mar 26 22:22:08 2010
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  fid = &quot;b9 da 24 00 00 00 00 00 6a fa 0d 3f 01 00 00 00 eb 5b 0b 00 00 00 00 00 00 00 00 00 00 00 00 00 &quot; (32)
  fid: objid=744427 seq=0 parent=[0x24dab9:0x3f0dfa6a:0x0] stripe=1
EXTENTS:
(0-255):4620544-4620799, (256-6143):4621312-4627199
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Of interest is the &quot;fid:&quot; line, where it reports the parent inode number (the first number inside the square brackets, 0x24dab9 in this example).&lt;/p&gt;

&lt;p&gt;On the MDT run:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mdt&amp;gt; debugfs -c -R &quot;ncheck 0x24dab9&quot; /dev/{MDT}
debugfs 1.41.90.wc3 (28-May-2011)
/dev/vgmyth/lvmythmdt0.ssd: catastrophic mode - not reading inode or group bitmaps
Inode	Pathname
2415289	/ROOT/tmp/4stripe
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If your filesystem is large, ncheck may take some time.  If there are multiple inodes affected from multiple OSTs, you can list them all on the same &quot;ncheck&quot; line to avoid scanning the filesystem multiple times.  In Lustre 2.x this &quot;ncheck&quot; step can be avoided, and the pathname can be resolved directly from the FID (e.g. &quot;lfs fid2path &lt;span class=&quot;error&quot;&gt;&amp;#91;0x24dab9:0x3f0dfa6a:0x0&amp;#93;&lt;/span&gt;&quot;).&lt;/p&gt;</comment>
                            <comment id="33437" author="ferner" created="Wed, 4 Apr 2012 06:53:45 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;thanks for your comment. Based on this, I have managed to identify 5 (out of 6) files, for on FID ncheck did not return any file on the MDT. I suspect lfsck would report that as orphaned object?&lt;/p&gt;

&lt;p&gt;In addition to this, I ran e2fsck -v -f -p on the affected OSTs, e2fsck managed to repair only one off them. In all other cases it stopped without repairing the corrupted orphan link list:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[bnh65367@cs04r-sc-oss01-05 ~]$ sudo e2fsck -v -f -p  /dev/mapper/ost_lustre01_1

lustre01-OST0001: Inodes that were part of a corrupted orphan linked list found.  

lustre01-OST0001: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
	(i.e., without -a or -p options)
[bnh65367@cs04r-sc-oss01-05 ~]$ 
[bnh65367@cs04r-sc-oss01-05 ~]$ sudo e2fsck -v -f -n  /dev/mapper/ost_lustre01_1 
e2fsck 1.41.90.wc4 (01-Sep-2011)
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix? no

Inode 393896873 was part of the orphaned inode list.  IGNORED.
Inode 393896873 is in use, but has dtime set.  Fix? no

Inode 393896873, i_size is 0, should be 12288.  Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

lustre01-OST0001: ********** WARNING: Filesystem still has errors **********


  890452 inodes used (0.18%)
  257084 non-contiguous files (28.9%)
      32 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 32/0/0
         Extent depth histogram: 778159/111004/1240
1437482870 blocks used (73.59%)
       0 bad blocks
      55 large files

  890404 regular files
      39 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
  890443 files
[bnh65367@cs04r-sc-oss01-05 ~]$ 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After checking the file names of the affected files, I&apos;m not too concerned as they are either no longer important or should be relatively easy to recreate. I would still prefer to fix the OSTs.&lt;/p&gt;</comment>
                            <comment id="33438" author="ferner" created="Wed, 4 Apr 2012 06:57:54 +0000"  >&lt;p&gt;(Sorry forgot to add this&lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;I&apos;m also usually very cautious when it comes to running fsck manually without knowing what it does. &lt;/p&gt;

&lt;p&gt;One one OST e2fsck manage to fix it:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[bnh65367@cs04r-sc-oss01-05 ~]$ sudo e2fsck -v -f -n  /dev/mapper/ost_lustre01_0e2fsck 1.41.90.wc4 (01-Sep-2011)
Pass 1: Checking inodes, blocks, and sizes
Inode 2 creation time (Thu Jan  1 02:08:16 1970) invalid.
Clear? no

Inodes that were part of a corrupted orphan linked list found.  Fix? no

Inode 22242677 was part of the orphaned inode list.  IGNORED.
Inode 22242677 is in use, but has dtime set.  Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

lustre01-OST0000: ********** WARNING: Filesystem still has errors **********


  930294 inodes used (0.19%)
  271214 non-contiguous files (29.2%)
      32 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 32/0/0
         Extent depth histogram: 813541/115276/1428
1398216469 blocks used (71.58%)
       0 bad blocks
      36 large files

  930246 regular files
      39 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
  930285 files
[bnh65367@cs04r-sc-oss01-05 ~]$ sudo e2fsck -v -f -p  /dev/mapper/ost_lustre01_0
lustre01-OST0000: Truncating orphaned inode 22242677 (uid=499, gid=499, mode=0100666, size=0)
lustre01-OST0000: Truncating orphaned inode 55623793 (uid=499, gid=101902, mode=0100666, size=0)
lustre01-OST0000: Inode 2 creation time (Thu Jan  1 02:08:16 1970) invalid.
CLEARED.


  930294 inodes used (0.19%)
  271214 non-contiguous files (29.2%)
      32 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 32/0/0
         Extent depth histogram: 813541/115276/1428
1398216467 blocks used (71.58%)
       0 bad blocks
      36 large files

  930246 regular files
      39 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
  930285 files
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="33450" author="adilger" created="Wed, 4 Apr 2012 09:46:33 +0000"  >&lt;p&gt;Running e2fsck -fp will repair only straight forward errors. That is a reasonable approach until one knows what the error is. In this case, you should run e2fsck -fy do that it answers &quot;yes&quot; to repairing these errors. &lt;/p&gt;</comment>
                            <comment id="33545" author="ferner" created="Thu, 5 Apr 2012 14:08:02 +0000"  >&lt;p&gt;Thanks for your reply. Unfortunately I had to mount the file system before I received the reply. I&apos;ve identified all affected files and will schedule another maintenance window to fix it properly.&lt;/p&gt;</comment>
                            <comment id="34643" author="adilger" created="Thu, 12 Apr 2012 17:06:30 +0000"  >&lt;p&gt;Are there any questions related to this issue, or can this bug be closed?&lt;/p&gt;</comment>
                            <comment id="41881" author="ferner" created="Mon, 16 Jul 2012 12:04:21 +0000"  >&lt;p&gt;Apologies for the delay I missed the update.&lt;/p&gt;

&lt;p&gt;Yes, the ticket can be closed. Thanks.&lt;/p&gt;</comment>
                            <comment id="41884" author="pjones" created="Mon, 16 Jul 2012 12:10:00 +0000"  >&lt;p&gt;Thanks Frederik!&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvzy7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10070</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>