<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:40:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4226] MDS unable to locate swabbed FID SEQ in FLDB</title>
                <link>https://jira.whamcloud.com/browse/LU-4226</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Our sysadmins updated one of out Lustre 2.1 filesystem to lustre &lt;a href=&quot;https://github.com/chaos/lustre/tree/2.4.0-19chaos&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;2.4.0-19chaos&lt;/a&gt;.  Note that this filesystem was likely originally formatted under 1.8.  It looks like oi_scrub ran automatically this time, but failed to make any updates:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;gt; cat osd-ldiskfs/lsd-MDT0000/oi_scrub
name: OI_scrub
magic: 0x4c5fd252
oi_files: 1
status: completed
flags:
param:
time_since_last_completed: 505891 seconds
time_since_latest_start: 521998 seconds
time_since_last_checkpoint: 505891 seconds
latest_start_position: 12
last_checkpoint_position: 991133697
first_failure_position: N/A
checked: 200636112
updated: 0
failed: 0
prior_updated: 0
noscrub: 3090
igif: 15492100
success_count: 2
run_time: 16107 seconds
average_speed: 12456 objects/sec
real-time_speed: N/A
current_position: N/A
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You&apos;ll recall that we have oi scrub problems when we tried to upgrade the first ldiskfs filesystem to 2.4 in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3934&quot; title=&quot;Directories gone missing after 2.4 update&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3934&quot;&gt;&lt;del&gt;LU-3934&lt;/del&gt;&lt;/a&gt;.  This time we are using a version of lustre with the suggested patches included.&lt;/p&gt;

&lt;p&gt;We are seeing similar symptoms as last time.  For example, directory listings show ????????? for permissions flags for some of the subdirectories, and we are seeing errors on the MDS console like this:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(fld_handler.c:169:fld_server_lookup()) srv-lsd-MDT0000: Cannot find sequence 0x607000002000000: rc = -5
Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(fld_handler.c:169:fld_server_lookup()) Skipped 20 previous similar messages
Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(osd_handler.c:2125:osd_fld_lookup()) lsd-MDT0000-osd: cannot find FLD range for [0x607000002000000:0x8a0:0x0]: rc = -5
Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(osd_handler.c:2125:osd_fld_lookup()) Skipped 14 previous similar messages
Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(osd_handler.c:3317:osd_remote_fid()) lsd-MDT0000-osd: Can not lookup fld for [0x607000002000000:0x8a0:0x0]
Nov  7 08:06:19 momus-mds1 kernel: LustreError: 7326:0:(osd_handler.c:3317:osd_remote_fid()) Skipped 14 previous similar messages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The filesystem is unusable many of our users.&lt;/p&gt;</description>
                <environment></environment>
        <key id="21917">LU-4226</key>
            <summary>MDS unable to locate swabbed FID SEQ in FLDB</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                            <label>llnl</label>
                            <label>ppc</label>
                    </labels>
                <created>Thu, 7 Nov 2013 19:18:36 +0000</created>
                <updated>Tue, 13 Dec 2016 22:45:08 +0000</updated>
                            <resolved>Tue, 13 Dec 2016 22:45:08 +0000</resolved>
                                    <version>Lustre 1.8.9</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="71011" author="pjones" created="Thu, 7 Nov 2013 20:15:53 +0000"  >&lt;p&gt;Di is looking into this issue&lt;/p&gt;</comment>
                            <comment id="71015" author="adilger" created="Thu, 7 Nov 2013 20:37:55 +0000"  >&lt;p&gt;It looks like the FID sequences for your objects are very strange, and the node cannot figure out on which server those objects are located.  The FID sequence of &lt;span class=&quot;error&quot;&gt;&amp;#91;0x607000002000000:0x8a0:0x0&amp;#93;&lt;/span&gt; is way outside the range of IGIF FIDs (0x0000000c-0xffffffff) reserved for 1.8 objects, and also way outside the range of objects that would normally be allocated for 2.x MDT objects (starting at 0x200000400).  It almost looks like some kind of endian bug?  If that was swabbed it would be 0x200000706, which would be a very reasonable FID sequence.&lt;/p&gt;

&lt;p&gt;What does the FLDB (FID-&amp;gt;server location mapping table) look like on your MDS?  The following will dump out the FID sequence allocation tables on the node:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mds# lctl get_param seq.*.*
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In particular, the &quot;fldb&quot; entry will show the global mapping table of sequence numbers (first part of the FID) to the server:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;seq.ctl-testfs-MDT0000.fldb=
[0x000000000000000c-0x0000000100000000):0:mdt
[0x0000000200000002-0x0000000200000003):0:mdt
[0x0000000200000007-0x0000000200000008):0:mdt
[0x0000000200000400-0x0000000240000400):0:mdt
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this case (single MDT filesystem) all of the allocated sequences map to MDT0000.  It also shows that the rest of the unallocated &quot;space&quot; is reserved by the sequence controller (always on MDT0) for future assignment:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;seq.ctl-testfs-MDT0000.space=[0x240000400 - 0xffffffffffffffff]:0:mdt
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="71017" author="adilger" created="Thu, 7 Nov 2013 21:13:49 +0000"  >&lt;p&gt;I don&apos;t necessarily suspect LFSCK as the culprit here.  There are only about 15M inodes that were created with a 1.x MDS, while the remaining 185M were created with a 2.x MDS (either 1.8 or 2.1 clients).&lt;/p&gt;

&lt;p&gt;It seems more likely that the DNE code is refusing to process these strange FIDs because it thinks that they belong to a remote MDT.  That probably wasn&apos;t being checked in the 2.1 code, since it only could handle objects on MDT0.&lt;/p&gt;

&lt;p&gt;Could you please provide a sample of FIDs that are reporting errors (e.g. grep for &quot;Cannot find sequence&quot; in syslog) and attach here?  I suspect that they are all byte-swapped FID sequences.&lt;/p&gt;

&lt;p&gt;The other important question is whether this problem is due to legacy objects (e.g. files created with 1.8 clients on PPC nodes), or if they are still actively being created?  Could you please try on a PPC client node and on an x86 client node:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;client$ touch /path/to/lustre/testfile
client$ ls -li /path/to/lustre/testfile
client$ lfs getstripe -v /path/to/lustre/testfile
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If this shows that 2.4 PPC clients (or 2.1 PPC clients, if you still have them) are still creating these objects, then this would be the first problem to find and fix.&lt;/p&gt;

&lt;p&gt;It might be possible to make a workaround for this by adding an FLDB entry to cover the sequence range &lt;span class=&quot;error&quot;&gt;&amp;#91;0x0000N0002000000-0xffffN0002000000):0:mdt&amp;#93;&lt;/span&gt;, but I&apos;m not sure how confused the FLDB code would get if the unallocated &quot;space&quot; didn&apos;t extend to 0xffffffffffffffff.  It also depends on how many FID sequences were allocated in swabbed order.  If there were a large number of these sequences allocated (i.e. &quot;0x0000002000N0000&quot; is very large, and hence &quot;0x0000N0002000000&quot; is) then this could encroach on the unallocated sequence space and potentially cause problems in the future.  That said, with the FLDB workaround entry in place, and if no new ones were being created, it would be possible to find and migrate those inodes to have &quot;normal&quot; FIDs, and then remove the FLDB workaround entry to avoid issues in the future.&lt;/p&gt;</comment>
                            <comment id="71021" author="adilger" created="Thu, 7 Nov 2013 21:31:56 +0000"  >&lt;p&gt;It might also be useful to get the FLDB and sequence allocation information from a PPC client (both 2.4 and 2.1 if you are still running both):&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;client$ lctl get_param seq.*.* | grep ffff
seq.cli-cli-testfs-MDT0000-mdc-ffff8800464e6800.fid=[0x200000401:0x1:0x0]
seq.cli-cli-testfs-MDT0000-mdc-ffff8800464e6800.server=testfs-MDT0000_UUID
seq.cli-cli-testfs-MDT0000-mdc-ffff8800464e6800.space=[0x200000402 - 0x200000402]:0:mdt
seq.cli-cli-testfs-MDT0000-mdc-ffff8800464e6800.width=131072
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="71026" author="morrone" created="Thu, 7 Nov 2013 21:57:21 +0000"  >&lt;p&gt;Andreas, I am not aware of any PPC systems mounting this filesystem.  Perhaps at some time in the past, I don&apos;t really know.  But not now to the best of my knowledge.&lt;/p&gt;

&lt;p&gt;We are also having trouble with top-level directories in lustre, which were certainly never created from PPC nodes.&lt;/p&gt;

&lt;p&gt;Here are some of the problem fids reports on the MDS console:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;0x105a949837030000
0x22f78a0102000000
0x22f78a0102000003
0x2390f96017010000
0x2570c30002000000
0x2d67c37f37030000
0x2d67c37f37030001
0x2d67c37f37030002
0x2d67c37f37030003
0x2f5e2f0102000000
0x607000002000000
0x6897260102000000
0x716c574202000000
0x82d7a31e0e070013
0x8ba9660102000000
0x969f00d6c7080009
0x978f48ef0d070002
0x978f48ef0d070007
0xaf97d14acf070009
0xaf97d14acf070017
0xaf97d14acf070018
0xaf97d14acf070022
0xb7539d4402000000
0xf104000002000000
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;EDIT: fixed sequence list&lt;/p&gt;</comment>
                            <comment id="71032" author="morrone" created="Thu, 7 Nov 2013 22:11:29 +0000"  >&lt;p&gt;More requested info:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;seq.cli-cli-lsd-OST0000-osc-MDT0000.fid=[0x0:0x0:0x0]
seq.cli-cli-lsd-OST0000-osc-MDT0000.server=lsd-OST0000_UUID
seq.cli-cli-lsd-OST0000-osc-MDT0000.space=[0x0 - 0x0]:0:mdt
seq.cli-cli-lsd-OST0000-osc-MDT0000.width=4294967295
seq.cli-cli-lsd-OST0001-osc-MDT0000.fid=[0x0:0x0:0x0]
seq.cli-cli-lsd-OST0001-osc-MDT0000.server=lsd-OST0001_UUID
seq.cli-cli-lsd-OST0001-osc-MDT0000.space=[0x0 - 0x0]:0:mdt
seq.cli-cli-lsd-OST0001-osc-MDT0000.width=4294967295
[cut, they all look the same]
seq.cli-cli-lsd-OST0257-osc-MDT0000.fid=[0x0:0x0:0x0]
seq.cli-cli-lsd-OST0257-osc-MDT0000.server=lsd-OST0257_UUID
seq.cli-cli-lsd-OST0257-osc-MDT0000.space=[0x0 - 0x0]:0:mdt
seq.cli-cli-lsd-OST0257-osc-MDT0000.width=4294967295
seq.cli-ctl-lsd-MDT0000.fid=[0x0:0x0:0x0]
seq.cli-ctl-lsd-MDT0000.server=ctl-lsd-MDT0000
seq.cli-ctl-lsd-MDT0000.space=[0x0 - 0x0]:0:mdt
seq.cli-ctl-lsd-MDT0000.width=131072
seq.ctl-lsd-MDT0000.server=&amp;lt;none&amp;gt;
seq.ctl-lsd-MDT0000.space=[0x8c800000400 - 0xffffffffffffffff]:0:mdt
seq.ctl-lsd-MDT0000.width=1073741824
seq.srv-lsd-MDT0000.server=ctl-lsd-MDT0000
seq.srv-lsd-MDT0000.space=[0x8c7da139f01 - 0x8c800000400]:0:mdt
seq.srv-lsd-MDT0000.width=1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="71033" author="morrone" created="Thu, 7 Nov 2013 22:12:53 +0000"  >&lt;p&gt;Andreas, the only fldb file is under the fld tree, not seq like you demonstrated above.  Does that mean anything?&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;fld.srv-lsd-MDT0000.fldb=
[0x0000000000000001-0x0000000100000000):0:mdt
[0x0000000200000002-0x0000000200000003):0:mdt
[0x0000000200000007-0x0000000200000008):0:mdt
[0x0000000200000400-0x000008c800000400):0:mdt
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="71035" author="di.wang" created="Thu, 7 Nov 2013 22:24:19 +0000"  >&lt;p&gt;ah, so &lt;span class=&quot;error&quot;&gt;&amp;#91;0x607000002000000:0x8a0:0x0&amp;#93;&lt;/span&gt; is out of allocated sequence space, but I do not understand how top-level directories can be attached with some wrong sequence FIDs if not with PPC client. I do not recall any bugs which can trigger the problem. but I might miss sth. &lt;/p&gt;</comment>
                            <comment id="71040" author="adilger" created="Thu, 7 Nov 2013 22:36:22 +0000"  >&lt;p&gt;Location of fldb file isn&apos;t important.  The &quot;fldb&quot; entry looks consistent with the &quot;space&quot; entry for MDT0.&lt;/p&gt;

&lt;p&gt;It does show is that the MDS thinks the FID sequences are being allocated &lt;em&gt;somewhat&lt;/em&gt; normally, from 0x200000400 onward on MDT0.  It is somewhat &lt;em&gt;unusual&lt;/em&gt; in that it appears to have allocated 0x8c7da139f01-0x200000400 = 9645860297473 (~= 2^43 or 9 trillion) sequences for MDT0000.  Each sequence would typically be used by one client per mount, though I recall you had bug &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1632&quot; title=&quot;FID sequence numbers not working properly with filesystems formatted using 1.8?&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1632&quot;&gt;&lt;del&gt;LU-1632&lt;/del&gt;&lt;/a&gt;, and even in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3318&quot; title=&quot;mdc_set_lock_data() ASSERTION( old_inode-&amp;gt;i_state &amp;amp; I_FREEING ) &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3318&quot;&gt;&lt;del&gt;LU-3318&lt;/del&gt;&lt;/a&gt; it reports that you have 2.1.3 clients that do not have the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1632&quot; title=&quot;FID sequence numbers not working properly with filesystems formatted using 1.8?&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1632&quot;&gt;&lt;del&gt;LU-1632&lt;/del&gt;&lt;/a&gt; patch and are consuming FID sequences like crazy.  That isn&apos;t totally critical at this point, however, since even 9T sequences isn&apos;t a huge dent in the 2^64 sequence space if they are being consumed one-at-a-time.&lt;/p&gt;

&lt;p&gt;My theory about byte-swabbed FID sequences is out the window I guess.  The values look to be all over the place, and I can&apos;t see any obvious pattern about what would be causing this.&lt;/p&gt;</comment>
                            <comment id="71044" author="di.wang" created="Thu, 7 Nov 2013 22:49:22 +0000"  >&lt;p&gt;Another possibility might be OI-scrub screw up FIDs during update grade, but that is just wild guess. Fan Yong, please comment here. &lt;/p&gt;

&lt;p&gt;Chris: If you create a new file on this system right now. Does the file shows reasonable FID with it. You can simply get the FID by &quot;lfs path2fid xxxx&quot;, Thanks.&lt;/p&gt;</comment>
                            <comment id="71045" author="morrone" created="Thu, 7 Nov 2013 22:53:30 +0000"  >&lt;p&gt;I have some more information about the PPC situation.  We &lt;em&gt;did&lt;/em&gt; have PPC clients mount this filesystem in the past.  The PPC clients were running a 1.8 flavor while the servers were are 2.1.  It &lt;em&gt;is&lt;/em&gt; possible that some top-level user directories were created from a PPC login node.&lt;/p&gt;

&lt;p&gt;We are not mounting the filesystem from any PPC nodes today.&lt;/p&gt;</comment>
                            <comment id="71047" author="morrone" created="Thu, 7 Nov 2013 22:55:38 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ touch bar 
$ lfs path2fid bar
[0x8c7da1389ec:0x4:0x0]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Yes, that is working.&lt;/p&gt;</comment>
                            <comment id="71051" author="adilger" created="Thu, 7 Nov 2013 23:28:27 +0000"  >&lt;p&gt;Chris, would you say that the number of broken files is a large fraction of files in the filesystem, or only isolated to specific files/directories?&lt;/p&gt;

&lt;p&gt;If you know of a specific directory in the filesystem suffering this problem, could you please check on the MDS with a relatively new version of e2fsprogs:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mds# debugfs -c -R &quot;stat /ROOT/path/to/bad/file&quot; /dev/mdsdev
debugfs 1.42.7.wc1 (12-Apr-2013)                                                
/dev/vg_sookie/lvmdt1: catastrophic mode - not reading inode or group bitmaps   
Inode: 117   Type: regular    Mode:  0644   Flags: 0x0                          
Generation: 2384158001    Version: 0x00000002:00000008                          
User:     0   Group:     0   Size: 0                                            
File ACL: 0    Directory ACL: 0                                                 
Links: 1   Blockcount: 0                                                        
Fragment:  Address: 0    Number: 0    Size: 0                                   
 ctime: 0x527c1e95:00000000 -- Thu Nov  7 16:13:25 2013                         
 atime: 0x527c1e95:00000000 -- Thu Nov  7 16:13:25 2013                         
 mtime: 0x527c1e95:00000000 -- Thu Nov  7 16:13:25 2013                         
crtime: 0x527c1e95:d604a370 -- Thu Nov  7 16:13:25 2013                         
Size of extra inode fields: 28                                                  
Extended attributes stored in inode body:                                       
  lma = &quot;00 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00 04 00 00 00 00 00 00 00
 &quot; (24)                                                                         
  lma: fid=[0x0000000200000400:0x4:0x0] compat=0 incompat=0                  
  lov = &quot;d0 0b d1 0b 01 00 00 00 04 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00
 00 00 10 00 01 00 00 00 22 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
0 00 01 00 00 00 &quot; (56)                                                         
  link = &quot;df f1 ea 11 01 00 00 00 2d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
0 00 15 00 00 00 02 00 00 04 00 00 00 00 03 00 00 00 00 66 6f 6f &quot; (45)         
BLOCKS:

mds# debugfs -c -R &quot;ls -lD /ROOT/path/to/bad&quot; /dev/mdsdev                                                                  
debugfs 1.42.7.wc1 (12-Apr-2013)                                                
/dev/vg_sookie/lvmdt1: catastrophic mode - not reading inode or group bitmaps   
    116   40755 (2)  0   0  4096  7-Nov-2013 16:13 .                   
 229388   40755 (18) 0   0  4096  7-Nov-2013 16:13 [0x200000007:0x1:0x0] ..                                                                           
    117  100644 (17) 0   0     0  7-Nov-2013 16:13 [0x200000400:0x4:0x0] foo                                                                         
    118  100644 (17) 0   0     0  7-Nov-2013 16:13 [0x200000400:0x5:0x0] bar                                                                     
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will tell us if the FID in the LMA xattr, in the LOV xattr, and in the directory entry data.  It will also tell us if the file was recently created, or is old.  It would also be useful to use the &quot;stat&quot; command to check the crtime of several of the files with bad FIDs, to see if there is any consistency between new/old creation date.&lt;/p&gt;

&lt;p&gt;What Lustre version(s) are the clients on this system?&lt;/p&gt;</comment>
                            <comment id="71052" author="adilger" created="Thu, 7 Nov 2013 23:29:41 +0000"  >&lt;p&gt;Ah, ignore the path2fid request in my previous comment, I see you already did that for Di.&lt;/p&gt;</comment>
                            <comment id="71053" author="di.wang" created="Thu, 7 Nov 2013 23:36:44 +0000"  >&lt;p&gt;I just made a temporary patch &lt;a href=&quot;http://review.whamcloud.com/8213&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8213&lt;/a&gt; to make FLDB &quot;working&quot;, i.e. all of non-exist sequence will point to MDT0, since you are using single MDT, this can make FLDB work.&lt;/p&gt;

&lt;p&gt;But since we do not know the real problem yet, so I am not sure whether this temporary patch will make the FS &quot;working&quot; again.(i.e. I do not know OI table is good or not right now). Though this will help us understand the problem.&lt;/p&gt;</comment>
                            <comment id="71058" author="morrone" created="Fri, 8 Nov 2013 00:25:04 +0000"  >&lt;p&gt;The proplematic top-level directories are easiest to find:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# &amp;gt; debugfs.ldiskfs -c -R &quot;stat /ROOT/dkp&quot; /dev/sda
debugfs.ldiskfs 1.42.7.wc1.1chaos (12-Apr-2013)
/dev/sda: catastrophic mode - not reading inode or group bitmaps
Inode: 66529648   Type: directory    Mode:  0700   Flags: 0x80000
Generation: 607576858    Version: 0x0000002e:0ff37cd7
User: 41679   Group: 41679   Size: 4096
File ACL: 0    Directory ACL: 0
Links: 2   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x4f31ae1d:00000000 -- Tue Feb  7 15:05:01 2012
 atime: 0x524106c8:00000000 -- Mon Sep 23 20:28:08 2013
 mtime: 0x4f31ae1d:00000000 -- Tue Feb  7 15:05:01 2012
crtime: 0x4f31ae1d:e22ed204 -- Tue Feb  7 15:05:01 2012
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  lma = &quot;00 00 00 00 00 00 00 00 00 00 00 02 00 00 07 06 4f 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 &quot; (64)
  lma: fid=[0x200000706:0x4f090000:0x0] compat=0 incompat=0
  link = &quot;df f1 ea 11 01 00 00 00 2d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 15 00 00 00 00 01 e6 00 01 fb 01 ac dd 00 00 00 00 64 6b 70 &quot; (45)
  lov = &quot;d0 0b d1 0b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 ff ff &quot; (32)
EXTENTS:
(0):33264747

# &amp;gt; debugfs.ldiskfs -c -R &quot;ls -lD /ROOT&quot; /dev/sda |grep dkp
debugfs.ldiskfs 1.42.7.wc1.1chaos (12-Apr-2013)
/dev/sda: catastrophic mode - not reading inode or group bitmaps
 66529648   40700 (2)  41679  41679    4096  7-Feb-2012 15:05 dkp
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This directory on a lustre client looks like:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;d??????????    ? ?        ?               ?            ? dkp
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;Clients are mostly 2.1 flavor still, although there may be some 2.4 out there.  We&apos;re in the (much longer than hoped) process of upgrading the servers before we do the client upgrades.&lt;/p&gt;</comment>
                            <comment id="71060" author="adilger" created="Fri, 8 Nov 2013 01:00:44 +0000"  >&lt;p&gt;In this case &quot;fid=&lt;span class=&quot;error&quot;&gt;&amp;#91;0x200000706:0x4f090000:0x0&amp;#93;&lt;/span&gt;&quot; has a sequence that looks reasonable &quot;0x200000706&quot;, though the OID looks strange &quot;0x4f090000&quot;.  It should normally be below 128k.  In any case, this in itself shouldn&apos;t be causing the &quot;can&apos;t find sequence&quot; errors, since this particular sequence is valid and should map to MDT0 properly.&lt;/p&gt;

&lt;p&gt;The FID in the LMA does not match the one in the LOV EA (which appears to be all zero) since this is a directory and not a regular file.  This is not a problem, but I just wanted to see if this data was inconsistent.  Could you do this same step with a regular file?&lt;/p&gt;

&lt;p&gt;It also looks like your version of debugfs.ldiskfs either does not implement the &quot;-D&quot; option, or the top level directory does not have dirdata that holds the FID.&lt;/p&gt;

&lt;p&gt;Since I also have an MDT that has existed since Lustre 1.6 or earlier (currently running 2.1.3 server with 2.4 clients), I&apos;m going to try upgrading this to 2.4.1 to see what happens.&lt;/p&gt;</comment>
                            <comment id="71062" author="di.wang" created="Fri, 8 Nov 2013 01:24:03 +0000"  >&lt;p&gt;Hmm, the sequence seems correct, so Andreas&apos;s idea of swapped problem might be correct. But OID seems too big here. (The maxim OID of MDT FID should be 0x20000ULL)&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lma: fid=[0x200000706:0x4f090000:0x0] compat=0 incompat=0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Chris, could you please tell me client version(2.1.6?), are there any tag?  What is the output of &quot;lfs path2fid dkp&quot;?&lt;/p&gt;

&lt;p&gt;Could you please provide -1 debug log of client/MDT when do &quot;stat dkp&quot;? (please clear client cache before doing this &quot;lctl set_param ldlm.*.&lt;b&gt;MDT&lt;/b&gt;-mdc*.lru_size=0&quot;) Thanks.&lt;/p&gt;</comment>
                            <comment id="71068" author="morrone" created="Fri, 8 Nov 2013 01:43:33 +0000"  >&lt;p&gt;The client I am on happens to be the same lustre version 2.4.0-19chaos.  There are probably lustre 2.1.4-&lt;span class=&quot;error&quot;&gt;&amp;#91;45&amp;#93;&lt;/span&gt;chaos clients mounting the filesystem as well.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;gt; lfs path2fid dkp
can&apos;t get fid for dkp: Input/output error
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I&apos;ll work on getting logs.&lt;/p&gt;</comment>
                            <comment id="71072" author="morrone" created="Fri, 8 Nov 2013 02:10:09 +0000"  >&lt;p&gt;I spot checked some working top level directories, and large OID numbers look pretty common.  Here are some fid EAs from working directories:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;  lma: fid=[0xd6e35b0102000000:0xa3b20000:0x0] compat=0 incompat=0
  lma: fid=[0x7c83e80100000000:0x716125c4:0x0] compat=0 incompat=0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Using path2fid, those two are reported on the client, respectively, as:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[0x2015be3d6:0xb2a3:0x0]
[0x1e8837c:0xc4256171:0x0]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So it would appear that debugfs is printing the fid info in the wrong byte order.&lt;/p&gt;

&lt;p&gt;(Both client and server are x86_64, and running the same 2.4.0-19chaos version of lustre).&lt;/p&gt;</comment>
                            <comment id="71076" author="di.wang" created="Fri, 8 Nov 2013 02:44:05 +0000"  >&lt;p&gt;Interesting. Hmm, what is your debugfs version? debugfs suppose to print things as little endian on x86_64.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;static inline void lfsck_swab_fid(struct lu_fid *fid)
{                                     
        fid-&amp;gt;f_seq = ext2fs_le64_to_cpu(fid-&amp;gt;f_seq);
        fid-&amp;gt;f_oid = ext2fs_le32_to_cpu(fid-&amp;gt;f_oid);
        fid-&amp;gt;f_ver = ext2fs_le32_to_cpu(fid-&amp;gt;f_ver);
}       

static void print_lmastr(FILE *out, ext2_ino_t inode_num, void *data, int len)
{
        struct lustre_mdt_attrs *lma = data;

        if (len &amp;lt; sizeof(*lma)) {
                fprintf(stderr, &quot;%s: error: lma for inode %u smaller than &quot;
                        &quot;expected (%d bytes).\n&quot;,
                        debug_prog_name, inode_num, len);
                return;
        }
        lfsck_swab_fid(&amp;amp;lma-&amp;gt;lma_self_fid);
        fprintf(out, &quot;  lma: fid=&quot;DFID&quot;\n&quot;, PFID(&amp;amp;lma-&amp;gt;lma_self_fid));
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;Hmm, debugfs is wrong here, then the fid we got from the previous comment is wrong.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;fid=[0x200000706:0x4f090000:0x0]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then it means the FID(&lt;span class=&quot;error&quot;&gt;&amp;#91;0x200000706:0x4f090000:0x0&amp;#93;&lt;/span&gt;) stored on LMA is wrong.&lt;/p&gt;</comment>
                            <comment id="71077" author="morrone" created="Fri, 8 Nov 2013 02:56:13 +0000"  >&lt;p&gt;See attached client_log.txt, and serveR_log.txt.bz2.  Note that client nid is 192.168.115.67@o2ib10, and the server nid is 172.16.64.141@tcp.&lt;/p&gt;</comment>
                            <comment id="71078" author="morrone" created="Fri, 8 Nov 2013 02:56:18 +0000"  >&lt;p&gt;There is too much MDS traffic to reasonably catch that one RPC with -1 debugging.  I backed it off to our defaults + rpctrace on the server side.  Let me know if there is some other conservative debug setting that would work for you.&lt;/p&gt;

</comment>
                            <comment id="71079" author="morrone" created="Fri, 8 Nov 2013 02:58:14 +0000"  >&lt;p&gt;debugfs version is 1.42.7.wc1.1chaos&lt;/p&gt;</comment>
                            <comment id="71081" author="di.wang" created="Fri, 8 Nov 2013 03:12:53 +0000"  >&lt;p&gt;Unfortunately, the server log here is not very helpful&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00100000:9.0:1383878901.757322:0:7277:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1451068745273592
00000100:00100000:9.0:1383878901.757330:0:7277:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-192.168.115.67@o2ib10, seq: 610971421
00000100:00100000:9.0:1383878901.757333:0:7277:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt02_044:249dc688-8ad3-1d27-4460-e739bfdc22f5+5:68652:x1451068745273592:12345-192.168.115.67@o2ib10:101
80000000:00020000:9.0:1383878901.757376:0:7277:0:(fld_handler.c:169:fld_server_lookup()) srv-lsd-MDT0000: Cannot find sequence 0x607000002000000: rc = -5
00000004:00020000:9.0:1383878901.757380:0:7277:0:(osd_handler.c:2125:osd_fld_lookup()) lsd-MDT0000-osd: cannot find FLD range for [0x607000002000000:0x94f:0x0]: rc = -5
00000004:00020000:9.0:1383878901.757382:0:7277:0:(osd_handler.c:3317:osd_remote_fid()) lsd-MDT0000-osd: Can not lookup fld for [0x607000002000000:0x94f:0x0]
80000000:00020000:9.0:1383878901.757394:0:7277:0:(fld_handler.c:169:fld_server_lookup()) srv-lsd-MDT0000: Cannot find sequence 0x607000002000000: rc = -5
00000100:00100000:9.0:1383878901.757418:0:7277:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt02_044:249dc688-8ad3-1d27-4460-e739bfdc22f5+5:68652:x1451068745273592:12345-192.168.115.67@o2ib10:101 Request procesed in 85us (105us total) trans 0 rc 301/301
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is the log I found. almost the same as we got from the console. Could you please also add &quot;+info +inode&quot; &quot;lctl set_param debug &quot;+info +inode&quot;&quot; on the server side? Hope that would not be too much.&lt;/p&gt;</comment>
                            <comment id="71082" author="morrone" created="Fri, 8 Nov 2013 03:14:19 +0000"  >&lt;p&gt;What is &lt;em&gt;your&lt;/em&gt; version of e2fsprogs?  The code at the 1.42.7.wc1 is:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        fid_be_to_cpu(&amp;amp;lma-&amp;gt;lma_self_fid, &amp;amp;lma-&amp;gt;lma_self_fid);
        fprintf(out, &lt;span class=&quot;code-quote&quot;&gt;&quot;  lma: fid=&quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot; compat=%x incompat=%x\n&quot;&lt;/span&gt;,
                PFID(&amp;amp;lma-&amp;gt;lma_self_fid), ext2fs_le32_to_cpu(lma-&amp;gt;lma_compat),
                ext2fs_le32_to_cpu(lma-&amp;gt;lma_incompat));
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That does the opposite swab of the one that you showed:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; inline void fid_be_to_cpu(struct lu_fid *dst, struct lu_fid *src)
{
	dst-&amp;gt;f_seq = ext2fs_be64_to_cpu(src-&amp;gt;f_seq);
	dst-&amp;gt;f_oid = ext2fs_be32_to_cpu(src-&amp;gt;f_oid);
	dst-&amp;gt;f_ver = ext2fs_be32_to_cpu(src-&amp;gt;f_ver);
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="71083" author="di.wang" created="Fri, 8 Nov 2013 03:18:51 +0000"  >&lt;p&gt;ah, on 1.42.7 debugfs is somewhat wrong.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;static void print_lmastr(FILE *out, ext2_ino_t inode_num, void *data, int len)
{
        struct lustre_mdt_attrs *lma = data;

        if (len &amp;lt; offsetof(typeof(*lma), lma_self_fid) +
                  sizeof(lma-&amp;gt;lma_self_fid)) {
                fprintf(stderr, &quot;%s: error: LMA for inode %u smaller than &quot;
                        &quot;expected (%d bytes).\n&quot;,
                        debug_prog_name, inode_num, len);
                return;
        }
        fid_be_to_cpu(&amp;amp;lma-&amp;gt;lma_self_fid, &amp;amp;lma-&amp;gt;lma_self_fid);
        fprintf(out, &quot;  lma: fid=&quot;DFID&quot; compat=%x incompat=%x\n&quot;,
                PFID(&amp;amp;lma-&amp;gt;lma_self_fid), ext2fs_le32_to_cpu(lma-&amp;gt;lma_compat),
                ext2fs_le32_to_cpu(lma-&amp;gt;lma_incompat));
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It tries to convert the FID from big endian, but actually LMA store the FID as little endian.&lt;/p&gt;</comment>
                            <comment id="71085" author="di.wang" created="Fri, 8 Nov 2013 03:22:32 +0000"  >&lt;p&gt;Chris: Do you still have PPC client(2.1) attached to this server? Could you create a directory from PPC client then use debugfs to see whether the FID in lma is correct?&lt;/p&gt;</comment>
                            <comment id="71086" author="di.wang" created="Fri, 8 Nov 2013 03:24:05 +0000"  >&lt;p&gt;Oh, my debugfs version is 1.42.3wc3, I create a ticket &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4228&quot; title=&quot;debugfs swab the FID in LMA as big endian&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4228&quot;&gt;&lt;del&gt;LU-4228&lt;/del&gt;&lt;/a&gt; to track this problem.&lt;/p&gt;</comment>
                            <comment id="71087" author="morrone" created="Fri, 8 Nov 2013 03:24:24 +0000"  >&lt;p&gt;No.  We have not have PPC clients attached to this filesystem.&lt;/p&gt;</comment>
                            <comment id="71088" author="yong.fan" created="Fri, 8 Nov 2013 03:29:05 +0000"  >&lt;p&gt;It seems that the &quot;oid&quot; in the FID is correct, but the &quot;seq&quot; in the FID is in wrong order. It is quite possible that the FID in LMA, and in dir-ent entry, and in the OI mapping are all consistent with one another, but with the wrong &quot;seq&quot; order. The issue should has been there since Lustre-2.1, but because Lustre-2.1 MDS did not verify the FLDB, so it worked before.&lt;/p&gt;

&lt;p&gt;On the other hand, it is NOT all the FIDs with wrong &quot;seq&quot; order. The FID &lt;span class=&quot;error&quot;&gt;&amp;#91;0x2015be3d6:0xb2a3:0x0&amp;#93;&lt;/span&gt; and &lt;span class=&quot;error&quot;&gt;&amp;#91;0x1e8837c:0xc4256171:0x0&amp;#93;&lt;/span&gt; are valid. I suspect that only some special client, such as PPC client, ever generated invalid FID and sent it to the Lustre-2.1 MDT when create (such as the &quot;/ROOT/dkp&quot;. So if possible, we can downgrade the MDS to Lustre-2.1, and test whether it is the case or not.&lt;/p&gt;</comment>
                            <comment id="71089" author="di.wang" created="Fri, 8 Nov 2013 03:39:09 +0000"  >&lt;p&gt;Since during upgrade(2.1 to 2.4), OI-scrub will unlikely touch the FID(both in LMA and name entry). But the FIDs(in LMA) of &quot;dkp&quot; is clearly wrong(Note: FID in name-entry should be wrong too, we can see it from the debug message)&lt;/p&gt;

&lt;p&gt;So probably it is not the problem of 2.4 or upgrading. It is more likely a bug already exists on 2.1. Though we suspect that is probably related with PPC client.  As Fan Yong said, we do not do fld lookup(check) for 2.1, so it &quot;works&quot; on 2.1.&lt;/p&gt;</comment>
                            <comment id="71090" author="di.wang" created="Fri, 8 Nov 2013 03:46:42 +0000"  >&lt;p&gt;On possible solution here might be (probably there are better ones)&lt;br/&gt;
1. Applied patch &lt;a href=&quot;http://review.whamcloud.com/8213&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8213&lt;/a&gt; to MDT, and rebuild and remount the system.&lt;br/&gt;
2. Find out all files/directories with wrong sequence, and &quot;migrate&quot; them to the new dir/file.&lt;br/&gt;
    For migrate I mean mkdir(touch), cp, then remove the old ones.&lt;br/&gt;
3. umount MDT server and remove the patch for 8213.&lt;/p&gt;

&lt;p&gt;But we still need to find out why this &quot;big endian&quot; FIDs was stored in LMA or name-entry. Does it related with PPC client. is it still exists on 2.1? Andreas, Do you recall any bugs related?&lt;/p&gt;</comment>
                            <comment id="71091" author="morrone" created="Fri, 8 Nov 2013 03:47:13 +0000"  >&lt;p&gt;And the MDT just happily wrote the bogus sequence to disk...that sounds like at least two bugs.&lt;/p&gt;</comment>
                            <comment id="71092" author="morrone" created="Fri, 8 Nov 2013 03:47:47 +0000"  >&lt;p&gt;I have logs if you still want them.  Where can I upload large files these days?  Is ftp.whamcloud.com still the place?&lt;/p&gt;</comment>
                            <comment id="71093" author="di.wang" created="Fri, 8 Nov 2013 03:49:30 +0000"  >&lt;p&gt;According to the timestamp of dpk&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; ctime: 0x4f31ae1d:00000000 -- Tue Feb  7 15:05:01 2012
 atime: 0x524106c8:00000000 -- Mon Sep 23 20:28:08 2013
 mtime: 0x4f31ae1d:00000000 -- Tue Feb  7 15:05:01 2012
crtime: 0x4f31ae1d:e22ed204 -- Tue Feb  7 15:05:01 2012
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What is the lustre version was running at Feb 2012? &lt;/p&gt;</comment>
                            <comment id="71094" author="di.wang" created="Fri, 8 Nov 2013 03:57:26 +0000"  >&lt;p&gt;Clearly, b2_1 did not do good job here. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; It seems to me ftp.whamcloud.com still works. At least I can access it. &lt;/p&gt;</comment>
                            <comment id="71095" author="morrone" created="Fri, 8 Nov 2013 03:59:03 +0000"  >&lt;p&gt;In Dec 2012 is was running 2.1.2-3chaos.  So either that or something earlier.&lt;/p&gt;</comment>
                            <comment id="71096" author="morrone" created="Fri, 8 Nov 2013 04:00:32 +0000"  >&lt;p&gt;I just uploaded &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4226&quot; title=&quot;MDS unable to locate swabbed FID SEQ in FLDB&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4226&quot;&gt;&lt;del&gt;LU-4226&lt;/del&gt;&lt;/a&gt;_client_log2.txt and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4226&quot; title=&quot;MDS unable to locate swabbed FID SEQ in FLDB&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4226&quot;&gt;&lt;del&gt;LU-4226&lt;/del&gt;&lt;/a&gt;_server_log2.txt.bz2 to ftp.whamcloud.com.&lt;/p&gt;</comment>
                            <comment id="71103" author="di.wang" created="Fri, 8 Nov 2013 05:55:04 +0000"  >&lt;p&gt;Chris: Thanks for debug log, so it is clear that, the FID(of dkp) in LMA and name entry are wrong,&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LU-4226_server_log2.txt:00000020:00000040:8.0:1383880710.013496:0:7320:0:(lustre_handles.c:114:class_handle_hash()) added object ffff88051775fc00 with handle 0x57ab4f35e04c8dd2 to hash
LU-4226_server_log2.txt:00010000:00000040:8.0:1383880710.013497:0:7320:0:(ldlm_resource.c:1423:ldlm_resource_dump()) --- Resource: ffff8805ece16d00 (31850497/4211191005/0/2365253) (rc: 1)
LU-4226_server_log2.txt:80000000:00020000:8.0:1383880710.013533:0:7320:0:(fld_handler.c:169:fld_server_lookup()) srv-lsd-MDT0000: Cannot find sequence 0x607000002000000: rc = -5
LU-4226_server_log2.txt:00000004:00020000:8.0:1383880710.046488:0:7320:0:(osd_handler.c:2125:osd_fld_lookup()) lsd-MDT0000-osd: cannot find FLD range for [0x607000002000000:0x94f:0x0]: rc = -5
LU-4226_server_log2.txt:00000004:00020000:8.0:1383880710.086690:0:7320:0:(osd_handler.c:3317:osd_remote_fid()) lsd-MDT0000-osd: Can not lookup fld for [0x607000002000000:0x94f:0x0]
LU-4226_server_log2.txt:00000004:00000040:8.0:1383880710.119477:0:7320:0:(mdt_handler.c:2384:mdt_object_find()) Find object for [0x607000002000000:0x94f:0x0]
LU-4226_server_log2.txt:00000004:00000040:8.0:1383880710.119485:0:7320:0:(mdt_handler.c:5018:mdt_object_init()) object init, fid = [0x607000002000000:0x94f:0x0]
LU-4226_server_log2.txt:80000000:00020000:8.0:1383880710.119493:0:7320:0:(fld_handler.c:169:fld_server_lookup()) srv-lsd-MDT0000: Cannot find sequence 0x607000002000000: rc = -5
LU-4226_server_log2.txt:00000004:00000040:8.0:1383880710.119500:0:7320:0:(mdt_handler.c:5038:mdt_object_free()) object free, fid = [0x607000002000000:0x94f:0x0]
LU-4226_server_log2.txt:00010000:00000040:8.0:1383880710.119505:0:7320:0:(ldlm_lock.c:888:ldlm_lock_decref_internal()) forcing cancel of local lock
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It seems OID is correct, but sequence 0x607000002000000 is wrong(likely to be wrongly swapped), So it means the error might happened during seq allocation process, otherwise both OID and sequence will be wrong, since they are always be swapped at the same time except seq allocation. I checked 2.1.2 code, and did not find anything wrong there. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; If you can tell me the exact lustre version you use on Feb 2012, it will be very helpful. And also please tell me are there any other special patches you might add in that version?&lt;/p&gt;

&lt;p&gt;I will keep digging the history here. Thanks.&lt;/p&gt;



</comment>
                            <comment id="71114" author="adilger" created="Fri, 8 Nov 2013 11:37:15 +0000"  >&lt;p&gt;I upgraded my home system from 2.1.3 to 2.4.1 (RHEL6 with RPMs from the b2_4 &quot;last_successful_build&quot; yum repo on build.whamcloud.com to include Fan Yong&apos;s scrub fix from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3934&quot; title=&quot;Directories gone missing after 2.4 update&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3934&quot;&gt;&lt;del&gt;LU-3934&lt;/del&gt;&lt;/a&gt;).  The filesystem was originally formatted with Lustre 1.4 (not sure which version) and has been upgraded through 1.6, 1.8, and 2.1.  It performed an automatic initial scrub on the MDS mount.  I enabled &quot;dirdata&quot; and performed another manual scrub without problems.  I&apos;ve verified creating new files create entries with dirdata, old files are accessible (both IGIF and normal FIDs), and have not seen any sign of problems.  I also verified the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4228&quot; title=&quot;debugfs swab the FID in LMA as big endian&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4228&quot;&gt;&lt;del&gt;LU-4228&lt;/del&gt;&lt;/a&gt; fix for debugfs works properly to print FIDs in the LMA and &quot;ls -lD&quot; to print them from dirdata.&lt;/p&gt;

&lt;p&gt;It would be useful to know if these files with the crazy FIDs are still being created, or if they have stopped being created when that happened.  If there are a limited number of them, then it may be possible to &quot;lfs_migrate&quot; them to new files with Di&apos;s patch applied on the MDS.  Alternately, it may also be possible to mount the filesystem as ldiskfs and delete the trusted.lma and trusted.link xattrs from the file (essentially turning the file into a 1.8 upgrade object with IGIF) and then re-run LFSCK on it to clean up the dirdata entries and re-add the files into the OI.  I haven&apos;t tested that yet, so I&apos;m not 100% sure of what effect it will have.  Clients would probably need to be remounted, or at a minimum have their cache flushed.&lt;/p&gt;</comment>
                            <comment id="71141" author="adilger" created="Fri, 8 Nov 2013 17:57:32 +0000"  >&lt;p&gt;Chris, have you considered downgrading to 2.1 again to get the system back up and usable?&lt;/p&gt;</comment>
                            <comment id="71143" author="di.wang" created="Fri, 8 Nov 2013 18:20:34 +0000"  >&lt;p&gt;I checked the code since 2.1.0, and did not find anything unusual which can trigger the problem. Unfortunately, I do not have any big endian machine to try this on 2.1.0 to see whether the problem is still there. If the system is still working on 2.1, and if you can find some PPC clients attached to it, that might help us understand the issue. Thanks.&lt;/p&gt;</comment>
                            <comment id="71145" author="morrone" created="Fri, 8 Nov 2013 18:21:56 +0000"  >&lt;p&gt;I think I like the idea to remove the trusted.lma and trusted.link xattrs the best.  Walking the filesystem through the direct ldiskfs mount won&apos;t be too time consuming.&lt;/p&gt;

&lt;p&gt;I&apos;ll give that a try on some files in a test filesystem.&lt;/p&gt;</comment>
                            <comment id="71146" author="morrone" created="Fri, 8 Nov 2013 18:22:57 +0000"  >&lt;p&gt;I fear the downgrade option.  Upgrades don&apos;t work well, and those are at least somewhat tested.  Downgrades aren&apos;t even tested.&lt;/p&gt;</comment>
                            <comment id="71147" author="pjones" created="Fri, 8 Nov 2013 18:28:57 +0000"  >&lt;p&gt;Chris&lt;/p&gt;

&lt;p&gt;We do test downgrades, but certainly nothing as complex as this situation&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="71164" author="adilger" created="Fri, 8 Nov 2013 20:26:14 +0000"  >&lt;p&gt;Chris, Di&apos;s patch in &lt;a href=&quot;http://review.whamcloud.com/8213&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8213&lt;/a&gt; may also be a quick workaround.  It weakens the added &quot;FID sanity check&quot; that was introduced in 2.4 so that invalid FIDs are assumed to be on MDT0000.  They will still fail if the object doesn&apos;t exist, but it might allow the bad files to become accessible again, and would not involve any permanent change to the filesystem, like deleting the LMA xattr does.&lt;/p&gt;

&lt;p&gt;It would also make sense to do a &quot;dd&quot; backup of the MDT if possible.  That should be relatively fast even if the target is just a single 4TB SATA drive (est. 5.5h for a 100MB/s source/target, extrapolating a 2TB MDT from the ~1B inodes that LFSCK scanned), and could be done with the filesystem live if necessary (a backup that needs an e2fsck is better than none).&lt;/p&gt;</comment>
                            <comment id="71186" author="nedbass" created="Sat, 9 Nov 2013 00:38:17 +0000"  >&lt;p&gt;Andreas, we may try your method of deleting trusted.lma and trusted.link xattrs.  One thing I&apos;m unsure about is the semantics of the lfsck_start command with regard to the supported repair types.  Can we simultaneously repair the OI and namespace (i.e. FID-in-Dirent and LinkEA)?  That is, should we run&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl lfsck_start -M &amp;lt;mdt&amp;gt;
lctl lfsck_start -M &amp;lt;mdt&amp;gt; -t namespace
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;in immediate succession, or should we wait until the OI scrub completes before starting the namespace repair, or can both types be started with one command?&lt;/p&gt;</comment>
                            <comment id="71206" author="yong.fan" created="Sun, 10 Nov 2013 00:46:09 +0000"  >&lt;p&gt;&quot;lctl lfsck_start -M &amp;lt;MDT&amp;gt;&quot; will only trigger OI_scrub check/repair at background. So the command will return immediately, but as for how long the check/repair will last depends on how many files on the MDT.&lt;/p&gt;

&lt;p&gt;&quot;lctl lfsck_start -M &amp;lt;MDT&amp;gt; -t namespace&quot; will trigger OI_scrub and namespace check/repair simultaneously. The two components run in parallel at background. Similar as above, the command will return immediately, and the scanning time depends on the files count on the MDT.&lt;/p&gt;

&lt;p&gt;Something to be clarified:&lt;br/&gt;
1) FID-in-LMA is trusted by OI_scrub. If you drop FID-in-LMA, then OI_scrub will regard it as old file, and generate IGIF mode FID, which is different from original dropped FID, and insert IGIF&amp;lt;=&amp;gt;ino# mapping into the OI table.&lt;/p&gt;

&lt;p&gt;2) namespace scanning also trust FID-in-LMA. If the FID-in-LMA does not exist, it will append IGIF mode FID after name-entry in the directory block, and also use the IGIF for linkEA.&lt;/p&gt;

&lt;p&gt;3) Because on server-side, the files&apos; FIDs are changed, you have to re-mount the clients to purge out all related cached but staled information.&lt;/p&gt;

&lt;p&gt;To be safe, it is better to find a test system firstly to verify wether it works as our expectation or not.&lt;/p&gt;</comment>
                            <comment id="71598" author="adilger" created="Thu, 14 Nov 2013 23:37:12 +0000"  >&lt;p&gt;This is a simple script to check a filesystem for strange looking fids.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;usage: checkfid.sh [-v] /path/to/lustre
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It will print tons of errors on the filesystem discussed in this bug, but it would be useful to run it on other filesystems (preferably before upgrade to 2.4) to see if they suffer from the same problems.&lt;/p&gt;</comment>
                            <comment id="71602" author="nedbass" created="Fri, 15 Nov 2013 01:19:13 +0000"  >&lt;p&gt;Thanks Andreas, that&apos;s quite helpful (though I don&apos;t think this does what you intended):&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;	[[ -n &lt;span class=&quot;code-quote&quot;&gt;&quot;$LAST&quot;&lt;/span&gt; &amp;amp;&amp;amp; &lt;span class=&quot;code-quote&quot;&gt;&quot;$F&quot;&lt;/span&gt; == &lt;span class=&quot;code-quote&quot;&gt;&quot;$LAST&quot;&lt;/span&gt; ]] &amp;amp;&amp;amp; LAST=&lt;span class=&quot;code-quote&quot;&gt;&quot;&quot; &amp;amp;&amp;amp; echo &quot;&lt;/span&gt;found&quot; ||
		&lt;span class=&quot;code-keyword&quot;&gt;continue&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;Incidentally, we carried out the &quot;remove trusted.{lma,link} from bad files + lfsck&quot; recovery procedure, and it worked pretty much as expected.&lt;/p&gt;</comment>
                            <comment id="71611" author="adilger" created="Fri, 15 Nov 2013 09:19:40 +0000"  >&lt;p&gt;Updated version of checkfid.sh program.  The &quot;restart&quot; mechanism was added at the last minute and looked like it was working, but wasn&apos;t.&lt;/p&gt;</comment>
                            <comment id="71757" author="adilger" created="Mon, 18 Nov 2013 09:22:46 +0000"  >&lt;p&gt;Any chance to run the checkfid.sh script on any of your other filesystems?&lt;/p&gt;</comment>
                            <comment id="72269" author="morrone" created="Mon, 25 Nov 2013 22:16:09 +0000"  >&lt;p&gt;Yes, it was run (maybe still running) on four of our ldiskfs systems on the SCF.  Of the four, only one had bad fids, and that filesystem was the one that BG/P used exclusively.  That filesystem has in excess of 1 million files/directories with bad fids.&lt;/p&gt;

&lt;p&gt;So that would appear to be anther strong correlation pointing to ppc clients and lack of checking on the servers. &lt;/p&gt;</comment>
                            <comment id="72923" author="nedbass" created="Thu, 5 Dec 2013 21:13:46 +0000"  >&lt;p&gt;Andreas, in case checkfid.sh is needed again, it needs to handle sequence numbers that compare as negative integers:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;-       [[ ${SFID[1]} -ge $MAXFID ]] &amp;amp;&amp;amp; echo &quot;$F: bad SEQ $FFID&quot; &amp;amp;&amp;amp; continue
+       if [[ ${SFID[1]} -ge $MAXFID -o ${FID[1]} -lt 0 ]] ; then
+               echo &quot;$F: bad SEQ $FFID&quot;
+               continue
+       fi
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="81682" author="di.wang" created="Tue, 15 Apr 2014 21:55:39 +0000"  >&lt;p&gt;Ned, Chris: Could you please tell me if OI_scrub fix these bad FIDs? Are there anything else I should do for this ticket? Thanks.&lt;/p&gt;</comment>
                            <comment id="81683" author="di.wang" created="Tue, 15 Apr 2014 21:58:11 +0000"  >&lt;p&gt;Btw: we will do more FID validation on the server side in &lt;a href=&quot;https://jira.hpdd.intel.com/browse/LU-4232&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jira.hpdd.intel.com/browse/LU-4232&lt;/a&gt;. Ah, already attached that ticket to the sub-tasks.&lt;/p&gt;</comment>
                            <comment id="82017" author="adilger" created="Sat, 19 Apr 2014 09:14:29 +0000"  >&lt;p&gt;Di, I don&apos;t think there was any way for LFSCK to fix the bad FIDs directly. My understanding is that the LMA xattr was removed from the inodes, and then LFSCK treated this as an upgraded 1.8 filesystem with IGIF FIDs and recreated the LMA.  &lt;/p&gt;</comment>
                            <comment id="82076" author="morrone" created="Mon, 21 Apr 2014 18:00:10 +0000"  >&lt;p&gt;The problem was handled as Andreas explained.  If the servers now have code to prevent this problem in the fist place, then the ticket is complete.&lt;/p&gt;</comment>
                            <comment id="161853" author="simmonsja" created="Sun, 14 Aug 2016 17:24:45 +0000"  >&lt;p&gt;time to close this out.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="25650">LU-5369</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21941">LU-4232</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21942">LU-4233</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="13834" name="checkfid.sh" size="949" author="adilger" created="Fri, 15 Nov 2013 09:19:40 +0000"/>
                            <attachment id="13805" name="client_log.txt" size="717575" author="morrone" created="Fri, 8 Nov 2013 02:56:13 +0000"/>
                            <attachment id="13806" name="server_log.txt.bz2" size="231" author="morrone" created="Fri, 8 Nov 2013 02:56:13 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 18 Jul 2014 19:18:36 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw8br:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11501</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10020"><![CDATA[1]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 7 Nov 2013 19:18:36 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>