[LU-3696] sanity test_17m, test_17n: e2fsck unattached inodes failure Created: 05/Aug/13  Updated: 01/Dec/14  Resolved: 05/Jun/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.6.0
Fix Version/s: Lustre 2.6.0, Lustre 2.5.4

Type: Bug Priority: Critical
Reporter: James Nunez (Inactive) Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: mn4

Issue Links:
Related
is related to LU-4690 sanity test_4: Expect error removing ... Resolved
is related to LU-4140 Volatile files have CREATE changelog ... Resolved
is related to LU-4293 lfs_migrate is failing with a volatil... Resolved
Severity: 3
Rank (Obsolete): 9541

 Description   

I'm trying to find what test/environment/circumstances fills an OST during autotest. I ran sanity three times in a row on Toro; https://maloo.whamcloud.com/test_sessions/90d23e6c-fbe4-11e2-aaad-52540035b04c . I didn't hit the full OST problem, but I did run into sanity test 17m failures.

On the second and, not surprisingly, third run of sanity, test 17m failed with:
sanity test_17m: @@@@@@ FAIL: e2fsck should not report error upon short/long symlink MDT: rc=4

This first, successful, run of test 17m has the following output:

01:55:18:stop and checking mds1: e2fsck -fnvd /dev/lvm-MDS/P1
01:55:18:CMD: client-24vm3 grep -c /mnt/mds1' ' /proc/mounts
01:55:18:Stopping /mnt/mds1 (opts:-f) on client-24vm3
01:55:18:CMD: client-24vm3 umount -d -f /mnt/mds1
01:55:18:CMD: client-24vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
01:55:18:CMD: client-24vm3 e2fsck -fnvd /dev/lvm-MDS/P1
01:55:18:client-24vm3: e2fsck 1.42.7.wc1 (12-Apr-2013)
01:55:18:Pass 1: Checking inodes, blocks, and sizes
01:55:18:Pass 2: Checking directory structure
01:55:18:Pass 3: Checking directory connectivity
01:55:18:Pass 4: Checking reference counts
01:55:18:Pass 5: Checking group summary information
01:55:18:
01:55:18:        1324 inodes used (0.13%, out of 1048576)
01:55:18:           7 non-contiguous files (0.5%)
01:55:18:           1 non-contiguous directory (0.1%)
01:55:18:             # of inodes with ind/dind/tind blocks: 2/0/0
01:55:18:      154573 blocks used (29.48%, out of 524288)
01:55:18:           0 bad blocks
01:55:18:           1 large file
01:55:18:
01:55:18:         127 regular files
01:55:18:         137 directories
01:55:18:           0 character device files
01:55:18:           0 block device files
01:55:18:           0 fifos
01:55:18:           0 links
01:55:18:        1051 symbolic links (526 fast symbolic links)
01:55:18:           0 sockets
01:55:18:------------
01:55:18:        1315 files

This second run of test 17m has the following output:

== sanity test 17m: run e2fsck against MDT which contains short/long symlink == 04:23:23 (1375442603)
CMD: client-24vm3 /usr/sbin/lctl get_param -n version
CMD: client-24vm3 /usr/sbin/lctl get_param -n version
create 512 short and long symlink files under /mnt/lustre/d0.sanity/d17m
erase them
Waiting for local destroys to complete
recreate the 512 symlink files with a shorter string
stop and checking mds1: e2fsck -fnvd /dev/lvm-MDS/P1
CMD: client-24vm3 grep -c /mnt/mds1' ' /proc/mounts
Stopping /mnt/mds1 (opts:-f) on client-24vm3
CMD: client-24vm3 umount -d -f /mnt/mds1
CMD: client-24vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
CMD: client-24vm3 e2fsck -fnvd /dev/lvm-MDS/P1
client-24vm3: e2fsck 1.42.7.wc1 (12-Apr-2013)
client-24vm3: e2fsck_pass1:1500: increase inode 32773 badness 0 to 2
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 635
Connect to /lost+found? no

Unattached inode 636
Connect to /lost+found? no

Unattached inode 638
Connect to /lost+found? no

Unattached inode 639
Connect to /lost+found? no

Unattached inode 641
Connect to /lost+found? no

Unattached inode 645
Connect to /lost+found? no

Unattached inode 1841
Connect to /lost+found? no

Unattached inode 1842
Connect to /lost+found? no

Unattached inode 1843
Connect to /lost+found? no

Unattached inode 1844
Connect to /lost+found? no

Unattached inode 1845
Connect to /lost+found? no

Unattached inode 1846
Connect to /lost+found? no

Unattached inode 1847
Connect to /lost+found? no

Unattached inode 1848
Connect to /lost+found? no

Unattached inode 1849
Connect to /lost+found? no

Unattached inode 1850
Connect to /lost+found? no

Unattached inode 1851
Connect to /lost+found? no

Unattached inode 1852
Connect to /lost+found? no

Unattached inode 1855
Connect to /lost+found? no

Unattached inode 1894
Connect to /lost+found? no

Unattached inode 1895
Connect to /lost+found? no

Unattached inode 1896
Connect to /lost+found? no

Unattached inode 1897
Connect to /lost+found? no

Unattached inode 1898
Connect to /lost+found? no

Unattached inode 1899
Connect to /lost+found? no

Unattached inode 1900
Connect to /lost+found? no

Unattached inode 1901
Connect to /lost+found? no

Unattached inode 1902
Connect to /lost+found? no

Unattached inode 1903
Connect to /lost+found? no

Unattached inode 1904
Connect to /lost+found? no

Unattached inode 1905
Connect to /lost+found? no

Unattached inode 1908
Connect to /lost+found? no

Pass 5: Checking group summary information

lustre-MDT0000: ********** WARNING: Filesystem still has errors **********


        1396 inodes used (0.13%, out of 1048576)
          40 non-contiguous files (2.9%)
           2 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 18/2/0
      158490 blocks used (30.23%, out of 524288)
           0 bad blocks
           1 large file

         197 regular files
         139 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
        1051 symbolic links (526 fast symbolic links)
           0 sockets
------------
        1355 files

Because the same test names were run in one test session, if looks like Maloo is confusing the output of one run with another and is a little confusing when looking at the logs. The time stamps also seem to be in the future of when the results were reported. Hopefully, I'm just misreading the logs and time stamps.



 Comments   
Comment by James Nunez (Inactive) [ 16/Aug/13 ]

I ran sanity twice and got the same reported error https://maloo.whamcloud.com/test_sets/eb3e8820-0642-11e3-a405-52540035b04c .

From the client console, I see trouble communicating with the MDT. The following happens three times:

13:55:23:Lustre: DEBUG MARKER: == sanity test 17m: run e2fsck against MDT which contains short/long symlink == 13:55:20 (1376600120)
13:55:44:LustreError: 11-0: lustre-MDT0000-mdc-ffff88007ab08400: Communicating with 10.10.4.150@tcp, operation obd_ping failed with -107.
13:55:46:Lustre: lustre-MDT0000-mdc-ffff88007ab08400: Connection to lustre-MDT0000 (at 10.10.4.150@tcp) was lost; in progress operations using this service will wait for recovery to complete
13:55:46:LustreError: 11-0: MGC10.10.4.150@tcp: Communicating with 10.10.4.150@tcp, operation obd_ping failed with -107.
13:55:46:LustreError: 166-1: MGC10.10.4.150@tcp: Connection to MGS (at 10.10.4.150@tcp) was lost; in progress operations using this service will fail
13:55:46:Lustre: Evicted from MGS (at 10.10.4.150@tcp) after server handle changed from 0xbbcb7768cf2f2565 to 0xbbcb7768cf310459
13:55:46:Lustre: MGC10.10.4.150@tcp: Connection restored to MGS (at 10.10.4.150@tcp)
13:56:02:LustreError: 167-0: lustre-MDT0000-mdc-ffff88007ab08400: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
13:56:02:LustreError: 508:0:(mdc_locks.c:920:mdc_enqueue()) ldlm_cli_enqueue: -5
13:56:02:LustreError: 508:0:(file.c:3026:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -5
13:56:02:LustreError: 509:0:(file.c:171:ll_close_inode_openhandle()) inode 144115205255725102 mdc close failed: rc = -108
13:56:02:LustreError: 508:0:(mdc_locks.c:920:mdc_enqueue()) ldlm_cli_enqueue: -108
13:56:02:LustreError: 508:0:(file.c:3026:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -108
13:56:02:Lustre: lustre-MDT0000-mdc-ffff88007ab08400: Connection restored to lustre-MDT0000 (at 10.10.4.150@tcp)
13:56:03:Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 2>/dev/null || true
13:56:03:Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 17n: run e2fsck against master\/slave MDT which contains remote dir == 13:55:49 \(1376600149\)
14:55:34:Lustre: DEBUG MARKER: == sanity test 17m: run e2fsck against MDT which contains short/long symlink == 14:55:32 (1376603732)
14:55:55:LustreError: 11-0: lustre-MDT0000-mdc-ffff88007da19400: Communicating with 10.10.4.150@tcp, operation obd_ping failed with -107.

From the MDT console, I see a few different errors

13:55:23:Lustre: DEBUG MARKER: == sanity test 17m: run e2fsck against MDT which contains short/long symlink == 13:55:20 (1376600120)
13:55:23:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version
13:55:23:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version
13:55:34:Lustre: DEBUG MARKER: lctl get_param -n osc.*MDT*.sync_*
13:55:34:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
13:55:34:Lustre: DEBUG MARKER: umount -d -f /mnt/mds1
13:55:35:LustreError: 2745:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88005be45400 x1443469761447100/t0(0) o13->lustre-OST0002-osc-MDT0000@10.10.4.151@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
13:55:35:LustreError: 2745:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff880037e00800 x1443469761447108/t0(0) o13->lustre-OST0004-osc-MDT0000@10.10.4.151@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
13:55:35:Lustre: lustre-MDT0000: Not available for connect from 10.10.4.152@tcp (stopping)
13:55:35:Lustre: lustre-MDT0000: Not available for connect from 10.10.4.151@tcp (stopping)
13:55:35:LustreError: 2746:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff880063ccb000 x1443469761447124/t0(0) o13->lustre-OST0006-osc-MDT0000@10.10.4.151@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
13:55:35:LustreError: 2746:0:(client.c:1048:ptlrpc_import_delay_req()) Skipped 3 previous similar messages
13:55:46:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.10.4.152@tcp (no target)
13:55:47:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.10.4.153@tcp (no target)
13:55:47:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.10.4.152@tcp (no target)
13:55:47:Lustre: 9801:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1376600136/real 1376600136]  req@ffff880064c01400 x1443469761447144/t0(0) o251->MGC10.10.4.150@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1376600142 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
13:55:47:Lustre: server umount lustre-MDT0000 complete
13:55:47:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
13:55:47:Lustre: DEBUG MARKER: e2fsck -fnvd /dev/lvm-MDS/P1
13:55:47:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1
13:55:48:Lustre: DEBUG MARKER: test -b /dev/lvm-MDS/P1
13:55:48:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl  		                   /dev/lvm-MDS/P1 /mnt/mds1
13:55:48:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
13:55:48:Lustre: lustre-MDT0000: used disk, loading
13:55:52:LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.

The OSTs are also having problems communicating with the MDT:

13:55:27:Lustre: DEBUG MARKER: == sanity test 17m: run e2fsck against MDT which contains short/long symlink == 13:55:20 (1376600120)
13:55:38:LustreError: 11-0: lustre-MDT0000-lwp-OST0001: Communicating with 10.10.4.150@tcp, operation obd_ping failed with -107.
13:55:38:Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.10.4.150@tcp) was lost; in progress operations using this service will wait for recovery to complete
13:55:38:LustreError: 11-0: lustre-MDT0000-lwp-OST0003: Communicating with 10.10.4.150@tcp, operation obd_ping failed with -107.
13:55:39:Lustre: lustre-MDT0000-lwp-OST0003: Connection to lustre-MDT0000 (at 10.10.4.150@tcp) was lost; in progress operations using this service will wait for recovery to complete
13:55:52:Lustre: lustre-OST0000: deleting orphan objects from 0x0:11 to 0x0:33
13:55:52:Lustre: lustre-OST0001: deleting orphan objects from 0x0:11 to 0x0:33
13:55:52:Lustre: lustre-OST0002: deleting orphan objects from 0x0:11 to 0x0:33
13:55:52:Lustre: lustre-OST0003: deleting orphan objects from 0x0:12 to 0x0:33
13:55:52:Lustre: lustre-OST0004: deleting orphan objects from 0x0:12 to 0x0:33
13:55:53:Lustre: lustre-OST0005: deleting orphan objects from 0x0:12 to 0x0:33
13:55:53:Lustre: lustre-OST0006: deleting orphan objects from 0x0:12 to 0x0:33
13:55:53:Lustre: 4969:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1376600138/real 1376600138]  req@ffff88006a3f3000 x1443469771933360/t0(0) o400->MGC10.10.4.150@tcp@10.10.4.150@tcp:26/25 lens 224/224 e 0 to 1 dl 1376600145 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
13:55:53:LustreError: 166-1: MGC10.10.4.150@tcp: Connection to MGS (at 10.10.4.150@tcp) was lost; in progress operations using this service will fail
13:55:55:Lustre: Evicted from MGS (at 10.10.4.150@tcp) after server handle changed from 0xbbcb7768cf2f22c5 to 0xbbcb7768cf31047c
13:55:55:Lustre: MGC10.10.4.150@tcp: Connection restored to MGS (at 10.10.4.150@tcp)
Comment by Sarah Liu [ 26/Oct/13 ]

I also hit this error when running rolling upgrade test; after all of the servers and clients upgrade from 2.4.1 to 2.5, run sanity and hit this:

https://maloo.whamcloud.com/test_sets/2a999af6-3e0a-11e3-83f5-52540035b04c
test log shows:

== sanity test 17m: run e2fsck against MDT which contains short/long symlink == 22:14:48 (1382764488)
create 512 short and long symlink files under /mnt/lustre/d0.sanity/d17m
erase them
Waiting for local destroys to complete
recreate the 512 symlink files with a shorter string
stop and checking mds1: e2fsck -fnvd /dev/sdb1
Stopping /mnt/mds1 (opts:-f) on fat-amd-1
fat-amd-1: e2fsck 1.42.7.wc1 (12-Apr-2013)
fat-amd-1: e2fsck_pass1:1500: increase inode 204 badness 0 to 2
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 632
Connect to /lost+found? no

Unattached inode 633
Connect to /lost+found? no

Unattached inode 644
Connect to /lost+found? no

Unattached inode 645
Connect to /lost+found? no

Unattached inode 652
Connect to /lost+found? no

Unattached inode 653
Connect to /lost+found? no

Unattached inode 1689
Connect to /lost+found? no

Unattached inode 1690
Connect to /lost+found? no

Unattached inode 1691
Connect to /lost+found? no

Unattached inode 1692
Connect to /lost+found? no

Unattached inode 1693
Connect to /lost+found? no

Unattached inode 1694
Connect to /lost+found? no

Unattached inode 1695
Connect to /lost+found? no

Unattached inode 1696
Connect to /lost+found? no

Unattached inode 1697
Connect to /lost+found? no

Unattached inode 1698
Connect to /lost+found? no

Unattached inode 1699
Connect to /lost+found? no

Unattached inode 1700
Connect to /lost+found? no

Unattached inode 1701
Connect to /lost+found? no

Unattached inode 1702
Connect to /lost+found? no

Unattached inode 1703
Connect to /lost+found? no

Unattached inode 1704
Connect to /lost+found? no

Unattached inode 1705
Connect to /lost+found? no

Unattached inode 1709
Connect to /lost+found? no

Unattached inode 1710
Connect to /lost+found? no

Unattached inode 1711
Connect to /lost+found? no

Unattached inode 1712
Connect to /lost+found? no

Unattached inode 1713
Connect to /lost+found? no

Unattached inode 1714
Connect to /lost+found? no

Unattached inode 1715
Connect to /lost+found? no

Unattached inode 1716
Connect to /lost+found? no

Unattached inode 1717
Connect to /lost+found? no

Unattached inode 1718
Connect to /lost+found? no

Unattached inode 1719
Connect to /lost+found? no

Unattached inode 1720
Connect to /lost+found? no

Unattached inode 1721
Connect to /lost+found? no

Unattached inode 1722
Connect to /lost+found? no

Unattached inode 1723
Connect to /lost+found? no

Unattached inode 1732
Connect to /lost+found? no

Unattached inode 1733
Connect to /lost+found? no

Unattached inode 1734
Connect to /lost+found? no

Unattached inode 1735
Connect to /lost+found? no

Unattached inode 1736
Connect to /lost+found? no

Unattached inode 1737
Connect to /lost+found? no

Unattached inode 1738
Connect to /lost+found? no

Unattached inode 1739
Connect to /lost+found? no

Unattached inode 1740
Connect to /lost+found? no

Unattached inode 1741
Connect to /lost+found? no

Unattached inode 1742
Connect to /lost+found? no

Unattached inode 1743
Connect to /lost+found? no

Unattached inode 1744
Connect to /lost+found? no

Unattached inode 1745
Connect to /lost+found? no

Unattached inode 1746
Connect to /lost+found? no

Unattached inode 1747
Connect to /lost+found? no

Pass 5: Checking group summary information

lustre-MDT0000: ********** WARNING: Filesystem still has errors **********


        1396 inodes used (0.14%, out of 1000184)
          46 non-contiguous files (3.3%)
           2 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 23/3/0
      152902 blocks used (30.58%, out of 500000)
           0 bad blocks
           1 large file

         197 regular files
         139 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
        1051 symbolic links (526 fast symbolic links)
           0 sockets
------------
        1333 files
Starting mds1: -o user_xattr,acl  /dev/sdb1 /mnt/mds1
Started lustre-MDT0000
 sanity test_17m: @@@@@@ FAIL: e2fsck should not report error upon  short/long symlink MDT: rc=4 
  Trace dump:
Comment by Andreas Dilger [ 28/Feb/14 ]

Hit this again on master on test_17n: https://maloo.whamcloud.com/test_sets/4c95bf56-9fce-11e3-ab91-52540035b04c

This has failed 10x in the past week on master, so I suspect some regression.

Comment by Andreas Dilger [ 04/Mar/14 ]

One option for debugging this is to do something like "do_facet $SINGLEMDS debugfs -c -R "stat <644>" $MDSDEV" to print out the FID information, and then use "lfs fid2path $FID" to figure out where the file was supposed to be linked.

Comment by James Nunez (Inactive) [ 07/Mar/14 ]

Once sanity test_4 was disabled, test_17n stopped failing. So, although 17n isn't failing anymore, there may still be issues here.

Comment by James Nunez (Inactive) [ 19/Mar/14 ]

Running the sanity suite once and then test 17m, 17m fails with "e2fsck should not report error upon short/long symlink MDT: rc=4 ". The output from e2fsck starts with:

# e2fsck -fnvd /dev/sda3
e2fsck 1.42.7.wc2 (07-Nov-2013)
device /dev/sda3 mounted by lustre per /proc/fs/lustre/mgs/MGS/mntdev
Warning!  /dev/sda3 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
e2fsck_pass1:1500: increase inode 50043 badness 0 to 2
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 605
Connect to /lost+found? no

Unattached inode 606
Connect to /lost+found? no

Unattached inode 607
Connect to /lost+found? no

Unattached inode 608
Connect to /lost+found? no

Unattached inode 609
Connect to /lost+found? no

Unattached inode 611
Connect to /lost+found? no

Unattached inode 2071
Connect to /lost+found? no

Unattached inode 2072
Connect to /lost+found? no

Unattached inode 2073
Connect to /lost+found? no

Following Andreas' comments from 27/Feb/14, I got the FIDs for the inodes that were reported by e2fsck:

# debugfs -c /dev/sda3 -R "stat <605>"
debugfs 1.42.7.wc2 (07-Nov-2013)
/dev/sda3: catastrophic mode - not reading inode or group bitmaps
Inode: 605   Type: regular    Mode:  0600   Flags: 0x0
Generation: 728859788    Version: 0x00000008:00006684
User:     0   Group:     0   Size: 0
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
 atime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
 mtime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
crtime: 0x5329d6db:4bb5e868 -- Wed Mar 19 10:41:47 2014
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  lma = "00 00 00 00 00 00 00 00 71 1b 00 00 02 00 00 00 c5 1f 00 00 00 00 00 00
 " (24)
  lma: fid=[0x200001b71:0x1fc5:0x0] compat=0 incompat=0
  lov = "d0 0b d1 0b 01 00 00 00 c5 1f 00 00 00 00 00 00 71 1b 00 00 02 00 00 00
 00 00 10 00 01 00 00 00 6b 56 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
0 00 00 00 00 00 " (56)
BLOCKS:

# debugfs -c /dev/sda3 -R "stat <606>"
debugfs 1.42.7.wc2 (07-Nov-2013)
/dev/sda3: catastrophic mode - not reading inode or group bitmaps
Inode: 606   Type: regular    Mode:  0600   Flags: 0x0
Generation: 728859789    Version: 0x00000008:00006688
User:     0   Group:     0   Size: 0
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
 atime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
 mtime: 0x5329d6db:00000000 -- Wed Mar 19 10:41:47 2014
crtime: 0x5329d6db:4e554a00 -- Wed Mar 19 10:41:47 2014
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  lma = "00 00 00 00 00 00 00 00 71 1b 00 00 02 00 00 00 c6 1f 00 00 00 00 00 00
 " (24)
  lma: fid=[0x200001b71:0x1fc6:0x0] compat=0 incompat=0
  lov = "d0 0b d1 0b 01 00 00 00 c6 1f 00 00 00 00 00 00 71 1b 00 00 02 00 00 00
 00 00 10 00 01 00 00 00 34 4e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
0 00 01 00 00 00 " (56)
BLOCKS:

...

Note that the Type is "regular". Then tried to get the file name using "lfs fid2path":

# lfs fid2path /lustre/lscratch [0x200001b71:0x1fc6:0x0]
fid2path: error on FID [0x200001b71:0x1fc6:0x0]: No such file or directory
# lfs fid2path /lustre/lscratch 0x200001b71:0x1fc5:0x0
fid2path: error on FID 0x200001b71:0x1fc6:0x0: No such file or directory

So, there is no file name available? I tried this for the first 10 inodes listed in the e2fsck output and got the same result; "No such file or directory".

Just to make sure I got the commands right, I used an inode that did not come up on the e2fsck list:

# debugfs -c /dev/sda3 -R "stat <604>"
debugfs 1.42.7.wc2 (07-Nov-2013)
/dev/sda3: catastrophic mode - not reading inode or group bitmaps
Inode: 604   Type: symlink    Mode:  0777   Flags: 0x0
Generation: 596733023    Version: 0x00000017:000009fa
User:     0   Group:     0   Size: 10
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5329ff53:00000000 -- Wed Mar 19 13:34:27 2014
 atime: 0x5329ff53:00000000 -- Wed Mar 19 13:34:27 2014
 mtime: 0x5329ff53:00000000 -- Wed Mar 19 13:34:27 2014
crtime: 0x5329ff53:b6e3205c -- Wed Mar 19 13:34:27 2014
Size of extra inode fields: 28
Extended attributes stored in inode body: 
  lma = "00 00 00 00 00 00 00 00 c0 61 00 00 02 00 00 00 f5 0f 00 00 00 00 00 00
 " (24)
  lma: fid=[0x2000061c0:0xff5:0x0] compat=0 incompat=0
  link = "df f1 ea 11 01 00 00 00 33 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
0 00 1b 00 00 00 02 00 00 61 c0 00 00 0c 09 00 00 00 00 73 68 6f 72 74 2d 32 34 
35 " (51)
Fast_link_dest: 0123456789

# lfs fid2path /lustre/lscratch [0x2000061c0:0xff5:0x0]
/lustre/lscratch/d17m.sanitym/short-245

Note that the type for the "good" inode is "symlink".

So, what does it mean that there is no path information for the Unattached inodes?

Comment by James Nunez (Inactive) [ 20/Mar/14 ]

I looked at all the files in the Lustre file system and, on the client, got the FID for each file:

# lfs path2fid /lustre/lscratch
[0x200000007:0x1:0x0]

# lfs path2fid /lustre/lscratch/d17m.sanitym
[0x2000061c0:0xc09:0x0]

# lfs path2fid /lustre/lscratch/d17m.sanitym/short-*
/lustre/lscratch/d17m.sanitym/short-0: [0x2000061c0:0xe0b:0x0]
/lustre/lscratch/d17m.sanitym/short-1: [0x2000061c0:0xe0d:0x0]
/lustre/lscratch/d17m.sanitym/short-2: [0x2000061c0:0xe0f:0x0]
…
/lustre/lscratch/d17m.sanitym/short-510: [0x2000061c0:0x1207:0x0]
/lustre/lscratch/d17m.sanitym/short-511: [0x2000061c0:0x1209:0x0]

# lfs path2fid /lustre/lscratch/d17m.sanitym/long-* 
/lustre/lscratch/d17m.sanitym/long-0: [0x2000061c0:0xe0a:0x0]
/lustre/lscratch/d17m.sanitym/long-1: [0x2000061c0:0xe0c:0x0]
/lustre/lscratch/d17m.sanitym/long-2: [0x2000061c0:0xe0e:0x0]
…
/lustre/lscratch/d17m.sanitym/long-510: [0x2000061c0:0x1206:0x0]
/lustre/lscratch/d17m.sanitym/long-511: [0x2000061c0:0x1208:0x0]

Those are all the files and directories in the file system.

So the FIDs for the Unattached inodes found by running e2fsck do not belong to files in the file system. Is this the correct interpretation? If so, maybe those inodes are left over from the first run of sanity that didn't get cleaned up?

Comment by Andreas Dilger [ 20/Mar/14 ]

If you add something like:

lctl get_param seq.*.fid

after each test in run_one() you will know which test generated the FIDs for the orphan files. Just comparing the FID sequence numbers, the test_17m sequence is (0x2000061c0 - 0x2000000400) = 24000 higher than the starting sequence, while the orphan sequence is (0x200001b71 - 0x2000000400) = 6001 higher, and by linear extrapolation this puts it 25% of the way to test 17m, which is around test_5 (or test_4 which is the likely candidate).

It is a bit confusing that the client would be getting a new sequence so often, since that should only happen every 128000 files or if the client is remounted, which I don't think should be the case? In any case, you could also try running:

ONLY=4 sh sanity.sh

on an already-mounted filesystem and then run e2fsck -f on the MDS to check if this test is causing the leak.

Comment by James Nunez (Inactive) [ 20/Mar/14 ]

I started with a fresh file system and ran sanity. After sanity completed, there were 16 unattached inodes found by e2fsck. The offending tests are 56w, 56x, 185, and 187b.

Comment by James Nunez (Inactive) [ 20/Mar/14 ]

For sanity test 56x, I've confirmed that the migrate, call to "lfs migrate", from a two stripe file to a single stripe file creates one unattached inode. I suspect this is the same for test 56w since it calls lfs migrate on a file and directory.

Also, migrating from a single striped file to a file with two strips creates an unattached inode also.

Comment by James Nunez (Inactive) [ 20/Mar/14 ]

For sanity test 185 and 187b, the calls to multiop with the V option, open a volatile file, are causing the remaining unattached inodes.

Comment by James Nunez (Inactive) [ 24/Mar/14 ]

What's common to all of these tests is that they create volatile files. This is related to or is a case of LU-4140.

Comment by James Nunez (Inactive) [ 31/Mar/14 ]

Closing as duplicate of LU-4140. We can reopen this ticket is LU-4140 does not solve this problem.

Comment by Andreas Dilger [ 22/Apr/14 ]

James, I don't think that the orphan inode problem is caused by the presence or absence of a changelog record for the volatile file. It seems to me that this problem could be fixed entirely independently of the changelog issue. I think this is particularly important since any file migration using "lfs migrate" is leaking space in the filesystem since 2.4.0 that can only be reclaimed by an offline e2fsck of the MDT to link the files into /lost+found and then manually mounting the filesystem to move files from /lost+found to e.g. /ROOT/lost+found or similar and deleting them.

I suspect that there is some simple fix here like decrementing the nlink count on the volatile file after open so that it is properly cleaned up at close time. Then, the changelog issue in LU-4140 can be fixed independently.

Comment by Bruno Faccini (Inactive) [ 24/Apr/14 ]

Ok Andreas, I agree this seems appropriate.
But are you, or James able to reproduce with recent master ? I am asking because I just tried/checked as part of LU-4140 work but I am unable to get any orphan inode as I was able before and with migrate/release+restore operations ...

Comment by James Nunez (Inactive) [ 24/Apr/14 ]

I just checked the current master and, yes, I can still trigger this problem.

Just running multiop to create a volatile file will create an unattached inode. For example, running the following from sanity test 185 will create an unattached inode:

# ./multiop /lustre/scratch/vfile_test VFw4096c
Comment by Andreas Dilger [ 29/May/14 ]

I posted a prototype patch at http://review.whamcloud.com/10179 that is at least an attempt at fixing this. I didn't test it, so it likely needs some help before it can land.

Comment by Bruno Faccini (Inactive) [ 03/Jun/14 ]

I did some extended testing and it appears that with this patch there is no longer a "i_am_nobody" orphan inode left for each hsm_release operation nor a ".^L^S^T^R:VOLATILE:: ..." one for each hsm_restore operation.
And I also checked it also fixes/avoids the unecessary reporting of ChangeLog events for these volatile files, for LU-4140.

Comment by Jodi Levi (Inactive) [ 05/Jun/14 ]

Patch landed to Master.

Comment by Andreas Dilger [ 26/Sep/14 ]

Patch for b2_5: http://review.whamcloud.com/12076

Comment by Gerrit Updater [ 01/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12076/
Subject: LU-3696 mdd: decref volatile object after creation
Project: fs/lustre-release
Branch: b2_5
Current Patch Set:
Commit: f6b32c44806aa37c0d43fd4e01b4fecb7fa60c1c

Generated at Sat Feb 10 01:36:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.