[LU-8071] lvcreate --snapshot of MDT hangs in ldiskfs_journal_start_sb Created: 26/Apr/16 Updated: 14/Jun/18 Resolved: 02/Jun/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Nathan Dauchy (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS-6.7 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Similar to LU-7616 "creation of LVM snapshot on ldiskfs based MDT hangs until MDT activity/use is halted", but opening a new case for tracking. The goal is to use LVM snapshots and tar to make file level MDT backups. Procedure worked fine 2 or 3 times, then we triggered the following problem on a recent attempt. The MDS became extremely sluggish, and all MDT threads went into D state, when running the following command: lvcreate -l95%FREE -s -p r -n mdt_snap /dev/nbp9-vg/mdt9 (the command never returned, and any further lv* commands hung as well) In the logs... Apr 25 17:09:35 nbp9-mds kernel: WARNING: at /usr/src/redhat/BUILD/lustre-2.5.3/ldiskfs/super.c:280 ldiskfs_journal_start_sb+0xce/0xe0 [ldiskfs]() (Not tainted) Apr 25 17:14:45 nbp9-mds ] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa0e5c4e5>] ? mds_readpage_handle+0x15/0x20 [mdt] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa08a90c5>] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa05d18d5>] ? lc_watchdog_touch+0x65/0x170 [libcfs] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa08a1a69>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa08ab89d>] ? ptlrpc_main+0xafd/0x1780 [ptlrpc] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffff8100c28a>] ? child_rip+0xa/0x20 Apr 25 17:14:45 nbp9-mds kernel: [<ffffffffa08aada0>] ? ptlrpc_main+0x0/0x1780 [ptlrpc] Apr 25 17:14:45 nbp9-mds kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Apr 25 17:14:45 nbp9-mds kernel: ---[ end trace c9f3339c0e103edf ]--- Apr 25 17:14:57 nbp9-mds kernel: WARNING: at /usr/src/redhat/BUILD/lustre-2.5.3/ldiskfs/super.c:280 ldiskfs_journal_start_sb+0xce/0xe0 [ldiskfs]() (Tainted: G W --------------- ) Apr 25 17:14:57 nbp9-mds kernel: Hardware name: AltixXE270 Apr 25 17:14:57 nbp9-mds kernel: Modules linked in: dm_snapshot dm_bufio osp(U) mdd(U) lfsck(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) ldiskfs(U) jbd2 lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic crc32c_intel libcfs(U) sunrpc bonding ib_ucm(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) configfs ib_ipoib(U) ib_cm(U) ib_uverbs(U) ib_umad(U) dm_round_robin scsi_dh_rdac dm_multipath microcode iTCO_wdt iTCO_vendor_support i2c_i801 lpc_ich mfd_core shpchp sg igb dca ptp pps_core tcp_bic ext3 jbd sd_mod crc_t10dif sr_mod cdrom ahci pata_acpi ata_generic pata_jmicron mptfc scsi_transport_fc scsi_tgt mptsas mptscsih mptbase scsi_transport_sas mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) ipv6 mlx4_core(U) mlx_compat(U) memtrack(U) usb_storage radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod gru [last unloaded: scsi_wait_scan] Apr 25 17:14:57 nbp9-mds kernel: Pid: 85906, comm: mdt_rdpg00_042 Tainted: G W --------------- 2.6.32-504.30.3.el6.20151008.x86_64.lustre253 #1 Apr 25 17:14:57 nbp9-mds kernel: Call Trace: Apr 25 17:14:57 nbp9-mds kernel: [<ffffffff81074127>] ? warn_slowpath_common+0x87/0xc0 Apr 25 17:14:57 nbp9-mds kernel: [<ffffffff8107417a>] ? warn_slowpath_null+0x1a/0x20 Apr 25 17:14:57 nbp9-mds kernel: [<ffffffffa0a1c33e>] ? ldiskfs_journal_start_sb+0xce/0xe0 [ldiskfs] Apr 25 17:14:57 nbp9-mds kernel: [<ffffffffa0d6069f>] ? osd_trans_start+0x1df/0x660 [osd_ldiskfs] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0ef3619>] ? lod_trans_start+0x1b9/0x250 [lod] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0f7af07>] ? mdd_trans_start+0x17/0x20 [mdd] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0f61ece>] ? mdd_close+0x6be/0xb80 [mdd] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0e48be9>] ? mdt_mfd_close+0x4a9/0x1bc0 [mdt] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0899525>] ? lustre_msg_buf+0x55/0x60 [ptlrpc] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa08c07f6>] ? __req_capsule_get+0x166/0x710 [ptlrpc] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa089a53e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa06ebf05>] ? class_handle2object+0x95/0x190 [obdclass] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0e4b6a2>] ? mdt_close+0x642/0xa80 [mdt] Apr 25 17:15:06 nbp9-mds kernel: [<ffffffffa0e1fada>] ? mdt_handle_common+0x52a/0x1470 [mdt] Apr 25 17:15:10 nbp9-mds multipathd: nbp9_MGS_MDS: sdc - rdac checker reports path is down Apr 25 17:15:10 nbp9-mds multipathd: checker failed path 8:32 in map nbp9_MGS_MDS Apr 25 17:15:10 nbp9-mds multipathd: nbp9_MGS_MDS: remaining active paths: 1 Apr 25 17:15:10 nbp9-mds multipathd: sdd: remove path (uevent) Apr 25 17:15:10 nbp9-mds multipathd: nbp9_MGS_MDS: failed in domap for removal of path sdd Apr 25 17:15:10 nbp9-mds multipathd: uevent trigger error Apr 25 17:15:10 nbp9-mds multipathd: sdc: remove path (uevent) Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa0e5c4e5>] ? mds_readpage_handle+0x15/0x20 [mdt] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa08a90c5>] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa05d18d5>] ? lc_watchdog_touch+0x65/0x170 [libcfs] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa08a1a69>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa08ab89d>] ? ptlrpc_main+0xafd/0x1780 [ptlrpc] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffff8100c28a>] ? child_rip+0xa/0x20 Apr 25 17:15:10 nbp9-mds kernel: [<ffffffffa08aada0>] ? ptlrpc_main+0x0/0x1780 [ptlrpc] Apr 25 17:15:10 nbp9-mds kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Apr 25 17:15:16 nbp9-mds multipathd: nbp9_MGS_MDS: map in use Apr 25 17:15:16 nbp9-mds multipathd: nbp9_MGS_MDS: can't flush The server had to be rebooted and e2fsck run to get it back into production. |
| Comments |
| Comment by Nathan Dauchy (Inactive) [ 26/Apr/16 ] |
|
Not sure how closely tied this is, but... after the reboot we ran e2fsck on the MDT. The fsck was able replay the journal but during the scan found lots of unattached inodes and moved them to /lost+found. Interestingly, most of those inodes were almost sequential: Unattached inode 584092565 Connect to /lost+found<y>? yes Unattached inode 584092567 Connect to /lost+found<y>? yes Unattached inode 584092637 Connect to /lost+found<y>? yes Unattached inode 584092640 Connect to /lost+found<y>? yes Unattached inode 584092643 Connect to /lost+found<y>? yes Unattached inode 584092645 Connect to /lost+found<y>? yes Unattached inode 584092648 Connect to /lost+found<y>? yes Unattached inode 584092715 Connect to /lost+found<y>? yes Unattached inode 584092717 Connect to /lost+found<y>? yes Unattached inode 584092720 Connect to /lost+found<y>? yes Unattached inode 584092723 Connect to /lost+found<y>? yes Unattached inode 584092784 Mounted as ldiskfs and saw that all but about 20 files were owned by root and had surprisingly old date stamps: nbp9-mds ~ # ls -l /mnt/lustre/nbp9-mdt/lost+found/
total 6816
-rw------- 1 root root 5792 Jul 26 1970 #17831719
-rw------- 1 root root 5792 Jul 26 1970 #17831723
-rw------- 1 root root 5792 Feb 4 1988 #571005637
-rw------- 1 root root 5792 Feb 4 1988 #571016437
-rw------- 1 root root 5792 Feb 4 1988 #571029133
...
-rw------- 1 root root 4832 Jul 5 1988 #584109225
-rw------- 1 root root 4832 Jul 5 1988 #584109227
-rw------- 1 root root 4832 Jul 5 1988 #584109231
-rw------- 1 root root 4832 Jul 5 1988 #584109233
-rw------- 1 root root 4832 Jul 5 1988 #584109235
nbp9-mds ~ # ls -l /mnt/lustre/nbp9-mdt/lost+found | grep -v " root "
total 47328
-rwxr----- 1 dmenemen g26209 0 Feb 10 2014 #570436464
-rw-r--r-- 1 dmenemen g26209 0 Feb 10 2014 #570436465
-rwxr----- 1 dmenemen g26209 0 Feb 10 2014 #570436466
-rw-r--r-- 1 dmenemen g26209 0 Feb 10 2014 #570436467
-rw-r--r-- 1 dmenemen g26209 0 Feb 10 2014 #570436468
-rw-r--r-- 1 dmenemen g26209 0 Feb 10 2014 #570436469
...
(another dozen or so more similar files)
Ran "ll_recover_lost_found_objs" on the MDT mounted as ldiskfs, and it reported results like: Object /mnt/lustre/nbp9-mdt/O/22d0c13d/d23/cb410817 restored. Object /mnt/lustre/nbp9-mdt/O/22d0c13f/d27/cb41083b restored. Object /mnt/lustre/nbp9-mdt/O/22d0c145/d1/cb504661 restored. Object /mnt/lustre/nbp9-mdt/O/22d0c147/d27/cb50467b restored. Object /mnt/lustre/nbp9-mdt/O/22d0c153/d16/cb504810 restored. Object /mnt/lustre/nbp9-mdt/O/22d0c155/d20/cb504814 restored. (It got all the non-root owned files) I believe that the rest of the files in lost+found were probably either useless or were never pointing to user data... they have no xattrs: # getfattr -d -m ".*" -e hex /mnt/lustre/nbp9-mdt/lost+found/*
(returned nothing!) So, we ran another e2fsck (which came back clean!) and put the filesystem back into production. |
| Comment by Nathan Dauchy (Inactive) [ 26/Apr/16 ] |
|
Uploading backtrace and dmesg from the MDS. |
| Comment by Mahmoud Hanafi [ 28/Apr/16 ] |
|
The recovery of the mdt as resulted in 25776 file losing their stripe info. <code> We need help to recover the stripe info! |
| Comment by Peter Jones [ 28/Apr/16 ] |
|
Fan Yong Is looking into this |
| Comment by nasf (Inactive) [ 28/Apr/16 ] |
|
Mahmoud, Usually, for the file with small LOV EA, such as four stripes or so, the LOV EA will be stored inside the inode space (block). So if the inode is there, the LOV EA should be there also, unless the inode corrupted. But since e2fsck run completedlily, it seems not such case. So the lost LOV EA should be stored in separated block, do you still have some idea about the stripe count for the files lost the stripe info? Have you ever enable large EA to support thousands of stripes? On the other hand, the files in /lost+found seems abnormal. The time "Jul 26 1970" is the start time, that is impossible for normal Lustre objects. And the time "Feb 4 1988" is also not reasonable Lustre time stamp. Then I think that the e2fsck scanned some non-initialized area, the inode bitmap maybe corrupted.
As for recover object from /lost+found via "ll_recover_lost_found_objs", it is used for OST-object. I am surprised that it recovered something on your MDT! So it makes me to suspect your system corruption in further. Back to your direct question: how to recover the lost stripe info. Our new layout LFSCK can recover the lost (MDT-object's) LOV EA from orphan OST-object(s), such functionality is available since Lustre-2.6. Seems your system is Lustre-2.5 based, so unless you upgrade your system, otherwise, you have to recover that manually as following for example: 1) Get the "temperature_energy.dat" FID via "lfs path2fid temperature_energy.dat" Above process is very inefficient and easy to be wrong. So unless you are confident for such manually operations have time, I would suggest to upgrade your system and run layout LFSCK for recovery. |
| Comment by Mahmoud Hanafi [ 28/Apr/16 ] |
|
I have a modified e2scan that will dump ost object and look up the PFID for each OST. We use it to quickly identify users that are filling up an OST. So if I can run that and get a list of all OST object what would be the next step to put it back together? We could also upgrade to 2.7.1. How fast does the lfsck run? |
| Comment by nasf (Inactive) [ 28/Apr/16 ] |
|
If you can get the OST-objects list, then I can make a patch to re-generate the MDT-objects LOV EA from such list. As for "how fast the LFSCK run", depends on your system environment. I have tested the layout LFSCK on OpenSFS cluster, for single MDT case, the routine scan (without repairing) speed can be up to 100K objects/sec, the repair speed is about 30K objects/sec. |
| Comment by Mahmoud Hanafi [ 28/Apr/16 ] |
|
Since we need to install a patch we may want to just upgrade. I'll update the case tomorrow as to what we decided to do. |
| Comment by nasf (Inactive) [ 28/Apr/16 ] |
|
Sorry, I did not say clearly. I mean that if you can get the OST-objects list, then I will make a tool to re-geneate the stripe info from your OST objects list. You do not need to patch your current system. |
| Comment by Nathan Dauchy (Inactive) [ 28/Apr/16 ] |
|
Prior to the hang, when I had (what I thought was) a successful LVM snapshot, I happened to be looking for files with dates out of the expected range: nohup find /mnt/snap/ROOT/ -mtime +10000 -o -mtime -0 > /tmp/nbp9_big_time_t.out 2>&1 < /dev/null & looking at those files from a client, the results are like: pfe20 /u/ndauchy/tmp # xargs -a nbp9_big_time_t.txt -d '\n' ls -ldtr
-rwxr-xr-- 1 sviken a1426 0 Dec 31 1969 /nobackupp9/sviken/rev3mod3/grid3b/sally_grid_info/?c}x;?<+?"?:8??.??K?????????f?$Z??????????O?????c^+?c?????Dol??:H?N5?#??;G??Z???Vq?s??VX?i??????O?L?
-rwxr-xr-- 1 sviken a1426 0 Dec 31 1969 /nobackupp9/sviken/rev3mod3/grid3b/sally_grid_info/??#???a??O????\?a?????QU?4???W?~??I?k???????w[?~???????"??K?}???????e??a??{???g?|?|?/??k????1??.?lXD
-rw-r--r-- 1 yfan2 s0969 7663656 Feb 21 2018 /nobackupp9/yfan2/mfe_f90_mpi3d/polysph_3dfulltherm/run.15.3/eng_1
drwx------ 18 yfan2 s0969 94208 Feb 21 2018 /nobackupp9/yfan2/mfe_f90_mpi3d/polysph_3dfulltherm
-rwxr-xr-- 1 sviken a1540 1703069 Jun 20 2018 /nobackupp9/sviken/rev1wing/40deg/old_wrong_BL/m096a4b0p3TA/flow.o3964429
-rwxr-xr-- 1 sviken a1540 14925662688 Jun 20 2018 /nobackupp9/sviken/rev1wing/40deg/old_wrong_BL/m096a4b0p3TA/gaw-215-flap40.flow
-rwxr-xr-- 1 sviken a1540 14055821 Jun 20 2018 /nobackupp9/sviken/rev1wing/40deg/old_wrong_BL/m096a4b0p3TA/gaw-215-flap40.grid_info
-rwxr-xr-- 1 sviken a1540 10454535 Jun 20 2018 /nobackupp9/sviken/rev1wing/40deg/old_wrong_BL/m096a4b0p3TA/source_grid_00000.p3d
...
-rw-r--r-- 1 tsandstr vistech 234 Oct 5 2280 /nobackupp9/chenze/data/rogers/dta_cfd/met120_dta_ET128/m3.913a5.129b0.080/SRBcut_1.map
-rw-r--r-- 1 tsandstr vistech 1424 Oct 5 2280 /nobackupp9/chenze/data/rogers/dta_cfd/met120_dta_ET128/m3.913a5.129b0.080/SRBcut.fvx
-rw-r--r-- 1 tsandstr vistech 394 Oct 10 2280 /nobackupp9/chenze/data/rogers/dta_cfd/met120_dta_ET128/m3.913a5.129b0.080/sslv_top.vct
-rw-r--r-- 1 tsandstr vistech 208629828 Feb 28 2281 /nobackupp9/chenze/data/rogers/dta_cfd/met100_dta_ET128/m3.370a2.558b-0.041/q.surf
Of course, I don't know whether these are A) from the same event, B) from taking the previous LVM snapshot, or C) have been that way for a long time and are completely unrelated. I can upload the full file with all (590) such files if needed. |
| Comment by Mahmoud Hanafi [ 29/Apr/16 ] |
|
We have upgraded to 2.7.1 and started a dryrun lfsck. This is the command we uses. lctl lfsck_start --dryrun -M nbp9-MDT0000 -A Is that correct? |
| Comment by nasf (Inactive) [ 29/Apr/16 ] |
|
The files list in nbp9_big_time_t.txt looks abnormal, although you said they were from previous successful LVM backup. For further verification, you can dump their LMA EA ("trusted.lma") and linkEA ("trusted.link"). As for the layout LFSCK, if you want to run as "dryrun" mode, please try "lctl lfsck_start -M nbp9-MDT0000 -t layout -o --dryrun" |
| Comment by Mahmoud Hanafi [ 29/Apr/16 ] |
|
Doesn't -A do both layout and orphan repair? |
| Comment by nasf (Inactive) [ 29/Apr/16 ] |
|
"-A" is enabled automatically if "-o" specified. |
| Comment by Nathan Dauchy (Inactive) [ 29/Apr/16 ] |
|
lfsck was completed last night after the upgrade to 2.7.1. Initial indications are that there was a lot of improvement, out of the initial list of 24150 problem files files, only 125 remain with no stripe info. They appear to mostly be grouped into a handful of directories (or their subdirs) for a small set of users: # sed 's/ has no .*//' /nobackupnfs2/mhanafi/nbp9_ost_scan/nbp9.no_stripe_files.out.after_lfsck | xargs -d '\n' ls -l | awk '{print $3}' | sort | uniq -c
ls: cannot access /nobackupp9/jchang/newfile: No such file or directory
65 bduda
45 bmauer
1 cbrehm1
1 dhixon
11 mchoudha
1 nfeather
# sed 's/ has no .*//' /nobackupnfs2/mhanafi/nbp9_ost_scan/nbp9.no_stripe_files.out.after_lfsck | xargs -d '\n' -L 1 dirname | xargs -L 1 dirname | sort | uniq -c
1 /nobackupp9
65 /nobackupp9/bduda
1 /nobackupp9/cbrehm1/OpenRotor/6848rpm_ibio_with_Plate_newRun
1 /nobackupp9/dhixon
45 /nobackupp9/gmao_SIteam/ObservingSystem/reanalysis/aqua_disc/hdf
6 /nobackupp9/mchoudha/powerflow
1 /nobackupp9/mchoudha/powerflow/fine
4 /nobackupp9/mchoudha/powerflow/medium_aoa0_copy
1 /nobackupp9/nfeather/marti/Builds/Scaling/Dynamo
It also looks like the recovered files sill lost ACL information. Is it expected that lfsck is unable to recover those attributes? |
| Comment by nasf (Inactive) [ 29/Apr/16 ] |
|
There are several possible cases for the remained 125 no-stripes files. 1) The file has never been written after create, so the OST-object has no parent FID information. According to my experience, the first case more possible. The layout LFSCK can recover file owner information, but cannot recover ACL. Sorry for such inconvenience. |
| Comment by Nathan Dauchy (Inactive) [ 29/Apr/16 ] |
|
There are indeed about 464350 files in the .lustre/lost+found/MDT0000/ directory. The vast majority of them are owned by root, size zero, and date stamped around when the lfsck was run... so I'm assuming those are useless. In going through the remaining ~969 "interesting" ones, several have IO errors: pfe27 /nobackupp9/.lustre/lost+found/MDT0000 # file [0x2000a1e14:0x10e4b:0x0]-R-0 [0x2000a1e14:0x10e4b:0x0]-R-0: ERROR: cannot read `' (Input/output error) pfe27 /nobackupp9/.lustre/lost+found/MDT0000 # ls -l [0x2000a1e14:0x10e4b:0x0]-R-0 -r-------- 1 dlrodrig g28100 20971520 Apr 28 20:31 [0x2000a1e14:0x10e4b:0x0]-R-0 pfe27 /nobackupp9/.lustre/lost+found/MDT0000 # lfs getstripe !$ lfs getstripe [0x2000a1e14:0x10e4b:0x0]-R-0 [0x2000a1e14:0x10e4b:0x0]-R-0 lmm_stripe_count: 4 lmm_stripe_size: 1048576 lmm_pattern: 40000001 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 0 0 0 0 0 0 0 0 0 0 0 23 7103687 0x6c64c7 0 Would I be correct in assuming that those are non-recoverable? |
| Comment by nasf (Inactive) [ 29/Apr/16 ] |
|
For the files under .lustre/lost+found/MDT0000/, handle them as following: 1) If the file is accessible (means no IO error), and its size is NOT zero, then it contains some valid data. But since we do not know its original position in the namespace, you needs to move them back to the namespace according to your memory or some new rule. 2) If the file is accessible, and if the size is zero, then it is safe to remove them directly, at least it does not contains valid data, although it may affect related inode quota a bit. 3) If the file is inaccessible with IO error, means only part of the stripes are found, and there is 'hole' in the re-generated LOV EA. Then you can dump part of the data from such file via "dd conv=sync,noerror ...". If such part data is still valuable, then save them in some new file. Otherwise, you have to discard them. |
| Comment by Nathan Dauchy (Inactive) [ 02/May/16 ] |
|
Note: we do eventually still want to get to the bottom of the snapshot journal call trace and hang issue. But first, another possible corruption scenario... A file used for hourly I/O tests grew from 960M before the lfsck to 57G afterward. It appears the extra size is mostly zeros, embedded in the middle of the file. # stat /nobackupp9/dtalcott/nagtest.toobig File: `/nobackupp9/dtalcott/nagtest.toobig' Size: 60150513664 Blocks: 3899408 IO Block: 4194304 regular file Device: 162deh/90846d Inode: 144117204932100247 Links: 1 Access: (0444/-r--r--r--) Uid: (10491/dtalcott) Gid: ( 1179/ cstaff) Access: 2016-04-29 12:36:38.000000000 -0700 Modify: 2016-04-28 14:06:34.000000000 -0700 Change: 2016-04-29 11:55:56.000000000 -0700 Birth: - # stat /nobackupp9/dtalcott/nagtest File: `/nobackupp9/dtalcott/nagtest' Size: 1006632960 Blocks: 1966080 IO Block: 4194304 regular file Device: 162deh/90846d Inode: 144134705246109726 Links: 1 Access: (0644/-rw-r--r--) Uid: (10491/dtalcott) Gid: ( 1179/ cstaff) Access: 2016-05-02 10:06:50.000000000 -0700 Modify: 2016-05-02 10:06:56.000000000 -0700 Change: 2016-05-02 10:06:56.000000000 -0700 Birth: - (file was simply renamed, and a new one created for other testing) # lfs getstripe /nobackupp9/dtalcott/nagtest.toobig /nobackupp9/dtalcott/nagtest.toobig lmm_stripe_count: 240 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 236 lmm_stripe_offset: 65 obdidx objid objid group 65 9682882 0x93bfc2 0 104 10385635 0x9e78e3 0 121 9870950 0x969e66 0 87 9941827 0x97b343 0 170 332727 0x513b7 0 13 332418 0x51282 0 6 332591 0x5132f 0 171 332503 0x512d7 0 33 332294 0x51206 0 21 332406 0x51276 0 195 332547 0x51303 0 126 332574 0x5131e 0 162 332560 0x51310 0 103 332341 0x51235 0 144 332517 0x512e5 0 196 332547 0x51303 0 184 332443 0x5129b 0 239 332564 0x51314 0 89 332596 0x51334 0 60 332234 0x511ca 0 26 332362 0x5124a 0 61 332282 0x511fa 0 134 332304 0x51210 0 187 332391 0x51267 0 209 332400 0x51270 0 213 333037 0x514ed 0 211 332515 0x512e3 0 30 332396 0x5126c 0 50 332564 0x51314 0 151 332187 0x5119b 0 32 332532 0x512f4 0 20 332721 0x513b1 0 72 332643 0x51363 0 143 332666 0x5137a 0 73 332940 0x5148c 0 92 332507 0x512db 0 10 332659 0x51373 0 109 332645 0x51365 0 166 332316 0x5121c 0 203 332306 0x51212 0 193 332351 0x5123f 0 149 332595 0x51333 0 3 332815 0x5140f 0 94 332773 0x513e5 0 18 332809 0x51409 0 183 332545 0x51301 0 80 332312 0x51218 0 212 332258 0x511e2 0 120 332442 0x5129a 0 191 332265 0x511e9 0 9 332443 0x5129b 0 108 332580 0x51324 0 122 332412 0x5127c 0 173 332369 0x51251 0 38 332405 0x51275 0 123 332523 0x512eb 0 1 332561 0x51311 0 5 332717 0x513ad 0 131 332752 0x513d0 0 158 332602 0x5133a 0 210 332550 0x51306 0 167 332499 0x512d3 0 64 332551 0x51307 0 116 332689 0x51391 0 24 332897 0x51461 0 111 332624 0x51350 0 169 332665 0x51379 0 220 332578 0x51322 0 186 332801 0x51401 0 205 332960 0x514a0 0 70 332644 0x51364 0 75 332327 0x51227 0 225 332395 0x5126b 0 133 332705 0x513a1 0 147 332449 0x512a1 0 46 332438 0x51296 0 114 332486 0x512c6 0 231 332310 0x51216 0 112 332828 0x5141c 0 52 332709 0x513a5 0 136 332738 0x513c2 0 95 332467 0x512b3 0 105 332314 0x5121a 0 172 332398 0x5126e 0 218 332224 0x511c0 0 189 332112 0x51150 0 86 332331 0x5122b 0 27 332303 0x5120f 0 161 332587 0x5132b 0 117 332643 0x51363 0 179 332506 0x512da 0 238 332437 0x51295 0 226 332206 0x511ae 0 23 332669 0x5137d 0 208 332492 0x512cc 0 228 332685 0x5138d 0 216 332582 0x51326 0 223 332566 0x51316 0 185 332800 0x51400 0 28 332903 0x51467 0 138 332639 0x5135f 0 157 332528 0x512f0 0 22 332328 0x51228 0 235 332309 0x51215 0 113 332504 0x512d8 0 181 332486 0x512c6 0 227 332562 0x51312 0 174 332442 0x5129a 0 178 332552 0x51308 0 55 332324 0x51224 0 48 332698 0x5139a 0 148 332589 0x5132d 0 152 332613 0x51345 0 159 332651 0x5136b 0 41 332617 0x51349 0 156 332609 0x51341 0 202 332762 0x513da 0 125 332840 0x51428 0 230 332751 0x513cf 0 43 332886 0x51456 0 145 332769 0x513e1 0 229 332760 0x513d8 0 83 332440 0x51298 0 14 332357 0x51245 0 82 332559 0x5130f 0 119 332679 0x51387 0 16 332860 0x5143c 0 180 332702 0x5139e 0 200 332568 0x51318 0 175 332710 0x513a6 0 153 332796 0x513fc 0 124 332438 0x51296 0 58 332344 0x51238 0 141 332188 0x5119c 0 182 332413 0x5127d 0 91 332378 0x5125a 0 81 332441 0x51299 0 69 332581 0x51325 0 51 332391 0x51267 0 190 332423 0x51287 0 98 332557 0x5130d 0 199 332471 0x512b7 0 160 332528 0x512f0 0 84 332693 0x51395 0 88 332739 0x513c3 0 79 332455 0x512a7 0 57 332518 0x512e6 0 44 332375 0x51257 0 90 332369 0x51251 0 45 332629 0x51355 0 214 332659 0x51373 0 107 332818 0x51412 0 49 332966 0x514a6 0 37 332933 0x51485 0 99 332822 0x51416 0 110 332478 0x512be 0 130 332654 0x5136e 0 71 332394 0x5126a 0 96 332281 0x511f9 0 4 332616 0x51348 0 104 332652 0x5136c 0 15 332590 0x5132e 0 121 332463 0x512af 0 76 332578 0x51322 0 42 332582 0x51326 0 29 332589 0x5132d 0 102 332444 0x5129c 0 219 332330 0x5122a 0 129 332467 0x512b3 0 85 332737 0x513c1 0 19 332804 0x51404 0 206 332591 0x5132f 0 34 332583 0x51327 0 7 332498 0x512d2 0 176 332351 0x5123f 0 164 332292 0x51204 0 8 332694 0x51396 0 47 332683 0x5138b 0 217 332318 0x5121e 0 188 333078 0x51516 0 74 332354 0x51242 0 77 332476 0x512bc 0 198 332637 0x5135d 0 11 332451 0x512a3 0 177 333010 0x514d2 0 101 332725 0x513b5 0 163 332432 0x51290 0 62 332414 0x5127e 0 146 332412 0x5127c 0 135 332565 0x51315 0 128 332862 0x5143e 0 100 332828 0x5141c 0 168 332484 0x512c4 0 127 332413 0x5127d 0 201 332432 0x51290 0 140 332907 0x5146b 0 154 332681 0x51389 0 221 332738 0x513c2 0 54 332759 0x513d7 0 155 332498 0x512d2 0 17 332498 0x512d2 0 165 332624 0x51350 0 67 332566 0x51316 0 142 332525 0x512ed 0 194 332587 0x5132b 0 87 332708 0x513a4 0 224 332711 0x513a7 0 132 332822 0x51416 0 232 332903 0x51467 0 63 332711 0x513a7 0 233 332720 0x513b0 0 12 332866 0x51442 0 234 332943 0x5148f 0 93 332571 0x5131b 0 118 332352 0x51240 0 59 332427 0x5128b 0 65 332296 0x51208 0 197 332254 0x511de 0 35 332315 0x5121b 0 222 332360 0x51248 0 66 332418 0x51282 0 215 332613 0x51345 0 192 332397 0x5126d 0 36 332430 0x5128e 0 40 332322 0x51222 0 31 332412 0x5127c 0 25 332424 0x51288 0 204 332576 0x51320 0 106 332455 0x512a7 0 237 332340 0x51234 0 150 332376 0x51258 0 139 332451 0x512a3 0 97 332461 0x512ad 0 53 332565 0x51315 0 115 332749 0x513cd 0 78 332803 0x51403 0 2 332885 0x51455 0 39 332810 0x5140a 0 0 333030 0x514e6 0 68 332710 0x513a6 0 So far I have not been able to figure out any reasonable way to scan for other files that grew a lot and got filled with zeros. The best I can think of is a find for all files modified since 4/28 that are more than ~1GB. Then compare the alleged size (ls -l) and the blocks used (du -s), and if they are more than ~10x off it could be a problem, and look at those files individually. Ugh! Any other ideas? |
| Comment by nasf (Inactive) [ 03/May/16 ] |
|
How many OSTs in your system? Have you ever created the file nagtest.toobig with 240 stripes? If no, then it may because that some OST-object(s) contained bad PFID EA, and claimed that its parent is the file nagtest.toobig, but LFSCK could NOT know whether it is right or not. Since nobody conflict with it/them, then the nagtest.toobig's LOV EA is enlarged to contain those OST-object(s) as to the size increased. According to the "lfs getstripe" output, we know every OST-object's ID, then we can use such ID to check on related OST. For example:
Here "65" corresponding to the OST0041 (HEX), the "9682882" corresponding to the sub-dir "9682882 % 32 = 2" under /O, you can check such OST-object via: debugfs -c -R "stat /O/0/d2/9682882" $OST0041_dev On the other hand, the OST-objects with ID "332xx" looks abnormal. Usually, the OST-object's ID on different OSTs should be quite scattered. But your case looks some adjacent numbers. That makes me to suspect that the LOV EA is over-written by some pattern data. |
| Comment by Mahmoud Hanafi [ 03/May/16 ] |
|
Is there a way log what files that lfsck is making changes to. As for the /nobackupp9/dtalcott/nagtest.toobig. I am attaching object scan of all osts for this file. You can see that there was 2 stripes 0. All the stripe for this file are 4194304 in size but the stripe on OST-65 is 251658240. #filename:type,blockgroup,uid,gid,inode,size,type,fid,stripe,ctime,atime,mtime and ost 65 has stripe 62 |
| Comment by nasf (Inactive) [ 04/May/16 ] |
|
By default, the repairing behaviour will be recorded in Lustre debug log via label "D_LFSCK". But because Lustre kernel log is in RAM only, and if you did not dump them periodically, then it will be overwritten. I am not sure whether you have such log or not. On the other hand, the log "nagtest.toobig.stripes" shows that there are 244 OST-objects claiming as the stripe of the [0x20001d596:0x97:0x0], but the "lfs getstripe" output shows that the file "nagtest.toobig" only has 240 stripes. So there are (at least) 4 OST-objects are fake. I found that there are 4 OST-objects with the size 251658240 bytes are conflict with another 4 OST-objects with the size 4194304 bytes, as shown following: A) 65.out.service162:F,2833,10491,1179,362692,251658240,lma,[0x100000000:0x93bfc2:0x0],0,fid,[0x20001d596:0x97:0x0],0,1461812807,1461812805,1461812807 B) 104.out.service169:F,240,10491,1179,30814,251658240,lma,[0x100000000:0x9e78e3:0x0],0,fid,[0x20001d596:0x97:0x1],1,1461812807,1461812805,1461812807 C) 121.out.service170:F,2375,10491,1179,304121,251658240,lma,[0x100000000:0x969e66:0x0],0,fid,[0x20001d596:0x97:0x2],2,1461812807,1461812805,1461812807 D) 87.out.service168:F,4361,10491,1179,558333,251658240,lma,[0x100000000:0x97b343:0x0],0,fid,[0x20001d596:0x97:0x3],3,1461812807,1461812805,1461812807 a) 56.out.service169:F,32528,10491,1179,4163662,4194304,lma,[0x100000000:0x512a2:0x0],0,fid,[0x20001d596:0x97:0x0],0,1461629209,1461629204,1461629209 b) 207.out.service176:F,37845,10491,1179,4844182,4194304,lma,[0x100000000:0x5139a:0x0],0,fid,[0x20001d596:0x97:0x1],1,1461629209,1461629204,1461629209 c) 137.out.service170:F,40964,10491,1179,5243393,4194304,lma,[0x100000000:0x512d2:0x0],0,fid,[0x20001d596:0x97:0x2],2,1461629209,1461629204,1461629209 d) 236.out.service173:F,35030,10491,1179,4483887,4194304,lma,[0x100000000:0x51320:0x0],0,fid,[0x20001d596:0x97:0x3],3,1461629209,1461629204,1461629209 The OST-objects A-D with the size 251658240 bytes claim as the first 4 stripes of the file "nagtest.toobig". The reason may be as following: the file "nagtest.toobig" lost its LOV EA because some unknown reason. Either the OST-objects A-D or a-d have corrupted PFID EA. According to the size attribute, the OST-objects A-D seems have corrupted PFID EA. During the LFSCK processing, the layout LFSCK tried to re-geneate the "nagtest.toobig" LOV EA with these 244 orphan OST-objects. Unfortunately, the OST-objects A-D were found earlier than the OST-objects a-d, then the regenerated LOV EA contains the OST-objects A-D information. And then when the orphan OST-objects a-d were handled, the layout LFSCK found that they were conflict with some others, then the layout LFSCK should have created some new files with the name "[0x20001d596:0x97:0x0]-[0x100000000:0x512a2:0x0]-C-0" or similar under the directory ".lustre/lost+found/MDT0000/". Please check for verification. |
| Comment by Nathan Dauchy (Inactive) [ 04/May/16 ] |
|
To follow up on question from yesterday, which you probably already guessed... There are 240 OSTs on the corrupted file system (on 16 OSS). |
| Comment by Nathan Dauchy (Inactive) [ 04/May/16 ] |
|
The closest match I could find in lost+found was: [0x20011c02e:0x1e13:0x0]-[0x20001d596:0x97:0x0]-3-C-0 [0x20011c02e:0x69c7:0x0]-[0x20001d596:0x97:0x0]-0-C-0 [0x20011c030:0x1219c:0x0]-[0x20001d596:0x97:0x0]-2-C-0 [0x20011c030:0x13ffc:0x0]-[0x20001d596:0x97:0x0]-1-C-0 And the file ownership is the same as /nobackupp9/dtalcott/nagtest.toobig |
| Comment by Nathan Dauchy (Inactive) [ 04/May/16 ] |
|
Full listing of nonzero files in lost+found, gathered as: cd /nobackupp9/.lustre/lost+found/MDT0000 find . -type f -a ! -size 0 | xargs ls -ld > /u/ndauchy/tmp/nbp9_diag/lostfound_nonzero.lst |
| Comment by nasf (Inactive) [ 04/May/16 ] |
|
There are 3 kinds of files under your lustre lost+found directory. They are: 1) The file name contains infix "-C-". That means multiple OST-objects claim the same MDT-object and the same slot in the LOV EA. Then the layout LFSCK will create new MDT-object(s) to hold the conflict OST-object(s). Searching your "lostfound_nonzero.lst" file, there are 4 files with "-C-" name. As described in former comment, they are corresponding to the file "nagtest.toobig.stripes" and conflict with the OST-objects A-D. If you think that the layout LFSCK made wrong decision when re-generated the "nagtest.toobig.stripes" LOV EA, we need to make new patch to recover it. But since there is only one file with conflict OST-objects, you also can recover the file "nagtest.toobig.stripes" manually. 2) The file name contains infix "-R-". That means the orphan OST-object knows its parent MDT-object FID, but does not know the position (the file name) in the namespace. So have to create a file with the known PFID. Under such case, if you can remember something from file data, you can rename the file back to the normal namespace. 3) The file name contains infix "-O-". They are orphan MDT-objects. The backend e2fsck put them into the local /lost+found directory, that is invisible to the client. The LFSCK scans the MDT-objects under local /lost+found directory, if the MDT-object has valid linkEA, then the LFSCK will move it back to normal namespace, otherwise, it will be put under the global Lustre lost+found directory that is visible to the client. Similar as above case 2), if you can remember something from file data, you can rename the file back to the normal namespace. On the other hand, I noticed that the files with "-O-" name all have the size 5792, that is mysterious. Unless the e2fsck found some corrupted inodes, otherwise, it is almost impossible. So you have to check these file with more human knowledge and if you think they are invalid, then remove them directly. |
| Comment by Nathan Dauchy (Inactive) [ 04/May/16 ] |
|
OK, I think we have a decent handle on the lost+found contents now, and the e2fsck and lfsck behaviour. It sounds like perhaps there is room for an improvement in lfsck to handle the conflicts more elegantly? What is still a significant unknown is how this corruption happened to begin with and what we can do to avoid it in the future. For now, we are no longer performing LVM snapshots of the MDT. Do you have a more complete explanation for the source of the lost striping information on the MDT? It is probably not just the LVM snapshot or e2fsck, as each have been used widely before. The server reboot while "lvcreate --snapshot" was running could have something to do with it I suppose. Do the stack traces and ldiskfs_journal_start_sb messages provide any clues? |
| Comment by nasf (Inactive) [ 05/May/16 ] |
If two orphan OST-objects conflict with each other when rebuilding the lost LOV EA, from the LFSCK view, it is NOT easy to know which one is right. Because the LFSCK cannot analysis the OST-object data. We found the trouble just because of the abnormal file size based on human judgement. But from the LFSCK view, since the file/object size is in the valid range, it cannot say current OST-object A-D are invalid. So on the this degree, the LFSCK needs some human assistant to make the right choice. As for the reason of lost stripe information, I do not know, I only have some guess or suspicion. Rebooting the system during "lvcreate --snapshot" is mostly suspected. For large LOV EA (with ten of or hundred stripes), the LOV EA will be stored in separated block. If someone breaks the process during the interval of inode creation and LOV EA storing, then the inode may lost its LOV EA. On the other hand, the implementation of LMV snapshot also may affect that. For example, there maybe some internal data re-location during the "lvcreate --snapshot", if someone breaks the process, the system may be in some internal inconsistent status. According to the stack traces in this ticket, there were some users accessing the system during "lvcreate --snapshot", right? If yes, I think that it would cause some Copy-On-Write, that may be blocked by the in-processing "lvcreate --snapshot" or handled with it currently, that depends on the LVM implementation. Unfortunately, it is out of Lustre scope, and I am not familiar with that. Honestly, I am not the fun of LVM snapshot because of many criticisms about the performance. Andreas, do you have any idea about the LVM/e2fsck trouble? |
| Comment by Andreas Dilger [ 05/May/16 ] |
|
It appears that we may be missing calls in osd-ldiskfs to check if the block device is currently (being) frozen before submitting IO to the underlying storage (i.e. all modifications blocked at a barrier while in the middle of creating a snapshot. In newer kernels (3.5+, e.g. RHEL7) this is handled with sb_start_write() and sb_end_write() at the upper layers, and sb_start_intwrite() and sb_end_intwrite() for filesystem internal writes, and should be taken before i_mutex (see e.g. kernel commit 8e8ad8a57c75f3b for examples). In older kernels (e.g. RHEL6) it uses vfs_check_frozen() to see if the device is currently frozen, with level=SB_FREEZE_WRITE to block new modifications from starting, and level=SB_FREEZE_TRANS to only block internal threads if the freeze barrier is complete. |
| Comment by Nathan Dauchy (Inactive) [ 06/May/16 ] |
|
At some point in the future we plan to migrate to RHEL/CentOS-7, but right now CentOS-6.7 is the priority. It looks like there is additional code that uses vfs_check_frozen in newer ext4/super.c source, but that it is not in our 2.6.32-504.30 kernel, nor is it mentioned in the changelog for a very recent CentOS-6.7 2.6.32-573.22 kernel. Are you saying that we just need to incorporate all the existing fixes, or that a new lustre patch is needed to perform a check which is unique to ldiskfs? If it would be more expedient to use some sort of external pause/resume commands before and after the lvm snapshot is created, then we could work that into the process as well. Also, just curious if this should work for a running MDT or ldiskfs: service322 ~ # fsfreeze -f /mnt/lustre/nbptest-mdt fsfreeze: /mnt/lustre/nbptest-mdt: freeze failed: Operation not supported |
| Comment by Andreas Dilger [ 06/May/16 ] |
|
The fsfreeze is the same as what LVM does internally, so it wouldn't change the problem being seen. We don't plumb that ioctl through the MDT mountpoint to the underlying filesystem, which is why there is an error reported. As for fixing this problem, I suspect that some of the upstream ext4 patches need to be backported (in particular kernel commits 437f88cc031ffe7f and 6b0310fbf087ad6) but there is also likely a need to add checks into the osd-ldiskfs code to block new IO submissions when the device is freezing/frozen, since these do not have the VFS-level checks. |
| Comment by Gerrit Updater [ 09/May/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/20062 |
| Comment by Jay Lan (Inactive) [ 10/May/16 ] |
|
The last six patches in ldiskfs-2.6-rhel6.6.series of the master branch are: rhel6.3/ext4-drop-inode-from-orphan-list-if-ext4_delete_inode-fails.patch Only the first 2 patches have already been picked into b2_7_fe. All six have not been picked to b2_5_fe. We are running Centos 6.6 and it seems to me these patches are important to have also. Some of our servers run 2.5.3 and the rest run 2.7.1. Is it safe for us to pick up those missing ldiskfs kernel patches? Please advise. |
| Comment by Andreas Dilger [ 18/May/16 ] |
|
I would recommend against using the ext4-give-warning-with-dir-htree-growing.patch as this also requires other changes to the Lustre code. The other changes are OK to use on other kernels. Also, are the following patches from the upstream kernel already applied on your systems? commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f
Author: Eric Sandeen <sandeen@sandeen.net>
AuthorDate: Sun Aug 1 17:33:29 2010 -0400
Commit: Theodore Ts'o <tytso@mit.edu>
CommitDate: Sun Aug 1 17:33:29 2010 -0400
ext4: fix freeze deadlock under IO
Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks
when freezing a filesystem which had active IO; the vfs_check_frozen
level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing
through. Duh.
Changing the test to FREEZE_TRANS should let the normal freeze
syncing get through the fs, but still block any transactions from
starting once the fs is completely frozen.
I tested this by running fsstress in the background while periodically
snapshotting the fs and running fsck on the result. I ran into
occasional deadlocks, but different ones. I think this is a
fine fix for the problem at hand, and the other deadlocky things
will need more investigation.
Reported-by: Phillip Susi <psusi@cfl.rr.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
commit 6b0310fbf087ad6e9e3b8392adca97cd77184084
Author: Eric Sandeen <sandeen@redhat.com>
AuthorDate: Sun May 16 02:00:00 2010 -0400
Commit: Theodore Ts'o <tytso@mit.edu>
CommitDate: Sun May 16 02:00:00 2010 -0400
ext4: don't return to userspace after freezing the fs with a mutex held
ext4_freeze() used jbd2_journal_lock_updates() which takes
the j_barrier mutex, and then returns to userspace. The
kernel does not like this:
================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
lvcreate/1075 is leaving the kernel with locks still held!
1 lock held by lvcreate/1075:
#0: (&journal->j_barrier){+.+...}, at: [<ffffffff811c6214>]
jbd2_journal_lock_updates+0xe1/0xf0
Use vfs_check_frozen() added to ext4_journal_start_sb() and
ext4_force_commit() instead.
Addresses-Red-Hat-Bugzilla: #568503
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
I would guess yes, since they originated from Red Hat, but just wanted to confirm. |
| Comment by Gerrit Updater [ 02/Jun/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20062/ |
| Comment by Peter Jones [ 02/Jun/16 ] |
|
Now landed for 2.9 and queued up for maintenance releases. Is there anything further you need on this ticket or can it be closed? |
| Comment by Jay Lan (Inactive) [ 02/Jun/16 ] |
|
Yes, this ticket can be closed. Thanks! |
| Comment by Peter Jones [ 02/Jun/16 ] |
|
Thanks Jay! |