[LU-8569] Sharded DNE directory full of files that don't exist Created: 30/Aug/16 Updated: 10/Aug/17 Resolved: 18/Jan/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
On our DNE testbed, one of our sharded directories seems to contain files that are all in a broken state. Currently both servers and clients are running 2.8.0_0.0.llnlpreview.40 (see the lustre-release-fe-llnl repo). We can get a directory listing, but nothing listed is actually accessible. Here is an excerpt from running ls -l: # pwd /p/lquake/casses1/opal-jet/simul_2 # ls -l ls: cannot access simul_link.2243: No such file or directory ls: cannot access simul_link.3161: No such file or directory ls: cannot access simul_link.3129: No such file or directory ls: cannot access simul_link.3893: No such file or directory ls: cannot access simul_link.691: No such file or directory ls: cannot access simul_link.3233: No such file or directory ls: cannot access simul_link.235: No such file or directory ls: cannot access simul_link.1653: No such file or directory ls: cannot access simul_link.3167: No such file or directory ls: cannot access simul_link.681: No such file or directory ls: cannot access simul_link.835: No such file or directory ls: cannot access simul_link.3857: No such file or directory ls: cannot access simul_link.1591: No such file or directory ls: cannot access simul_link.1175: No such file or directory [cut] -????????? ? ? ? ? ? simul_link.937 -????????? ? ? ? ? ? simul_link.94 -????????? ? ? ? ? ? simul_link.940 -????????? ? ? ? ? ? simul_link.941 -????????? ? ? ? ? ? simul_link.942 -????????? ? ? ? ? ? simul_link.943 -????????? ? ? ? ? ? simul_link.944 -????????? ? ? ? ? ? simul_link.947 [cut] Here is the striping information: # lfs getdirstripe .
.
lmv_stripe_count: 16 lmv_stripe_offset: 12
mdtidx FID[seq:oid:ver]
12 [0x50000996c:0x14fed:0x0]
13 [0x54000919d:0x14fed:0x0]
14 [0x58000a086:0x14fed:0x0]
15 [0x5c000996b:0x14fed:0x0]
0 [0x200006b03:0x14fed:0x0]
1 [0x3000089cc:0x14fed:0x0]
2 [0x38000996d:0x14fed:0x0]
3 [0x4c000b0df:0x14fed:0x0]
4 [0x2c000a142:0xec09:0x0]
5 [0x3c000b8b2:0xec09:0x0]
6 [0x34000a143:0xec09:0x0]
7 [0x40000a143:0xec09:0x0]
8 [0x44000a142:0xec09:0x0]
9 [0x24000a143:0xec09:0x0]
10 [0x2800091a4:0xec09:0x0]
11 [0x4800091a3:0xec09:0x0]
I ran lfsck on all services (at least those started by the "--all" option), but that did not address this situation. The problem files cannot be unlinked: # rm simul_link.999 rm: cannot remove 'simul_link.999': No such file or directory |
| Comments |
| Comment by Andreas Dilger [ 31/Aug/16 ] |
|
Can you check "lfs getstripe" on a few of the broken files, to see if the FIDs of the IST objects are unusual? I suspect that the directory is OK, but the error is coming from the OST which does not have the objects in the MDT file's layout. That may still indicate a problem with the MDT or OST, but will give a starting point. |
| Comment by Andreas Dilger [ 31/Aug/16 ] |
|
Can you please check "lfs getstripe" on a few of the broken files. It may be that the error is coming from the OST and not the directory at all. |
| Comment by Peter Jones [ 31/Aug/16 ] |
|
Assigning to Fan Yong for further investigation |
| Comment by Christopher Morrone [ 31/Aug/16 ] |
|
Here is the result of lfs getstripe for files in that directory: # lfs getstripe simul_link.2280 error opening simul_link.2280: Bad address (14) llapi_semantic_traverse: Failed to open 'simul_link.2280': Bad address (14) error: getstripe failed for simul_link.2280. |
| Comment by nasf (Inactive) [ 01/Sep/16 ] |
|
Would you please to collect the -1 level Lustre debug log on both the client and MDT when you hit "lfs getstripe simul_link.2280" failure? Since we do NOT know (if you know, that is better) on which MDT the file "lfs getstripe simul_link.2280" resides, then have to collect the logs on all MDTs. Thanks! |
| Comment by Giuseppe Di Natale (Inactive) [ 19/Sep/16 ] |
|
I collected -1 level Lustre logs on the client and for each MDT. They are in the tar file 'getstripelogs.tar.gz' which I attached to this issue. The command I logged is: lfs getstripe simul_link.898 The output of the command was: error opening simul_link.898: Bad address (14) llapi_semantic_traverse: Failed to open 'simul_link.898': Bad address (14) error: getstripe failed for simul_link.898. A grep seems to indicate that jet2 may be the log of interest, but I included all of them for completeness. Let me know if you need any other information. |
| Comment by nasf (Inactive) [ 20/Sep/16 ] |
|
The log on the client (client-getstripe.log) shows that: 00800000:00000001:3.0:1474324139.597219:0:117923:0:(lmv_intent.c:276:lmv_intent_open()) Process entered 00800000:00000040:3.0:1474324139.597221:0:117923:0:(lustre_lmv.h:170:lmv_name_to_stripe_index()) name simul_link.898 hash_type 2 idx 1 00800000:00000040:3.0:1474324139.597223:0:117923:0:(lmv_obd.c:1715:lmv_locate_target_for_name()) locate on mds 1 [0x30000cf20:0x1:0x0] 00800000:00000002:3.0:1474324139.597224:0:117923:0:(lmv_intent.c:316:lmv_intent_open()) OPEN_INTENT with fid1=[0x30000cf20:0x1:0x0], fid2=[0x0:0x0:0x0], name='simul_link.898' -> mds #1 ... Means client intent open ([0x30000cf20:0x1:0x0]/simul_link.898) RPC to the mds#1 00000004:00000001:7.0:1474324139.598512:0:38638:0:(mdt_open.c:1198:mdt_reint_open()) Process entered 00000020:00000001:7.0:1474324139.598514:0:38638:0:(lprocfs_jobstats.c:272:lprocfs_job_stats_log()) Process entered 00000020:00000001:7.0:1474324139.598517:0:38638:0:(lprocfs_jobstats.c:323:lprocfs_job_stats_log()) Process leaving (rc=0 : 0 : 0) 00000004:00000002:7.0:1474324139.598518:0:38638:0:(mdt_open.c:1226:mdt_reint_open()) I am going to open [0x30000cf20:0x1:0x0]/(simul_link.898->[0x0:0x0:0x0]) cr_flag=01 mode=0100000 msg_flag=0x0 ... 00080000:00000001:7.0:1474324139.598600:0:38638:0:(osd_index.c:395:osd_dir_lookup()) Process entered 00080000:00000001:7.0:1474324139.598639:0:38638:0:(osd_index.c:415:osd_dir_lookup()) Process leaving (rc=1 : 1 : 1) ... 00000004:00000001:7.0:1474324139.599521:0:38638:0:(osp_trans.c:469:osp_remote_sync()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) 00000004:00000001:7.0:1474324139.599522:0:38638:0:(osp_object.c:591:osp_attr_get()) Process leaving via out (rc=18446744073709551614 : -2 : 0xfffffffffffffffe) ... Means the mds#1 received the intent open RPC. It did lookup firstly and found the name entry "simul_link.898" existed on this MDT, but its FID is remote, then triggered osp_attr_get() to fetch the object's attribute when initialise the object. Unfortunately, the remote MDT returned -2 (-ENOENT) to this MDT. That the "simul_link.898" is dangling name entry. That is why the subsequent operation got -14 (-EFAULT) failure. Currently, I do not know what caused the dangling name entry. But I would suggest to run namespace LFSCK to fix related Lustre inconsistency. To be safe, you can run namespace LFSCK without "-C" option firstly, that will detect how many dangling name entries in the system but NOT auto repair them. Then you can check whether need to fix them. If you think it is necessary to re-create related lost MDT-objects, then re-run the namespace LFSCK with "-C" specified. |
| Comment by Giuseppe Di Natale (Inactive) [ 21/Sep/16 ] |
|
We came up with an easier reproducer for this issue in case you need to collect more information. Details are below. Create a striped directory for this test. cd to that directory and create a simple file: echo "hello world" > afile From there, create a script called 'linkme.sh' with the following contents: #!/bin/bash
filename=$(hostname)_${RANDOM}
ln afile $filename
Now, using srun, we can run the script across many nodes/cores w/ no timeout. Example below: srun -W 0 -N 47 -n $((47*36)) linkme.sh The script ran for a bit, but eventually we started seeing "bad address" errors. I'll continue to try and collect more information. |
| Comment by Giuseppe Di Natale (Inactive) [ 22/Sep/16 ] |
|
Ran an lfsck namespace with -C and got the following LBUG on multiple MDTs. 2016-09-22 10:04:23 [493341.943717] LustreError: 127771:0:(lfsck_namespace.c:4452:lfsck_namespace_double_scan()) ASSERTION( list_empty(&lad->lad_req_list) ) failed: 2016-09-22 10:04:23 [493341.958848] LustreError: 127771:0:(lfsck_namespace.c:4452:lfsck_namespace_double_scan()) LBUG 2016-09-22 10:04:23 [493341.968781] Pid: 127771, comm: lfsck Have the following call stack on two MDTs. 2016-09-22 10:03:52 Sep 22 10:03:52 [493315.464373] Kernel panic - not syncing: LBUG 2016-09-22 10:03:52 jet6 kernel: [49[493315.470430] CPU: 2 PID: 111809 Comm: lfsck Tainted: P OE ------------ 3.10.0-327.28.2.1chaos.ch6.x86_64 #1 2016-09-22 10:03:52 3315.297027] Lus[493315.484175] Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 2016-09-22 10:03:52 treError: 111809[493315.497715] ffffffffa079be0f 0000000055805053 ffff882757e4fc78 ffffffff8164cae7 2016-09-22 10:03:52 :0:(lfsck_namesp[493315.507701] ffff882757e4fcf8 ffffffff81645adf ffffffff00000008 ffff882757e4fd08 2016-09-22 10:03:52 ace.c:4452:lfsck[493315.517684] ffff882757e4fca8 0000000055805053 ffffffffa1070e70 0000000000000246 2016-09-22 10:03:52 _namespace_doubl[493315.527666] Call Trace: 2016-09-22 10:03:52 e_scan()) ASSERT[493315.532060] [<ffffffff8164cae7>] dump_stack+0x19/0x1b 2016-09-22 10:03:52 ION( list_empty([493315.539478] [<ffffffff81645adf>] panic+0xd8/0x1e7 2016-09-22 10:03:52 &lad->lad_req_li[493315.546501] [<ffffffffa077fdeb>] lbug_with_loc+0xab/0xc0 [libcfs] 2016-09-22 10:03:52 st) ) failed: 2016-09-22 10:03:52 [493315.555082] [<ffffffffa102c2a6>] lfsck_namespace_double_scan+0x106/0x140 [lfsck] 2016-09-22 10:03:52 Sep 22 10:03:52 [493315.565122] [<ffffffffa10234f9>] lfsck_double_scan+0x59/0x200 [lfsck] 2016-09-22 10:03:52 jet6 kernel: [49[493315.574086] [<ffffffffa0d88fc4>] ? osd_zfs_otable_it_fini+0x64/0x110 [osd_zfs] 2016-09-22 10:03:52 3315.311863] Lus[493315.583931] [<ffffffffa0d88fc4>] ? osd_zfs_otable_it_fini+0x64/0x110 [osd_zfs] 2016-09-22 10:03:52 treError: 111809[493315.593765] [<ffffffff811c8bad>] ? kfree+0x12d/0x170 2016-09-22 10:03:52 :0:(lfsck_namesp[493315.601075] [<ffffffffa1028044>] lfsck_master_engine+0x434/0x1310 [lfsck] 2016-09-22 10:03:52 ace.c:4452:lfsck[493315.610415] [<ffffffff81015588>] ? __switch_to+0xf8/0x4d0 2016-09-22 10:03:52 _namespace_doubl[493315.618212] [<ffffffff810bd4f0>] ? wake_up_state+0x20/0x20 2016-09-22 10:03:52 e_scan()) LBUG 2016-09-22 10:03:52 [493315.626108] [<ffffffffa1027c10>] ? lfsck_master_oit_engine+0x1430/0x1430 [lfsck] 2016-09-22 10:03:52 [493315.636145] [<ffffffff810a99bf>] kthread+0xcf/0xe0 2016-09-22 10:03:52 [493315.642238] [<ffffffff810a98f0>] ? kthread_create_on_node+0x140/0x140 2016-09-22 10:03:52 [493315.650187] [<ffffffff8165d9d8>] ret_from_fork+0x58/0x90 2016-09-22 10:03:52 [493315.656864] [<ffffffff810a98f0>] ? kthread_create_on_node+0x140/0x140 2016-09-22 10:03:52 [493315.711916] drm_kms_helper: panic occurred, switching back to text console 2016-09-22 10:03:52 [493315.720378] ------------[ cut here ]------------ 2016-09-22 10:03:52 [493315.726202] WARNING: at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5f/0x70() 2016-09-22 10:03:52 [493315.735902] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) rpcsec_gss_krb5 ko2iblnd(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) nfsv3 iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl kvm mlx5_ib pcspkr mlx5_core sb_edac lpc_ich edac_core mfd_core mei_me mei zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) ses enclosure ipmi_devintf spl(OE) zlib_deflate sg i2c_i801 ioatdma shpchp ipmi_si ipmi_msghandler acpi_power_meter acpi_cpufreq binfmt_misc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr nfsd nfs_acl ip_tables auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache dm_round_robin sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul mgag200 crc32c_intel syscopyarea sysfillrect sysimgblt ghash_clmulni_intel i2c_algo_bit drm_kms_helper mxm_wmi ttm aesni_intel ixgbe lrw gf128mul ahci drm dca glue_helper mpt3sas libahci ptp i2c_core ablk_helper cryptd libata raid_class pps_core scsi_transport_sas mdio wmi sunrpc dm_mirror dm_region_hash dm_log scsi_transport_iscsi dm_multipath dm_mod 2016-09-22 10:03:52 [493315.859970] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P OE ------------ 3.10.0-327.28.2.1chaos.ch6.x86_64 #1 2016-09-22 10:03:52 [493315.872734] Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 2016-09-22 10:03:53 [493315.885407] 0000000000000000 bcf7d7e5812e0014 ffff883f7e683d78 ffffffff8164cae7 2016-09-22 10:03:53 [493315.894536] ffff883f7e683db0 ffffffff8107d6d0 0000000000000000 ffff883f7e6967c0 2016-09-22 10:03:53 [493315.903668] 000000011d5cacb8 ffff883f7e6167c0 0000000000000002 ffff883f7e683dc0 2016-09-22 10:03:53 [493315.912796] Call Trace: 2016-09-22 10:03:53 [493315.916347] <IRQ> [<ffffffff8164cae7>] dump_stack+0x19/0x1b 2016-09-22 10:03:53 [493315.923621] [<ffffffff8107d6d0>] warn_slowpath_common+0x70/0xb0 2016-09-22 10:03:53 [493315.931168] [<ffffffff8107d81a>] warn_slowpath_null+0x1a/0x20 2016-09-22 10:03:53 [493315.938512] [<ffffffff81048fdf>] native_smp_send_reschedule+0x5f/0x70 2016-09-22 10:03:53 [493315.946646] [<ffffffff810cb04d>] trigger_load_balance+0x18d/0x250 2016-09-22 10:03:53 [493315.954390] [<ffffffff810bbdd3>] scheduler_tick+0x103/0x150 2016-09-22 10:03:53 [493315.961553] [<ffffffff810e5800>] ? tick_sched_handle.isra.14+0x60/0x60 2016-09-22 10:03:53 [493315.969775] [<ffffffff81091a06>] update_process_times+0x66/0x80 2016-09-22 10:03:53 [493315.977304] [<ffffffff810e57c5>] tick_sched_handle.isra.14+0x25/0x60 2016-09-22 10:03:53 [493315.985310] [<ffffffff810e5841>] tick_sched_timer+0x41/0x70 2016-09-22 10:03:53 [493315.992432] [<ffffffff810adeda>] __hrtimer_run_queues+0xea/0x2c0 2016-09-22 10:03:53 [493316.000042] [<ffffffff810ae4e0>] hrtimer_interrupt+0xb0/0x1e0 2016-09-22 10:03:53 [493316.007351] [<ffffffff8104be47>] local_apic_timer_interrupt+0x37/0x60 2016-09-22 10:03:53 [493316.015442] [<ffffffff8166000f>] smp_apic_timer_interrupt+0x3f/0x60 2016-09-22 10:03:53 [493316.023338] [<ffffffff8165e6dd>] apic_timer_interrupt+0x6d/0x80 2016-09-22 10:03:53 [493316.030848] <EOI> [<ffffffff810dd69c>] ? ktime_get+0x4c/0xd0 2016-09-22 10:03:53 [493316.038194] [<ffffffff810b8da6>] ? finish_task_switch+0x56/0x180 2016-09-22 10:03:53 [493316.045803] [<ffffffff81651df0>] __schedule+0x2e0/0x940 2016-09-22 10:03:53 [493316.052533] [<ffffffff81653709>] schedule_preempt_disabled+0x39/0x90 2016-09-22 10:03:53 [493316.060533] [<ffffffff810db1f4>] cpu_startup_entry+0x184/0x2d0 2016-09-22 10:03:53 [493316.067949] [<ffffffff81049eea>] start_secondary+0x1ca/0x240 2016-09-22 10:03:53 [493316.075162] ---[ end trace 28897805122ddeee ]--- Filesystem info: Also worth noting, once we have a directory with files that exhibit this "bad address" error, the directory cannot be removed. Let me know if you need more info. |
| Comment by Giuseppe Di Natale (Inactive) [ 23/Sep/16 ] |
|
I'm going to attempt to bring our filesystem back up this afternoon, if you could let me know if you have everything you need, that'd be great! Thanks! |
| Comment by Peter Jones [ 23/Sep/16 ] |
|
Joe Fan Yong is based in China so mayl not see this question until Sunday evening by this time of day Peter |
| Comment by Giuseppe Di Natale (Inactive) [ 23/Sep/16 ] |
|
Ah, thanks for letting me know, Peter. We are able to reproduce it if necessary, so I think it's safe to reboot our filesystem. |
| Comment by nasf (Inactive) [ 25/Sep/16 ] |
|
I will make patch to fix the namespace LFSCK assertion.
That is because there are dangling name entries under the parent directory, the dangling name entries cannot be removed via normal unlink/rmdir command, as to the parent directory are not empty. That is why the parent directory cannot be removed under such case. To resolve such trouble, you have to use the namespace LFSCK with "-C" option to fix the dangling name entries firstly, then removed them. |
| Comment by Gerrit Updater [ 25/Sep/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/22723 |
| Comment by Christopher Morrone [ 26/Sep/16 ] |
I had done that, but it did not fix the problem. lfsck ran to completion and did not assert when I ran it. The assertion is a new thing. I have no idea why it crashed this time around. |
| Comment by nasf (Inactive) [ 27/Sep/16 ] |
|
I think you have specified "-C" option when run the namespace LFSCK completely, right? Thanks! |
| Comment by Giuseppe Di Natale (Inactive) [ 28/Sep/16 ] |
|
I went ahead and attached a log file called "lfsck_namespace_state-9-28-2016.log" which was obtained by running the following on each MDS: pdsh -g mds 'lctl get_param -n mdd.$(ldev -l | grep lquake-MDT).lfsck_namespace' | dshbak -c Worth noting, when I restarted the filesystem, I had to stop the lfsck namespace check because the kernel panics would continue to occur because lfsck tried picking up where it left off. Also, we are creating the name dangling issue at will at the moment with the reproduction steps I provided in my Sept 21, 2016 comment (the one with the linkme.sh). I think that still needs to be addressed. I'm also going to break out the lfsck call stack issue to a separate ticket, it is unclear whether or not it is related. |
| Comment by Giuseppe Di Natale (Inactive) [ 28/Sep/16 ] |
|
|
| Comment by nasf (Inactive) [ 30/Sep/16 ] |
|
According to namespace LFSCK status, some dangling name entry should have been fixed: # grep dangling lfsck_namespace_state-9-28-2016.log 33:dangling_repaired: 423 92:dangling_repaired: 442 151:dangling_repaired: 431 210:dangling_repaired: 437 269:dangling_repaired: 406 328:dangling_repaired: 437 387:dangling_repaired: 440 446:dangling_repaired: 403 505:dangling_repaired: 511 564:dangling_repaired: 434 623:dangling_repaired: 432 682:dangling_repaired: 434 741:dangling_repaired: 540 800:dangling_repaired: 429 859:dangling_repaired: 435 918:dangling_repaired: 411 But still some failures when try to repair the striped directories: # grep failed lfsck_namespace_state-9-28-2016.log | grep -v 0 5:48:striped_shards_failed: 6 75:874:striped_shards_failed: 1 Unfortunately, if without the detailed LFSCK Lustre kernel debug logs, we cannot know what caused the LFSCK failure. If you can re-run the namespace LFSCK, then please enable "lfsck" debug on the MDTs, and collect the Lustre kernel debug logs. Currently, since the the namespace LFSCK failed to fix some inconsistency, if you have to remove those dangling entries some soon, then one possible solution is that: mount the backend as "ZFS" and remove those entries under "ZFS" mode directly. That will leave some stale OI mappings in the system, but it is almost harmless but space waste. |
| Comment by Andreas Dilger [ 30/Sep/16 ] |
|
It may also be possible to use "lfs rm" to remove dangling remote directory entries without trying to unlink the remote inode. That is intended for use in case of an MDT becoming permanently unavailable, but should also work in this case. |
| Comment by nasf (Inactive) [ 30/Sep/16 ] |
I have ever tried above scripts with multiple clients run in parallel, but cannot reproduce the trouble. Giuseppe, would you please to reproduce the issue as the way you mentioned with "-1" level Lustre kernel debug logs collected on the MDTs? Thanks! |
| Comment by Giuseppe Di Natale (Inactive) [ 05/Oct/16 ] |
|
I can get you some logs soon. Our test system isn't happy right now. Working on getting it back up so I can reproduce this to get those logs. Stay tuned. |
| Comment by Brad Hoagland (Inactive) [ 12/Oct/16 ] |
|
Hi dinatale2, |
| Comment by Giuseppe Di Natale (Inactive) [ 13/Oct/16 ] |
|
Still having issues with it. Will attempt to reproduce this ASAP. |
| Comment by Giuseppe Di Natale (Inactive) [ 14/Oct/16 ] |
|
Logs are now attached to this incident. The file names are jet-link-logs-part[1-4].tar.gz. The part 1 gzip has errors.log in it which has a sampling of what shows up in the console so you can use that to track down a specific file in the logs. Let me know if you need anything else. |
| Comment by Gerrit Updater [ 20/Oct/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22723/ |
| Comment by Peter Jones [ 20/Oct/16 ] |
|
Landed for 2.9 |
| Comment by Peter Jones [ 20/Oct/16 ] |
|
Actually perhaps I was premature to mark as resolved here. Fan Yong, what did the patch tracked under this ticket that jus tlanded to master address? Is there still work to be tracked under this ticket? |
| Comment by Giuseppe Di Natale (Inactive) [ 21/Oct/16 ] |
|
Peter, There is still work being tracked under this ticket. The logs I posted last week are to help find a resolution to this issue. The patch that landed was for |
| Comment by Peter Jones [ 22/Oct/16 ] |
|
So |
| Comment by nasf (Inactive) [ 23/Oct/16 ] |
|
Peter, As you can see in the comment history, to make So we can close the ticket |
| Comment by Peter Jones [ 24/Oct/16 ] |
|
Got it. For future reference it is possible to make adjustments to git commit messages when landing, so it would have been possible to use the correct JIRA reference without delaying things. |
| Comment by Di Wang [ 27/Oct/16 ] |
|
Just looked the debug log, it looks like update log is too long, which seems not right. ............. 0x23:47025: 200000020:00000040:9.0:1476399235.972447:0:154190:0:(update_trans.c:93:top_multiple_thandle_dump()) cookie 0x23:47025: 1 too much log cookies ( > 1k) for this transaction, each cookie can hold 32k update records. So I do not understand why link can generate such big record size. Hmm, even though the linkea size might be big in your test. (Do we limit linkea size for zfs?) the problem might be in I suspect this test might reproduce the problem, sigh, I do not have zfs environment here, diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh
index c61e3bc..0a3a82c 100755
--- a/lustre/tests/sanity.sh
+++ b/lustre/tests/sanity.sh
@@ -15196,6 +15196,29 @@ test_300q() {
}
run_test 300q "create remote directory under orphan directory"
+test_300r() {
+ [ $PARALLEL == "yes" ] && skip "skip parallel run" && return
+ [ $(lustre_version_code $SINGLEMDS) -lt $(version_code 2.7.55) ] &&
+ skip "Need MDS version at least 2.7.55" && return
+ [ $MDSCOUNT -lt 2 ] && skip "needs >= 2 MDTs" && return
+ local stripe_count
+ local file
+
+ mkdir $DIR/$tdir
+
+ $LFS setdirstripe -i1 -c3 $DIR/$tdir/remote_dir ||
+ error "set striped dir error"
+
+ touch $DIR/$tdir/$tfile
+ for ((i = 0; i < 50000; i++)); do
+ ln $DIR/$tdir/$tfile $DIR/$tdir/remote_dir/fffffffffffffffffffffffffffffffffffffffff-$i ||
+ error "ln remote file fails"
+ done
+
+ return 0
+}
+run_test 300r "test remote ln under striped directory"
+
prepare_remote_file() {
mkdir $DIR/$tdir/src_dir ||
error "create remote source failed"
|
| Comment by Di Wang [ 27/Oct/16 ] |
|
Just did some tests on ZFS and it looks like the problem is because the linkEA on ZFS reach above the llog chunk size (32768), which our current update llog system can not handle. i.e. one update operation (update op + its parameter) size can not > llog chunk size (32KB). So is it ok to limit the linkea size here? |
| Comment by Andreas Dilger [ 27/Oct/16 ] |
|
Yes, I think it is reasonable to limit linkEA size in this case. The Linux kernel xattr API is also similarly limited by the size of individual xattrs, and ldiskfs has a 4KB limit for xattrs, so the Lustre code is already expecting that not all links will be stored for a given file. |
| Comment by Gerrit Updater [ 01/Nov/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23500 |
| Comment by Gerrit Updater [ 14/Nov/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23741 |
| Comment by Gerrit Updater [ 01/Jan/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23500/ |
| Comment by Giuseppe Di Natale (Inactive) [ 11/Jan/17 ] |
|
Before this closes, can these patches also be ported to the 2.8FE branch? |
| Comment by Peter Jones [ 11/Jan/17 ] |
|
Giuseppe The ticket will be marked resolved when the patches land to master but the ticket will remain on the LLNL prority list until the equivalent patches have been ported and landed to the 2.8 FE branch Peter |
| Comment by Gerrit Updater [ 18/Jan/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23741/ |
| Comment by Peter Jones [ 18/Jan/17 ] |
|
All patches landed to master for 2.10. Ports to 2.8 and 2.9 FE branches will be tracked separately. |
| Comment by Giuseppe Di Natale (Inactive) [ 19/Jan/17 ] |
|
Peter, Are there tasks created so I can keep track of the 2.8 FE port? Joe |
| Comment by Peter Jones [ 19/Jan/17 ] |
|
We'll post the links on the ticket and mark with llnlfixready when it's ready for you to pick up |
| Comment by Giuseppe Di Natale (Inactive) [ 20/Jan/17 ] |
|
Apologies Peter, I went ahead and created |