[LU-12704] racer test_1: Invalid layout: The component end must be aligned by the stripe size Created: 27/Aug/19 Updated: 23/Dec/20 Resolved: 12/Nov/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Vladimir Saveliev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for jianyu <yujian@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4c310714-c728-11e9-9fc9-52540065bddc test_1 failed with the following error: layout: raid0 raid0 pfl pfl pfl dom dom dom flr flr flr Invalid layout: The component end must be aligned by the stripe size <<Please provide additional information about the failure here>> VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 12/Sep/19 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36174 |
| Comment by Andreas Dilger [ 12/Sep/19 ] |
|
Note that on 2.10/2.12 (between commits v2_9_55_0-14-g89693927f0..v2_12_55-83-gff5eb304fa) it isn't allowed to create files with stripe_size that isn't a power-of-two value due to:
if (stripe_size == 0 ||
(prev_end != LUSTRE_EOF &&
(prev_end & (stripe_size - 1)))) {
CDEBUG(D_LAYOUT, "stripe size isn't aligned. "
" stripe_sz: %u, [%llu, %llu)\n",
stripe_size, ext->e_start, prev_end);
RETURN(-EINVAL);
}
This was fixed as part of SEL to check only that they are a multiple, not assuming a power-of-two value:
if (prev_end % stripe_size) {
CDEBUG(D_LAYOUT, "stripe size isn't aligned, "
"stripe_sz: %u, [%llu, %llu)\n",
stripe_size, ext->e_start, prev_end);
RETURN(-EINVAL);
}
so if this patch is backported to b2_12 we should include that part of the change, otherwise setstripe will still be failing with random 64KiB-multiple stripe_size. |
| Comment by James Nunez (Inactive) [ 25/Sep/19 ] |
|
We're seeing racer test 1 crash with the 'Invalid layout' messages in the client test_log and the following call trace: [50194.820231] 2[26110]: segfault at 0 ip (null) sp 00007ffd1d570c28 error 14 in 2[400000+6000] [50260.478242] 13[17503]: segfault at 8 ip 00007f5cedcb5718 sp 00007fff9f4f5d70 error 4 in ld-2.17.so[7f5cedcaa000+22000] [50261.303734] 13[17172]: segfault at 8 ip 00007fc73cbf4718 sp 00007ffffb9ec790 error 4 in ld-2.17.so[7fc73cbe9000+22000] [50280.191268] 16[9651]: segfault at 8 ip 00007f7cbdbdd718 sp 00007ffe4ab86420 error 4 in ld-2.17.so[7f7cbdbd2000+22000] [50283.020027] 19[13428]: segfault at 8 ip 00007f9ce34c0718 sp 00007ffc51e417c0 error 4 in ld-2.17.so[7f9ce34b5000+22000] [50297.598388] general protection fault: 0000 [#1] SMP [50297.599253] Modules linked in: lustre(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) nfsd nfs_acl brd loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr virtio_balloon i2c_piix4 parport_pc parport ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk ata_piix libata 8139too crct10dif_pclmul crct10dif_common crc32c_intel [50297.607808] serio_raw 8139cp virtio_pci virtio_ring mii virtio floppy [last unloaded: libcfs] [50297.608709] CPU: 1 PID: 16082 Comm: ldlm_bl_03 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.27.2.el7.x86_64 #1 [50297.609823] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [50297.610384] task: ffff95157aa35140 ti: ffff95155a928000 task.ti: ffff95155a928000 [50297.611111] RIP: 0010:[<ffffffffc082c368>] [<ffffffffc082c368>] cl_object_flush+0x48/0x120 [obdclass] [50297.612168] RSP: 0018:ffff95155a92bb20 EFLAGS: 00010292 [50297.612680] RAX: 5a5a5a5a5a5a5a5a RBX: 5a5a5a5a5a5a5a42 RCX: 0000000000000000 [50297.613366] RDX: ffff95154c051bd8 RSI: 0000000000000246 RDI: 0000000000000246 [50297.614050] RBP: ffff95155a92bb40 R08: 0000000000000000 R09: 0000000180180017 [50297.614735] R10: 0000000000000001 R11: ffffe84bc1bdc2c0 R12: ffff95156f70b430 [50297.615430] R13: ffff951549f8ac68 R14: ffff95155f7ae6c0 R15: 0000000000000000 [50297.616120] FS: 0000000000000000(0000) GS:ffff95157fd00000(0000) knlGS:0000000000000000 [50297.616899] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [50297.617451] CR2: 00007fde9a099f34 CR3: 0000000078754000 CR4: 00000000000606e0 [50297.618146] Call Trace: [50297.618436] [<ffffffffc0cde4ec>] lov_flush_composite+0x17c/0x1d0 [lov] [50297.619091] [<ffffffffc0cdd2e2>] lov_object_flush+0x22/0x60 [lov] [50297.619713] [<ffffffffc082c383>] cl_object_flush+0x63/0x120 [obdclass] [50297.620440] [<ffffffffc0d5affe>] ll_lock_cancel_bits+0x3ce/0x9f0 [lustre] [50297.621121] [<ffffffffc0d5bd68>] ll_md_blocking_ast+0x248/0x2b0 [lustre] [50297.621895] [<ffffffffc0a704aa>] ldlm_cancel_callback+0x8a/0x330 [ptlrpc] [50297.622789] [<ffffffffc067e425>] ? cfs_trace_unlock_tcd+0x35/0x90 [libcfs] [50297.623828] [<ffffffffc0a7b3c0>] ldlm_cli_cancel_local+0xa0/0x3f0 [ptlrpc] [50297.625013] [<ffffffffc0a81121>] ldlm_cli_cancel+0x161/0x650 [ptlrpc] [50297.626050] [<ffffffffc0d5bc0a>] ll_md_blocking_ast+0xea/0x2b0 [lustre] [50297.627204] [<ffffffffc0a8593d>] ldlm_handle_bl_callback+0xed/0x4e0 [ptlrpc] [50297.627945] [<ffffffffc0a86530>] ldlm_bl_thread_main+0x800/0xa40 [ptlrpc] [50297.628653] [<ffffffff854d7c40>] ? wake_up_state+0x20/0x20 [50297.629234] [<ffffffffc0a85d30>] ? ldlm_handle_bl_callback+0x4e0/0x4e0 [ptlrpc] [50297.629962] [<ffffffff854c2e81>] kthread+0xd1/0xe0 [50297.630458] [<ffffffff854c2db0>] ? insert_kthread_work+0x40/0x40 [50297.631082] [<ffffffff85b76c37>] ret_from_fork_nospec_begin+0x21/0x21 [50297.631728] [<ffffffff854c2db0>] ? insert_kthread_work+0x40/0x40 [50297.632319] Code: 4c 8b 26 74 0d f6 05 d3 7c e7 ff 20 0f 85 91 00 00 00 49 8b 44 24 40 49 83 c4 40 49 39 c4 48 8d 58 e8 74 33 0f 1f 80 00 00 00 00 <48> 8b 43 28 48 8b 40 68 48 85 c0 74 12 4c 89 f2 48 89 de 4c 89 [50297.635385] RIP [<ffffffffc082c368>] cl_object_flush+0x48/0x120 [obdclass] [50297.636104] RSP <ffff95155a92bb20> Logs are at |
| Comment by Vladimir Saveliev [ 26/Sep/19 ] |
[50297.612680] RAX: 5a5a5a5a5a5a5a5a RBX: 5a5a5a5a5a5a5a42 RCX: 0000000000000000 [50297.613366] RDX: ffff95154c051bd8 RSI: 0000000000000246 RDI: 0000000000000246 [50297.614050] RBP: ffff95155a92bb40 R08: 0000000000000000 R09: 0000000180180017 [50297.614735] R10: 0000000000000001 R11: ffffe84bc1bdc2c0 R12: ffff95156f70b430 [50297.615430] R13: ffff951549f8ac68 R14: ffff95155f7ae6c0 R15: 0000000000000000 [50297.616120] FS: 0000000000000000(0000) GS:ffff95157fd00000(0000) knlGS:0000000000000000 [50297.616899] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [50297.617451] CR2: 00007fde9a099f34 CR3: 0000000078754000 CR4: 00000000000606e0 [50297.618146] Call Trace: [50297.618436] [<ffffffffc0cde4ec>] lov_flush_composite+0x17c/0x1d0 [lov] [50297.619091] [<ffffffffc0cdd2e2>] lov_object_flush+0x22/0x60 [lov] This is introduced by 707bab62f5 (" cl_object_flush needs i/o initialized. Otherwise, it races with layout change. |
| Comment by Gerrit Updater [ 26/Sep/19 ] |
|
Vladimir Saveliev (c17830@cray.com) uploaded a new patch: https://review.whamcloud.com/36300 |
| Comment by Gerrit Updater [ 04/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36174/ |
| Comment by Gerrit Updater [ 04/Oct/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36368 |
| Comment by Mikhail Pershin [ 04/Oct/19 ] |
|
I was fixing DOM entry checking in lov_flush_composite() and thought that taking LSM reference should protect us from layout change. Could you check if patch above fixes that problem? |
| Comment by Vladimir Saveliev [ 04/Oct/19 ] |
[ 2091.709375] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2091.712428] IP: [< (null)>] (null) ... [ 2091.736582] CPU: 0 PID: 20963 Comm: ldlm_bl_08 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.5.1.el7.x86_64 #1 ... [ 2091.773122] Call Trace: [ 2091.775002] [<ffffffffc0c9a1a2>] ? lov_object_flush+0x22/0x60 [lov] [ 2091.777577] [<ffffffffc093e193>] cl_object_flush+0x63/0x120 [obdclass] [ 2091.780136] [<ffffffffc0e1c408>] ll_lock_cancel_bits+0x9b8/0xc00 [lustre] According to crash dump (from Cray's test system), the BUG happened when lov_object_flush() tried to dispatch to lov_flush_composite(), so probably the change in lov_flush_composite() will not help. crash> lov_object.lo_type,lo_lsm 0xffff8f46e1af42e0 lo_type = LLT_EMPTY lo_lsm = 0x0 That probably happened earlier when layout lock was canceled: 00000080:00010000:0.0:1569576242.662719:0:21009:0:(namei.c:248:ll_lock_cancel_bits()) ### to cancel bits 0x19 ns: lustre-MDT0001-mdc-ffff8f4777b0e000 lock: ffff8f4765861d40/0x7909e24de166dec lrc: 3/0,0 mode: PR/PR res: [0x240000406:0x5278:0x0].0x0 bits 0x19/0x19 rrc: 3 type: IBT flags: 0x429400000000 nid: local remote: 0x96dffd219df32662 expref: -99 pid: 3158 timeout: 0 lvb_type: 0 00000080:00200000:0.0:1569576242.662727:0:21009:0:(vvp_object.c:140:vvp_conf_set()) [0x240000406:0x5278:0x0]: losing layout lock Then on canceling DoM lock ldlm_bl_08 thread faced with empty layout: 00000080:00010000:0.0:1569576242.728877:0:20963:0:(namei.c:248:ll_lock_cancel_bits()) ### to cancel bits 0x40 ns: lustre-MDT0001-mdc-ffff8f4777b0e000 lock: ffff8f47663ee000/0x7909e24de166f19 lrc: 2/0,0 mode: PR/PR res: [0x240000406:0x5278:0x0].0x0 bits 0x48/0x40 rrc: 3 type: IBT flags: 0x460400000000 nid: local remote: 0x96dffd219df32c12 expref: -99 pid: 3288 timeout: 0 lvb_type: 3 |
| Comment by Vladimir Saveliev [ 04/Oct/19 ] |
|
It seems that all elements of lov_dispatch[] need all members of struct lov_layout_operations initialized. So, how about defining lov_dispatch[LLT_EMPTY].llo_flush and lov_dispatch[LLT_RELEASED].llo_flush to a function like static int lov_flush_empty() { return 0; } Mike, would that work? |
| Comment by Mikhail Pershin [ 04/Oct/19 ] |
|
Vladimir, yes, I think it is what we need to add P.S. I've updated patch with that code |
| Comment by Vladimir Saveliev [ 04/Oct/19 ] |
|
Mike, wouldn't it be better to have https://review.whamcloud.com/36368 on top of https://review.whamcloud.com/#/c/36300? |
| Comment by Mikhail Pershin [ 04/Oct/19 ] |
|
Vladimir, do you think https://review.whamcloud.com/#/c/36300 is still needed? I was thinking that we are safe from layout change while LSM is referenced and no need to initialize IO in llite. |
| Comment by Andreas Dilger [ 18/Oct/19 ] |
|
Mike, Oleg is still hitting crashes with https://review.whamcloud.com/36300 so it can't land as-is. Does it make sense to rebase your patch to be directly on master so that it can land independently, or does it depend on 36300 in order to work properly? |
| Comment by Peter Jones [ 18/Oct/19 ] |
|
Moving to 2.14 until we can understand the crashes in Olegtest |
| Comment by Vladimir Saveliev [ 18/Oct/19 ] |
Mike, yes, you are right, io->ci_ignore_layout set to 1 leads to race between layout change and cl_io_init. Something like 00020000:00040000:0.0:1571007482.482526:0:28488:0:(lov_io.c:318:lov_io_mirror_init()) ASSERTION( comp->lo_preferred_mirror == 0 ) failed: has been seen few times. However, there is also a problem with io->ci_ignore_layout set to 0. Namely, the below lockup has beed observed: [<ffffffffc096f065>] ldlm_completion_ast+0x4e5/0x860 [ptlrpc] [<ffffffffc0970e2a>] ldlm_cli_enqueue_fini+0x63a/0xef0 [ptlrpc] [<ffffffffc0973b71>] ldlm_cli_enqueue+0x451/0xa60 [ptlrpc] [<ffffffffc0ba7730>] mdc_enqueue_base+0x330/0x1c40 [mdc] [<ffffffffc0ba9a85>] mdc_intent_lock+0x135/0x560 [mdc] [<ffffffffc0be6742>] lmv_intent_lock+0x402/0xa20 [lmv] [<ffffffffc0c0eb1d>] ll_layout_intent+0x1dd/0x720 [lustre] [<ffffffffc0c1fa6c>] ll_layout_refresh+0x30c/0x900 [lustre] [<ffffffffc0c62ea7>] vvp_io_init+0x347/0x460 [lustre] [<ffffffffc076bf4b>] cl_io_init0.isra.15+0x8b/0x160 [obdclass] [<ffffffffc076c0e3>] cl_io_init+0x43/0x80 [obdclass] [<ffffffffc0c41fe5>] ll_lock_cancel_bits+0x625/0xca0 [lustre] [<ffffffffc0c42a5c>] ll_md_blocking_ast+0x24c/0x2b0 [lustre] [<ffffffffc09626ba>] ldlm_cancel_callback+0x8a/0x330 [ptlrpc] [<ffffffffc096e311>] ldlm_cli_cancel_local+0xd1/0x420 [ptlrpc] [<ffffffffc097294a>] ldlm_cli_cancel_list_local+0xea/0x280 [ptlrpc] [<ffffffffc0972c6b>] ldlm_cancel_resource_local+0x18b/0x2a0 [ptlrpc] [<ffffffffc0ba02ac>] mdc_resource_get_unused_res+0x10c/0x250 [mdc] [<ffffffffc0bb1057>] mdc_enqueue_send+0x557/0x710 [mdc] [<ffffffffc0bb14b2>] mdc_lock_enqueue+0x2a2/0x6f2 [mdc] [<ffffffffc07696d5>] cl_lock_enqueue+0x65/0x120 [obdclass] [<ffffffffc0ab51e5>] lov_lock_enqueue+0x95/0x150 [lov] [<ffffffffc07696d5>] cl_lock_enqueue+0x65/0x120 [obdclass] [<ffffffffc0769c67>] cl_lock_request+0x67/0x1f0 [obdclass] [<ffffffffc076d9cb>] cl_io_lock+0x2bb/0x3d0 [obdclass] [<ffffffffc076dcfa>] cl_io_loop+0xba/0x1c0 [obdclass] [<ffffffffc0c5981f>] cl_setattr_ost+0x25f/0x3d0 [lustre] [<ffffffffc0c34b28>] ll_setattr_raw+0xcc8/0x1060 [lustre] [<ffffffffc0c34f23>] ll_setattr+0x63/0xc0 [lustre] [<ffffffffbc260524>] notify_change+0x2c4/0x420 [<ffffffffbc23f335>] do_truncate+0x75/0xc0 mdc_enqueue_send() tries to do early cancel, ll_lock_cancel_bits()->ll_dom_lock_cancel() initializes cl_io->ci_ignore_layout to 0, so vvp_io_init() does ll_layout_refresh() and takes lli->lli_layout_mutex and sends enqueue rpc. Server sends blocking ast back to the client, so another ll_lock_cancel_bits() gets to run and stuck on trying to lock the mutex lli->lli_layout_mutex:
int ll_layout_refresh(struct inode *inode, __u32 *gen)
{
...
mutex_lock(&lli->lli_layout_mutex);
...
[<ffffffffc0c1f94e>] ll_layout_refresh+0x1ee/0x900 [lustre] [<ffffffffc0c62ea7>] vvp_io_init+0x347/0x460 [lustre] [<ffffffffc076bf4b>] cl_io_init0.isra.15+0x8b/0x160 [obdclass] [<ffffffffc076c0e3>] cl_io_init+0x43/0x80 [obdclass] [<ffffffffc0c41fe5>] ll_lock_cancel_bits+0x625/0xca0 [lustre] [<ffffffffc0c42a99>] ll_md_blocking_ast+0x289/0x2b0 [lustre] [<ffffffffc0978bdd>] ldlm_handle_bl_callback+0xed/0x4e0 [ptlrpc] [<ffffffffc09797d0>] ldlm_bl_thread_main+0x800/0xa40 [ptlrpc] [<ffffffffbc0c1c71>] kthread+0xd1/0xe0 |
| Comment by Vladimir Saveliev [ 22/Oct/19 ] |
Mike, it looks like we are in trouble with this patch as io->ci_ignore_layout = 0; as well as io->ci_ignore_layout = 1; leads to a problem. Any idea? |
| Comment by Gerrit Updater [ 12/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36368/ |
| Comment by Peter Jones [ 12/Nov/19 ] |
|
Landed for 2.14 |