[LU-2910] racer test_1, sanity test_54c: ASSERTION( cio->cui_tot_nrsegs >= cio->cui_nrsegs ) failed Created: 05/Mar/13  Updated: 18/Mar/13  Resolved: 18/Mar/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: LB

Severity: 3
Rank (Obsolete): 7001

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/3fa2a912-85b2-11e2-9f8d-52540035b04c.

The sub-test test_1 failed with the following error:

test failed to respond and timed out

client console shows:

18:47:35:Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-22-ib,client-23-ib.lab.whamcloud.com DURATION=900 == 18:47:28 (1362451648)
18:47:35:Lustre: DEBUG MARKER: DURATION=900 MDSCOUNT=1 				   /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre2/racer 
18:47:35:Lustre: DEBUG MARKER: DURATION=900 MDSCOUNT=1 				   /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre/racer 
18:51:19:LustreError: 8649:0:(file.c:2601:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000bd0:0x9095:0x0] error: rc = -116
18:56:24:LustreError: 28058:0:(lcommon_cl.c:821:ccc_io_advance()) ASSERTION( cio->cui_tot_nrsegs >= cio->cui_nrsegs ) failed: 
18:56:24:LustreError: 28058:0:(lcommon_cl.c:821:ccc_io_advance()) LBUG
18:56:24:Pid: 28058, comm: cat
18:56:24:
18:56:24:Call Trace:
18:56:24: [<ffffffffa03fe895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
18:56:24: [<ffffffffa03fee97>] lbug_with_loc+0x47/0xb0 [libcfs]
18:56:24: [<ffffffffa1264c0e>] ccc_io_advance+0x11e/0x190 [lustre]
18:56:24: [<ffffffffa05a96d6>] cl_io_rw_advance+0x66/0x150 [obdclass]
18:56:24: [<ffffffffa05ae602>] cl_io_loop+0xe2/0x1b0 [obdclass]
18:56:24: [<ffffffffa1215323>] ll_file_io_generic+0x223/0x570 [lustre]
18:56:24: [<ffffffffa12157af>] ll_file_aio_read+0x13f/0x2c0 [lustre]
18:56:25: [<ffffffffa121635c>] ll_file_read+0x16c/0x2a0 [lustre]
18:56:25: [<ffffffff81176cb5>] vfs_read+0xb5/0x1a0
18:56:25: [<ffffffff81176df1>] sys_read+0x51/0x90
18:56:25: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
18:56:25:
18:56:25:Kernel panic - not syncing: LBUG
18:56:25:Pid: 28058, comm: cat Not tainted 2.6.32-279.19.1.el6.x86_64 #1
18:56:25:Call Trace:
18:56:25: [<ffffffff814e9541>] ? panic+0xa0/0x168
18:56:25: [<ffffffffa03feeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
18:56:25: [<ffffffffa1264c0e>] ? ccc_io_advance+0x11e/0x190 [lustre]
18:56:25: [<ffffffffa05a96d6>] ? cl_io_rw_advance+0x66/0x150 [obdclass]
18:56:25: [<ffffffffa05ae602>] ? cl_io_loop+0xe2/0x1b0 [obdclass]
18:56:25: [<ffffffffa1215323>] ? ll_file_io_generic+0x223/0x570 [lustre]
18:56:25: [<ffffffffa12157af>] ? ll_file_aio_read+0x13f/0x2c0 [lustre]
18:56:25: [<ffffffffa121635c>] ? ll_file_read+0x16c/0x2a0 [lustre]
18:56:25: [<ffffffff81176cb5>] ? vfs_read+0xb5/0x1a0
18:56:25: [<ffffffff81176df1>] ? sys_read+0x51/0x90
18:56:26: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
18:56:26:Initializing cgroup subsys cpuset
18:56:26:Initializing cgroup subsys cpu


 Comments   
Comment by Niu Yawei (Inactive) [ 07/Mar/13 ]

The client debug log in above maloo link is unfortunately missed, Sarah, could you try to reproduce it and collect the client debug log (with D_VFSTRACE enabled)? Thanks in advance.

Comment by Niu Yawei (Inactive) [ 08/Mar/13 ]

Looks the LASSERT could be triggered when we restart io with not restored iovector: http://review.whamcloud.com/5652

Comment by Jinshan Xiong (Inactive) [ 08/Mar/13 ]

is file_swap.sh enabled in this run?

Comment by Niu Yawei (Inactive) [ 08/Mar/13 ]

Xiong, I think the file_swap.sh should be disabled by default. Sarah knows it better.

Comment by Jinshan Xiong (Inactive) [ 08/Mar/13 ]

I'm thinking in which situation it would make IO restart. Also it may be incorrect to restart whole IO in some circumstances.

Comment by Jinshan Xiong (Inactive) [ 08/Mar/13 ]

Now that the log is missed, I'd like to wait for a while to see if this issue can be reproduced. I'd like to make it clear before making a fix.

Comment by Niu Yawei (Inactive) [ 10/Mar/13 ]

Xiong, in the racer test, file size could be changed anytime (mostly from rename, I think), so it's possible trigger io restart when ccc_prep_size() thinking that the ppos is beyond of file size, isn't it?

From the current code, seems we support only the whole io restart when there isn't any bit of read/write done yet, we don't support io restart when partial io done, right?

Comment by Sarah Liu [ 11/Mar/13 ]

I ran this test on Friday 4 times and cannot reproduce it on my test nodes. BTW Xiong, I cannot find file_swap.sh under tests/racer, could you please point me out? Thanks.

Comment by Jinshan Xiong (Inactive) [ 11/Mar/13 ]

Are you in master tree? Try to run the following command and see what you get:

[jinxiong@intel lustre]$ git annotate tests/racer/file_swap.sh
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 1)#!/bin/bash
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 2)
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 3)DIR=$1
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 4)MAX=$2
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 5)
4af3ab19 (Jinshan Xiong 2013-02-01 10:33:09 -0800 6)while : ; do

Comment by Sarah Liu [ 11/Mar/13 ]

Xiong, have you added the file into Makefile.am? If no, then it will not be built into the rpms

Comment by Jinshan Xiong (Inactive) [ 11/Mar/13 ]

ah no, i didn't

Comment by Jodi Levi (Inactive) [ 11/Mar/13 ]

Can this be reduced from a blocker since it doesn't sound like it can be reproduced?

Comment by Sarah Liu [ 11/Mar/13 ]

Didn't hit it in the latest tag testing, I think this can be dropped to major

Comment by Niu Yawei (Inactive) [ 12/Mar/13 ]

Though it's hard to be reproduced, from the code we can see if io restared, the iov_len could be messed up and this LASSERT will be triggered at the end.

Xiong, if you think layout change in racer test is unexpected, then there could be something serious wrong in the layout code, because I can see lots of io restart in my local testing (without file_swap, most io restart come from vvp_io_fini(), some from can_populate_pages()). Is it possible caused by rename?

Comment by Jinshan Xiong (Inactive) [ 12/Mar/13 ]

Niu, you're right about this. There exists fake layout change after layout swapping is introduced, and rename can cause this because it revokes full ibits lock.

Comment by Niu Yawei (Inactive) [ 15/Mar/13 ]

patch landed for 2.4

Comment by John Hammond [ 15/Mar/13 ]

I'm hitting this reliably using 2.3.62-47-g3a0bb68 (which includes the patch here) on my home setup. The easiest way to reproduce for my is to run sanity 54c in a loop. Maybe this should be reopened.

Comment by Niu Yawei (Inactive) [ 18/Mar/13 ]

What I saw is another LASSERT which is newly added from the patch:

LustreError: 31446:0:(lcommon_cl.c:794:ccc_io_update_iov()) ASSERTION( cio->cui_tot_nrsegs >= cio->cui_nrsegs ) failed: tot_nrsegs: 0, nrsegs: 1
LustreError: 31446:0:(lcommon_cl.c:794:ccc_io_update_iov()) LBUG
Kernel panic - not syncing: LBUG
Pid: 31446, comm: loop3 Tainted: P           ---------------    2.6.32 #1
Call Trace:
 [<ffffffff814fe08e>] ? panic+0xa0/0x168
 [<ffffffffa10b3f5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa0f6e569>] ? ccc_io_update_iov+0xe9/0xf0 [lustre]
 [<ffffffffa0f77d8c>] ? vvp_io_rw_lock+0x8c/0x690 [lustre]
 [<ffffffffa10c4301>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 [<ffffffffa0f783c5>] ? vvp_io_write_lock+0x35/0x40 [lustre]
 [<ffffffffa06e03d3>] ? cl_io_lock+0x63/0x560 [obdclass]
 [<ffffffffa10bedf8>] ? libcfs_log_return+0x28/0x40 [libcfs]
 [<ffffffffa0f4a2bf>] ? ll_cl_init+0x38f/0x570 [lustre]
 [<ffffffffa0f4a6e3>] ? ll_prepare_write+0x53/0x1a0 [lustre]
 [<ffffffffa0f62c0e>] ? ll_write_begin+0x7e/0x1a0 [lustre]
 [<ffffffff8111346c>] ? pagecache_write_begin+0x1c/0x20
 [<ffffffff8135ef05>] ? do_lo_send_aops+0x155/0x1a0
 [<ffffffff8135f15b>] ? do_bio_filebacked+0x20b/0x2d0
 [<ffffffff8135edb0>] ? do_lo_send_aops+0x0/0x1a0
 [<ffffffff8135f2f1>] ? loop_thread+0xd1/0x270
 [<ffffffff81092180>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8135f220>] ? loop_thread+0x0/0x270
 [<ffffffff81091e16>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81091d80>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20

Actually, sometimes the tot_nrsegs could be zero (see ll_cl_init()), the newly added assert isn't correct in such case.

Comment by Niu Yawei (Inactive) [ 18/Mar/13 ]

Skip iov update when tot_nrsegs is zero: http://review.whamcloud.com/5747

Comment by Andreas Dilger [ 18/Mar/13 ]

Increasing priority, since this is causing almost every test to fail.

Comment by Minh Diep [ 18/Mar/13 ]

hit this https://maloo.whamcloud.com/test_sets/7e10a3f6-8e87-11e2-a173-52540035b04c

Comment by Andreas Dilger [ 18/Mar/13 ]

Jinshan replied to my comment on Patch Set 1:

"Failed tests always show "tot_nrsegs: 0" in the LASSERT() so I can imagine this will fix the problem."

The root cause of this problem is lloop write

Is there something that should be fixed in the lloop code? There has been more interest in this recently from Robert and Richard for VM provisioning and other cloud applications.

Comment by Jinshan Xiong (Inactive) [ 18/Mar/13 ]

Is there something that should be fixed in the lloop code? There has been more interest in this recently from Robert and Richard for VM provisioning and other cloud applications.

Did I say lustre loop device? I meant to say linux kernel loop device, sorry.

First of all, we should figure out what kind of IO has been received by lloop device then we can work out corresponding fix. In current implementation of lloop, we use direct IO to serve bio requests so an obvious way to improve performance is to do cache write - especially if we find out there are plenty of small IOs.

Comment by Peter Jones [ 18/Mar/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:29:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.