[LU-5487] Test failure on test suite sanity-lfsck, subtest test_18d Created: 14/Aug/14 Updated: 11/Apr/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 15309 |
| Description |
|
This issue was created by maloo for Minh Diep <minh.diep@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/b04945f0-236c-11e4-84ee-5254006e85c2. The sub-test test_18d failed with the following error:
Info required for matching: sanity-lfsck 18d |
| Comments |
| Comment by Oleg Drokin [ 15/Aug/14 ] |
|
I suspect this is the case of background task completing too fast? Changed after 0s: from 'scanning-phase2' to 'completed'
Waiting 6 secs for update
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
CMD: onyx-51vm3 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout |
awk '/^status/ { print \$2 }'
Update not seen after 6s: wanted 'scanning-phase2' got 'completed'
sanity-lfsck test_18d: @@@@@@ FAIL: (3.0) MDS1 is not the expected 'scanning-phase2'
|
| Comment by Jian Yu [ 08/Jul/16 ] |
|
More failure instance on master branch: https://testing.hpdd.intel.com/test_sets/c53fdc74-403b-11e6-acf3-5254006e85c2 |
| Comment by Dmitry Eremin (Inactive) [ 11/Apr/17 ] |
|
I see the following crash in master: 16:46:36:[14470.127738] LustreError: 22186:0:(vvp_io.c:345:vvp_io_fini()) ASSERTION( io->ci_type == CIT_WRITE || cl_io_is_trunc(io) ) failed: 16:46:36:[14470.132048] LustreError: 22186:0:(vvp_io.c:345:vvp_io_fini()) LBUG 16:46:36:[14470.134194] Pid: 22186, comm: cat 16:46:36:[14470.135970] 16:46:36:[14470.135970] Call Trace: 16:46:36:[14470.139113] [<ffffffffa07107f3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] 16:46:36:[14470.141168] [<ffffffffa0710861>] lbug_with_loc+0x41/0xb0 [libcfs] 16:46:36:[14470.143473] [<ffffffffa0c93761>] vvp_io_fini+0x321/0x360 [lustre] 16:46:36:[14470.145411] [<ffffffffa0beaff2>] ? lov_io_fini+0x282/0x460 [lov] 16:46:36:[14470.147499] [<ffffffffa0805165>] cl_io_fini+0x75/0x240 [obdclass] 16:46:36:[14470.149358] [<ffffffffa0c42f73>] ll_file_io_generic+0x2a3/0xb00 [lustre] 16:46:36:[14470.151383] [<ffffffff81219cff>] ? touch_atime+0x12f/0x160 16:46:36:[14470.153202] [<ffffffffa0c4409a>] ll_file_aio_read+0x34a/0x3e0 [lustre] 16:46:36:[14470.155178] [<ffffffffa0c441fe>] ll_file_read+0xce/0x1e0 [lustre] 16:46:36:[14470.157019] [<ffffffff811fe19e>] vfs_read+0x9e/0x170 16:46:36:[14470.158806] [<ffffffff811fed6f>] SyS_read+0x7f/0xe0 16:46:36:[14470.160524] [<ffffffff81696b09>] system_call_fastpath+0x16/0x1b The code is following: /**
* dynamic layout change needed, send layout intent
* RPC.
*/
if (io->ci_need_write_intent) {
loff_t start = 0;
loff_t end = 0;
LASSERT(io->ci_type == CIT_WRITE || cl_io_is_trunc(io));
|
| Comment by Dmitry Eremin (Inactive) [ 11/Apr/17 ] |
|
This crash was introduced in https://review.whamcloud.com/25317 static int lov_io_rw_iter_init(const struct lu_env *env, const struct cl_io_slice *ios) { ... index = lov_lsm_entry(lsm, lio->lis_endpos - 1); if (index > 0 && !lsm_entry_inited(lsm, index)) { io->ci_need_write_intent = 1; RETURN(io->ci_result = -ENODATA); } So, "io->ci_need_write_intent" can be set to "1" in read also. |