[LU-4513] sanity test_220: prealloc_last_id: Found no match Created: 20/Jan/14  Updated: 27/Jan/14  Resolved: 22/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-3246 fc18: sanity test_133c: @@@@@@ FAIL: ... Resolved
is related to LU-4510 Oops (use after free) in osp_prealloc... Resolved
is related to LU-3319 Adapt to 3.10 upstream kernel proc_di... Resolved
is related to LU-4510 Oops (use after free) in osp_prealloc... Resolved
is related to LU-4532 Test failure on test suite sanity, su... Resolved
Severity: 3
Rank (Obsolete): 12350

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run:
http://maloo.whamcloud.com/test_sets/89dcedf6-8144-11e3-81ba-52540035b04c
https://maloo.whamcloud.com/test_sets/6bd28cc0-80d7-11e3-8385-52540035b04c

The sub-test test_220 failed with the following error:

CMD: shadow-5vm4 /usr/sbin/lctl pool_add lustre.test_220 lustre-OST0000
shadow-5vm4: OST lustre-OST0000_UUID added to pool lustre.test_220
pool 'lustre.test_220' has no OSTs
error: setstripe: create stripe file '/mnt/lustre/d220.sanity' failed
CMD: shadow-5vm4 lctl get_param -n osc.lustre-OST0000-osc-MDT0000.prealloc_last_id
shadow-5vm4: error: get_param: /proc/fs/lustre/osc/lustre-OST0000-osc-MDT0000/prealloc_last_id: Found no match
CMD: shadow-5vm4 lctl get_param -n osc.lustre-OST0000-osc-MDT0000.prealloc_next_id
shadow-5vm4: error: get_param: /proc/fs/lustre/osc/lustre-OST0000-osc-MDT0000/prealloc_next_id: Found no match

Info required for matching: sanity 220



 Comments   
Comment by Nathaniel Clark [ 20/Jan/14 ]

Started failing 1/19/14 (on above linked runs) and has been failing on 1/4 of the time since.

Comment by Nathaniel Clark [ 20/Jan/14 ]

I have narrowed the window of the regression:
Last Passing 142b2e4 LU-3467 ofd: remove obsoleted OBD methods
First Failing be41e2c LU-4416 mem: truncate_pagecache oldsize removed
I haven't found any reviews with a parent patch in between those.

Comment by Oleg Drokin [ 20/Jan/14 ]

Can you try with http://review.whamcloud.com/8029 reverted pleasE?

Comment by Andreas Dilger [ 20/Jan/14 ]

That leaves the following patches for consideration:

commit be41e2ce0d71a707da703e6f8e82d397be839d23
Author: yangsheng <yang.sheng@intel.com>

LU-4416 mem: truncate_pagecache oldsize removed

commit a674871d5f9e4819b3428593e24df6e52096612f
Author: Swapnil Pimpale <spimpale@ddn.com>

LU-4353 strncmp: Replace incorrect strncmp()s with strcmp()

commit b5f3d6db9200e369a68284a8ef85a1205e5905e1
Author: Emoly Liu <emoly.liu@intel.com>

LU-4154 lfsck: skip old lfsck test in DNE mode

commit a97e4898ad9e0b65f457b01bdfa954f7d7cd272d
Author: James Simmons <uja.ornl@gmail.com>

LU-3319 procfs: move osp proc handling to seq_files

commit 04e7562f872092c7a94e6d77fb5d2a7f97594bcf
Author: James Simmons <uja.ornl@gmail.com>

LU-3319 procfs: update shared server side core proc handling to seq_files

commit add1417e6f95a63cd3ed90c968b7b0c260168ce4
Author: Manisha Salve <msalve@ddn.com>

LU-2880 ldiskfs: Added mount option to enable dirdata.

commit 14cbce4b70833ee259427fa5a2ba826b75bb5c58
Author: Mikhail Pershin <mike.pershin@intel.com>

LU-3467 ofd: remove obsoleted OBD methods

My bet would be on one or both of James' patches, since they affect the /proc files, and this bug and the recent popularity of LU-3246 both relate to problems with /proc files.

Comment by James A Simmons [ 20/Jan/14 ]

I know which patch it is. Its 8029 and I see what the problem is. I'm looking into a fix.

Comment by John Hammond [ 20/Jan/14 ]

The logic of osp_lprocfs_init() needs review. If osc_proc_dir is an ERR_PTR() then it shouldn't be used later in the function. If lprocfs_add_symlink() then we shouldn't remove osc_proc_dir. There is no reason that I can tell for us to allocate a copy of name. If there is a reason to allocate it then it should be freed regardless of whether or not it has "osc" as a substring.

In osp_init0() the proc cleanup should remove obd_proc_private.

Comment by James A Simmons [ 21/Jan/14 ]

It's been reverted. Let me know if problems still exist.

Comment by Peter Jones [ 21/Jan/14 ]

Nathaniel

Could you please confirm whether the revert of the LU-3319 patch - http://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=b9b4614c1e302058ed9863b1ab73b7def2c5c924 - has indeed removed this problem?

Thanks

Peter

Comment by Andreas Dilger [ 22/Jan/14 ]

Problem was fixed with the revert of commit a97e4898ad9e0b (http://review.whamcloud.com/8029). Only a small number of failures since that was landed, and all of them have parent patches that predate the revert.

Generated at Sat Feb 10 01:43:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.