[LU-4221] lctl conf_param <obdname>.ost.writethrough_cache_enable=N does not work anymore Created: 06/Nov/13  Updated: 17/Jan/14  Resolved: 02/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Bug Priority: Blocker
Reporter: Michael MacDonald (Inactive) Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 11487

 Description   

Using 2.5.0 RC1, which I assume is what went GA.

In previous releases of Lustre, it was possible to set this tunable via conf_param, but now it doesn't work:

[root@mgs ~]# lctl conf_param testfs-OST0000.ost.writethrough_cache_enable=0
[root@mgs ~]# cat /proc/fs/lustre/obdfilter/testfs-OST0000/writethrough_cache_enable
1

The command completes successfully, but we see the following in dmesg:

LustreError: 15418:0:(obd_config.c:1341:class_process_proc_param()) testfs-OST0000: unknown param writethrough_cache_enable=0
LustreError: 15418:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.42.42.5@tcp: cfg command failed: rc = -38
Lustre:    cmd=cf00f 0:testfs-OST0000  1:ost.writethrough_cache_enable=0

I understand that conf_param is on its way to being deprecated, and that set_param -P is preferred. However, conf_param should still work, right? It seems that some things still work as they always have, e.g.:

[root@mgs ~]# lctl conf_param testfs-OST0000.ost.client_cache_seconds=4242
[root@mgs ~]# cat /proc/fs/lustre/obdfilter/testfs-OST0000/client_cache_seconds
4242

In dmesg:

Lustre: Modifying parameter testfs-OST0000.ost.client_cache_seconds in log testfs-OST0000


 Comments   
Comment by Michael MacDonald (Inactive) [ 06/Nov/13 ]

For the record, substituting obdfilter for ost in the command doesn't work either:

[root@mgs ~]# lctl conf_param testfs-OST0000.obdfilter.writethrough_cache_enable=0
Make sure cfg_device is set first.
error: conf_param: Function not implemented
Comment by Andreas Dilger [ 06/Nov/13 ]

Strange. The obdfilter parameters exist, though the tunable is actually moved to osd-ldiskfs instead of in obdfilter:

# lctl get_param *.*.writethrough_cache_enable     
obdfilter.testfs-OST0000.writethrough_cache_enable=1
obdfilter.testfs-OST0001.writethrough_cache_enable=1
obdfilter.testfs-OST0002.writethrough_cache_enable=1
osd-ldiskfs.testfs-OST0000.writethrough_cache_enable=1
osd-ldiskfs.testfs-OST0001.writethrough_cache_enable=1
osd-ldiskfs.testfs-OST0002.writethrough_cache_enable=1

# ls -l /proc/fs/lustre/obdfilter/testfs-OST0000/writethrough_cache_enable 
0 lrwxrwxrwx 1 root root 58 Nov  6 14:25 /proc/fs/lustre/obdfilter/testfs-OST0000/writethrough_cache_enable -> ../../osd-ldiskfs/testfs-OST0000/writethrough_cache_enable

This may be the crux of the problem. The "set_param -P" handler is just calling a usermode helper to walk the /proc path, and doesn't care about this. The "conf_param" handler is doing this inside the kernel and maybe doesn't understand about the symlink?

Comment by Peter Jones [ 06/Nov/13 ]

Emoly

Could you please look into this one?

Thanks

Peter

Comment by Emoly Liu [ 07/Nov/13 ]

I will verify/investigate the symlink issue questioned by Andreas.

Comment by Emoly Liu [ 08/Nov/13 ]

The root cause is that these symlink entries are not included by lproc variable list in ofd_process_config(). So, when searching through the list, the symlink can't be matched and ENOSYS is reported.

static int ofd_process_config(const struct lu_env *env, struct lu_device *d,
                              struct lustre_cfg *cfg)
{
...
        switch (cfg->lcfg_command) {
        case LCFG_PARAM: {
                struct lprocfs_static_vars lvars;
...
                lprocfs_ofd_init_vars(&lvars);
                rc = class_process_proc_param(PARAM_OST, lvars.obd_vars, cfg,
                                              d->ld_obd);
}

void lprocfs_ofd_init_vars(struct lprocfs_static_vars *lvars)
{                               
        lvars->module_vars  = lprocfs_ofd_module_vars;
        lvars->obd_vars     = lprocfs_ofd_obd_vars;
}   

This list is initialized by lprocfs_ofd_init_vars(), while those symlink ones are defined in ofd_procfs_add_brw_stats_symlink() separately.

We need to re-add the missing symlinks to the list.

Comment by Emoly Liu [ 08/Nov/13 ]

If we can get parent proc entry, calling lprocfs_srch() to find the param in class_process_proc_param() should be easier than updating the list in xxx_process_config().

I am working on the patch.

Comment by James A Simmons [ 08/Nov/13 ]

I noticed this ticket and I have been working on moving all the proc code over to the seq_file. With newer kernels you can no longer transverse the tree with lprocfs_srch so I cached the parent proc entry in struct obd_device. Just tried this with my work. I instead get a

[76965.678055] LustreError: 23899:0:(mgs_handler.c:744:mgs_iocontrol()) MGS: setparam err: rc = -22
[77095.862258] LustreError: 24014:0:(mgs_llog.c:335:mgs_new_fsdb()) fsname obdfilter is too long

If this is a problem for obdfilter I better it is broken for other things as well.

Comment by Emoly Liu [ 08/Nov/13 ]

James, thanks for your information! I know you are working on LU-3319. Did you try the command "lctl conf_param testfs-OST0000.ost.writethrough_cache_enable=0"? Could you please provide more logs or dmesg?

Thanks.

Comment by Emoly Liu [ 11/Nov/13 ]

I have a tentative patch without lprocfs_srch(). Now the problem is that how to get the real target proc entry, because the symlink entry has no read/write_proc.
I think there should be a better way than resolving the link path stored in entry->data.

Comment by Emoly Liu [ 12/Nov/13 ]

Thanks for fanyong's advice. We don't need to pass parent entry to get all the sub entries or try to resolve symbolic link path, instead, just pass conf param down the stack to let osd process it.

I add case LCFG_PARAM to osd_process_config() and now it can recognize these symlink params.

I will submit the patch later.

Comment by James A Simmons [ 12/Nov/13 ]

Sorry I have been busy with LU-3373/LU-3319 work. Sounds like you have a better solution than what I have. I look forward to applying your solution to my work.

Comment by Emoly Liu [ 12/Nov/13 ]

patch tracking at: http://review.whamcloud.com/8238

Comment by James A Simmons [ 13/Nov/13 ]

Testing with your patch I'm getting.

[root@spoon45 tests]# lctl conf_param lustre-OST0000.ost.writethrough_cache_enable=1
[root@spoon45 tests]# dmesg
[ 1383.895301] Lustre: Modifying parameter lustre-OST0000.ost.writethrough_cache_enable in log lustre-OST0000
[ 1392.871703] LustreError: 20254:0:(obd_config.c:1341:class_process_proc_param()) lustre-OST0000: unknown param writethrough_cache_enable=1
[root@spoon45 tests]# cat /proc/fs/lustre/obdfilter/lustre-OST0000/writethrough_cache_enable
1

Its doing the right thing but I see a error reported in dmesg. Other than that the patch fixed this problem.

Comment by Andreas Dilger [ 13/Nov/13 ]

I guess the obdfilter code is complaining because it is processing the "ost" parameter, but the named parameter doesn't exist. This is desirable under normal usage.

It probably makes sense to add an explicit skip of this parameter name in the obdfilter config processing code, if that is possible. There may be a couple of other parameters that moved from obdfilter to osd that should also be skipped in obdfilter.

Comment by Emoly Liu [ 14/Nov/13 ]

Yes, James, this error was reported because obdfilter didn't match the parameter in its static lproc var list, and then it passed the parameter to next module osd.

Andreas, IMHO, I want to pass the parent entry to get the whole sub entries list, but it's not so easy to follow the link to get the target.

Comment by James A Simmons [ 22/Nov/13 ]

Did testing with the latest patch and works as expected. No more complaining as well. Thanks.

Comment by Andreas Dilger [ 25/Nov/13 ]

James, the uncertainty I have is that I think this patch will allow this specific parameter to be set, but it will result in warnings for every other OFD parameter that is not in the OSD. If I'm incorrect in this understanding, then I'm happier to land the patch.

Comment by Emoly Liu [ 25/Nov/13 ]

Andreas, this patch won't result in warnings for every other OFD parameter that is not in the OSD, because the warnings only happen when the parameter isn't matched. But as you said in http://review.whamcloud.com/#/c/8238/3/lustre/obdclass/obd_config.c , this patch will make all the OFD parameters never get a warning.

I will improve the patch and add a test case.

Comment by Emoly Liu [ 02/Jan/14 ]

The patch for b2_6 has been landed.

A backport for b2_5 is at http://review.whamcloud.com/#/c/8618/ .

Comment by Peter Jones [ 02/Jan/14 ]

Landed for 2.6. Should land for 2.5.1 shortly.

Generated at Sat Feb 10 01:40:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.