[LU-13030] pcc: auto_attach doesn't work after client cache cleared Created: 28/Nov/19 Updated: 05/Dec/19 Resolved: 05/Dec/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara | Assignee: | Qian Yingjin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | PCC | ||
| Environment: |
pcc |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
PCC auto_attach feature doesn't work and no re-attach after clear caches on client. PCC auto_attach feature doesn't work and no re-attach after clear caches on client.
Enable HSM
[root@c01 ~]# clush -w mds[01-02] lctl set_param mdt.*.hsm_control=enabled
mds01: mdt.ai400-MDT0000.hsm_control=enabled
mds02: mdt.ai400-MDT0001.hsm_control=enabled
Create PCC cache space
[root@c01 ~]# mkfs.ext4 /dev/sdb
[root@c01 ~]# mount /dev/sdb /pcc
Start HSM copytool
[root@c01 ~]# lhsmtool_posix --daemon --hsm-root /pcc --archive=1 /ai400
[root@c01 ~]# ps -ef | grep hsm
root 2872 1 0 15:09 ? 00:00:00 lhsmtool_posix --daemon --hsm-root /pcc --archive=1 /ai400
root 2874 2340 0 15:09 pts/0 00:00:00 grep --color=auto hsm
Enable PCC with projid=100
[root@c01 ~]# lctl pcc add /ai400 /pcc -p "projid={100} auto_attach=1 rwid=1"
[root@c01 ~]# lctl pcc list /ai400
/pcc:
rwid: 1
flags: 1f
autocache: projid={100}
Reproducer
[root@c01 ~]# mkdir /ai400/lpcc
[root@c01 ~]# lfs project -sp 100 /ai400/lpcc/
[root@c01 ~]# echo "QQQ" > /ai400/lpcc/test
[root@c01 ~]# lfs pcc state /ai400/lpcc/test
file: /ai400/lpcc/test, type: readwrite, PCC file: /0082/0000/0403/0000/0002/0000/0x200000403:0x82:0x0, user number: 0, flags: 1c
[root@c01 ~]# echo 3 > /proc/sys/vm/drop_caches
[root@c01 ~]# lfs pcc state /ai400/lpcc/test
file: /ai400/lpcc/test, type: none
[root@c01 ~]# lfs pcc state /ai400/lpcc/test
file: /ai400/lpcc/test, type: none
After drop cache on client, 'lfs pcc state' should trigger auto_attach, but it doesn't. |
| Comments |
| Comment by Gerrit Updater [ 28/Nov/19 ] |
|
Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/36892 |
| Comment by Gerrit Updater [ 03/Dec/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36892/ |
| Comment by Peter Jones [ 03/Dec/19 ] |
|
Landed for 2.13 |
| Comment by Shuichi Ihara [ 04/Dec/19 ] |
|
It seems patch is not enough? and still strange behaviors when if it triggers multiple "drop caches" on client. Here is case. # /work/tools/bin/fio -name=randread -ioengine=sync -rw=randread -blocksize=4k -iodepth=1 -direct=1 -size=100m -runtime=10 -numjobs=8 -group_reporting -directory=/ai400/proj100 -create_serialize=0 -filename_format='f.$jobnum.$filenum' [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: readwrite, PCC file: /0008/0000/0401/0000/0002/0000/0x200000401:0x8:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.1.0, type: readwrite, PCC file: /0006/0000/0401/0000/0002/0000/0x200000401:0x6:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.2.0, type: readwrite, PCC file: /0003/0000/0401/0000/0002/0000/0x200000401:0x3:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.3.0, type: readwrite, PCC file: /0005/0000/0401/0000/0002/0000/0x200000401:0x5:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.4.0, type: readwrite, PCC file: /0004/0000/0401/0000/0002/0000/0x200000401:0x4:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.5.0, type: readwrite, PCC file: /0007/0000/0401/0000/0002/0000/0x200000401:0x7:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.6.0, type: readwrite, PCC file: /0009/0000/0401/0000/0002/0000/0x200000401:0x9:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.7.0, type: readwrite, PCC file: /000a/0000/0401/0000/0002/0000/0x200000401:0xa:0x0, user number: 0, flags: 0 state is fine here. [root@c01 ~]# echo 3 > /proc/sys/vm/drop_caches [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: readwrite, PCC file: /0008/0000/0401/0000/0002/0000/0x200000401:0x8:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.1.0, type: readwrite, PCC file: /0006/0000/0401/0000/0002/0000/0x200000401:0x6:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.2.0, type: readwrite, PCC file: /0003/0000/0401/0000/0002/0000/0x200000401:0x3:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.3.0, type: readwrite, PCC file: /0005/0000/0401/0000/0002/0000/0x200000401:0x5:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.4.0, type: readwrite, PCC file: /0004/0000/0401/0000/0002/0000/0x200000401:0x4:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.5.0, type: readwrite, PCC file: /0007/0000/0401/0000/0002/0000/0x200000401:0x7:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.6.0, type: readwrite, PCC file: /0009/0000/0401/0000/0002/0000/0x200000401:0x9:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.7.0, type: readwrite, PCC file: /000a/0000/0401/0000/0002/0000/0x200000401:0xa:0x0, user number: 0, flags: 0 state is fine even after drop cache. However, if drop cache on client again, all state are none and doesn't re-attach even after 'lfs pcc state' command. [root@c01 ~]# echo 3 > /proc/sys/vm/drop_caches [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: none file: /ai400/proj100/f.1.0, type: none file: /ai400/proj100/f.2.0, type: none file: /ai400/proj100/f.3.0, type: none file: /ai400/proj100/f.4.0, type: none file: /ai400/proj100/f.5.0, type: none file: /ai400/proj100/f.6.0, type: none file: /ai400/proj100/f.7.0, type: none [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: none file: /ai400/proj100/f.1.0, type: none file: /ai400/proj100/f.2.0, type: none file: /ai400/proj100/f.3.0, type: none file: /ai400/proj100/f.4.0, type: none file: /ai400/proj100/f.5.0, type: none file: /ai400/proj100/f.6.0, type: none file: /ai400/proj100/f.7.0, type: none server and client are running rc2 build [root@c01 ~]# clush -w c01,es400nv-vm1 lctl get_param version es400nv-vm1: version=2.13.0_RC2 c01: version=2.13.0_RC2 |
| Comment by Gerrit Updater [ 04/Dec/19 ] |
|
Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/36923 |
| Comment by Shuichi Ihara [ 04/Dec/19 ] |
|
Yingjin found a root cause and he advised me the following fix. diff --git a/lustre/llite/llite_lib.c b/lustre/llite/llite_lib.c
index 36e34ed092..54a0db1fb6 100644
--- a/lustre/llite/llite_lib.c
+++ b/lustre/llite/llite_lib.c
@@ -1007,7 +1007,7 @@ void ll_lli_init(struct ll_inode_info *lli)
mutex_init(&lli->lli_pcc_lock);
lli->lli_pcc_state = PCC_STATE_FL_NONE;
lli->lli_pcc_inode = NULL;
- lli->lli_pcc_dsflags = PCC_DATASET_NONE;
+ lli->lli_pcc_dsflags = PCC_DATASET_INVALID;
lli->lli_pcc_generation = 0;
mutex_init(&lli->lli_group_mutex);
lli->lli_group_users = 0;
I've just confirmed this fix solved original problem finally! here is results. even after multiple drop caches on clients, pcc state is back again automatically after "lfs pcc state" command. /work/tools/bin/fio -name=randread -ioengine=sync -rw=randread -blocksize=4k -iodepth=1 -direct=1 -size=100m -runtime=10 -numjobs=8 -group_reporting -directory=/ai400/proj100 -create_serialize=0 -filename_format='f.$jobnum.$filenum' [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: readwrite, PCC file: /0009/0000/0401/0000/0002/0000/0x200000401:0x9:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.1.0, type: readwrite, PCC file: /000a/0000/0401/0000/0002/0000/0x200000401:0xa:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.2.0, type: readwrite, PCC file: /0003/0000/0401/0000/0002/0000/0x200000401:0x3:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.3.0, type: readwrite, PCC file: /0004/0000/0401/0000/0002/0000/0x200000401:0x4:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.4.0, type: readwrite, PCC file: /0008/0000/0401/0000/0002/0000/0x200000401:0x8:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.5.0, type: readwrite, PCC file: /0005/0000/0401/0000/0002/0000/0x200000401:0x5:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.6.0, type: readwrite, PCC file: /0006/0000/0401/0000/0002/0000/0x200000401:0x6:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.7.0, type: readwrite, PCC file: /0007/0000/0401/0000/0002/0000/0x200000401:0x7:0x0, user number: 0, flags: 0 [root@c01 ~]# echo 3 > /proc/sys/vm/drop_caches [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: readwrite, PCC file: /0009/0000/0401/0000/0002/0000/0x200000401:0x9:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.1.0, type: readwrite, PCC file: /000a/0000/0401/0000/0002/0000/0x200000401:0xa:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.2.0, type: readwrite, PCC file: /0003/0000/0401/0000/0002/0000/0x200000401:0x3:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.3.0, type: readwrite, PCC file: /0004/0000/0401/0000/0002/0000/0x200000401:0x4:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.4.0, type: readwrite, PCC file: /0008/0000/0401/0000/0002/0000/0x200000401:0x8:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.5.0, type: readwrite, PCC file: /0005/0000/0401/0000/0002/0000/0x200000401:0x5:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.6.0, type: readwrite, PCC file: /0006/0000/0401/0000/0002/0000/0x200000401:0x6:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.7.0, type: readwrite, PCC file: /0007/0000/0401/0000/0002/0000/0x200000401:0x7:0x0, user number: 0, flags: 0 [root@c01 ~]# echo 3 > /proc/sys/vm/drop_caches [root@c01 ~]# lfs pcc state /ai400/proj100/f.* file: /ai400/proj100/f.0.0, type: readwrite, PCC file: /0009/0000/0401/0000/0002/0000/0x200000401:0x9:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.1.0, type: readwrite, PCC file: /000a/0000/0401/0000/0002/0000/0x200000401:0xa:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.2.0, type: readwrite, PCC file: /0003/0000/0401/0000/0002/0000/0x200000401:0x3:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.3.0, type: readwrite, PCC file: /0004/0000/0401/0000/0002/0000/0x200000401:0x4:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.4.0, type: readwrite, PCC file: /0008/0000/0401/0000/0002/0000/0x200000401:0x8:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.5.0, type: readwrite, PCC file: /0005/0000/0401/0000/0002/0000/0x200000401:0x5:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.6.0, type: readwrite, PCC file: /0006/0000/0401/0000/0002/0000/0x200000401:0x6:0x0, user number: 0, flags: 0 file: /ai400/proj100/f.7.0, type: readwrite, PCC file: /0007/0000/0401/0000/0002/0000/0x200000401:0x7:0x0, user number: 0, flags: 0 Thanks Yingijin for your prompt fix! |
| Comment by Gerrit Updater [ 05/Dec/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36923/ |
| Comment by Peter Jones [ 05/Dec/19 ] |
|
Landed for 2.13 |