[LU-15706] "lctl pool_destroy" can't work correctly due to "SKIP" records Created: 30/Mar/22 Updated: 20/Oct/23 Resolved: 27/Jun/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.2 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Emoly Liu | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
"lctl pool_destroy" can't work correctly when MGS and MDT0 is from different node. Then I did the following test on two nodes(centos7-2 and centos7-4) and found it failed because it didn't process "SKIP" pool llog records properly. centos7-2 (MDT+2OST+client): [root@centos7-2 tests]# MGSDEV=/tmp/lustre-mgs mgs_HOST=centos7-4 mdt_HOST=centos7-2 ost1_HOST=centos7-2 ost2_HOST=centos7-2 PDSH="pdsh -S -Rssh -w" sh llmount.sh centos7-4(MGS): [root@centos7-4 tests]# lctl pool_new lustre.testpool Pool lustre.testpool created [root@centos7-4 tests]# lctl pool_add lustre.testpool OST0000 OST lustre-OST0000_UUID added to pool lustre.testpool [root@centos7-4 tests]# lctl pool_add lustre.testpool OST0001 OST lustre-OST0001_UUID added to pool lustre.testpool [root@centos7-4 tests]# lctl pool_destroy lustre.testpool Pool lustre.testpool not empty, please remove all members pool_destroy: Directory not empty [root@centos7-4 tests]# lctl pool_remove lustre.testpool OST0000 OST lustre-OST0000_UUID removed from pool lustre.testpool [root@centos7-4 tests]# lctl pool_destroy lustre.testpool Pool lustre.testpool destroyed "testpool" was destroyed wrongly even OST0001 was still in the pool. After checking the llog records, I found "SKIP pool add 0:lustre-clilov 1:lustre 2:testpool 3:lustre-OST0000_UUID" was skipped. #43 (112)SKIP pool new 0:lustre-clilov 1:lustre 2:testpool #44 (224)SKIP END marker 29 (flags=0x06, v2.15.0.0) lustre-clilov 'new lustre.testpool' Wed Mar 30 11:02:17 2022-Wed Mar 30 11:02:51 2022 #45 (224)SKIP START marker 31 (flags=0x05, v2.15.0.0) lustre-clilov 'add lustre.testpool.lustre-OST0000_UUID' Wed Mar 30 11:02:28 2022-Wed Mar 30 11:02:49 2022 #46 (136)SKIP pool add 0:lustre-clilov 1:lustre 2:testpool 3:lustre-OST0000_UUID #48 (224)SKIP END marker 31 (flags=0x06, v2.15.0.0) lustre-clilov 'add lustre.testpool.lustre-OST0000_UUID' Wed Mar 30 11:02:28 2022-Wed Mar 30 11:02:49 2022 #49 (224)marker 33 (flags=0x01, v2.15.0.0) lustre-clilov 'add lustre.testpool.lustre-OST0001_UUID' Wed Mar 30 11:02:30 2022- #50 (136)pool add 0:lustre-clilov 1:lustre 2:testpool 3:lustre-OST0001_UUID #51 (224)END marker 33 (flags=0x02, v2.15.0.0) lustre-clilov 'add lustre.testpool.lustre-OST0001_UUID' Wed Mar 30 11:02:30 2022- #52 (224)marker 35 (flags=0x01, v2.15.0.0) lustre-clilov 'rem lustre.testpool.lustre-OST0000_UUID' Wed Mar 30 11:02:49 2022- #53 (136)pool remove 0:lustre-clilov 1:lustre 2:testpool 3:lustre-OST0000_UUID #54 (224)END marker 35 (flags=0x02, v2.15.0.0) lustre-clilov 'rem lustre.testpool.lustre-OST0000_UUID' Wed Mar 30 11:02:49 2022- #55 (224)marker 37 (flags=0x01, v2.15.0.0) lustre-clilov 'del lustre.testpool' Wed Mar 30 11:02:51 2022- #56 (112)pool destroy 0:lustre-clilov 1:lustre 2:testpool #57 (224)END marker 37 (flags=0x02, v2.15.0.0) lustre-clilov 'del lustre.testpool' Wed Mar 30 11:02:51 2022- I will look into this issue. |
| Comments |
| Comment by Gerrit Updater [ 30/Mar/22 ] |
|
"Emoly Liu <emoly@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46951 |
| Comment by Emoly Liu [ 01/Apr/22 ] |
|
During Maloo test and my local test, I also noticed the following issue, that is, if jt_llog_print_iter() tries to get llog records more times in for loop, the "boundary" record is easy to be processed by llog_print_cb()->class_config_yaml_output() improperly, skipped or not skipped wrongly. For example, in the failure at https://testing.whamcloud.com/test_logs/ca4871c9-d298-4a2a-99f6-7449aa6fb803/show_text , it says "trevis-39vm5: OST lustre-OST0006_UUID is not part of the 'lustre' fs.". I can reproduce this issue locally, and I found the reason as follows: #23 (128)attach 0:lustre-OST0000-osc 1:osc 2:lustre-clilov_UUID #24 (144)setup 0:lustre-OST0000-osc 1:lustre-OST0000_UUID 2:192.168.0.118@tcp #25 (128)lov_modify_tgts add 0:lustre-clilov 1:lustre-OST0000_UUID 2:0 3:1 But, the record output by class_config_yaml_output() shows 24th and 25th records were skipped wrongly, that is why OST0000 can't be found. - { index: 22, event: add_uuid, nid: 192.168.0.118@tcp(0x20000c0a80076), node: 192.168.0.118@tcp }
- { index: 23, event: attach, device: lustre-OST0000-osc, type: osc, UUID: lustre-clilov_UUID }
======BOUNDARY======
- { index: 31, event: add_uuid, nid: 192.168.0.118@tcp(0x20000c0a80076), node: 192.168.0.118@tcp }
- { index: 32, event: attach, device: lustre-OST0001-osc, type: osc, UUID: lustre-clilov_UUID }
Another, in lost-pool.sh test23a failed due to "Pool lustre.testpool not empty, please remove all members pool_destroy: Directory not empty ", because the 1412th record was not skipped correctly. The original llog records output by debugfs are: #1412 (136)SKIP pool add 0:lustre-clilov 1:lustre 2:testpool 3:lustre-OST0003_UUID #1413 (224)SKIP END marker 939 (flags=0x06, v2.15.0.0) lustre-clilov 'add lustre.testpool.lustre-OST0003_UUID' Fri Apr 1 08:47:01 2022-Fri Apr 1 08:47:02 2022 But class_config_yaml_output() didn't add SKIP to this record properly, so that OST0003 was considered still in testpool wrongly. =====BOUNDARY=====
- { index: 1412, event: add_pool, device: lustre-clilov, fsname: lustre, pool: testpool, ost: lustre-OST0003_UUID }
- { index: 1428, event: remove_pool, device: lustre-clilov, fsname: lustre, pool: testpool, ost: lustre-OST0001_UUID }
I will look into the following part of code in class_config_yaml_output() and make a fix. if (lcfg->lcfg_command == LCFG_MARKER) {
struct cfg_marker *marker = lustre_cfg_buf(lcfg, 1);
lustre_swab_cfg_marker(marker, swab,
LUSTRE_CFG_BUFLEN(lcfg, 1));
if (marker->cm_flags & CM_START) {
*cfg_flags = CFG_F_MARKER;
if (marker->cm_flags & CM_SKIP)
*cfg_flags = CFG_F_SKIP;
} else if (marker->cm_flags & CM_END) {
*cfg_flags = 0;
}
if (likely(!raw))
return 0;
}
/* entries outside marker are skipped */
if (!(*cfg_flags & CFG_F_MARKER) && !raw)
return 0;
/* inside skipped marker */
if (*cfg_flags & CFG_F_SKIP && !raw)
return 0;
|
| Comment by Emoly Liu [ 06/Apr/22 ] |
|
The root cause of this issue is that if the "start" several record in one ioctl request are just the ones "protected" by one marker, they will don't know their "SKIP or not" flag, and then will be mis-labeled in class_config_yaml_output(). I'm trying to remember the flag in llog_print_cb() to fix this issue. |
| Comment by Gerrit Updater [ 27/Jun/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46951/ |
| Comment by Peter Jones [ 27/Jun/22 ] |
|
Landed for 2.16 |
| Comment by Gerrit Updater [ 24/Nov/22 ] |
|
"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49230 |
| Comment by Gerrit Updater [ 06/Dec/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49230/ |
| Comment by Gerrit Updater [ 20/Oct/23 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52773 |