[LU-17033] Add RCU protect for export nid operation Created: 16/Aug/23 Updated: 23/Sep/23 Resolved: 23/Sep/23 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Minor |
| Reporter: | Yang Sheng | Assignee: | Yang Sheng |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
A few of crash relate to exp_nid_hash. Looks it was operated without RCU protect. [ 257.896656] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e2 [ 257.897791] IP: [<ffffffffc0cf1eb0>] ldebugfs_rhash_seq_show+0xa0/0x1e0 [obdclass] [ 257.898814] PGD 21c80e0067 PUD 21bab0c067 PMD 0 [ 257.899472] Oops: 0000 [#1] SMP [ 257.914018] CPU: 9 PID: 13241 Comm: lctl Kdump: loaded Tainted: G OE ------------ T 3.10.0-1160.95.1.el7_lustre.ddn17.x86_64 #1 [ 257.915601] Hardware name: DDN SFA400NVX2E, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 257.916811] task: ffffa1678707d280 ti: ffffa168c6f54000 task.ti: ffffa168c6f54000 [ 257.917773] RIP: 0010:[<ffffffffc0cf1eb0>] [<ffffffffc0cf1eb0>] ldebugfs_rhash_seq_show+0xa0/0x1e0 [obdclass] [ 257.919093] RSP: 0018:ffffa168c6f57d78 EFLAGS: 00010246 [ 257.944326] Call Trace: [ 257.945836] [<ffffffff8c084e93>] ? seq_printf+0x53/0x80 [ 257.947705] [<ffffffffc0cf20b0>] lprocfs_hash_seq_show+0x60/0x90 [obdclass] [ 257.949770] [<ffffffffc15ff862>] mgs_hash_seq_show+0x12/0x20 [mgs] [ 257.951731] [<ffffffff8c0857f8>] seq_read+0x138/0x460 [ 257.953549] [<ffffffff8c0d7ad0>] proc_reg_read+0x40/0x80 [ 257.955357] [<ffffffff8c05bb2f>] vfs_read+0x9f/0x170 [ 257.957088] [<ffffffff8c05c9a5>] SyS_read+0x55/0xd0 [ 257.958780] [<ffffffff8c5c639a>] system_call_fastpath+0x25/0x2a ..... [ 8320.870019] BUG: unable to handle kernel NULL pointer dereference at 00000000000001ca [ 8320.872531] IP: [<ffffffff98db7459>] rht_deferred_worker+0x209/0x430 [ 8320.874773] PGD 0 [ 8320.876458] Oops: 0000 [#1] SMP [ 8320.904160] CPU: 13 PID: 3272 Comm: kworker/13:1 Kdump: loaded Tainted: G OE ------------ T 3.10.0-1160.88.1.el7_lustre.ddn17.x86_64 #1 [ 8320.907100] Hardware name: DDN SFA400NVX2E, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 8320.909544] Workqueue: events rht_deferred_worker [ 8320.911387] task: ffff89c6dfdb3180 ti: ffff89e8c3994000 task.ti: ffff89e8c3994000 [ 8320.913572] RIP: 0010:[<ffffffff98db7459>] [<ffffffff98db7459>] rht_deferred_worker+0x209/0x430 [ 8320.939508] Call Trace: [ 8320.940810] [<ffffffff98ac32ef>] process_one_work+0x17f/0x440 [ 8320.942542] [<ffffffff98ac4436>] worker_thread+0x126/0x3c0 [ 8320.944188] [<ffffffff98ac4310>] ? manage_workers.isra.26+0x2b0/0x2b0 [ 8320.946001] [<ffffffff98acb621>] kthread+0xd1/0xe0 [ 8320.947555] [<ffffffff98acb550>] ? insert_kthread_work+0x40/0x40 [ 8320.949308] [<ffffffff991c61dd>] ret_from_fork_nospec_begin+0x7/0x21 [ 8320.951057] [<ffffffff98acb550>] ? insert_kthread_work+0x40/0x40 |
| Comments |
| Comment by Gerrit Updater [ 16/Aug/23 ] |
|
"Yang Sheng <ys@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51957 |
| Comment by Yang Sheng [ 16/Aug/23 ] |
|
Hi, Neil, As you asked, faddr2line result:
VMrhel7# LANG=C bash ~/git/linux/scripts/faddr2line vmlinux rht_deferred_worker+0x209/0x430
rht_deferred_worker+0x209/0x430:
rhashtable_rehash_one at lib/rhashtable.c:275
(inlined by) rhashtable_rehash_chain at lib/rhashtable.c:315
(inlined by) rhashtable_rehash_table at lib/rhashtable.c:363
(inlined by) rht_deferred_worker at lib/rhashtable.c:464
.........
rht_for_each(entry, old_tbl, old_hash) {
err = 0;
next = rht_dereference_bucket(entry->next, old_tbl, old_hash); <<<--------
if (rht_is_a_nulls(next))
break;
pprev = &entry->next;
}
The main problem as below: for stack:
[ 471.820067] BUG: unable to handle kernel NULL pointer dereference at 0000000000000142
[ 471.822528] IP: [<ffffffffa07b7536>] rht_deferred_worker+0x226/0x430
[ 471.851583] CPU: 23 PID: 316 Comm: kworker/23:2 Kdump: loaded Tainted: G OE ------------ T 3.10.0-1160.95.1.el7_lustre.ddn17.x86_64 #1
[ 471.854631] Hardware name: DDN SFA400NVX2E, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 471.857301] Workqueue: events rht_deferred_worker
[ 471.859330] task: ffff9ed3a5770000 ti: ffff9ed3b6960000 task.ti: ffff9ed3b6960000
[ 471.861664] RIP: 0010:[<ffffffffa07b7536>] [<ffffffffa07b7536>] rht_deferred_worker+0x226/0x430
[ 471.864180] RSP: 0018:ffff9ed3b6963da0 EFLAGS: 00010246
[ 471.866235] RAX: ffff9ed3e63944b8 RBX: 0000000000000142 RCX: 0000000000000000
[ 471.868508] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ed3d8c46c8c
[ 471.870756] RBP: ffff9ed3b6963e18 R08: ffff9ed5d67608b0 R09: 0000000000000598
[ 471.872993] R10: 00000000a77101d6 R11: 00000000c7893a1b R12: 0000000000000139
[ 471.875213] R13: ffff9ed494bbe000 R14: ffff9ed3e63944b8 R15: ffff9ed457ea2498
[ 471.877431] FS: 0000000000000000(0000) GS:ffff9ed6315c0000(0000) knlGS:0000000000000000
[ 471.879730] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 471.881757] CR2: 0000000000000142 CR3: 00000023001ea000 CR4: 0000000000760fe0
[ 471.883901] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 471.886028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 471.888141] PKRU: 00000000
[ 471.889741] Call Trace:
The table is exp_nid_hash:
crash> bucket_table 0xffff9ed3f1c73000
struct bucket_table {
size = 256,
nest = 0,
rehash = 163,
hash_rnd = 2859063006,
locks_mask = 127,
locks = 0xffff9ed3d8c46c00,
walkers = {
next = 0xffff9ed3f1c73020,
prev = 0xffff9ed3f1c73020
},
rcu = {
next = 0x0,
func = 0x0
},
future_tbl = 0xffff9ed494bbe000,
buckets = 0xffff9ed3f1c73080
}
Then look into bucket:
.......
ffff9ed3f1c73580: 0000000000000141 0000000000000143 A.......C.......
ffff9ed3f1c73590: 0000000000000145 ffff9ed3e63944b8 <<<<------ E........D9.....
ffff9ed3f1c735a0: 0000000000000149 000000000000014b I.......K..............
crash> rd ffff9ed3e63944b8
ffff9ed3e63944b8: 0000000000000142 <<<<----- it should be 000147, marker as a null entry, but was set to 0000142.
Other few of instance also in such case. So i suspect the exp_nid_hash lost some locking or barrier. Thanks, |