[LU-17453] Use dget_parent/dput during d_revalidate Created: 22/Jan/24 Updated: 29/Jan/24 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Shaun Tancheff | Assignee: | Shaun Tancheff |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
There appears to be a race that can be triggered by parallel-scale-nfsv3. In any case the use of dget/dput prevents the dentry from disappearing while it is being validated This can result in a crash: [ 1998.665129][T23978] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1998.672417][T23978] #PF: supervisor read access in kernel mode [ 1998.675733][T23978] #PF: error_code(0x0000) - not-present page [ 1998.679427][T23978] PGD 0 P4D 0 [ 1998.683876][T23978] Oops: 0000 [#1] PREEMPT SMP PTI [ 1998.686690][T23978] CPU: 4 PID: 23978 Comm: dd Kdump: loaded Tainted: G W OE N 5.14.21-150400.24.41-default #1 SLE15-SP4 e37e7aadb4e42246eb51815d42fa73d67a617d00 [ 1998.698157][T23978] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 1998.702225][T23978] RIP: 0010:path_openat+0x81a/0x1080 [ 1998.705221][T23978] Code: 00 f0 ff ff 49 89 c3 48 89 44 24 10 48 8b 4c 24 30 0f 86 c6 fa ff ff e9 fb fc ff ff 48 8b 5a 30 8b 15 c6 4c 9f 01 85 d2 75 0d <0f> b7 03 66 25 00 f0 66 3d 00 10 74 28 8b 0d ab 4c 9f 01 85 c9 75 [ 1998.723281][T23978] RSP: 0018:ffffa86744fc7c50 EFLAGS: 00010246 [ 1998.727185][T23978] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000004b00000000 [ 1998.731275][T23978] RDX: 0000000000000000 RSI: 0000000000000064 RDI: ffff8a19d4465418 [ 1998.736350][T23978] RBP: ffffa86744fc7e3c R08: 00000000000090c8 R09: 0000000000000001 [ 1998.742956][T23978] R10: ffff8a19dc9ad190 R11: ffff8a19cb28d300 R12: 0000000000008042 [ 1998.751696][T23978] R13: 0000000000000000 R14: ffffa86744fc7d00 R15: ffff8a19c2101600 [ 1998.757626][T23978] FS: 00007f1fb2ceb740(0000) GS:ffff8a19fbd00000(0000) knlGS:0000000000000000 [ 1998.769153][T23978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1998.774199][T23978] CR2: 0000000000000000 CR3: 000000011e8aa000 CR4: 00000000000406e0 [ 1998.777925][T23978] Call Trace: [ 1998.780410][T23978] <TASK> [ 1998.782562][T23978] ? do_filp_open+0xd9/0x140 [ 1998.789075][T23978] do_filp_open+0xc5/0x140 [ 1998.792461][T23978] ? _raw_spin_unlock+0xa/0x30 [ 1998.794971][T23978] ? kmem_cache_alloc+0x4d/0x4c0 [ 1998.797279][T23978] ? _raw_spin_unlock+0xa/0x30 [ 1998.800835][T23978] ? do_sys_openat2+0x23e/0x310 [ 1998.810720][T23978] do_sys_openat2+0x23e/0x310 [ 1998.815004][T23978] do_sys_open+0x57/0x80 [ 1998.818226][T23978] do_syscall_64+0x5b/0x80 [ 1998.821460][T23978] ? syscall_exit_to_user_mode+0x18/0x40 [ 1998.825997][T23978] ? _raw_spin_unlock+0xa/0x30 [ 1998.829523][T23978] ? filp_close+0x51/0x80 [ 1998.836459][T23978] ? syscall_exit_to_user_mode+0x18/0x40 [ 1998.839667][T23978] ? do_syscall_64+0x67/0x80 [ 1998.842921][T23978] ? exc_page_fault+0x67/0x150 [ 1998.846876][T23978] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 1998.850821][T23978] RIP: 0033:0x7f1fb27e077d |
| Comments |
| Comment by Gerrit Updater [ 22/Jan/24 ] |
|
"Shaun Tancheff <shaun.tancheff@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53757 |
| Comment by Patrick Farrell [ 22/Jan/24 ] |
|
Can you give any more details about how this race manifests? What goes wrong? |
| Comment by Shaun Tancheff [ 23/Jan/24 ] |
|
Updated description to include the crash back-trace. |
| Comment by Gerrit Updater [ 29/Jan/24 ] |
|
"Shaun Tancheff <shaun.tancheff@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53850 |