[LU-16745] general protection fault: RIP: 0010:lustre_msg_get_opc+0x6/0xf0 [ptlrpc] Created: 17/Apr/23  Updated: 22/Jun/23  Resolved: 31/May/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Dongyang Li Assignee: Dongyang Li
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
[529973.611661] general protection fault: 0000 [#1] SMP NOPTI
[529973.617424] CPU: 3 PID: 3096 Comm: ptlrpcd_rcv Kdump: loaded Tainted: G          IOE    --------- -  - 4.18.0-425.10.1.el8_7.x86_64 #1
[529973.630082] Hardware name:  /0XFK4K, BIOS 2.5.4 01/13/2020
[529973.635897] RIP: 0010:lustre_msg_get_opc+0x6/0xf0 [ptlrpc]
[529973.641786] Code: c7 05 e2 46 0c 00 00 00 02 00 e8 25 d1 68 ff b8 68 12 00 00 5b e9 5a 00 1c f3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 <81> 7f 08 d3 0b d0 0b 48 89 fb 74 5d 48 b8 00 01 00 00 34 04 00 00
[529973.661122] RSP: 0018:ffffaf100cedfcd0 EFLAGS: 00010286
[529973.666680] RAX: 0000000000000000 RBX: ffff97089c56c800 RCX: ffff971299c1f4c0
[529973.674145] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 5a5a5a5a5a5a5a5a
[529973.681606] RBP: ffff9716482c6300 R08: 00000000000000d8 R09: 0000000000000000
[529973.689065] R10: 0000000000000000 R11: ffff97193c123eda R12: ffff970c49122080
[529973.696527] R13: ffff971299c1f610 R14: 0000000000000000 R15: 0000000000000000
[529973.703986] FS:  0000000000000000(0000) GS:ffff971dcf640000(0000) knlGS:0000000000000000
[529973.712402] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[529973.718473] CR2: 00007f2f0e58d000 CR3: 00000003c1410005 CR4: 00000000007706e0
[529973.725925] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[529973.733379] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[529973.740831] PKRU: 55555554
[529973.743857] Call Trace:
[529973.746620]  mdc_replay_open+0xaf/0x410 [mdc]
[529973.751308]  ptlrpc_replay_interpret+0x13d/0x720 [ptlrpc]
[529973.757087]  ? after_reply+0x8cc/0xd90 [ptlrpc]
[529973.761978]  ptlrpc_check_set.part.29+0x41d/0x1e60 [ptlrpc]
[529973.767906]  ? del_timer_sync+0x25/0x40
[529973.772052]  ? schedule_timeout+0x19f/0x300
[529973.776559]  ptlrpcd_check+0x3d9/0x5c0 [ptlrpc]
[529973.781453]  ptlrpcd+0x364/0x490 [ptlrpc]
[529973.785806]  ? wake_up_q+0x70/0x70
[529973.789504]  ? ptlrpcd_check+0x5c0/0x5c0 [ptlrpc]
[529973.794549]  kthread+0x10b/0x130
[529973.798069]  ? set_kthread_struct+0x50/0x50
[529973.802540]  ret_from_fork+0x1f/0x40
crash> struct ptlrpc_request.rq_cli.cr_cb_data ffff97089c56c800
  rq_cli.cr_cb_data = 0xffff970c49122080,
crash> struct md_open_data ffff970c49122080
struct md_open_data {
  mod_och = 0x0, 
  mod_open_req = 0xffff97089c56c800, 
  mod_close_req = 0xffff9716482c6300, 
  mod_refcount = {
    counter = 1
  }, 
  mod_is_create = true
}
crash> rd 0xffff9716482c6300
ffff9716482c6300:  5a5a5a5a5a5a5a5a                    ZZZZZZZZ


 Comments   
Comment by Gerrit Updater [ 17/Apr/23 ]

"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50656
Subject: LU-16745 mdc: md_open_data should keep ref on close_req
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e4bd83f87653e15b7432904b5799b9b5ac5674f3

Comment by Gerrit Updater [ 31/May/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50656/
Subject: LU-16745 mdc: md_open_data should keep ref on close_req
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ca716f763f89af192ab86678ee9d14f49c80cae6

Comment by Peter Jones [ 31/May/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:29:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.