[LU-12537] recovery-mds-scale failover_mds: BUG: unable to handle kernel NULL pointer dereference at (null) IP: lnet_attach_rsp_tracker.isra.32+0xad/0x200 [lnet] Created: 11/Jul/19  Updated: 09/Aug/19  Resolved: 09/Aug/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Major
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

A race of sorts when starting LNet. The discovery thread is started before the monitor thread, so we may have PUT/GET issued before the monitor thread has initialized its data structures. Results in oops:

[  297.984001] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  297.987525] IP: [<ffffffffc07db70d>] lnet_attach_rsp_tracker.isra.32+0xad/0x200 [lnet]
[  297.990999] PGD 0 
[  297.991954] Oops: 0000 [#1] SMP 
[  297.993493] Modules linked in: ksocklnd(OE) ptlrpc(OE+) obdclass(OE) lnet(OE) libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma(OE) xprtrdma(OE) ib_isert(OE) ib_iser(OE) ib_srpt(OE) ib_srp(OE) ib_ipoib(OE) rdma_ucm(OE) mlx5_ib(OE) ib_ucm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_core(OE) rdma_cm(OE) ib_cm(OE) iw_cm(OE) mlx4_ib(OE) ib_core(OE) mlx4_en(OE) ptp pps_core mlx4_core(OE) mlx_compat(OE) devlink cirrus ttm drm_kms_helper syscopyarea ppdev sysfillrect sysimgblt fb_sys_fops drm i2c_piix4 pcspkr joydev i2c_core virtio_balloon parport_pc parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_net virtio_blk ata_piix libata serio_raw virtio_pci virtio_ring virtio floppy
[  298.031286] CPU: 1 PID: 11179 Comm: lnet_discovery Tainted: G           OE  ------------   3.10.0-693.21.1.x3.2.152.x86_64 #1
[  298.038606] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  298.042341] task: ffff8800b897cf10 ti: ffff880134cf8000 task.ti: ffff880134cf8000
[  298.047182] RIP: 0010:[<ffffffffc07db70d>]  [<ffffffffc07db70d>] lnet_attach_rsp_tracker.isra.32+0xad/0x200 [lnet]
[  298.053925] RSP: 0018:ffff880134cfbcc0  EFLAGS: 00010296
[  298.057366] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000017
[  298.061963] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800b8b6df80
[  298.066598] RBP: ffff880134cfbcf0 R08: 00000000c565cd78 R09: 0000000000000000
[  298.071218] R10: 0000000000000009 R11: fffffffffffffff0 R12: ffff8800b8b6df80
[  298.075816] R13: ffff8800b8b6df80 R14: 0000004787d10555 R15: ffff880136265060
[  298.080443] FS:  0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[  298.085693] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  298.089432] CR2: 0000000000000000 CR3: 00000000359c2000 CR4: 00000000000006e0
[  298.094034] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  298.098707] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  298.103328] Call Trace:
[  298.104940]  [<ffffffffc07e2751>] LNetGet+0x5d1/0xa80 [lnet]
[  298.108750]  [<ffffffffc07e2e05>] lnet_send_ping+0x1a5/0x1d0 [lnet]
[  298.112811]  [<ffffffffc07f24e8>] lnet_peer_send_ping+0xa8/0x250 [lnet]
[  298.117082]  [<ffffffffc07f7a20>] lnet_peer_discovery+0x800/0x11a0 [lnet]
[  298.121481]  [<ffffffff810b4fc0>] ? wake_up_atomic_t+0x30/0x30
[  298.123330]  [<ffffffffc07f7220>] ? lnet_peer_merge_data+0xf90/0xf90 [lnet]
[  298.126429]  [<ffffffff810b4031>] kthread+0xd1/0xe0
[  298.129468]  [<ffffffff810c28a7>] ? finish_task_switch+0x57/0x170
[  298.135321]  [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
[  298.141100]  [<ffffffff816c4577>] ret_from_fork+0x77/0xb0
[  298.146445]  [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
[  298.152179] Code: 03 00 e8 17 b2 f7 ff 4d 3b 24 24 74 11 4c 89 e7 e8 79 89 b6 c0 4d 89 24 24 4d 89 64 24 08 48 8b 05 01 8b 03 00 48 63 d3 4c 89 e7 <48> 8b 14 d0 48 8b 72 08 e8 96 88 b6 c0 48 8b 3d 9f 88 03 00 89 
[  298.173797] RIP  [<ffffffffc07db70d>] lnet_attach_rsp_tracker.isra.32+0xad/0x200 [lnet]
[  298.182665]  RSP <ffff880134cfbcc0>
[  298.187675] CR2: 0000000000000000

The oops was here:

/usr/src/debug/lustre-2.12.0.1_cray_136_g0e9df14/lnet/lnet/lib-move.c: 4623
0xffffffffc07d670d <lnet_attach_rsp_tracker+173>:   mov    (%rax,%rdx,8),%rdx

Which is:

list_add_tail(&local_rspt->rspt_on_list, the_lnet.ln_mt_rstq[cpt]);

And...:

crash_x86_64> the_lnet | grep ln_mt_rstq
  ln_mt_rstq = 0x0,
crash_x86_64>
 rc = lnet_peer_discovery_start();
        if (rc != 0)
                goto err_destroy_push_target;
 
        rc = lnet_monitor_thr_start();
        if (rc != 0)
                goto err_stop_discovery_thr;
crash_x86_64> ps | grep '>'
> 11163  11162   1  ffff880134cd8fd0  RU   0.0   18876   1032  modprobe
> 11176      2   0  ffff880135efcf10  RU   0.0       0      0  [lnet_discovery]
crash_x86_64>


 Comments   
Comment by Patrick Farrell (Inactive) [ 11/Jul/19 ]

ashehata, I thought I saw you talking about a similar race recently?

Comment by Gerrit Updater [ 11/Jul/19 ]

Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35478
Subject: LU-12537 lnet: Have discovery wait for monitor thread
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d74ab549081f634b671dc920194abe6886b89e11

Comment by Amir Shehata (Inactive) [ 11/Jul/19 ]

pfarrell, this is a bug that Chris ran into in his testing. Not sureĀ  which race you're referring to.

Comment by Patrick Farrell (Inactive) [ 11/Jul/19 ]

I am probably just confused - Sounded to me like something I had seen.

Comment by Gerrit Updater [ 09/Aug/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35478/
Subject: LU-12537 lnet: Sync the start of discovery and monitor threads
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9283e2ed6655e89fe693d35313c9dcf1d5a6703a

Comment by Peter Jones [ 09/Aug/19 ]

Landed for 2.13

Generated at Sat Feb 10 02:53:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.