Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.10.2
-
Soak test cluster
-
3
-
9223372036854775807
Description
Testing https://review.whamcloud.com/29341.(Revert patch for LU-9810 to determine if preferring
Fast Reg breaks mounting targets.)
System mounts fine (LU-10068) - but after a few hours, routers have LBUG:
Oct 5 16:25:31 soak-14 kernel: LNet: 2153:0:(o2iblnd_modparams.c:253:kiblnd_tunables_setup()) Invalid map_on_demand (0), expects 1 - 256. Using default of 256 Oct 5 16:25:31 soak-14 kernel: LNet: Using FMR for registration Oct 5 16:25:31 soak-14 kernel: LNetError: 4:0:(o2iblnd_cb.c:2304:kiblnd_passive_connect()) Can't accept conn from 192.168.1.121@o2ib on NA (ib1:0:192.168.1.114): bad dst nid 192.168.1.114@o2ib Oct 5 16:25:31 soak-14 kernel: LNet: Added LNI 192.168.1.114@o2ib [8/256/0/180] Oct 5 16:25:31 soak-14 kernel: LNet: Added LNI 172.16.1.14@o2ib1 [128/2048/0/180] Oct 5 16:25:31 soak-14 sshd[2130]: Received disconnect from 10.10.1.116 port 38944:11: disconnected by user Oct 5 16:25:31 soak-14 sshd[2130]: Disconnected from 10.10.1.116 port 38944 Oct 5 16:25:31 soak-14 sshd[2130]: pam_unix(sshd:session): session closed for user root Oct 5 16:25:31 soak-14 systemd-logind: Removed session 4. Oct 5 16:25:31 soak-14 systemd: Removed slice User Slice of root. Oct 5 16:25:31 soak-14 systemd: Stopping User Slice of root. Oct 5 16:37:04 soak-14 kernel: LNetError: 1979:0:(lib-move.c:2121:lnet_send()) ASSERTION( msg->msg_txpeer == ((void *)0) ) failed: Oct 5 16:37:04 soak-14 kernel: LNetError: 1979:0:(lib-move.c:2121:lnet_send()) LBUG Oct 5 16:37:04 soak-14 kernel: Pid: 1979, comm: lnet_discovery Oct 5 16:37:05 soak-14 kernel: #012Call Trace: Oct 5 16:37:05 soak-14 kernel: [<ffffffffc09ec7ae>] libcfs_call_trace+0x4e/0x60 [libcfs] Oct 5 16:37:05 soak-14 kernel: [<ffffffffc09ec83c>] lbug_with_loc+0x4c/0xb0 [libcfs] Oct 5 16:37:05 soak-14 kernel: [<ffffffffc0a7179e>] lnet_send+0x17e/0x180 [lnet] Oct 5 16:37:05 soak-14 kernel: [<ffffffffc0a80ef8>] lnet_peer_discovery_complete+0x178/0x320 [lnet] Oct 5 16:37:05 soak-14 kernel: [<ffffffffc0a868a8>] lnet_peer_discovery+0x588/0x1030 [lnet] Oct 5 16:37:05 soak-14 kernel: [<ffffffff810b1910>] ? autoremove_wake_function+0x0/0x40 Oct 5 16:37:05 soak-14 kernel: [<ffffffffc0a86320>] ? lnet_peer_discovery+0x0/0x1030 [lnet] Oct 5 16:37:05 soak-14 kernel: [<ffffffff810b098f>] kthread+0xcf/0xe0 Oct 5 16:37:05 soak-14 kernel: [<ffffffff810b08c0>] ? kthread+0x0/0xe0 Oct 5 16:37:05 soak-14 kernel: [<ffffffff816b4f58>] ret_from_fork+0x58/0x90 Oct 5 16:37:05 soak-14 kernel: [<ffffffff810b08c0>] ? kthread+0x0/0xe0 Oct 5 16:37:05 soak-14 kernel: Oct 5 16:37:05 soak-14 kernel: Kernel panic - not syncing: LBUG