[LU-10086] LNET_MINOR conflicts with USERIO_MINOR Created: 05/Oct/17 Updated: 27/Nov/17 Resolved: 06/Nov/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.1 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Dmitry Eremin (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
MOFED4 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
When loading lustre module we sometimes get this error. r147i0n0 ~ # modprobe libcfs
dmesg [ 3067.259489] LNet: HW NUMA nodes: 2, HW CPU cores: 80, npartitions: 2 [ 3067.259497] LNetError: 4183:0:(module.c:742:libcfs_init()) misc_register: error -16 module configs options lnet networks=o2ib314(ib1) routes="o2ib 10.149.26.[60,140,141,142,143,144,145]@o2ib314 10.149.25.[195,196,197,198,205]@o2ib314" dead_router_check_interval=60 live_router_check_interval=31
options lnet avoid_asym_router_failure=1 check_routers_before_use=1 small_router_buffers=65536 large_router_buffers=8192
lscpu 147i0n0 ~ # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 80 On-line CPU(s) list: 0-79 Thread(s) per core: 2 Core(s) per socket: 20 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz Stepping: 4 CPU MHz: 1000.000 CPU max MHz: 2401.0000 CPU min MHz: 1000.0000 BogoMIPS: 4799.99 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 28160K NUMA node0 CPU(s): 0-19,40-59 NUMA node1 CPU(s): 20-39,60-79 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_act_window hwp_epp hwp_pkg_req intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc |
| Comments |
| Comment by Mahmoud Hanafi [ 05/Oct/17 ] |
r147i0n0 ~ # lsmod Module Size Used by userio 16384 0 rpcsec_gss_krb5 36864 0 auth_rpcgss 65536 1 rpcsec_gss_krb5 nfsv4 528384 2 dns_resolver 16384 1 nfsv4 8021q 32768 0 garp 16384 1 8021q mrp 20480 1 8021q mlx5_ib 241664 0 rdma_ucm 24576 0 rdma_cm 65536 1 rdma_ucm iw_cm 49152 1 rdma_cm configfs 40960 2 rdma_cm ib_ipoib 184320 0 ib_cm 53248 2 rdma_cm,ib_ipoib ib_uverbs 86016 1 rdma_ucm ib_umad 24576 0 mlx4_ib 212992 0 ib_core 270336 9 rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib mlx4_core 376832 1 mlx4_ib iscsi_ibft 16384 0 iscsi_boot_sysfs 20480 1 iscsi_ibft intel_rapl 24576 0 x86_pkg_temp_thermal 16384 0 coretemp 16384 0 kvm_intel 184320 0 kvm 593920 1 kvm_intel irqbypass 16384 1 kvm crct10dif_pclmul 16384 0 crc32_pclmul 16384 0 joydev 20480 0 ghash_clmulni_intel 16384 0 drbg 28672 1 ansi_cprng 16384 0 msr 16384 0 ast 61440 1 mlx5_core 679936 1 mlx5_ib ttm 110592 1 ast aesni_intel 167936 0 drm_kms_helper 155648 1 ast inet_lro 16384 2 mlx5_core,ib_ipoib aes_x86_64 20480 1 aesni_intel mlx_compat 16384 12 rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib lrw 16384 1 aesni_intel drm 393216 4 ast,ttm,drm_kms_helper vxlan 49152 1 mlx5_core gf128mul 16384 1 lrw syscopyarea 16384 1 drm_kms_helper iTCO_wdt 16384 0 glue_helper 16384 1 aesni_intel ip6_udp_tunnel 16384 1 vxlan sysfillrect 16384 1 drm_kms_helper ipmi_ssif 28672 0 iTCO_vendor_support 16384 1 iTCO_wdt ablk_helper 16384 1 aesni_intel sysimgblt 16384 1 drm_kms_helper ipmi_devintf 20480 0 cryptd 20480 3 ghash_clmulni_intel,aesni_intel,ablk_helper fb_sys_fops 16384 1 drm_kms_helper udp_tunnel 16384 1 vxlan pcspkr 16384 0 i2c_i801 28672 0 lpc_ich 24576 0 mfd_core 16384 1 lpc_ich shpchp 36864 0 wmi 16384 0 ipmi_si 61440 0 ipmi_msghandler 53248 3 ipmi_ssif,ipmi_devintf,ipmi_si acpi_cpufreq 20480 0 nfit 45056 0 processor 45056 1 acpi_cpufreq libnvdimm 139264 1 nfit acpi_pad 180224 0 button 16384 0 tcp_bic 16384 21 numatools 65536 0 hwperf 184320 0 xpmem 102400 0 xp 16384 1 xpmem gru 114688 1 xp xvma 24576 2 gru,xpmem sg 40960 0 dm_multipath 28672 0 dm_mod 126976 1 dm_multipath scsi_dh_rdac 20480 0 scsi_dh_emc 16384 0 scsi_dh_alua 20480 0 efivarfs 16384 1 autofs4 45056 2 nfsv3 45056 9 nfs_acl 16384 1 nfsv3 nfs 274432 14 nfsv3,nfsv4 lockd 102400 2 nfs,nfsv3 grace 16384 1 lockd sunrpc 364544 77 nfs,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv3,nfsv4,nfs_acl fscache 77824 2 nfs,nfsv4 bridge 143360 0 stp 16384 2 garp,bridge llc 16384 3 stp,garp,bridge hid_generic 16384 0 usbhid 53248 0 igb 212992 0 ahci 36864 0 i2c_algo_bit 16384 2 ast,igb libahci 36864 1 ahci xhci_pci 16384 0 dca 16384 1 igb libata 270336 2 ahci,libahci ptp 20480 2 igb,mlx5_core xhci_hcd 192512 1 xhci_pci pps_core 20480 1 ptp usbcore 262144 3 usbhid,xhci_hcd,xhci_pci scsi_mod 266240 6 sg,scsi_dh_alua,scsi_dh_rdac,dm_multipath,scsi_dh_emc,libata usb_common 16384 1 usbcore af_packet 45056 0 crc32c_intel 24576 0 fjes 32768 0 |
| Comment by Mahmoud Hanafi [ 05/Oct/17 ] |
|
so this looks like a conflict with module userio. lrwxrwxrwx 1 root root 0 Oct 5 12:13 10:240 -> ../../devices/virtual/misc/userio this worked. r147i0n0 /sys/devices/virtual/misc/userio # modinfo userio
filename: /lib/modules/4.4.74-92.32.1.20170808-nasa/kernel/drivers/input/serio/userio.ko
license: GPL
description: Virtual Serio Device Support
author: Stephen Chandler Paul <thatslyude@gmail.com>
alias: devname:userio
alias: char-major-10-240
srcversion: 20AD3540D75DF95784E6C13
depends:
supported: no
intree: Y
vermagic: 4.4.74-92.32.1.20170808-nasa SMP mod_unload modversions
r147i0n0 /sys/devices/virtual/misc/userio # rmmod userio
r147i0n0 /sys/devices/virtual/misc/userio # modprobe libcfs
|
| Comment by Joseph Gmitter (Inactive) [ 06/Oct/17 ] |
|
Hi Dmitry, Can you please advise here? Thanks. |
| Comment by Dmitry Eremin (Inactive) [ 06/Oct/17 ] |
|
LNet driver and UserIO have the same driver MINOR. Therefore they cannot be loaded simultaneously. #define USERIO_MINOR 240 #define LNET_MINOR 240 |
| Comment by Dmitry Eremin (Inactive) [ 06/Oct/17 ] |
|
How critical to have both dirvers loaded simultaneously? |
| Comment by Mahmoud Hanafi [ 10/Oct/17 ] |
|
Not critical for us. But it cause some confusion when trying to figure it all out.
|
| Comment by John Hammond [ 23/Oct/17 ] |
|
It would be good to remove LNET_MINOR entirely and switch to using MISC_DYNAMIC_MINOR when we register libcfs_dev. At the same time we should remove the (fairly questionable) code in lctl the creates /dev/lnet if it doesn't exist. |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/29741 Project: fs/lustre-release |
| Comment by Gerrit Updater [ 06/Nov/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29741/ Project: fs/lustre-release |
| Comment by Peter Jones [ 06/Nov/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 06/Nov/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29945 Project: fs/lustre-release |
| Comment by Gerrit Updater [ 27/Nov/17 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29945/ Project: fs/lustre-release |