|
ct 16 09:00:35 lus03-mds1-1g LustreError: 14385:0:(brw_test.c:334:brw_client_done_rpc()) BRW RPC to 12345-172.17.148.5@tcp failed with -4
Oct 16 09:00:35 lus03-mds1-1g LustreError: 14385:0:(brw_test.c:334:brw_client_done_rpc()) Skipped 8 previous similar messages
Oct 16 09:00:35 lus03-mds1-1g LustreError: 14385:0:(brw_test.c:334:brw_client_done_rpc()) BRW RPC to 12345-172.17.148.5@tcp failed with -4
Oct 16 09:00:35 lus03-mds1-1g LustreError: 14385:0:(brw_test.c:334:brw_client_done_rpc()) Skipped 3 previous similar messages
Oct 16 09:05:32 lus03-mds1-1g LustreError: 166-1: MGC172.17.148.4@tcp: Connection to MGS (at 172.17.148.4@tcp) was lost; in progress operations using this service will fail
Oct 16 09:27:31 lus03-mds1-1g LustreError: 14384:0:(brw_test.c:334:brw_client_done_rpc()) BRW RPC to 12345-172.17.148.5@tcp failed with -4
Oct 16 09:27:31 lus03-mds1-1g LustreError: 14384:0:(brw_test.c:334:brw_client_done_rpc()) Skipped 8 previous similar messages
Oct 16 09:38:06 lus03-mds1-1g LustreError: 18334:0:(lmv_obd.c:1289:lmv_statfs()) can't stat MDS #0 (lus04-MDT0000-mdc-ffff8804a7607000), error -4
Oct 16 09:39:33 lus03-mds1-1g LustreError: 3711:0:(lov_obd.c:937:lov_cleanup()) lov tgt 0 not cleaned! deathrow=0, lovrc=1
Oct 16 09:39:33 lus03-mds1-1g LustreError: 18334:0:(obd_mount.c:1275:lustre_fill_super()) Unable to mount (-4)
Oct 16 09:45:10 lus03-mds1-1g LustreError: 19644:0:(lmv_obd.c:1289:lmv_statfs()) can't stat MDS #0 (lus04-MDT0000-mdc-ffff8804b3657c00), error -4
Oct 16 09:45:10 general protection fault: 0000 1
Oct 16 09:45:10 SMP
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Oct 16 09:45:10 CPU 1
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 Modules linked in:
Oct 16 09:45:10 lus03-mds1-1g lmv
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g mgc
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g lustre
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g lov
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g osc
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g mdc
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g fid
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g fld
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g ksocklnd
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g ptlrpc
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g obdclass
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g lnet
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g lvfs
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g sha512_generic
Oct 16 09:45:10 lus03-mds1-1g sha256_generic
Oct 16 09:45:10 lus03-mds1-1g libcfs
Oct 16 09:45:10 lus03-mds1-1g (U)
Oct 16 09:45:10 lus03-mds1-1g netconsole
Oct 16 09:45:10 lus03-mds1-1g configfs
Oct 16 09:45:10 lus03-mds1-1g 8021q
Oct 16 09:45:10 lus03-mds1-1g garp
Oct 16 09:45:10 lus03-mds1-1g bridge
Oct 16 09:45:10 lus03-mds1-1g stp
Oct 16 09:45:10 lus03-mds1-1g llc
Oct 16 09:45:10 lus03-mds1-1g acpi_cpufreq
Oct 16 09:45:10 lus03-mds1-1g mperf
Oct 16 09:45:10 lus03-mds1-1g cpufreq_powersave
Oct 16 09:45:10 lus03-mds1-1g cpufreq_ondemand
Oct 16 09:45:10 lus03-mds1-1g cpufreq_conservative
Oct 16 09:45:10 lus03-mds1-1g cpufreq_stats
Oct 16 09:45:10 lus03-mds1-1g freq_table
Oct 16 09:45:10 lus03-mds1-1g autofs4
Oct 16 09:45:10 lus03-mds1-1g dm_round_robin
Oct 16 09:45:10 lus03-mds1-1g nfsd
Oct 16 09:45:10 lus03-mds1-1g exportfs
Oct 16 09:45:10 lus03-mds1-1g nfs
Oct 16 09:45:10 lus03-mds1-1g lockd
Oct 16 09:45:10 lus03-mds1-1g fscache
Oct 16 09:45:10 lus03-mds1-1g auth_rpcgss
Oct 16 09:45:10 lus03-mds1-1g nfs_acl
Oct 16 09:45:10 lus03-mds1-1g sunrpc
Oct 16 09:45:10 lus03-mds1-1g ipv6
Oct 16 09:45:10 lus03-mds1-1g radeon
Oct 16 09:45:10 lus03-mds1-1g ttm
Oct 16 09:45:10 lus03-mds1-1g i5000_edac
Oct 16 09:45:10 lus03-mds1-1g ibmpex
Oct 16 09:45:10 lus03-mds1-1g drm_kms_helper
Oct 16 09:45:10 lus03-mds1-1g edac_core
Oct 16 09:45:10 lus03-mds1-1g ibmaem
Oct 16 09:45:10 lus03-mds1-1g drm
Oct 16 09:45:10 lus03-mds1-1g dm_multipath
Oct 16 09:45:10 lus03-mds1-1g dm_mod
Oct 16 09:45:10 lus03-mds1-1g ioatdma
Oct 16 09:45:10 lus03-mds1-1g i5k_amb
Oct 16 09:45:10 lus03-mds1-1g i2c_algo_bit
Oct 16 09:45:10 lus03-mds1-1g ics932s401
Oct 16 09:45:10 lus03-mds1-1g shpchp
Oct 16 09:45:10 lus03-mds1-1g i2c_core
Oct 16 09:45:10 lus03-mds1-1g dca
Oct 16 09:45:10 lus03-mds1-1g serio_raw
Oct 16 09:45:10 lus03-mds1-1g ext4
Oct 16 09:45:10 lus03-mds1-1g jbd2
Oct 16 09:45:10 lus03-mds1-1g mbcache
Oct 16 09:45:10 lus03-mds1-1g sr_mod
Oct 16 09:45:10 lus03-mds1-1g cdrom
Oct 16 09:45:10 lus03-mds1-1g ata_generic
Oct 16 09:45:10 lus03-mds1-1g ses
Oct 16 09:45:10 lus03-mds1-1g pata_acpi
Oct 16 09:45:10 lus03-mds1-1g sd_mod
Oct 16 09:45:10 lus03-mds1-1g crc_t10dif
Oct 16 09:45:10 lus03-mds1-1g enclosure
Oct 16 09:45:10 lus03-mds1-1g qla2xxx
Oct 16 09:45:10 lus03-mds1-1g scsi_transport_fc
Oct 16 09:45:10 lus03-mds1-1g ata_piix
Oct 16 09:45:10 lus03-mds1-1g bnx2
Oct 16 09:45:10 lus03-mds1-1g scsi_tgt
Oct 16 09:45:10 lus03-mds1-1g bnx2x
Oct 16 09:45:10 lus03-mds1-1g libcrc32c
Oct 16 09:45:10 lus03-mds1-1g mdio
Oct 16 09:45:10 lus03-mds1-1g aacraid
Oct 16 09:45:10 lus03-mds1-1g [last unloaded: lnet_selftest]
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 lus03-mds1-1g Pid: 19644, comm: mount.lustre Not tainted 2.6.32-jb23-358.18.1.el6-lustre-2.4.1 #1
Oct 16 09:45:10 lus03-mds1-1g IBM IBM System x3650 [7979B9G]
Oct 16 09:45:10 lus03-mds1-1g /System Planar
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 lus03-mds1-1g RIP: 0010:[<ffffffff81274922>]
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff81274922>] strlen+0x2/0x20
Oct 16 09:45:10 lus03-mds1-1g RSP: 0018:ffff880465975c90 EFLAGS: 00010246
Oct 16 09:45:10 lus03-mds1-1g RAX: 0000000000000000 RBX: ffff8804b3a6e6c0 RCX: 0000000000000008
Oct 16 09:45:10 lus03-mds1-1g RDX: 0000000000000001 RSI: 0000000000000008 RDI: 5a5a5a5a5a5a5a5a
Oct 16 09:45:10 lus03-mds1-1g RBP: ffff880465975ce8 R08: 0000000000000000 R09: 000000000000006f
Oct 16 09:45:10 lus03-mds1-1g R10: 0000000000000001 R11: 0000000000000000 R12: ffff88045f0e4b80
Oct 16 09:45:10 lus03-mds1-1g R13: 00000000fffffffc R14: ffff88045ee51f80 R15: ffff8804674b42c0
Oct 16 09:45:10 lus03-mds1-1g FS: 00007fc50d598700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
Oct 16 09:45:10 lus03-mds1-1g CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 16 09:45:10 lus03-mds1-1g CR2: 00007fff09fc6eb8 CR3: 000000045f4e6000 CR4: 00000000000407e0
Oct 16 09:45:10 lus03-mds1-1g DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 16 09:45:10 lus03-mds1-1g DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 16 09:45:10 Process mount.lustre (pid: 19644, threadinfo ffff880465974000, task ffff8804b4c93500)
Oct 16 09:45:10 lus03-mds1-1g Stack:
Oct 16 09:45:10 lus03-mds1-1g ffffffffa0c0f3e6
Oct 16 09:45:10 lus03-mds1-1g ffff8804a4cb12e0
Oct 16 09:45:10 lus03-mds1-1g ffff8804b3657c00
Oct 16 09:45:10 lus03-mds1-1g ffff880467438000
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 lus03-mds1-1g d>
Oct 16 09:45:10 lus03-mds1-1g ffff880464494800
Oct 16 09:45:10 lus03-mds1-1g ffff880465975ce8
Oct 16 09:45:10 lus03-mds1-1g 0000000000000000
Oct 16 09:45:10 lus03-mds1-1g ffff8804b3a6ed40
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 lus03-mds1-1g d>
Oct 16 09:45:10 lus03-mds1-1g ffff8804b51fa000
Oct 16 09:45:10 lus03-mds1-1g ffff8804a4cb12e0
Oct 16 09:45:10 lus03-mds1-1g ffff8804b3657c00
Oct 16 09:45:10 lus03-mds1-1g ffff880465975d88
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 Call Trace:
Oct 16 09:45:10 lus03-mds1-1g [<ffffffffa0c0f3e6>] ? ll_fill_super+0xd96/0x15b0 [lustre]
Oct 16 09:45:10 lus03-mds1-1g [<ffffffffa06dec81>] lustre_fill_super+0x771/0x24c0 [obdclass]
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8117b843>] ? sget+0x3a3/0x410
Oct 16 09:45:10 lus03-mds1-1g [<ffffffffa06de510>] ? lustre_fill_super+0x0/0x24c0 [obdclass]
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8117b90f>] get_sb_nodev+0x5f/0xa0
Oct 16 09:45:10 lus03-mds1-1g [<ffffffffa06d7955>] lustre_get_sb+0x25/0x30 [obdclass]
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8117b214>] vfs_kern_mount+0x74/0x1c0
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8117b3d4>] do_kern_mount+0x54/0x120
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8119a9eb>] do_mount+0x1cb/0x930
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff81158833>] ? alloc_pages_current+0xa3/0x110
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8111b19e>] ? __get_free_pages+0xe/0x50
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8119a69a>] ? copy_mount_options+0x3a/0x170
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8119b450>] sys_mount+0x90/0xe0
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Oct 16 09:45:10 lus03-mds1-1g Code:
Oct 16 09:45:10 48
Oct 16 09:45:10 89
Oct 16 09:45:10 e5
Oct 16 09:45:10 f6
Oct 16 09:45:10 82
Oct 16 09:45:10 80
Oct 16 09:45:10 11
Oct 16 09:45:10 af
Oct 16 09:45:10 81
Oct 16 09:45:10 20
Oct 16 09:45:10 74
Oct 16 09:45:10 15
Oct 16 09:45:10 0f
Oct 16 09:45:10 1f
Oct 16 09:45:10 44
Oct 16 09:45:10 00
Oct 16 09:45:10 00
Oct 16 09:45:10 48
Oct 16 09:45:10 83
Oct 16 09:45:10 c0
Oct 16 09:45:10 01
Oct 16 09:45:10 0f
Oct 16 09:45:10 b6
Oct 16 09:45:10 10
Oct 16 09:45:10 f6
Oct 16 09:45:10 82
Oct 16 09:45:10 80
Oct 16 09:45:10 11
Oct 16 09:45:10 af
Oct 16 09:45:10 81
Oct 16 09:45:10 20
Oct 16 09:45:10 75
Oct 16 09:45:10 f0
Oct 16 09:45:10 5d
Oct 16 09:45:10 c3
Oct 16 09:45:10 66
Oct 16 09:45:10 0f
Oct 16 09:45:10 1f
Oct 16 09:45:10 44
Oct 16 09:45:10 00
Oct 16 09:45:10 00
Oct 16 09:45:10 31
Oct 16 09:45:10 c0
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 3f
Oct 16 09:45:10 00
Oct 16 09:45:10 55
Oct 16 09:45:10 48
Oct 16 09:45:10 89
Oct 16 09:45:10 e5
Oct 16 09:45:10 74
Oct 16 09:45:10 11
Oct 16 09:45:10 48
Oct 16 09:45:10 89
Oct 16 09:45:10 f8
Oct 16 09:45:10 66
Oct 16 09:45:10 90
Oct 16 09:45:10 48
Oct 16 09:45:10 83
Oct 16 09:45:10 c0
Oct 16 09:45:10 01
Oct 16 09:45:10 80
Oct 16 09:45:10 38
Oct 16 09:45:10 00
Oct 16 09:45:10 lus03-mds1-1g
Oct 16 09:45:10 RIP
Oct 16 09:45:10 lus03-mds1-1g [<ffffffff81274922>] strlen+0x2/0x20
Oct 16 09:45:10 lus03-mds1-1g RSP <ffff880465975c90>
Oct 16 09:45:11 lus03-mds1-1g --[ end trace 0a0429a1702b83ae ]--
Oct 16 09:45:11 Kernel panic - not syncing: Fatal exception
Oct 16 09:45:11 lus03-mds1-1g Pid: 19644, comm: mount.lustre Tainted: G D --------------- 2.6.32-jb23-358.18.1.el6-lustre-2.4.1 #1
Oct 16 09:45:11 Call Trace:
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff81501d57>] ? panic+0xa7/0x167
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8150d0c4>] ? oops_end+0xe4/0x100
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8100f438>] ? die+0x58/0x90
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8150c932>] ? do_general_protection+0x152/0x160
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8150c3a5>] ? general_protection+0x25/0x30
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff81274922>] ? strlen+0x2/0x20
Oct 16 09:45:11 lus03-mds1-1g [<ffffffffa0c0f3e6>] ? ll_fill_super+0xd96/0x15b0 [lustre]
Oct 16 09:45:11 lus03-mds1-1g [<ffffffffa06dec81>] ? lustre_fill_super+0x771/0x24c0 [obdclass]
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8117b843>] ? sget+0x3a3/0x410
Oct 16 09:45:11 lus03-mds1-1g [<ffffffffa06de510>] ? lustre_fill_super+0x0/0x24c0 [obdclass]
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8117b90f>] ? get_sb_nodev+0x5f/0xa0
Oct 16 09:45:11 lus03-mds1-1g [<ffffffffa06d7955>] ? lustre_get_sb+0x25/0x30 [obdclass]
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8117b214>] ? vfs_kern_mount+0x74/0x1c0
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8117b3d4>] ? do_kern_mount+0x54/0x120
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8119a9eb>] ? do_mount+0x1cb/0x930
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff81158833>] ? alloc_pages_current+0xa3/0x110
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8111b19e>] ? __get_free_pages+0xe/0x50
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8119a69a>] ? copy_mount_options+0x3a/0x170
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8119b450>] ? sys_mount+0x90/0xe0
Oct 16 09:45:11 lus03-mds1-1g [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Oct 16 09:45:11 panic occurred, switching back to text console
root@lus03-mds2:~# uname -a
Linux lus03-mds2 2.6.32-jb23-358.18.1.el6-lustre-2.4.1 #1 SMP Fri Sep 20 10:06:06 BST 2013 x86_64 x86_64 x86_64 GNU/Linux
root@lus03-mds2:~#
|
|
We have one lustre 2.4.1 server system running in production with 1.8.9wc1 clients. We are trying to use a previous system now as test as a lustre 2.4.1 client.
This system can mount our working 2.4.1 server system however trying an ls always fail.
Oct 16 13:12:32 lus03-mds1-1g BUG: unable to handle kernel
Oct 16 13:12:32 paging request
Oct 16 13:12:32 lus03-mds1-1g at 0000082d00000019
Oct 16 13:12:32 lus03-mds1-1g IP:
Oct 16 13:12:32 lus03-mds1-1g [<ffffffffa0c033ff>] ll_show_options+0x2f/0x180 [lustre]
Oct 16 13:12:32 PGD 0
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g Oops: 0000 1
Oct 16 13:12:32 SMP
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:1a:00.0/0000:1b:01.0/0000:24:00.0/host5/rport-5:0-2/target5:0:1/5:0:1:0/state
Oct 16 13:12:32 CPU 5
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 Modules linked in:
Oct 16 13:12:32 lus03-mds1-1g mgc
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g lustre
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g lov
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g osc
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g mdc
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g fid
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g fld
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g ksocklnd
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g ptlrpc
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g obdclass
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g lnet
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g lvfs
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g sha512_generic
Oct 16 13:12:32 lus03-mds1-1g sha256_generic
Oct 16 13:12:32 lus03-mds1-1g libcfs
Oct 16 13:12:32 lus03-mds1-1g (U)
Oct 16 13:12:32 lus03-mds1-1g netconsole
Oct 16 13:12:32 lus03-mds1-1g configfs
Oct 16 13:12:32 lus03-mds1-1g 8021q
Oct 16 13:12:32 lus03-mds1-1g garp
Oct 16 13:12:32 lus03-mds1-1g bridge
Oct 16 13:12:32 lus03-mds1-1g stp
Oct 16 13:12:32 lus03-mds1-1g llc
Oct 16 13:12:32 lus03-mds1-1g acpi_cpufreq
Oct 16 13:12:32 lus03-mds1-1g mperf
Oct 16 13:12:32 lus03-mds1-1g cpufreq_powersave
Oct 16 13:12:32 lus03-mds1-1g cpufreq_ondemand
Oct 16 13:12:32 lus03-mds1-1g cpufreq_conservative
Oct 16 13:12:32 lus03-mds1-1g cpufreq_stats
Oct 16 13:12:32 lus03-mds1-1g freq_table
Oct 16 13:12:32 lus03-mds1-1g autofs4
Oct 16 13:12:32 lus03-mds1-1g nfsd
Oct 16 13:12:32 lus03-mds1-1g exportfs
Oct 16 13:12:32 lus03-mds1-1g nfs
Oct 16 13:12:32 lus03-mds1-1g lockd
Oct 16 13:12:32 lus03-mds1-1g fscache
Oct 16 13:12:32 lus03-mds1-1g auth_rpcgss
Oct 16 13:12:32 lus03-mds1-1g nfs_acl
Oct 16 13:12:32 lus03-mds1-1g sunrpc
Oct 16 13:12:32 lus03-mds1-1g ipv6
Oct 16 13:12:32 lus03-mds1-1g dm_round_robin
Oct 16 13:12:32 lus03-mds1-1g radeon
Oct 16 13:12:32 lus03-mds1-1g ttm
Oct 16 13:12:32 lus03-mds1-1g drm_kms_helper
Oct 16 13:12:32 lus03-mds1-1g i5000_edac
Oct 16 13:12:32 lus03-mds1-1g drm
Oct 16 13:12:32 lus03-mds1-1g edac_core
Oct 16 13:12:32 lus03-mds1-1g i2c_algo_bit
Oct 16 13:12:32 lus03-mds1-1g ibmpex
Oct 16 13:12:32 lus03-mds1-1g ibmaem
Oct 16 13:12:32 lus03-mds1-1g dm_multipath
Oct 16 13:12:32 lus03-mds1-1g ics932s401
Oct 16 13:12:32 lus03-mds1-1g shpchp
Oct 16 13:12:32 lus03-mds1-1g dm_mod
Oct 16 13:12:32 lus03-mds1-1g ioatdma
Oct 16 13:12:32 lus03-mds1-1g i5k_amb
Oct 16 13:12:32 lus03-mds1-1g serio_raw
Oct 16 13:12:32 lus03-mds1-1g i2c_core
Oct 16 13:12:32 lus03-mds1-1g dca
Oct 16 13:12:32 lus03-mds1-1g ext4
Oct 16 13:12:32 lus03-mds1-1g jbd2
Oct 16 13:12:32 lus03-mds1-1g mbcache
Oct 16 13:12:32 lus03-mds1-1g sr_mod
Oct 16 13:12:32 lus03-mds1-1g cdrom
Oct 16 13:12:32 lus03-mds1-1g ata_generic
Oct 16 13:12:32 lus03-mds1-1g pata_acpi
Oct 16 13:12:32 lus03-mds1-1g ses
Oct 16 13:12:32 lus03-mds1-1g sd_mod
Oct 16 13:12:32 lus03-mds1-1g enclosure
Oct 16 13:12:32 lus03-mds1-1g crc_t10dif
Oct 16 13:12:32 lus03-mds1-1g ata_piix
Oct 16 13:12:32 lus03-mds1-1g bnx2x
Oct 16 13:12:32 lus03-mds1-1g qla2xxx
Oct 16 13:12:32 lus03-mds1-1g scsi_transport_fc
Oct 16 13:12:32 lus03-mds1-1g libcrc32c
Oct 16 13:12:32 lus03-mds1-1g bnx2
Oct 16 13:12:32 lus03-mds1-1g mdio
Oct 16 13:12:32 lus03-mds1-1g scsi_tgt
Oct 16 13:12:32 lus03-mds1-1g aacraid
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g Pid: 17711, comm: ls Not tainted 2.6.32-jb23-358.18.1.el6-lustre-2.4.1 #1
Oct 16 13:12:32 lus03-mds1-1g IBM IBM System x3650 [7979B9G]
Oct 16 13:12:32 lus03-mds1-1g /System Planar
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g RIP: 0010:[<ffffffffa0c033ff>]
Oct 16 13:12:32 lus03-mds1-1g [<ffffffffa0c033ff>] ll_show_options+0x2f/0x180 [lustre]
Oct 16 13:12:32 lus03-mds1-1g RSP: 0018:ffff8804b5375e18 EFLAGS: 00010282
Oct 16 13:12:32 lus03-mds1-1g RAX: 0000082d00000001 RBX: ffff88048e0b4380 RCX: 0000000000000000
Oct 16 13:12:32 lus03-mds1-1g RDX: 0000000000000000 RSI: ffff8804b65da8c0 RDI: ffff88048e0b4380
Oct 16 13:12:32 lus03-mds1-1g RBP: ffff8804b5375e28 R08: ffff8804b5375d78 R09: ffffffff8179faa9
Oct 16 13:12:32 lus03-mds1-1g R10: 6e2c38363732333d R11: 0000000000000246 R12: ffff8804b65da928
Oct 16 13:12:32 lus03-mds1-1g R13: ffff8804b65da8c0 R14: 0000000000000000 R15: 000000000000009c
Oct 16 13:12:32 lus03-mds1-1g FS: 00007feefeaaa7c0(0000) GS:ffff880028340000(0000) knlGS:0000000000000000
Oct 16 13:12:32 lus03-mds1-1g CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 16 13:12:32 lus03-mds1-1g CR2: 0000082d00000019 CR3: 00000004ac991000 CR4: 00000000000407e0
Oct 16 13:12:32 lus03-mds1-1g DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 16 13:12:32 lus03-mds1-1g DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 16 13:12:32 Process ls (pid: 17711, threadinfo ffff8804b5374000, task ffff8804b3e36040)
Oct 16 13:12:32 lus03-mds1-1g Stack:
Oct 16 13:12:32 lus03-mds1-1g ffff88048e0b4380
Oct 16 13:12:32 lus03-mds1-1g ffff8804b65da928
Oct 16 13:12:32 lus03-mds1-1g ffff8804b5375e68
Oct 16 13:12:32 lus03-mds1-1g ffffffff81198965
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g d>
Oct 16 13:12:32 lus03-mds1-1g ffff8804b65da8c0
Oct 16 13:12:32 lus03-mds1-1g ffff8804a9398e00
Oct 16 13:12:32 lus03-mds1-1g ffff8804b3a85200
Oct 16 13:12:32 lus03-mds1-1g ffff88048e0b4380
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 lus03-mds1-1g d>
Oct 16 13:12:32 lus03-mds1-1g ffff8804b65da928
Oct 16 13:12:32 lus03-mds1-1g 00007feefeac739d
Oct 16 13:12:32 lus03-mds1-1g ffff8804b5375ee8
Oct 16 13:12:32 lus03-mds1-1g ffffffff8119c1ab
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 Call Trace:
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81198965>] show_vfsmnt+0x105/0x120
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8119c1ab>] seq_read+0x12b/0x3b0
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81212136>] ? security_file_permission+0x16/0x20
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff811792d5>] vfs_read+0xa5/0x180
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff811793fa>] sys_read+0x4a/0x90
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Oct 16 13:12:32 lus03-mds1-1g Code:
Oct 16 13:12:32 41
Oct 16 13:12:32 54
Oct 16 13:12:32 53
Oct 16 13:12:32 0f
Oct 16 13:12:32 1f
Oct 16 13:12:32 44
Oct 16 13:12:32 00
Oct 16 13:12:32 00
Oct 16 13:12:32 48
Oct 16 13:12:32 85
Oct 16 13:12:32 f6
Oct 16 13:12:32 48
Oct 16 13:12:32 89
Oct 16 13:12:32 fb
Oct 16 13:12:32 0f
Oct 16 13:12:32 84
Oct 16 13:12:32 2e
Oct 16 13:12:32 01
Oct 16 13:12:32 00
Oct 16 13:12:32 00
Oct 16 13:12:32 48
Oct 16 13:12:32 85
Oct 16 13:12:32 ff
Oct 16 13:12:32 0f
Oct 16 13:12:32 84
Oct 16 13:12:32 25
Oct 16 13:12:32 01
Oct 16 13:12:32 00
Oct 16 13:12:32 00
Oct 16 13:12:32 48
Oct 16 13:12:32 8b
Oct 16 13:12:32 86
Oct 16 13:12:32 90
Oct 16 13:12:32 00
Oct 16 13:12:32 00 last message repeated 2 times
Oct 16 13:12:32 48
Oct 16 13:12:32 8b
Oct 16 13:12:32 80
Oct 16 13:12:32 90
Oct 16 13:12:32 02
Oct 16 13:12:32 00
Oct 16 13:12:32 00
Oct 16 13:12:32 lus03-mds1-1g c>
Oct 16 13:12:32 8b
Oct 16 13:12:32 60
Oct 16 13:12:32 18
Oct 16 13:12:32 41
Oct 16 13:12:32 8b
Oct 16 13:12:32 44
Oct 16 13:12:32 24
Oct 16 13:12:32 70
Oct 16 13:12:32 a8
Oct 16 13:12:32 01
Oct 16 13:12:32 0f
Oct 16 13:12:32 85
Oct 16 13:12:32 f0
Oct 16 13:12:32 00
Oct 16 13:12:32 00 last message repeated 2 times
Oct 16 13:12:32 a8
Oct 16 13:12:32 04
Oct 16 13:12:32 0f
Oct 16 13:12:32 85
Oct 16 13:12:32 lus03-mds1-1g
Oct 16 13:12:32 RIP
Oct 16 13:12:32 lus03-mds1-1g [<ffffffffa0c033ff>] ll_show_options+0x2f/0x180 [lustre]
Oct 16 13:12:32 lus03-mds1-1g RSP <ffff8804b5375e18>
Oct 16 13:12:32 lus03-mds1-1g CR2: 0000082d00000019
Oct 16 13:12:32 lus03-mds1-1g --[ end trace a3fb76990639f714 ]--
Oct 16 13:12:32 Kernel panic - not syncing: Fatal exception
Oct 16 13:12:32 lus03-mds1-1g Pid: 17711, comm: ls Tainted: G D --------------- 2.6.32-jb23-358.18.1.el6-lustre-2.4.1 #1
Oct 16 13:12:32 Call Trace:
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81501d57>] ? panic+0xa7/0x167
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8150d0c4>] ? oops_end+0xe4/0x100
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81501655>] ? no_context+0x209/0x218
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff811ded50>] ? proc_delete_inode+0x0/0x60
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff815017e5>] ? __bad_area_nosemaphore+0x181/0x1a4
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8111463e>] ? find_get_page+0x1e/0xa0
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81501862>] ? bad_area+0x45/0x4e
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81044be5>] ? __do_page_fault+0x345/0x430
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81275dfd>] ? string.isra.3+0x3d/0xf0
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8150f0eb>] ? do_page_fault+0x3b/0xa0
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8150c3d5>] ? page_fault+0x25/0x30
Oct 16 13:12:32 lus03-mds1-1g [<ffffffffa0c033ff>] ? ll_show_options+0x2f/0x180 [lustre]
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81198965>] ? show_vfsmnt+0x105/0x120
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8119c1ab>] ? seq_read+0x12b/0x3b0
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff81212136>] ? security_file_permission+0x16/0x20
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff811792d5>] ? vfs_read+0xa5/0x180
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff811793fa>] ? sys_read+0x4a/0x90
Oct 16 13:12:32 lus03-mds1-1g [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
root@lus03-mds2:~#
|
|
In the first crash it looks like you were running lnet_selftest on the client before it had mount troubles? Is there a problem mounting if you don't run this first?
In the second crash, it looks like it crashes in ll_show_options() where sbi is set, but it may be slightly different since you compiled your own Lustre code:
It isn't at all clear why this would be crashing, since we've of course tested 2.4.1 clients+servers together. Is this a newly formatted filesystem, or is the server also upgraded? Did you try writeconf to rebuild the config logs for 2.4?
|
|
Bob, can you please take a look at this?
|
|
"In the first crash it looks like you were running lnet_selftest on the client before it had mount troubles? Is there a problem mounting if you don't run this first?"
No, the problem is independent of running a lnet_selftest, We just had a system which we were having real problems with and we were proving the network was not an issue. ( We are moving towards thinking that the problem was we had done a writeconf and not specified an --mgs in the tunefs.lustre ).
"It isn't at all clear why this would be crashing, since we've of course tested 2.4.1 clients+servers together. Is this a newly formatted filesystem, or is the server also upgraded? Did you try writeconf to rebuild the config logs for 2.4?"
Note this was a client system that hadn't either an MGS,MDT or OST mounted on it before being used as a client. We had gone though the loop of using writeconf multiple times on the servers serving the file system we were mounting.
We do have a in production 2.4.1 filesystem however while I can mount this from a 2.4.1 client ( either a system installed but not used as a server or from one of our dkms builds) both clients fail if I try and do an ls, I suspect that should be a separate issue but I have been trying to rule out most of my stupidity.
|
|
Could you please attach the .config from your kernel build and the config.h from your lustre build? I'd like to look for anomalies or departures from expected values there.
|
|
Kernel config file
|
|
config.h
|
|
I think I've spotted at least one anomaly in your lustre build. Your config.h has
#define HAVE_SUPEROPS_USE_DENTRY 1
Normally building against rhel/centos 2.6.32-358.18.1 kernel source this is #undef'ed. As far as I know this is controlled by the definition of struct super_operations.show_options in include/linux/fs.h of the kernel source. In standard red hat kernel source of this version the 2nd argument of that method is a struct vfsmount *, not a struct dentry *. The fact that your config doesn't match the includes in proper kernel source suggest that something is wrong in your configure or build of lustre.
Normally we only build lustre clients native on ubuntu. We don't build server code there. I'm wondering if there's something about building rhel kernel or lustre server code in a ubuntu environment that is just not working right. Or maybe you have bad options in the configure cmd of your lustre build.
|
|
My notes on building the kernel on ubuntu precise are...
Edit arch/x86/vdso/Makefile replace
#VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -Wl,-soname=linux-vdso.so.1 \
-Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096
VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \
-Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096
and
- VDSO_LDFLAGS_vdso32.lds = -m elf_x86 -Wl,-soname=linux-gate.so.1
VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-soname=linux-gate.so.1
( Because of http://stackoverflow.com/questions/10425761/kernel-compile-error-gcc-error-elf-i386-no-such-file-or-directory )
Edit Makefile replace, as gcc 4.6 does not parse....
CPP_MINOR := $(shell $(CPP) -dumpversion 2>&1 | cut -d'.' -f2)
CPP_PATCH := $(shell $(CPP) -dumpversion 2>&1 | cut -d'.' -f3)
with
CPP_MINOR := $(shell $(CPP) -dumpversion 2>&1 | cut -d'.' -f2)
CPP_PATCH := 0
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c+,fortran,objc,obj-c+ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
cpp -dumpversion
4.6
And finally
Compilation failures....
CONFIG_SCSI_PMCRAID=n
CC [M] drivers/scsi/pmcraid.o
In file included from drivers/scsi/pmcraid.c:57:0:
drivers/scsi/pmcraid.h:611:8: error: duplicate member ‘sense_buffer’
cd <kernel-source>
cp <lustre-source>/lustre/kernel_patches/kernel_configs/kernel-2.6.32-2.6-rhel6-x86_64.conf ./.config
make oldconfig
make kpkg-clean
CONCURRENCY_LEVEL=9 make-kpkg kernel-image kernel-headers kernel-source --append-to-version="-jb23-358.18.1.el6-lustre-2.4.1" --revision=20130920:1 --rootcmd fakeroot --initrd
|
|
"Normally building against rhel/centos 2.6.32-358.18.1 kernel source this is #undef'ed. As far as I know this is controlled by the definition of struct super_operations.show_options in include/linux/fs.h of the kernel source. In standard red hat kernel source of this version the 2nd argument of that method is a struct vfsmount *, not a struct dentry *. The fact that your config doesn't match the includes in proper kernel source suggest that something is wrong in your configure or build of lustre."
The kernel we get seems to be working as a server. However it seems to fail as a client 
Taking the stupid approach of looking show_options in fs.h we get
root@isg-dev4-1g:/local/lustre_2.4.1/kernel_source/kernel-2.6.32-358.18.1.el6/linux-2.6.32-358.18.1.el6.x86_64/include/linux# grep -n show_options fs.h
1484: * generic_show_options()
1789: int (*show_options)(struct seq_file *, struct vfsmount *);
2657:extern int generic_show_options(struct seq_file *m, struct vfsmount *mnt);
md5sum fs.h
4068f8d7a42f6e796049b36ddd076a60 fs.h
|
|
I more strongly suspect the lustre build, not the kernel build. I note that in the kernel #includes for the 3.8 kernel in the Ubuntu distro the 2nd arg of super_operations.show_options is in fact a struct dentry *. If you are somehow seeing the ubuntu native #includes instead of the rhel kernel #includes during the lustre build that would be a complete explanation of what went wrong.
|
|
which ubuntu release is 'precise'? I think the only ubuntu I have on hand to look at is a bit newer than that.
|
|
Precise is Ubuntu 12.04.2 LTS ( http://releases.ubuntu.com/precise/ ).
"I note that in the kernel #includes for the 3.8 kernel in the Ubuntu distro the 2nd arg of super_operations.show_options is in fact a struct dentry *. If you are somehow seeing the ubuntu native #includes instead of the rhel kernel #includes during the lustre build that would be a complete explanation of what went wrong."
I note that I did the lustre build while running 3.8.0-31-generic, its perfectly possible that I have picked up the native includes..., my config stanza looks like:
It was created by Lustre configure LUSTRE_VERSION, which was
generated by GNU Autoconf 2.68. Invocation command line was
$ ./configure --enable-mpitests=no --with-o2ib=/usr/src/ofa-kernel-headers-2.6.32-jb23-358.18.1.el6-lustre-2.4.1/ --with-linux=/local/lustre_2.4.1/kernel_source/kernel-2.6.32-358.
18.1.el6/linux-2.6.32-358.18.1.el6.x86_64/
Which seems reasonable..., however looking at the config.log I am not so sure, I will attached it.
|
|
config.log
|
|
It seems to be a build environment or Makefile thing. In centos builds native on Centos all the autoconf compile tests have -Werror in the command lines captured in config.log. In your config.log none of the compile test lines have -Werror in them. This leads to compile tests that should fail due to warnings about mismatched types being promoted to hard errors succeeding instead. Can't tell if this is due to some subtle version skew in autoconf or automake tools on ubuntu or due to some differences in Makefiles for kernel or lustre or what. All I know is it is different in your config.log than in builds on Centos or SLES or RHEL.
Can you describe how you obtained & installed rhel kernel source in ubuntu? My ubuntu environment doesn't have rpm tools, for example.
Would like to follow along in your footsteps and see if I can cause similar build errors.
|
|
I find if I do a client build in ubuntu against the native kernel all the autoconf compile lines in config.log do have -Werror in them as they probably should. Don't know what is different about the builds you are doing.
|
|
The way we get the kernel is to get a centos vm...
rpm -ivh <kernel-package>
cd ~/rpmbuild
rpmbuild -bp --target=`uname -m` ./SPECS/kernel.spec
This is our latest method for getting the lustre source where you can see we explicitly turn off -Werror ...
cd lustre-release
git checkout <tag>
- git tag -l will show the available tags
- Continue to do this on the same machine, do not move the git checkout....
- tidy up the source ready for building.
- Remove -Werror from the config
cd <lustre-source>
sed -i /autoconf/.m4 -e 's/-Werror//g' -e 's/Werror-implicit-function-declaration//'
sh ./autogen.sh
./configure --disable-modules
make dist
This will generate a lustre-X.tar.gz file. This is the source tarball you should for the subsequent stages in the build. I will discuss with my colleagues in the morning the reason why we started ensuring that Werror is not set as I can't remember right now....
|
|
I strongly suspect taking out -Werror is the source of your build problems. Much of our autoconf test infrastructure depends on having that option. I'm pretty sure the HAVE_SUPEROPS_USE_DENTRY is not the only config variable that's wrong due to stripping it out, it's probably only the tip of the iceberg.
|
|
Removing -Werror was the problem;
Putting it back give us a client that appears to work. I used
#pragma GCC diagnostic
to silence the compiler warnings generated by by gcc 4.6 that were causing the build to fail.
|
|
I'm curious why you are working so hard to create and run a rhel kernel on ubuntu? Why not just use a centos or vanilla rhel server instead? We don't normally build for or support ubuntu servers, only clients.
|
|
By the way we have a number of small changes that went in after the 2.4 release and aren't in the b2_4 branch of our code that fix problems with errors seen in late gcc versions like 4.6 and 4.7. Not normally seen during 2.4 builds as the native gcc version in rhel/centos 6.4 is 4.4
|
|
The theory was that is was easier to port the rhel kernel -> ubuntu than to fold new centos servers into our server mgmt and provisioning infrastructure (which currently handles ubuntu & SLES, but not RHEL.)
That assumption is under active re-consideration 
|
|
If it's any consolation we have started building for and supporting SLES11 servers again. Both SLES11 SP2 and SP3 server builds are in the 2.5 release. The vast majority of our installed base use Centos servers though, as far as I know.
|
|
If you insist on using rhel kernels on Ubuntu one approach might be to download and install one set of our prebuilt Centos kernel and lustre rpms on Ubuntu using "alien". Could avoid all kinds of problems in building your own. Just a suggestion.
|
|
Maybe just the kernel rpm would be safer. Might need to build lustre anyway to make sure all the userspace pieces were the right Ubuntu resident versions.
|
|
"If you insist on using rhel kernels on Ubuntu one approach might be to download and install one set of our prebuilt Centos kernel and lustre rpms on Ubuntu using "alien". Could avoid all kinds of problems in building your own. Just a suggestion."
We did take that approach about 5 or 6 years ago if I remember correctly. However our current method is easier than the pain we had with that. Given your notes Guy has built new kernel modules and userspace while I have been fighting with trying to upgrade a filesystem today. This new client seems to work. I suspect we would be happy for this ticket to be closed. ( I will let Guy confirm tomorrow ).
|
|
Problem was pilot error. Must not strip -Werror from autoconf scripts.
|
Generated at Sat Feb 10 01:39:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.