[LU-6828] conf-sanity test_32a: Setting MDT failover.node Created: 09/Jul/15 Updated: 04/Nov/15 Resolved: 29/Jul/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
client and server: lustre-master build # 3094 RHEL7 |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
This issue was created by maloo for sarah_lw <wei3.liu@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/07512048-25ec-11e5-866a-5254006e85c2. The sub-test test_32a failed with the following error: Setting MDT failover.node Test log below. MDS log is missing, TEI-3677 is for tracking the issue. This may be a similar issue with LU-6807 CMD: shadow-43vm3 mount -t lustre -o loop,mgsnode=10.1.6.3@tcp /tmp/t32/ost /tmp/t32/mnt/ost CMD: shadow-43vm3 /usr/sbin/lctl get_param -n obdfilter.t32fs-OST0000.uuid CMD: shadow-43vm3 /usr/sbin/lctl conf_param t32fs-OST0000.osc.max_dirty_mb=15 shadow-43vm3: warning: 'lctl conf_param' is deprecated, use 'lctl set_param -P' instead CMD: shadow-43vm3 /usr/sbin/lctl conf_param t32fs-OST0000.failover.node=10.1.6.3@tcp shadow-43vm3: warning: 'lctl conf_param' is deprecated, use 'lctl set_param -P' instead CMD: shadow-43vm3 /usr/sbin/lctl conf_param t32fs-MDT0000.mdc.max_rpcs_in_flight=9 shadow-43vm3: warning: 'lctl conf_param' is deprecated, use 'lctl set_param -P' instead CMD: shadow-43vm3 /usr/sbin/lctl conf_param t32fs-MDT0000.failover.node=10.1.6.3@tcp pdsh@shadow-43vm6: shadow-43vm3: mcmd: xpoll (setting up stderr): Interrupted system call conf-sanity test_32a: @@@@@@ FAIL: Setting MDT "failover.node" |
| Comments |
| Comment by Peter Jones [ 09/Jul/15 ] |
|
Fan Yong Please could you make this your top priority? This is disrupting the RHEL7 server testing. Thanks Peter |
| Comment by nasf (Inactive) [ 11/Jul/15 ] |
|
The tests from test_32a failed because of losing the MDS (shadow-43vm3). It seems that the MDS corrupted. But there are NOT MDS's logs on Maloo. The same case for LU-6807. We need the MDS logs for further investigation. That depends on TEI-3677. |
| Comment by Sarah Liu [ 13/Jul/15 ] |
|
I found something from this link: https://testing.hpdd.intel.com/test_logs/14bab794-25ec-11e5-866a-5254006e85c2/show_text MDS console 01:48:26:[ 2959.156834] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == conf-sanity test 32a: Upgrade \(not live\) ========================================================== 01:48:14 \(1436406494\) 01:48:26:[ 2959.269087] Lustre: DEBUG MARKER: == conf-sanity test 32a: Upgrade (not live) ========================================================== 01:48:14 (1436406494) 01:48:26:[ 2959.365721] Lustre: DEBUG MARKER: which tunefs.lustre 01:48:26:[ 2959.547961] Lustre: DEBUG MARKER: find /usr/lib64/lustre/tests -maxdepth 1 -name 'disk*-ldiskfs.tar.bz2' 01:48:26:[ 2959.717579] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids 01:48:26:[ 2960.008631] Lustre: DEBUG MARKER: mkdir -p /tmp/t32/mnt/mdt /tmp/t32/mnt/ost 01:48:26:[ 2960.184272] Lustre: DEBUG MARKER: tar xjvf /usr/lib64/lustre/tests/disk1_8_up_2_5-ldiskfs.tar.bz2 -S -C /tmp/t32 01:48:26:[ 2960.465708] Lustre: DEBUG MARKER: cat /tmp/t32/commit 01:48:26:[ 2960.629076] Lustre: DEBUG MARKER: cat /tmp/t32/kernel 01:48:26:[ 2960.789007] Lustre: DEBUG MARKER: cat /tmp/t32/arch 01:48:26:[ 2960.953119] Lustre: DEBUG MARKER: cat /tmp/t32/bspace 01:48:26:[ 2961.135406] Lustre: DEBUG MARKER: cat /tmp/t32/ispace 01:48:37:[ 2961.835195] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param debug=-1 01:48:37:[ 2962.064981] Lustre: DEBUG MARKER: tunefs.lustre --dryrun /tmp/t32/mdt 01:48:37:[ 2962.293098] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust 01:48:37:[ 2962.683563] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust 01:48:37:[ 2963.112844] Lustre: DEBUG MARKER: mount -t lustre -o loop,exclude=t32fs-OST0000 /tmp/t32/mdt /tmp/t32/mnt/mdt 01:48:37:[ 2963.220105] loop: module loaded 01:48:37:[ 2963.240628] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: errors=remount-ro,user_xattr,acl,no_mbcache 01:48:37:[ 2963.241781] Lustre: 29175:0:(osd_handler.c:5773:osd_mount()) t32fs-MDT0000-osd: device /dev/loop0 was upgraded from Lustre-1.x without enabling the dirdata feature. If you do not want to downgrade to Lustre-1.x again, you can enable it via 'tune2fs -O dirdata device' 01:48:37:[ 2963.280558] Lustre: 29221:0:(mgs_llog.c:235:mgs_fsdb_handler()) MDT using 1.8 OSC name scheme 01:48:37:[ 2963.375900] Lustre: 29222:0:(obd_config.c:1497:class_config_llog_handler()) For 1.8 interoperability, rename obd type from mds to mdt 01:48:37:[ 2963.490512] Lustre: 29222:0:(obd_mount.c:885:lustre_check_exclusion()) Excluding t32fs-OST0000 (on exclusion list) 01:48:37:[ 2963.668973] Lustre: ctl-t32fs-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt 01:48:37:[ 2963.882424] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.t32fs-MDT0000.uuid 01:48:37:[ 2964.054176] Lustre: DEBUG MARKER: tunefs.lustre --dryrun /tmp/t32/ost 01:48:37:[ 2964.259355] Lustre: DEBUG MARKER: mount -t lustre -o loop,mgsnode=10.1.6.3@tcp /tmp/t32/ost /tmp/t32/mnt/ost 01:48:37:[ 2964.393345] LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: errors=remount-ro,,no_mbcache 01:48:37:[ 2964.427745] Lustre: srv-t32fs-OST0000: No data found on store. Initialize space 01:48:37:[ 2964.429269] Lustre: Skipped 1 previous similar message 01:48:37:[ 2964.755144] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n obdfilter.t32fs-OST0000.uuid 01:48:37:[ 2964.934554] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param t32fs-OST0000.osc.max_dirty_mb=15 01:48:37:[ 2965.037764] Lustre: Setting parameter t32fs-OST0000-osc.osc.max_dirty_mb in log t32fs-client 01:48:37:[ 2965.038645] Lustre: Skipped 26 previous similar messages 01:48:37:[ 2965.129618] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param t32fs-OST0000.failover.node=10.1.6.3@tcp 01:48:37:[ 2965.319212] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param t32fs-MDT0000.mdc.max_rpcs_in_flight=9 01:48:37:[ 2965.512505] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param t32fs-MDT0000.failover.node=10.1.6.3@tcp 01:48:37:[ 2965.542134] ------------[ cut here ]------------ 01:48:37:[ 2965.543026] kernel BUG at mm/slub.c:3379! 01:48:37:[ 2965.543026] invalid opcode: 0000 [#1] SMP 01:48:37:[ 2965.544007] Modules linked in: ofd(OF) ost(OF) loop lustre(OF) lmv(OF) mdc(OF) lov(OF) osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ppdev ib_sa serio_raw ib_mad pcspkr virtio_balloon i2c_piix4 parport_pc parport ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ata_piix ttm 8139too virtio_pci 8139cp virtio_ring virtio mii drm libata i2c_core floppy 01:48:37:[ 2965.544007] CPU: 0 PID: 12 Comm: rcuos/0 Tainted: GF O-------------- 3.10.0-229.4.2.el7_lustre.x86_64 #1 01:48:37:[ 2965.544007] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 01:48:37:[ 2965.544007] task: ffff88007c050000 ti: ffff88007c04c000 task.ti: ffff88007c04c000 01:48:37:[ 2965.544007] RIP: 0010:[<ffffffff811ab1f3>] [<ffffffff811ab1f3>] kfree+0x133/0x140 01:48:37:[ 2965.544007] RSP: 0018:ffff88007c04fd98 EFLAGS: 00010246 01:48:37:[ 2965.544007] RAX: 001fffff00000000 RBX: ffff88007b220064 RCX: 0000000180150001 01:48:37:[ 2965.544007] RDX: 001fffff00000000 RSI: 0000000000000002 RDI: ffff88007b220064 01:48:37:[ 2965.544007] RBP: ffff88007c04fdb0 R08: ffff880079016540 R09: 0000000180150001 01:48:37:[ 2965.544007] R10: ffffea0001ec8800 R11: ffffffff810b6df0 R12: ffff8800654fc000 01:48:37:[ 2965.544007] R13: ffffffff810b6e02 R14: ffffffff81a25a00 R15: ffff8800654fc0f8 01:48:37:[ 2965.544007] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 01:48:37:[ 2965.544007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 01:48:37:[ 2965.544007] CR2: 00007f3e36b98000 CR3: 00000000366ea000 CR4: 00000000000006f0 01:48:37:[ 2965.544007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 01:48:37:[ 2965.544007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 01:48:37:[ 2965.544007] Stack: 01:48:37:[ 2965.544007] 0000000000000002 ffff8800654fc000 0000000000000002 ffff88007c04fde0 01:48:37:[ 2965.544007] ffffffff810b6e02 ffff8800654fc000 0000000000000002 0000000000000000 01:48:37:[ 2965.544007] 0000000000000013 ffff88007c04fdf8 ffffffff810a2b72 ffff8800654fc0f8 01:48:37:[ 2965.544007] Call Trace: 01:48:37:[ 2965.544007] [<ffffffff810b6e02>] free_fair_sched_group+0xa2/0xc0 01:48:37:[ 2965.544007] [<ffffffff810a2b72>] free_sched_group+0x12/0x30 01:48:37:[ 2965.544007] [<ffffffff810a2ba5>] free_sched_group_rcu+0x15/0x20 01:48:37:[ 2965.544007] [<ffffffff81113289>] rcu_nocb_kthread+0x229/0x370 01:48:37:[ 2965.544007] [<ffffffff81098350>] ? wake_up_bit+0x30/0x30 01:48:37:[ 2965.544007] [<ffffffff81113060>] ? rcu_start_gp+0x40/0x40 01:48:37:[ 2965.544007] [<ffffffff8109739f>] kthread+0xcf/0xe0 01:48:37:[ 2965.544007] [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140 01:48:37:[ 2965.544007] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0 01:48:37:[ 2965.544007] [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140 01:48:37:[ 2965.544007] Code: 49 8b 02 31 f6 f6 c4 40 74 04 41 8b 72 68 4c 89 d7 e8 a2 42 fb ff eb 8f 4c 8b 50 30 48 8b 10 80 e6 80 4c 0f 44 d0 e9 32 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 01:48:37:[ 2965.544007] RIP [<ffffffff811ab1f3>] kfree+0x133/0x140 01:48:37:[ 2965.544007] RSP <ffff88007c04fd98> 01:48:37:[ 2965.544007] ------------[ cut here ]------------ 01:48:37:[ 2965.544007] kernel BUG at mm/vmalloc.c:1339! 01:48:37:[ 2965.544007] invalid opcode: 0000 [#2] SMP 01:48:37:[ 2965.544007] Modules linked in: ofd(OF) ost(OF) loop lustre(OF) lmv(OF) mdc(OF) lov(OF) osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ppdev ib_sa serio_raw ib_mad pcspkr virtio_balloon i2c_piix4 parport_pc parport ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ata_piix ttm 8139too virtio_pci 8139cp virtio_ring virtio mii drm libata i2c_core floppy 01:48:37:[ 2965.544007] CPU: 0 PID: 12 Comm: rcuos/0 Tainted: GF O-------------- 3.10.0-229.4.2.el7_lustre.x86_64 #1 01:48:37:[ 2965.544007] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 01:48:37:[ 2965.544007] task: ffff88007c050000 ti: ffff88007c04c000 task.ti: ffff88007c04c000 01:48:37:[ 2965.544007] RIP: 0010:[<ffffffff811903ae>] [<ffffffff811903ae>] __get_vm_area_node+0x1ce/0x1d0 01:48:37:[ 2965.544007] RSP: 0018:ffff88007c04f390 EFLAGS: 00010006 01:48:37:[ 2965.544007] RAX: ffff88007c04ffd8 RBX: 00000000ffffffff RCX: ffffc90000000000 01:48:37:[ 2965.544007] RDX: 0000000000000022 RSI: 0000000000000001 RDI: 0000000000002000 01:48:37:[ 2965.544007] RBP: ffff88007c04f3f0 R08: ffffe8ffffffffff R09: 00000000ffffffff 01:48:37:[ 2965.544007] R10: ffff880062d55180 R11: ffff8800366a78d0 R12: ffffffffa01092c9 01:48:37:[ 2965.544007] R13: 0000000000001200 R14: 00000000000080d2 R15: ffffea0000d951c0 01:48:37:[ 2965.544007] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 01:48:37:[ 2965.544007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 01:48:37:[ 2965.544007] CR2: 00007f3e36b98000 CR3: 00000000366ea000 CR4: 00000000000006f0 01:48:37:[ 2965.544007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 01:48:37:[ 2965.544007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 01:48:37:[ 2965.544007] Stack: 01:48:37:[ 2965.544007] ffffffff81191b1d 00000000000080d2 ffffffffa01092c9 8000000000000163 01:48:37:[ 2965.544007] 000080d200000000 0000000000000000 00000000e0857f38 ffff880062d55180 01:48:37:[ 2965.544007] ffff8800366a78b0 0000000000240000 0000000000000080 ffffea0000d951c0 01:48:37:[ 2965.544007] Call Trace: 01:48:37:[ 2965.544007] [<ffffffff81191b1d>] ? __vmalloc_node_range+0x7d/0x270 01:48:37:[ 2965.544007] [<ffffffffa01092c9>] ? ttm_tt_init+0x69/0xb0 [ttm] 01:48:37:[ 2965.544007] [<ffffffff81191d51>] __vmalloc+0x41/0x50 01:48:37:[ 2965.544007] [<ffffffffa01092c9>] ? ttm_tt_init+0x69/0xb0 [ttm] 01:48:37:[ 2965.544007] [<ffffffffa01092c9>] ttm_tt_init+0x69/0xb0 [ttm] 01:48:37:[ 2965.544007] [<ffffffffa00d34e8>] cirrus_ttm_tt_create+0x58/0x90 [cirrus] 01:48:37:[ 2965.544007] [<ffffffffa0109a7d>] ttm_bo_add_ttm+0x8d/0xc0 [ttm] 01:48:37:[ 2965.544007] [<ffffffffa010b0f1>] ttm_bo_handle_move_mem+0x571/0x5b0 [ttm] 01:48:37:[ 2965.544007] [<ffffffff81601994>] ? __slab_free+0x10e/0x277 01:48:37:[ 2965.544007] [<ffffffffa010b74a>] ? ttm_bo_mem_space+0x10a/0x310 [ttm] 01:48:37:[ 2965.544007] [<ffffffffa010be17>] ttm_bo_validate+0x247/0x260 [ttm] 01:48:37:[ 2965.544007] [<ffffffff81059e69>] ? iounmap+0x79/0xa0 01:48:37:[ 2965.544007] [<ffffffff81050069>] ? kgdb_arch_late+0xe9/0x180 01:48:37:[ 2965.544007] [<ffffffffa00d3ac2>] cirrus_bo_push_sysram+0x82/0xe0 [cirrus] 01:48:37:[ 2965.544007] [<ffffffffa00d1c84>] cirrus_crtc_do_set_base.isra.8.constprop.10+0x84/0x430 [cirrus] 01:48:37:[ 2965.544007] [<ffffffffa00d2479>] cirrus_crtc_mode_set+0x449/0x4d0 [cirrus] 01:48:37:[ 2965.544007] [<ffffffffa012a939>] drm_crtc_helper_set_mode+0x2e9/0x520 [drm_kms_helper] 01:48:37:[ 2965.544007] [<ffffffffa012b6bf>] drm_crtc_helper_set_config+0x87f/0xaa0 [drm_kms_helper] 01:48:37:[ 2965.544007] [<ffffffff8160919b>] ? __ww_mutex_lock+0x1b/0xa0 01:48:37:[ 2965.544007] [<ffffffffa0088711>] drm_mode_set_config_internal+0x61/0xe0 [drm] 01:48:37:[ 2965.544007] [<ffffffffa0133a94>] drm_fb_helper_pan_display+0x94/0xf0 [drm_kms_helper] 01:48:37:[ 2965.544007] [<ffffffff81326259>] fb_pan_display+0xc9/0x190 01:48:37:[ 2965.544007] [<ffffffff813352e0>] bit_update_start+0x20/0x50 01:48:37:[ 2965.544007] [<ffffffff81334d0d>] fbcon_switch+0x39d/0x5a0 01:48:37:[ 2965.544007] [<ffffffff813a3549>] redraw_screen+0x1a9/0x270 01:48:37:[ 2965.544007] [<ffffffff8132645e>] ? fb_blank+0xae/0xc0 01:48:37:[ 2965.544007] [<ffffffff8133222a>] fbcon_blank+0x22a/0x2f0 01:48:37:[ 2965.544007] [<ffffffff81070384>] ? wake_up_klogd+0x34/0x50 01:48:37:[ 2965.544007] [<ffffffff810705a8>] ? console_unlock+0x208/0x400 01:48:37:[ 2965.544007] [<ffffffff8107ee63>] ? __internal_add_timer+0x113/0x130 01:48:37:[ 2965.544007] [<ffffffff8107f057>] ? internal_add_timer+0x17/0x40 01:48:37:[ 2965.544007] [<ffffffff81080bfd>] ? mod_timer+0x11d/0x240 01:48:37:[ 2965.544007] [<ffffffff813a3c08>] do_unblank_screen+0xb8/0x1f0 01:48:37:[ 2965.544007] [<ffffffff813a3d50>] unblank_screen+0x10/0x20 01:48:37:[ 2965.544007] [<ffffffff812e4b59>] bust_spinlocks+0x19/0x40 01:48:37:[ 2965.544007] [<ffffffff8160d6a8>] oops_end+0x38/0x150 01:48:37:[ 2965.544007] [<ffffffff810173eb>] die+0x4b/0x70 01:48:37:[ 2965.544007] [<ffffffff8160ce60>] do_trap+0x60/0x170 01:48:37:[ 2965.544007] [<ffffffff81014224>] do_invalid_op+0xb4/0x130 01:48:37:[ 2965.544007] [<ffffffff811ab1f3>] ? kfree+0x133/0x140 01:48:37:[ 2965.544007] [<ffffffff810a67a5>] ? check_preempt_curr+0x85/0xa0 01:48:37:[ 2965.544007] [<ffffffff810a67d9>] ? ttwu_do_wakeup+0x19/0xc0 01:48:37:[ 2965.544007] [<ffffffff810b6e02>] ? free_fair_sched_group+0xa2/0xc0 01:48:37:[ 2965.544007] [<ffffffff8161675e>] invalid_op+0x1e/0x30 01:48:37:[ 2965.544007] [<ffffffff810b6e02>] ? free_fair_sched_group+0xa2/0xc0 01:48:37:[ 2965.544007] [<ffffffff810b6df0>] ? free_fair_sched_group+0x90/0xc0 01:48:37:[ 2965.544007] [<ffffffff811ab1f3>] ? kfree+0x133/0x140 |
| Comment by Gerrit Updater [ 17/Jul/15 ] |
|
Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/15635 |
| Comment by Gerrit Updater [ 29/Jul/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15635/ |
| Comment by Peter Jones [ 29/Jul/15 ] |
|
Landed for 2.8 |
| Comment by Andreas Dilger [ 03/Nov/15 ] |
|
Did this bug suddenly become visible for RHEL 7 because it has 16-byte slab allocation sizes? I think the smallest slab allocation size for RHEL6 is 32 bytes, which would mean that even the smaller allocation size would be safe since it is rounded up to 32 bytes, and would only be visible if CONFIG_DEBUG_SLAB is enabled it similar. If the slab size is only 16 bytes then (strlen("lustre-mdtlov") + 1) is only 15 but (strlen("lustre-MDT0000-osd") + 1) is 19. |
| Comment by Yang Sheng [ 04/Nov/15 ] |
|
Yes, The smallest size is 8bytes in RHEL7. So it is more easy to crash since memory leak. |