Details
-
Bug
-
Resolution: Cannot Reproduce
-
Blocker
-
Lustre 2.0.0
-
None
-
3
-
8538
Description
At CEA they sometimes have multiple OSS nodes completely hang. Those nodes are dead and need to be crashed.
In the dump, they see that multiple tasks are spinning on "dq_list_lock" with the following stack traces:
============================================================================
#7 [ffff8805cf859180] _spin_lock at ffffffff81454fee
#8 [ffff8805cf859188] dqget at ffffffff811b0914
#9 [ffff8805cf8591d8] vfs_get_dqblk at ffffffff811b0f5a
#10 [ffff8805cf8591f8] fsfilt_ldiskfs_quotactl at ffffffffa03fcbff
#11 [ffff8805cf8592a8] compute_remquota at ffffffffa07cb7ce
#12 [ffff8805cf859328] quota_check_common at ffffffffa07d4ade
#13 [ffff8805cf859468] quota_chk_acq_common at ffffffffa07d5561
#14 [ffff8805cf8595e8] filter_commitrw_write at ffffffffa0797488
#15 [ffff8805cf8597d8] filter_commitrw at ffffffffa078a535
#16 [ffff8805cf859898] obd_commitrw at ffffffffa0655ffa
#17 [ffff8805cf859918] ost_brw_write at ffffffffa065e644
#18 [ffff8805cf859af8] ost_handle at ffffffffa066337a
#19 [ffff8805cf859ca8] ptlrpc_server_handle_request at ffffffffa06c5b11
#20 [ffff8805cf859de8] ptlrpc_main at ffffffffa06c6f0a
#21 [ffff8805cf859f48] kernel_thread at ffffffff8100d1aa
and
#6 [ffff8804e59817d0] _spin_lock at ffffffff81454fee
#7 [ffff8804e59817d8] dqget at ffffffff811b0914
#8 [ffff8804e5981828] dquot_initialize at ffffffff811b1077
#9 [ffff8804e5981898] filter_destroy at ffffffffa0779496
#10 [ffff8804e5981a78] ost_destroy at ffffffffa0656de3
#11 [ffff8804e5981af8] ost_handle at ffffffffa066252b
#12 [ffff8804e5981ca8] ptlrpc_server_handle_request at ffffffffa06c5b11
#13 [ffff8804e5981de8] ptlrpc_main at ffffffffa06c6f0a
#14 [ffff8804e5981f48] kernel_thread at ffffffff8100d1aa
============================================================================
when the one who owns the "dq_list_lock" is spinning forever with the following stack trace:
============================================================================
#6 [ffff88039cdeb8c0] vfs_quota_sync at ffffffff811b128b
#7 [ffff88039cdeb918] fsfilt_ldiskfs_quotactl at ffffffffa03fc6fe
#8 [ffff88039cdeb9c8] filter_quota_ctl at ffffffffa07d1bc2
#9 [ffff88039cdebaf8] ost_handle at ffffffffa06627d9
#10 [ffff88039cdebca8] ptlrpc_server_handle_request at ffffffffa06c5b11
#11 [ffff88039cdebde8] ptlrpc_main at ffffffffa06c6f0a
#12 [ffff88039cdebf48] kernel_thread at ffffffff8100d1aa
============================================================================
We can also see that a (struct super_block *)->s_dquot.info[cnt].dqi_dirty_list list contains a single "struct dquot" having its dq_dirty.new pointing to itself and also its dq_flags with both DQ_ACTIVE_B and DQ_MOD_B bits unset. It seems that this is leading to an infinite loop in vfs_quota_sync()/clear_dquot_dirty().
So maybe there is a place (in the kernel or Lustre) where a dqot struct can be chained or unchained on the dqi_dirty_list without the protection of "dq_list_lock".
On the OSSes, we also see very often the following messages in the syslog:
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: -----------[ cut here ]-----------
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Tainted: GF W )
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: Hardware name: bullx super-node
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: list_add corruption. next->prev should be prev (ffff88087da265c0), but was ffff88087c9bb2b0. (n
ext=ffff88087c9bb2b0).
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: Modules linked in: iptable_filter(U) ip_tables(U) x_tables(U) obdfilter(U) fsfilt_ldiskfs(U) os
t(U) mgc(U) ldiskfs(U) jbd2(U) lustre(U) lov(U) osc(U) mdc(U) lquota(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(F)(U)
lpfc(U) scsi_transport_fc(U) scsi_tgt(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_ta
ble(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_mthca(U) ib_
mad(U) ib_core(U) usbhid(U) hid(U) mlx4_core(U) igb(U) ioatdma(U) i2c_i801(U) sg(U) i2c_core(U) uhci_hcd(U) dca(U) ehci_hcd(U) iTCO_wdt(U) iTCO_vend
or_support(U) ext3(U) jbd(U) mbcache(U) sd_mod(U) crc_t10dif(U) ahci(U) dm_mod(U) [last unloaded: scsi_tgt]
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: Pid: 10660, comm: ll_ost_io_185 Tainted: GF W 2.6.32-30.el6.Bull.16.x86_64 #1
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: Call Trace:
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: [<ffffffff8105caa3>] warn_slowpath_common+0x83/0xc0
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: [<ffffffff8105cb41>] warn_slowpath_fmt+0x41/0x50
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: [<ffffffff8124ca5d>] __list_add+0x6d/0xa0
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: [<ffffffff811aef9d>] dquot_mark_dquot_dirty+0x5d/0x70
2011-03-31 11:38:17 Mar 31 11:38:17 node206 kernel: [<ffffffffa087f251>] ldiskfs_mark_dquot_dirty+0x31/0x60 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff811af887>] __dquot_free_space+0x197/0x2f0
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff811afa10>] dquot_free_space+0x10/0x20
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa084b3a3>] ldiskfs_free_blocks+0xf3/0x110 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa085033e>] ldiskfs_ext_truncate+0x82e/0x9c0 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff811106e2>] ? pagevec_lookup+0x22/0x30
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa085cfc8>] ldiskfs_truncate+0x4c8/0x660 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa084d43b>] ? __ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8112303b>] ? unmap_mapping_range+0x6b/0x140
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff81111ebe>] vmtruncate+0x5e/0x70
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff811729c5>] inode_setattr+0x35/0x170
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa085dba6>] ldiskfs_setattr+0x186/0x390 [ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08c500e>] fsfilt_ldiskfs_setattr+0x17e/0x200 [fsfilt_ldiskfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff810fee3f>] ? find_or_create_page+0x3f/0xb0
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08f0fd4>] filter_setattr_internal+0xcc4/0x22c0 [obdfilter]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08de14f>] ? filter_fmd_find_nolock+0x24f/0x2f0 [obdfilter]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08d6633>] ? filter_fmd_put+0x33/0x190 [obdfilter]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa00f4dc1>] ? push_ctxt+0x281/0x3e0 [lvfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08f272d>] filter_setattr+0x15d/0x610 [obdfilter]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa0600e0b>] ? lustre_pack_reply_v2+0x23b/0x310 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa05ffc65>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa062ba7b>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa0600fb1>] ? lustre_pack_reply_flags+0xd1/0x1f0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08f2cb9>] filter_truncate+0xd9/0x290 [obdfilter]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa00f973c>] ? lprocfs_counter_add+0x12c/0x170 [lvfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08b0741>] ost_punch+0x2a1/0x8c0 [ost]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa06019dc>] ? lustre_msg_get_version+0x7c/0xe0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa05ff884>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa0601b9c>] ? lustre_msg_get_conn_cnt+0x7c/0xe0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa08b86b0>] ost_handle+0x31d0/0x4f40 [ost]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8124a390>] ? __bitmap_weight+0x50/0xb0
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa05ff884>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa060eb11>] ptlrpc_server_handle_request+0x421/0xef0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8104079e>] ? activate_task+0x2e/0x40
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8104e0b6>] ? try_to_wake_up+0x286/0x380
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8104e1c2>] ? default_wake_function+0x12/0x20
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff81041059>] ? __wake_up_common+0x59/0x90
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa012c5ae>] ? cfs_timer_arm+0xe/0x10 [libcfs]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa060ff0a>] ptlrpc_main+0x92a/0x15b0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8104e1b0>] ? default_wake_function+0x0/0x20
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8100d1aa>] child_rip+0xa/0x20
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffffa060f5e0>] ? ptlrpc_main+0x0/0x15b0 [ptlrpc]
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: [<ffffffff8100d1a0>] ? child_rip+0x0/0x20
2011-03-31 11:38:18 Mar 31 11:38:17 node206 kernel: --[ end trace bb3c2f07eefda023 ]--
..........
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: -----------[ cut here ]-----------
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Tainted: GF W )
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: Hardware name: bullx super-node
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: list_add corruption. next->prev should be prev (ffff88087da265c0), but was ffff88087c9bb2b0. (n
ext=ffff88087c9bb2b0).
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: Modules linked in: iptable_filter(U) ip_tables(U) x_tables(U) obdfilter(U) fsfilt_ldiskfs(U) os
t(U) mgc(U) ldiskfs(U) jbd2(U) lustre(U) lov(U) osc(U) mdc(U) lquota(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(F)(U)
lpfc(U) scsi_transport_fc(U) scsi_tgt(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_ta
ble(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_mthca(U) ib_
mad(U) ib_core(U) usbhid(U) hid(U) mlx4_core(U) igb(U) ioatdma(U) i2c_i801(U) sg(U) i2c_core(U) uhci_hcd(U) dca(U) ehci_hcd(U) iTCO_wdt(U) iTCO_vend
or_support(U) ext3(U) jbd(U) mbcache(U) sd_mod(U) crc_t10dif(U) ahci(U) dm_mod(U) [last unloaded: scsi_tgt]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: Pid: 20096, comm: ll_ost_io_45 Tainted: GF W 2.6.32-30.el6.Bull.16.x86_64 #1
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: Call Trace:
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8105caa3>] warn_slowpath_common+0x83/0xc0
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8105cb41>] warn_slowpath_fmt+0x41/0x50
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8124ca5d>] __list_add+0x6d/0xa0
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff811aef9d>] dquot_mark_dquot_dirty+0x5d/0x70
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa087f251>] ldiskfs_mark_dquot_dirty+0x31/0x60 [ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff811afb83>] __dquot_alloc_space+0x133/0x220
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff81250078>] ? __percpu_counter_add+0x68/0x90
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff811afc9e>] dquot_alloc_space+0xe/0x10
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa0865e96>] ldiskfs_mb_new_blocks+0xf6/0x660 [ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff81453c1e>] ? mutex_lock+0x1e/0x50
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff811b0a7b>] ? dqget+0x1cb/0x380
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08c6d1b>] ldiskfs_ext_new_extent_cb+0x59b/0x6f0 [fsfilt_ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff811851cc>] ? __getblk+0x2c/0x2e0
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa084f7e9>] ldiskfs_ext_walk_space+0x109/0x2c0 [ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08c6780>] ? ldiskfs_ext_new_extent_cb+0x0/0x6f0 [fsfilt_ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08c644d>] fsfilt_map_nblocks+0xed/0x120 [fsfilt_ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08c659b>] fsfilt_ldiskfs_map_ext_inode_pages+0x11b/0x260 [fsfilt_ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff810cea15>] ? call_rcu_sched+0x15/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff81088cad>] ? commit_creds+0x11d/0x1e0
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08c6775>] fsfilt_ldiskfs_map_inode_pages+0x95/0xa0 [fsfilt_ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa087efb8>] ? ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08ff4c5>] filter_do_bio+0xd75/0x1860 [obdfilter]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa0901bd8>] filter_commitrw_write+0x13d8/0x284c [obdfilter]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08f4535>] filter_commitrw+0x2c5/0x2f0 [obdfilter]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa05ffc65>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa062ba7b>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa00f973c>] ? lprocfs_counter_add+0x12c/0x170 [lvfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08abffa>] obd_commitrw+0x11a/0x410 [ost]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08b4644>] ost_brw_write+0xff4/0x1e90 [ost]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa05f9e44>] ? ptlrpc_send_reply+0x284/0x6f0 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08f2cb9>] ? filter_truncate+0xd9/0x290 [obdfilter]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa00f973c>] ? lprocfs_counter_add+0x12c/0x170 [lvfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8104e1b0>] ? default_wake_function+0x0/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa05ff884>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa08b937a>] ost_handle+0x3e9a/0x4f40 [ost]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8124a390>] ? __bitmap_weight+0x50/0xb0
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa05ff884>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa060eb11>] ptlrpc_server_handle_request+0x421/0xef0 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8104079e>] ? activate_task+0x2e/0x40
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8104e0b6>] ? try_to_wake_up+0x286/0x380
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8104e1c2>] ? default_wake_function+0x12/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff81041059>] ? __wake_up_common+0x59/0x90
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa012c5ae>] ? cfs_timer_arm+0xe/0x10 [libcfs]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa060ff0a>] ptlrpc_main+0x92a/0x15b0 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8104e1b0>] ? default_wake_function+0x0/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8100d1aa>] child_rip+0xa/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffffa060f5e0>] ? ptlrpc_main+0x0/0x15b0 [ptlrpc]
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: [<ffffffff8100d1a0>] ? child_rip+0x0/0x20
2011-04-02 10:59:08 Apr 2 10:59:08 node206 kernel: --[ end trace bb3c2f07eefda100 ]--
.........
=============================================================
To me, this problem looks very similar to bugzilla 22363. But it is strange that the fix for this bug was only landed in 1.8 branch. In comment 28 Andrew ays that master does not need it as it is a SLES11-only fix, but now that we support RHEL6 in master, is this still true?
And also, I noticed that the patch quota-support-64-bit-quota-format.patch is not applied in 2.6-rhel6.series file.
What do you think?
TIA,
Sebastien.
As per update from Bull, this issue only occurs in pre-GA versions of RHEL6 without the previously mentioned adjustment so marking as RESOLVED. Please reopen if some further evidence is found to suggest that this is not the case.