Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.4.2
-
None
-
3
-
13203
Description
Hi,
When working with files striped across a large number of OSTs, CEA can see messages like:
Lustre: 12572:0:(osd_handler.c:833:osd_trans_start()) scratch3-MDT0000: too many transaction credits (28288 > 25600) Lustre: 12572:0:(osd_handler.c:840:osd_trans_start()) create: 160/4000, delete: 2/35, destroy: 1/25 Lustre: 12572:0:(osd_handler.c:845:osd_trans_start()) attr_set: 2/2, xattr_set: 161/2254 Lustre: 12572:0:(osd_handler.c:852:osd_trans_start()) write: 1282/17948, punch: 320/1280, quota 4/4 Lustre: 12572:0:(osd_handler.c:857:osd_trans_start()) insert: 161/2736, delete: 1/25 Lustre: 12572:0:(osd_handler.c:862:osd_trans_start()) ref_add: 1/1, ref_del: 3/3 Pid: 12572, comm: mdt01_005 Call Trace: [<ffffffffa041b895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa0c3b31e>] osd_trans_start+0x65e/0x680 [osd_ldiskfs] [<ffffffffa0d3e309>] lod_trans_start+0x1b9/0x250 [lod] [<ffffffffa08d6357>] mdd_trans_start+0x17/0x20 [mdd] [<ffffffffa08cb3be>] mdd_unlink+0x41e/0xe30 [mdd] [<ffffffffa0cc2da8>] mdo_unlink+0x18/0x50 [mdt] [<ffffffffa0cc6280>] mdt_reint_unlink+0x820/0x1010 [mdt] [<ffffffffa0cc2aa1>] mdt_reint_rec+0x41/0xe0 [mdt] [<ffffffffa0ca7c73>] mdt_reint_internal+0x4c3/0x780 [mdt] [<ffffffffa0ca7f74>] mdt_reint+0x44/0xe0 [mdt] [<ffffffffa0cacc27>] mdt_handle_common+0x647/0x16d0 [mdt] [<ffffffffa0788b9c>] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] [<ffffffffa0ce6835>] mds_regular_handle+0x15/0x20 [mdt] [<ffffffffa07983b8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [<ffffffffa041c5de>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa042dd9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [<ffffffffa078f719>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffff81058bd3>] ? __wake_up+0x53/0x70 [<ffffffffa079974e>] ptlrpc_main+0xace/0x1700 [ptlrpc] [<ffffffffa0798c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffffa0798c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffffa0798c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c200>] ? child_rip+0x0/0x20
This issue seems very similar to LU-4611, for which we would need a backport in b2_4 when a proper fix is identified.
Thanks,
Sebastien.
Attachments
Issue Links
- duplicates
-
LU-4611 too many transaction credits (32279 > 25600)
- Resolved