[LU-2121] Test failure on test suite lustre-rsync-test, subtest test_1 Created: 09/Oct/12  Updated: 29/Sep/15  Resolved: 29/Sep/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Issue Links:
Related
is related to LU-4781 lustre-rsync-test test_2b: Replicatio... Resolved
Severity: 3
Rank (Obsolete): 5122

 Description   

This issue was created by maloo for Li Wei <liwei@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/04462af8-1206-11e2-a663-52540035b04c.

The sub-test test_1 failed with the following error:

test failed to respond and timed out

Info required for matching: lustre-rsync-test 1



 Comments   
Comment by Li Wei (Inactive) [ 16/Oct/12 ]

https://maloo.whamcloud.com/test_sets/3be69dba-173a-11e2-afe1-52540035b04c

CMD: fat-intel-1vm3 dumpe2fs -h lustre-mdt1/mdt1 2>&1 | grep -q large_xattr
CMD: fat-intel-1vm3 dumpe2fs -h lustre-mdt1/mdt1 2>&1
Comment by Andreas Dilger [ 19/Oct/12 ]

15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)
15:07:23:cannot allocate a tage (334)

Comment by Andreas Dilger [ 19/Oct/12 ]

Looks like the client is out of memory? Could be related to LU-2139.

Comment by Andreas Dilger [ 18/Mar/14 ]

I see in more recent reports of this bug (which I suspect is a different problem, but this ticket is so old as to be only useful for recycling):

14:05:18:IP: [<ffffffffa073c085>] lu_context_exit+0x35/0xa0 [obdclass]
14:05:18:Oops: 0000 [#1] SMP 
14:05:18:CPU 0 
14:05:18:Pid: 18668, comm: lctl Tainted: P 2.6.32-431.3.1.el6_lustre.gc762f0f.x86_64 #1 Red Hat KVM
14:05:18:RIP: 0010:[<ffffffffa073c085>]  [<ffffffffa073c085>] lu_context_exit+0x35/0xa0 [obdclass]
14:05:18:Process lctl (pid: 18668, threadinfo ffff88006f718000, task ffff88006e76e080)
14:05:18:Stack:
14:05:18:Call Trace:
14:05:18: [<ffffffffa073d1e6>] lu_env_fini+0x16/0x30 [obdclass]
14:05:18: [<ffffffffa0ca8881>] mdd_changelog_users_seq_show+0x111/0x290 [mdd]
14:05:18: [<ffffffff811ade22>] seq_read+0xf2/0x400
14:05:18: [<ffffffff811f355e>] proc_reg_read+0x7e/0xc0
14:05:18: [<ffffffff811896b5>] vfs_read+0xb5/0x1a0
14:05:18: [<ffffffff811897f1>] sys_read+0x51/0x90
Comment by Nathaniel Clark [ 05/Sep/14 ]

Several of the failures (on both DNE and ZFS) on MDS:

13:43:31:Lustre: DEBUG MARKER: == lustre-rsync-test test 1: Simple Replication == 19:43:23 (1408563803)
13:43:31:LustreError: 3206:0:(layout.c:2355:req_capsule_extend()) ASSERTION( (fmt)->rf_fields[(i)].d[(j)]->rmf_size >= (old)->rf_fields[(i)].d[(j)]->rmf_size ) failed: 
13:43:31:LustreError: 3206:0:(layout.c:2355:req_capsule_extend()) LBUG
13:43:31:Pid: 3206, comm: mdt00_003
13:43:31:
13:43:31:Call Trace:
13:43:31: [<ffffffffa0483895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
13:43:31: [<ffffffffa0483e97>] lbug_with_loc+0x47/0xb0 [libcfs]
13:43:31: [<ffffffffa08511cc>] req_capsule_extend+0x1fc/0x200 [ptlrpc]
13:43:31: [<ffffffffa0eba77a>] mdt_intent_policy+0x38a/0xca0 [mdt]
13:43:31: [<ffffffffa07e0789>] ldlm_lock_enqueue+0x369/0x970 [ptlrpc]
13:43:31: [<ffffffffa0809e4a>] ldlm_handle_enqueue0+0x36a/0x1120 [ptlrpc]
13:43:31: [<ffffffffa088c972>] tgt_enqueue+0x62/0x1d0 [ptlrpc]
13:43:31: [<ffffffffa088d1fe>] tgt_request_handle+0x71e/0xb10 [ptlrpc]
13:43:31: [<ffffffffa083c224>] ptlrpc_main+0xe64/0x1990 [ptlrpc]
13:43:31: [<ffffffffa083b3c0>] ? ptlrpc_main+0x0/0x1990 [ptlrpc]
13:43:31: [<ffffffff8109abf6>] kthread+0x96/0xa0
13:43:31: [<ffffffff8100c20a>] child_rip+0xa/0x20
13:43:31: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
13:43:31: [<ffffffff8100c200>] ? child_rip+0x0/0x20
13:43:31:

https://testing.hpdd.intel.com/test_sets/52111d5c-28ba-11e4-901f-5254006e85c2
https://testing.hpdd.intel.com/test_sets/f5085e68-2e13-11e4-8a0b-5254006e85c2
https://testing.hpdd.intel.com/test_sets/381780f0-334e-11e4-b04e-5254006e85c2

Comment by Andreas Dilger [ 29/Sep/15 ]

Close old bug.

Generated at Sat Feb 10 01:22:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.