[LU-6423] Interop 2.5.3<->master replay-vbr test_7f: MDS OOM Created: 02/Apr/15  Updated: 02/Sep/15  Resolved: 06/Apr/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

server: 2.5.3
client: lustre-master build#2697


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/24997406-d5c5-11e4-a439-5254006e85c2.

The sub-test test_7f failed with the following error:

test failed to respond and timed out

MDS console

15:31:04:Lustre: DEBUG MARKER: == replay-vbr test 7f: unlink, {lost}, rename == 08:30:13 (1427556613)
15:31:04:Lustre: MGS: non-config logname received: params
15:31:04:Lustre: Skipped 103 previous similar messages
15:31:04:Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdd.lustre-MDT0000.sync_permission=0
15:31:04:Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdt.lustre-MDT0000.commit_on_sharing=0
15:31:04:Lustre: DEBUG MARKER: sync; sync; sync
15:31:04:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
15:31:04:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly
15:31:04:Turning device dm-0 (0xfd00000) read-only
15:31:04:Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
15:31:04:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
15:31:04:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
15:31:04:Lustre: DEBUG MARKER: umount -d /mnt/mds1
15:31:04:Removing read-only on unknown block (0xfd00000)
15:31:04:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
15:31:04:Lustre: DEBUG MARKER: hostname
15:31:04:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
15:31:04:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
15:31:04:LDISKFS-fs (dm-0): recovery complete
15:31:04:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
15:31:04:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
15:31:04:Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null
15:31:04:mdt00_001 invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
15:31:04:mdt00_001 cpuset=/ mems_allowed=0
15:31:04:Pid: 17425, comm: mdt00_001 Not tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1
15:31:04:Call Trace:
15:31:04: [<ffffffff810d0431>] ? cpuset_print_task_mems_allowed+0x91/0xb0
15:31:04: [<ffffffff81122810>] ? dump_header+0x90/0x1b0
15:31:04: [<ffffffff8112297e>] ? check_panic_on_oom+0x4e/0x80
15:31:04: [<ffffffff8112306b>] ? out_of_memory+0x1bb/0x3c0
15:31:04: [<ffffffff8112f9ef>] ? __alloc_pages_nodemask+0x89f/0x8d0
15:31:04: [<ffffffff8116e2d2>] ? kmem_getpages+0x62/0x170
15:31:04: [<ffffffff8116eeea>] ? fallback_alloc+0x1ba/0x270
15:31:04: [<ffffffff8116e93f>] ? cache_grow+0x2cf/0x320
15:31:04: [<ffffffff8116ec69>] ? ____cache_alloc_node+0x99/0x160
15:31:04: [<ffffffff8124c96c>] ? crypto_create_tfm+0x3c/0xe0
15:31:04: [<ffffffff8116fa39>] ? __kmalloc+0x189/0x220
15:31:04: [<ffffffff8124c96c>] ? crypto_create_tfm+0x3c/0xe0
15:31:04: [<ffffffff81253308>] ? crypto_init_shash_ops+0x68/0x100
15:31:04: [<ffffffff8124ca7a>] ? __crypto_alloc_tfm+0x6a/0x130
15:31:04: [<ffffffff8124d2ea>] ? crypto_alloc_base+0x5a/0xb0
15:31:04: [<ffffffffa04ad444>] ? cfs_percpt_unlock+0x24/0xb0 [libcfs]
15:31:04: [<ffffffffa04992ca>] ? cfs_crypto_hash_alloc+0x7a/0x290 [libcfs]
15:31:04: [<ffffffffa04995da>] ? cfs_crypto_hash_digest+0x6a/0xf0 [libcfs]
15:31:04: [<ffffffff8116fabc>] ? __kmalloc+0x20c/0x220
15:31:04: [<ffffffffa0785713>] ? lustre_msg_calc_cksum+0xd3/0x130 [ptlrpc]
15:31:04: [<ffffffffa07be811>] ? null_authorize+0xa1/0x100 [ptlrpc]
15:31:04: [<ffffffffa07ad876>] ? sptlrpc_svc_wrap_reply+0x56/0x1c0 [ptlrpc]
15:31:04: [<ffffffffa077dc6c>] ? ptlrpc_send_reply+0x1fc/0x7f0 [ptlrpc]
15:31:04: [<ffffffffa0794215>] ? ptlrpc_at_check_timed+0xc05/0x1360 [ptlrpc]
15:31:04: [<ffffffffa078c3d9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
15:31:04: [<ffffffffa0796140>] ? ptlrpc_main+0xbd0/0x1740 [ptlrpc]
15:31:04: [<ffffffffa0795570>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
15:31:04: [<ffffffff8109abf6>] ? kthread+0x96/0xa0
15:31:04: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
15:31:04: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
15:31:04: [<ffffffff8100c200>] ? child_rip+0x0/0x20
15:31:04:Mem-Info:


 Comments   
Comment by Doug Oucharek (Inactive) [ 06/Apr/15 ]

Duplicate of LU-5079

Generated at Sat Feb 10 02:00:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.