[LU-12683]  sanity-hsm 602 MDS hang at zfs part Created: 22/Aug/19  Updated: 22/Aug/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
 8912.216320] Lustre: mdt00_003: service thread pid 24030 was inactive for 60.101 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
[ 8999.961057] INFO: task mdt00_003:24030 blocked for more than 120 seconds.
[ 8999.962271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8999.963586] mdt00_003       D ffff9992d7dc6180     0 24030      2 0x00000080
[ 8999.964941] Call Trace:
[ 8999.965448]  [<ffffffffa1368ed9>] schedule+0x29/0x70
[ 8999.966536]  [<ffffffffc02d62d5>] cv_wait_common+0x125/0x150 [spl]
[ 8999.967592]  [<ffffffffa0cc2e70>] ? wake_up_atomic_t+0x30/0x30
[ 8999.968658]  [<ffffffffc02d6315>] __cv_wait+0x15/0x20 [spl]
[ 8999.969889]  [<ffffffffc04b2db3>] zrl_add_impl+0x83/0xd0 [zfs]
[ 8999.971305]  [<ffffffffc03e8451>] dbuf_destroy+0xd1/0x3b0 [zfs]
[ 8999.972411]  [<ffffffffa0e1c2d6>] ? kfree+0x106/0x140
[ 8999.973392]  [<ffffffffc04061c8>] dnode_destroy+0x138/0x230 [zfs]
[ 8999.974492]  [<ffffffffc04075ce>] dnode_hold_impl+0x5fe/0xc40 [zfs]
[ 8999.975621]  [<ffffffffc03f4905>] ? dmu_object_next+0x95/0x140 [zfs]
[ 8999.976745]  [<ffffffffc03f4b9f>] dmu_object_alloc_dnsize+0x1ef/0x3e0 [zfs]
[ 8999.977992]  [<ffffffffc1150df2>] __osd_object_create+0x82/0x170 [osd_zfs]
[ 8999.979196]  [<ffffffffc115115d>] osd_mkreg+0x7d/0x210 [osd_zfs]
[ 8999.980342]  [<ffffffffc1427ae9>] ? lod_sub_declare_xattr_set+0xf9/0x300 [lod]
[ 8999.981594]  [<ffffffffc114d6a6>] osd_create+0x316/0xaf0 [osd_zfs]
[ 8999.982687]  [<ffffffffc1425675>] lod_sub_create+0x1f5/0x480 [lod]
[ 8999.983780]  [<ffffffffc1416559>] lod_create+0x69/0x350 [lod]
[ 8999.984933]  [<ffffffffc1491e08>] mdd_create_object_internal+0xb8/0x280 [mdd]
[ 8999.986143]  [<ffffffffc147b32d>] mdd_create_object+0x7d/0x8e0 [mdd]
[ 8999.987234]  [<ffffffffc14859c7>] mdd_create+0xf57/0x1660 [mdd]
[ 8999.988265]  [<ffffffffc13236a3>] mdt_reint_open+0x2373/0x3360 [mdt]

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/b7fd9ccc-c387-11e9-90ad-52540065bddc

test_602 failed with the following error:

Timeout occurred after 207 mins, last suite running was sanity-hsm, restarting cluster to continue tests

<<Please provide additional information about the failure here>>

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-hsm test_602 - Timeout occurred after 207 mins, last suite running was sanity-hsm, restarting cluster to continue tests


Generated at Sat Feb 10 02:54:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.