[LU-10307] obdfilter-survey test_3a: Timeout occurred after 456 mins, last suite running was obdfilter-survey, restarting cluster to continue tests Created: 30/Nov/17 Updated: 01/Feb/21 Resolved: 01/Feb/21 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0, Lustre 2.10.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Casper | Assignee: | Yang Sheng |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
onyx, interop |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
session: https://testing.hpdd.intel.com/test_sessions/9f032a71-4161-4ba2-aee0-78e2895d8180 obdfilter-survey test 3a hangs on OST umount. The last thing we see in the client test_log for test 3a is unmounting OST1 on onyx-34vm8. From the OST dmesg log, we see [24240.212638] INFO: task umount:26551 blocked for more than 120 seconds. [24240.213441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [24240.214247] umount D 000000000000f908 0 26551 26550 0x00000080 [24240.215027] ffff880052e93ab0 0000000000000086 ffff8800128fdee0 ffff880052e93fd8 [24240.215949] ffff880052e93fd8 ffff880052e93fd8 ffff8800128fdee0 ffff88007c100000 [24240.216808] ffff880052e93ae0 00000001016e57b8 ffff88007c100000 000000000000f908 [24240.217726] Call Trace: [24240.218021] [<ffffffff816a9569>] schedule+0x29/0x70 [24240.218527] [<ffffffff816a6fb4>] schedule_timeout+0x174/0x2c0 [24240.219151] [<ffffffff81098b30>] ? internal_add_timer+0x70/0x70 [24240.220024] [<ffffffffc0775343>] ? dump_exports+0x143/0x150 [obdclass] [24240.220725] [<ffffffffc07753fb>] obd_exports_barrier+0xab/0x1a0 [obdclass] [24240.221464] [<ffffffffc0ff16bf>] ofd_device_fini+0x8f/0x2d0 [ofd] [24240.222134] [<ffffffffc078d911>] class_cleanup+0x971/0xcd0 [obdclass] [24240.222818] [<ffffffffc078fcad>] class_process_config+0x19cd/0x23b0 [obdclass] [24240.223579] [<ffffffffc0637bc7>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [24240.224275] [<ffffffffc0790856>] class_manual_cleanup+0x1c6/0x710 [obdclass] [24240.225044] [<ffffffffc07befee>] server_put_super+0x8de/0xcd0 [obdclass] [24240.225993] [<ffffffff81203692>] generic_shutdown_super+0x72/0x100 [24240.226718] [<ffffffff81203a62>] kill_anon_super+0x12/0x20 [24240.227416] [<ffffffffc0793152>] lustre_kill_super+0x32/0x50 [obdclass] [24240.228111] [<ffffffff81203e19>] deactivate_locked_super+0x49/0x60 [24240.228805] [<ffffffff81204586>] deactivate_super+0x46/0x60 [24240.229413] [<ffffffff812217cf>] cleanup_mnt+0x3f/0x80 [24240.229957] [<ffffffff81221862>] __cleanup_mnt+0x12/0x20 [24240.230566] [<ffffffff810ad275>] task_work_run+0xc5/0xf0 [24240.231138] [<ffffffff8102ab62>] do_notify_resume+0x92/0xb0 [24240.231779] [<ffffffff816b533d>] int_signal+0x12/0x17 |
| Comments |
| Comment by James Nunez (Inactive) [ 01/Dec/17 ] |
|
obdfilter-survey test 3a started hanging on OST umount from, at least, October 11. The OST umount only hangs (for this test) during interop testing; for example 2.10.2 RC1 and 2.9.0. |
| Comment by Peter Jones [ 01/Dec/17 ] |
|
Yang Sheng Can you please investigate this one? Thanks Peter |