[LU-7023] 2.5.3<->2.7.57 interop: obdfilter-survey test 1a hung Created: 20/Aug/15  Updated: 14/Dec/21  Resolved: 14/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.3, Lustre 2.5.5
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Lustre client build: https://build.hpdd.intel.com/job/lustre-master/3134
Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_5/86 (2.5.3)
FSTYPE=ldiskfs


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

obdfilter-survey test 1a hung. Console log on OSS node showed that:

02:45:04:Lustre: DEBUG MARKER: == obdfilter-survey test 1a: Object Storage Targets survey == 02:17:58 (1439432278)
02:45:04:Lustre: DEBUG MARKER: lctl dl | grep obdfilter
02:45:04:Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d '@'
02:45:04:Lustre: Echo OBD driver; http://www.lustre.org/
02:45:04:INFO: task lctl:23682 blocked for more than 120 seconds.
02:45:04:      Not tainted 2.6.32-504.30.3.el6_lustre.g683b4b2.x86_64 #1
02:45:04:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
02:45:04:lctl          D 0000000000000001     0 23682  23667 0x00000080
02:45:04: ffff88004cef38a8 0000000000000082 0000000000000000 0000000000000010
02:45:04: ffff88007e500140 ffff88004cef3848 000036e3560c1f90 000000009a45c271
02:45:04: ffff880027f89540 0000000103944929 ffff88001bdc1068 ffff88004cef3fd8
02:45:04:Call Trace:
02:45:04: [<ffffffffa0d20c71>] osd_do_bio+0x6c1/0x840 [osd_ldiskfs]
02:45:04: [<ffffffffa0412cf8>] ? ldiskfs_ext_walk_space+0x1a8/0x310 [ldiskfs]
02:45:04: [<ffffffffa0d1fa10>] ? ldiskfs_ext_new_extent_cb+0x0/0x6d0 [osd_ldiskfs]
02:45:04: [<ffffffff8109ec20>] ? autoremove_wake_function+0x0/0x40
02:45:04: [<ffffffffa0d22e8e>] osd_read_prep+0x34e/0x480 [osd_ldiskfs]
02:45:04: [<ffffffffa0e9e1b1>] ofd_preprw_read+0x291/0x920 [ofd]
02:45:04: [<ffffffffa0e9eb96>] ofd_preprw+0x356/0x1550 [ofd]
02:45:04: [<ffffffffa024646b>] ? cl_echo_object_find+0x4ab/0xac0 [obdecho]
02:45:04: [<ffffffffa02471d3>] echo_client_brw_ioctl+0x4c3/0x1140 [obdecho]
02:45:04: [<ffffffffa0248523>] echo_client_iocontrol+0x6d3/0x2ac0 [obdecho]
02:45:04: [<ffffffff8122f23f>] ? security_inode_permission+0x1f/0x30
02:45:04: [<ffffffffa05b0dbc>] class_handle_ioctl+0x16cc/0x21a0 [obdclass]
02:45:04: [<ffffffffa05962ab>] obd_class_ioctl+0x4b/0x190 [obdclass]
02:45:04: [<ffffffff811a3ff2>] vfs_ioctl+0x22/0xa0
02:45:04: [<ffffffff811a4194>] do_vfs_ioctl+0x84/0x580
02:45:04: [<ffffffff811a4711>] sys_ioctl+0x81/0xa0
02:45:04: [<ffffffff810e608e>] ? __audit_syscall_exit+0x25e/0x290
02:45:04: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b

Maloo reports:
https://testing.hpdd.intel.com/test_sets/eccea1dc-4189-11e5-9e18-5254006e85c2
https://testing.hpdd.intel.com/test_sets/7a0c1bcc-4196-11e5-9e18-5254006e85c2
https://testing.hpdd.intel.com/test_sets/e813f8e0-4166-11e5-882c-5254006e85c2
https://testing.hpdd.intel.com/test_sets/a28acbda-20d9-11e5-8478-5254006e85c2


Generated at Sat Feb 10 02:05:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.