Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.0.0, Lustre 2.1.0
-
None
-
Bull PetaFlopV1.0AE1, kernel 2.6.32-71.14.1.el6, lustre 2.0.0.1
-
3
-
5005
Description
During a "warm" (shine stop/start) of several OSTs, OSS became pseudo-hang (ping ok but no login possible).
Forced crash-dump analysis show that all 32 CPUs/Cores are running/spinning with 32 ll_ost_<xx> threads all with the following stack-trace:
=======================================================
_spin_lock()
filter_sync_llogs()
filter_set_info_async()
target_handle_connect()
ost_handle()
ptlrpc_server_handle_request()
ptlrpc_main()
kernel_thread()
=======================================================
when the thread owning the concerned (struct filter_obd *)->fo_llog_list_lock spin-lock is sleeping/waiting to be re-scheduled with the following stack-trace :
==================================================================================
schedule()
__cond_resched()
_cond_resched()
__kmalloc()
cfs_alloc()
filter_find_create_olg()
filter_set_info_async()
ost_set_info()
ost_handle()
ptlrpc_server_handle_request()
ptlrpc_main()
kernel_thread()
==================================================================================
So this definitelly points to a bug in filter_find_create_olg() where if filter_find_olg_internal() does not find a matching struct obd_llog_group and it needs to be kmem-allocated, this can and must be done with (struct filter_obd *)->fo_llog_list_lock freed to avoid this race+dead-lock scenario.
This means that the following source extract of filter_find_create_olg() routine :
==================================================================================
OBD_ALLOC_PTR(olg);
if (olg == NULL)
GOTO(out_unlock, olg = ERR_PTR(-ENOMEM));
llog_group_init(olg, group);
==================================================================================
should be replaced with :
=========================
spin_unlock(&filter->fo_llog_list_lock);
OBD_ALLOC_PTR(olg);
if (olg == NULL)
GOTO(out, olg = ERR_PTR(-ENOMEM));
llog_group_init(olg, group);
spin_lock(&filter->fo_llog_list_lock);
=========================
Landed for 2.1