Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
3
-
13809
Description
Several places in the Lustre code store the result of kthread_run, which is of 64-bit type task_struct *, in a 32-bit integer. Depending on the value of the pointer, this will be interpreted as an error by the IS_ERR_VALUE macro.
This problem exists, for example, with the creation of statahead threads, ll_sa.
static int start_statahead_thread(struct inode *dir, struct dentry *dentry) { ... int rc; ... rc = PTR_ERR(kthread_run(ll_statahead_thread, dentry->d_parent, "ll_sa_%u", lli->lli_opendir_pid)); thread = &sai->sai_thread; if (IS_ERR_VALUE(rc)) { CERROR("can't start ll_sa thread, rc: %d\n", rc); spin_lock(&lli->lli_sa_lock); thread_set_flags(thread, SVC_STOPPED); thread_set_flags(&sai->sai_agl_thread, SVC_STOPPED); spin_unlock(&lli->lli_sa_lock); ll_sai_put(sai); GOTO(out, rc = -EAGAIN); } ...
This can cause a general protection fault if the sai struct is accessed by the ll_sa thread after it has been freed. Here is an example in which the return value from kthread_run is incorrectly interpreted as an error:
2014-04-18T04:37:49.706791-05:00 c0-0c1s2n3 LustreError: 11453:0:(statahead.c:1588:start_statahead_thread()) can't start ll_sa thread, rc: -3968
The -3968 which is reported as the rc of the kthread_run is not a valid error number. It is the result of truncating the returned task_struct pointer to a 32-bit int.
There are other places in Lustre which handle the return value of kthread_run incorrectly. We need to go through and make sure the return value of kthread_run is handled correctly in all cases. The correct way to handle it would be like this:
struct task_struct *task = kthread_run(...) if (IS_ERR(task)) { rc = PTR_ERR(task); CERROR(..) }