Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
3
-
13809
Description
Several places in the Lustre code store the result of kthread_run, which is of 64-bit type task_struct *, in a 32-bit integer. Depending on the value of the pointer, this will be interpreted as an error by the IS_ERR_VALUE macro.
This problem exists, for example, with the creation of statahead threads, ll_sa.
static int start_statahead_thread(struct inode *dir, struct dentry *dentry)
{
...
int rc;
...
rc = PTR_ERR(kthread_run(ll_statahead_thread, dentry->d_parent,
"ll_sa_%u", lli->lli_opendir_pid));
thread = &sai->sai_thread;
if (IS_ERR_VALUE(rc)) {
CERROR("can't start ll_sa thread, rc: %d\n", rc);
spin_lock(&lli->lli_sa_lock);
thread_set_flags(thread, SVC_STOPPED);
thread_set_flags(&sai->sai_agl_thread, SVC_STOPPED);
spin_unlock(&lli->lli_sa_lock);
ll_sai_put(sai);
GOTO(out, rc = -EAGAIN);
}
...
This can cause a general protection fault if the sai struct is accessed by the ll_sa thread after it has been freed. Here is an example in which the return value from kthread_run is incorrectly interpreted as an error:
2014-04-18T04:37:49.706791-05:00 c0-0c1s2n3 LustreError: 11453:0:(statahead.c:1588:start_statahead_thread()) can't start ll_sa thread, rc: -3968
The -3968 which is reported as the rc of the kthread_run is not a valid error number. It is the result of truncating the returned task_struct pointer to a 32-bit int.
There are other places in Lustre which handle the return value of kthread_run incorrectly. We need to go through and make sure the return value of kthread_run is handled correctly in all cases. The correct way to handle it would be like this:
struct task_struct *task = kthread_run(...)
if (IS_ERR(task)) {
rc = PTR_ERR(task);
CERROR(..)
}