[LU-5871] Do not return -EAGAIN in lod_object_init Created: 05/Nov/14 Updated: 04/Jul/19 Resolved: 11/Nov/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | Lustre 2.7.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Di Wang | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 16422 | ||||||||
| Description |
|
Do not return EWOULDBLOCK (EAGAIN) in lod_object_init, since lu_object_find_at is using EAGAIN to check whether the object is dying and waiting the object to be released. struct lu_object *lu_object_find_at(const struct lu_env *env,
struct lu_device *dev,
const struct lu_fid *f,
const struct lu_object_conf *conf)
{
struct lu_site_bkt_data *bkt;
struct lu_object *obj;
wait_queue_t wait;
while (1) {
if (conf != NULL && conf->loc_flags & LOC_F_NOWAIT) {
obj = lu_object_find_try(env, dev, f, conf, NULL);
return obj;
}
obj = lu_object_find_try(env, dev, f, conf, &wait);
if (obj != ERR_PTR(-EAGAIN)) <--- Only wait here if the object is dying, obviously the failure(-EWOULDBLOCK) of lod_object_init should not wait here, otherwise it will cause list corruption.
return obj;
|
| Comments |
| Comment by Andreas Dilger [ 05/Nov/14 ] |
|
Di, is this something that should be fixed for 2.7.0, or what is the impact of this problem? |
| Comment by Di Wang [ 05/Nov/14 ] |
|
Yes, it should be fixed in 2.7.0, otherwise it will cause the list_entry corruption during fail over (though probably rare) . Anyway the patch should be tiny, and I will make the patch now. |
| Comment by Di Wang [ 05/Nov/14 ] |
| Comment by Di Wang [ 11/Nov/14 ] |
|
patch landed on master |