[LU-6624] LBUG in osc_lru_reclaim Created: 21/May/15 Updated: 05/Jun/15 Resolved: 05/Jun/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Hiroya Nozaki | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
According to the existing code, I can guess that cl_client_cache->ccc_lru needs spin_lock when being referred but the below code looks violating the rule. osc_lru_reclaim long osc_lru_reclaim(struct client_obd *cli) { struct cl_env_nest nest; struct lu_env *env; struct cl_client_cache *cache = cli->cl_cache; long rc = 0; int max_scans; ENTRY; LASSERT(cache != NULL); LASSERT(!list_empty(&cache->ccc_lru)); <--- HERE ..... spin_lock(&cache->ccc_lru_lock); <---- The LASSERT should be here, isn't it ? cache->ccc_lru_shrinkers++; .... Actually I sometimes see LBUG in osc_lru_reclaim when running multiple WRITEs in the same time. So I'm convinced this LASSERT should be moved to the locked section, or the LASSERT can touch ccc_lru while the other is doing linked list operation on ccc_lru. |
| Comments |
| Comment by Hiroya Nozaki [ 21/May/15 ] |
|
I'll upload a trivial patch soon. |
| Comment by Gerrit Updater [ 21/May/15 ] |
|
Hiroya Nozaki (nozaki.hiroya@jp.fujitsu.com) uploaded a new patch: http://review.whamcloud.com/14901 |
| Comment by Jinshan Xiong (Inactive) [ 21/May/15 ] |
|
cache->ccc_lru is the LRU list of all OSCs. Now that osc_lru_reclaim() is called, which means there exists at least one OSC, so this list shouldn't be NULL. Can you post the backtrace to this ticket when you see it next time? |
| Comment by Hiroya Nozaki [ 22/May/15 ] |
|
OK, I'll post the backtrace when this case is reproduced. |
| Comment by Jinshan Xiong (Inactive) [ 22/May/15 ] |
|
Good point, if there is only one OSC, it could be empty temporarily. |
| Comment by Gerrit Updater [ 05/Jun/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14901/ |
| Comment by Peter Jones [ 05/Jun/15 ] |
|
Landed for 2.8 |