[LU-7860] LustreError: 19445:0:(ldlm_lock.c:2273:ldlm_lock_cancel()) ASSERTION( !(((( lock))->l_flags & (1ULL << 53)) != 0) ) failed Created: 09/Mar/16 Updated: 14/Jun/18 Resolved: 09/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Ruth Klundt (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
servers: (zfs) v2_8_0_0_RC2--PRISTINE-3.10.0-327.0.0.1chaos.ch6.x86_64 |
||
| Issue Links: |
|
||||||||
| Severity: | 4 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This is an OSS node error encountered while running a 56 node/260 process IOR. This is not a server-client combo that is likely to see production, but I'd guess any ASSERT triggered would be of interest. The IOR jobs were failing on one of the 56 clients with ENOTDIR on attempting to open the data file. |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 09/Mar/16 ] |
|
Hi Ruth, |
| Comment by Peter Jones [ 09/Mar/16 ] |
|
Ruth You have flagged this ticket as severity 1 - meaning a production filesystem out of service. Looking at the details supplied it looks like this relates to testing you are doing on the community 2.8 release so I am wondering if your intention was to select the lowest severity (4)? Peter |
| Comment by Ruth Klundt (Inactive) [ 09/Mar/16 ] |
|
oops, sorry I meant lowest severity, so apparently I meant 5, 'not sever at all' |
| Comment by Peter Jones [ 09/Mar/16 ] |
|
ok - no problem |
| Comment by Oleg Drokin [ 10/Mar/16 ] |
|
Liang, apparently this is assertion that you have introduced and apparently it's incorrect and the previous handling for it was correct. Ruth, can you please upload a log from the crashed server if you have it? with a backtrace and all. |
| Comment by Joseph Gmitter (Inactive) [ 10/Mar/16 ] |
|
Hi Liang, Can you please have a look at the change? Thanks. |
| Comment by Ruth Klundt (Inactive) [ 14/Mar/16 ] |
|
The console log got very little info and no backtrace. syslog got nothing, I may have time to try to reproduce this week. <ConMan> Console [cs48] log at 2016-03-04 19:00:00 MST. |
| Comment by Gerrit Updater [ 31/May/16 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/20509 |
| Comment by Gerrit Updater [ 20/Jun/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20509/ |
| Comment by Joseph Gmitter (Inactive) [ 22/Jun/16 ] |
|
Patch has landed to master for 2.9.0 |
| Comment by Ned Bass [ 09/May/17 ] |
|
LLNL hit this on a 2.8 server today. Please consider landing the fix to b2_8_fe. Thanks |
| Comment by Peter Jones [ 09/May/17 ] |
|
Ned As far as the community releases is concerned, this is fixed in 2.9. If you want a separate support ticket to track fixing on an FE release then we can link to this one. Peter PS/ THere is already a 2.8 FE port of this fix, it just needs to be integrated |
| Comment by Ned Bass [ 09/May/17 ] |
|
Thanks Peter. I don't think we need a separate ticket. Let's just make sure that patch gets in the next 2.8 FE tag. |
| Comment by Peter Jones [ 09/May/17 ] |
|
Sure. It was already flagged for inclusion. |