[LU-3392] filter_do_bio()) ASSERTION(rw == OBD_BRW_READ) Created: 24/May/13 Updated: 03/Feb/17 Resolved: 28/Sep/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Wojciech Turek (Inactive) | Assignee: | WC Triage |
| Resolution: | Incomplete | Votes: | 1 |
| Labels: | None | ||
| Environment: |
servers: RHEL6 2.6.32-220.17.1.el6_lustre.x86_64 Lustre-2.1.2 |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 8396 | ||||
| Description |
|
We have experienced an LBUG today on one of our OSS servers which I have not seen before. LustreError: 16066:0:(filter_io_26.c:344:filter_do_bio()) ASSERTION(rw == OBD_BRW_READ) failed Now after rebooting that OSS same LBUG is triggered as soon as OSTs finish recovery and start servicing their data. Has anyone seen this before ? Our environment: |
| Comments |
| Comment by Wojciech Turek (Inactive) [ 29/May/13 ] |
|
The problem seem to be caused by one particular OST. I identify that OST by unmounting OSTs and I have run fsck on that OST and it found some errors. After fixing them I was able to mount OST and and LBUG was not reoccurring. I can not explain how the corruption of the inode size has crept in in the first place which is concerning. fsck from util-linux-ng 2.17.2 lustre1-OST0003: ***** FILE SYSTEM WAS MODIFIED ***** 6008797 inodes used (26.25%, out of 22888320)
6008751 regular files |
| Comment by Wojciech Turek (Inactive) [ 03/Jun/13 ] |
|
This LBUG has hit us again, and I found that fsck alone does not actually fix it but aborting recovery does. So after being hit by this lbug one needs to restart OSS and then fsck the OST to fix the i_size error then mount the OST with abort recovery option. If we did not abort recovery LBUG hits again and the i_size is corrupted again. We found that after recovering filesystem one client would not come back. This is most likely the client that creates the problem in the first place. It sounds like a serious bug as it seem that a client operation can bring down the server. Next step for us is to update the server side to 2.1.5 and see if we still see this problem. |
| Comment by John Fuchs-Chesney (Inactive) [ 28/Sep/15 ] |
|
Marking this as resolved/incomplete. If this is still a live issue on newer release, just let us know and we'll move the ticket to the correct Project. Thanks, |
| Comment by Gerrit Updater [ 24/Nov/16 ] |
|
Wang Shilong (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/23931 |
| Comment by Gerrit Updater [ 24/Nov/16 ] |
|
Wang Shilong (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/23938 |