Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.6.0
-
None
-
Master on SLES11 SP3
-
3
-
12967
Description
Filing a ticket as instructed. Log file for a client is filled with stack traces from the following error. All stack traces are the same.
LustreError: 11692:0:(rw.c:128:ll_cl_init()) husk1: [0x280000f70:0x11c59:0x0] no active IO, please file a ticket. Pid: 11692, comm: ksh_so_hack.bin Trace: [<ffffffff81005eb9>] try_stack_unwind+0x169/0x1b0 [<ffffffff81004919>] dump_trace+0x89/0x450 [<ffffffffa02158d7>] libcfs_debug_dumpstack+0x57/0x80 [libcfs] [<ffffffffa07f33ae>] ll_cl_init+0x21e/0x320 [lustre] [<ffffffffa07f34f8>] ll_readpage+0x48/0x1b0 [lustre] [<ffffffff81106418>] __do_page_cache_readahead+0x1e8/0x260 [<ffffffff81106538>] force_page_cache_readahead+0x78/0xa0 [<ffffffff810ff30d>] sys_fadvise64_64+0xdd/0x230 [<ffffffff810ff46e>] sys_fadvise64+0xe/0x10 [<ffffffff8145376b>] system_call_fastpath+0x16/0x1b [<00002aaaaaac11bd>] 0x2aaaaaac11bd
Also see these messages an hour prior to those above (in case there's a relationship):
LustreError: 4943:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x4:0x0] at 4753935872275117037: rc -5 LustreError: 5003:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x7:0x0] at 4753935872275117037: rc -5 LustreError: 5984:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x17fe9:0x0] at 6497832999440693922: rc -5
Attaching log file. Dump is available if you want it.
Attachments
Issue Links
- mentioned in
-
Page No Confluence page found with the given URL.
We've hit this error again, repeatedly, running Sanity - test 54c this time against
LU-3321built into our 2.6 branch. Patch http://review.whamcloud.com/9658 was not yet in our 2.6 build.We had not seen this error until recently, are there changes that are bringing this to light more?
In test 54c this fails when attempting to mount the loop device created, with the following in dmesgs:
Buffer I/O error on device loop3, logical block 0
lost page write due to I/O error on loop3
in log file:
mount: wrong fs type, bad option, bad superblock on /tmp/dal/loop54c,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so