[LU-3035] Failure on racer: ASSERTION( io->u.ci_rw.crw_count == count ) failed: 785408 != 4194304 Created: 26/Mar/13 Updated: 09/Apr/13 Resolved: 30/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Sarah Liu | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Environment: |
client and server: lustre-master build# 1340; configured as one MDS with two MDTs |
||
| Severity: | 3 |
| Rank (Obsolete): | 7405 |
| Description |
|
Hit LBUG when running racer under DNE with one MDS two MDTs client console: Lustre: DEBUG MARKER: -----============= acceptance-small: racer ============----- Tue Mar 26 11:44:28 PDT 2013 Lustre: DEBUG MARKER: excepting tests: LustreError: 152-6: Ignoring deprecated mount option 'acl'. Lustre: Increasing default stripe size to min 1048576 Lustre: Layout lock feature supported. Lustre: Mounted lustre-client LNet: 30388:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. LNet: 30388:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message Lustre: DEBUG MARKER: Using TIMEOUT=20 LNet: 31765:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. LNet: 31765:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-5,client-15 DURATION=900 == 11:44:37 (1364323477) LustreError: 495:0:(file.c:930:ll_file_io_generic()) ASSERTION( io->u.ci_rw.crw_count == count ) failed: 785408 != 4194304 LustreError: 495:0:(file.c:930:ll_file_io_generic()) LBUG Pid: 495, comm: cat Call Trace: [<ffffffffa0366895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa0366e97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0a26882>] ll_file_io_generic+0x542/0x600 [lustre] Message from sy [<ffffffffa0a27baf>] ll_file_aio_read+0x13f/0x2c0 [lustre] slogd@client-5 a [<ffffffffa0a27e9c>] ll_file_read+0x16c/0x2a0 [lustre] t Mar 26 11:44:3 [<ffffffff81176cb5>] vfs_read+0xb5/0x1a0 9 ... kernel: [<ffffffff8100bd6e>] ? reschedule_interrupt+0xe/0x20 LustreError: 495 [<ffffffff81176df1>] sys_read+0x51/0x90 :0:(file.c:930:l [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b l_file_io_generi c()) ASSERTION( io->u.ci_rw.crw_Kernel panic - not syncing: LBUG count == count )Pid: 495, comm: cat Not tainted 2.6.32-279.19.1.el6.x86_64 #1 failed: 785408 Call Trace: != 4194304 [<ffffffff814e9541>] ? panic+0xa0/0x168 Message from sy [<ffffffffa0366eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] slogd@client-5 a [<ffffffffa0a26882>] ? ll_file_io_generic+0x542/0x600 [lustre] t Mar 26 11:44:3 [<ffffffffa0a27baf>] ? ll_file_aio_read+0x13f/0x2c0 [lustre] 9 ... kernel: [<ffffffffa0a27e9c>] ? ll_file_read+0x16c/0x2a0 [lustre] LustreError: 495 [<ffffffff81176cb5>] ? vfs_read+0xb5/0x1a0 :0:(file.c:930:l [<ffffffff8100bd6e>] ? reschedule_interrupt+0xe/0x20 l_file_io_generi [<ffffffff81176df1>] ? sys_read+0x51/0x90 c()) LBUG [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b Message from sysInitializing cgroup subsys cpuset Initializing cgroup subsys cpu |
| Comments |
| Comment by Peter Jones [ 27/Mar/13 ] |
|
Jinshan Could you please comment on this one? Peter |
| Comment by Jinshan Xiong (Inactive) [ 27/Mar/13 ] |
|
This should be related to: commit ae76dd2f1866c9350df8cb4e772c12cc0d3c4314 so I reassign it to Niu. |
| Comment by Niu Yawei (Inactive) [ 28/Mar/13 ] |
|
This assertion isn't proper, since the crw_count could be changed in lov_io_rw_iter_init(). I'm going to change it as LASSERTF(io->ci_nob == 0, "%zd", io->ci_nob). |
| Comment by Niu Yawei (Inactive) [ 28/Mar/13 ] |
| Comment by Peter Jones [ 30/Mar/13 ] |
|
Landed for 2.4 |
| Comment by Minh Diep [ 08/Apr/13 ] |
|
also hit this in fc18 client testing using tag 2.3.63 client-1 login: [ 911.413689] LustreError: 152-6: Ignoring deprecated mount option 'acl'. |
| Comment by Niu Yawei (Inactive) [ 09/Apr/13 ] |
|
Minh, 2.3.63 doesn't have above fix. |