Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3035

Failure on racer: ASSERTION( io->u.ci_rw.crw_count == count ) failed: 785408 != 4194304

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • client and server: lustre-master build# 1340; configured as one MDS with two MDTs
    • 3
    • 7405

    Description

      Hit LBUG when running racer under DNE with one MDS two MDTs

      client console:

      Lustre: DEBUG MARKER: -----============= acceptance-small: racer ============----- Tue Mar 26 11:44:28 PDT 2013
      Lustre: DEBUG MARKER: excepting tests:
      LustreError: 152-6: Ignoring deprecated mount option 'acl'.
      Lustre: Increasing default stripe size to min 1048576
      Lustre: Layout lock feature supported.
      Lustre: Mounted lustre-client
      LNet: 30388:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
      LNet: 30388:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message
      Lustre: DEBUG MARKER: Using TIMEOUT=20
      LNet: 31765:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
      LNet: 31765:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 1 previous similar message
      Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-5,client-15 DURATION=900 == 11:44:37 (1364323477)
      LustreError: 495:0:(file.c:930:ll_file_io_generic()) ASSERTION( io->u.ci_rw.crw_count == count ) failed: 785408 != 4194304
      LustreError: 495:0:(file.c:930:ll_file_io_generic()) LBUG
      Pid: 495, comm: cat
      
      Call Trace:
       [<ffffffffa0366895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0366e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      
       [<ffffffffa0a26882>] ll_file_io_generic+0x542/0x600 [lustre]
      Message from sy [<ffffffffa0a27baf>] ll_file_aio_read+0x13f/0x2c0 [lustre]
      slogd@client-5 a [<ffffffffa0a27e9c>] ll_file_read+0x16c/0x2a0 [lustre]
      t Mar 26 11:44:3 [<ffffffff81176cb5>] vfs_read+0xb5/0x1a0
      9 ...
       kernel: [<ffffffff8100bd6e>] ? reschedule_interrupt+0xe/0x20
      LustreError: 495 [<ffffffff81176df1>] sys_read+0x51/0x90
      :0:(file.c:930:l [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      l_file_io_generi
      c()) ASSERTION( io->u.ci_rw.crw_Kernel panic - not syncing: LBUG
      count == count )Pid: 495, comm: cat Not tainted 2.6.32-279.19.1.el6.x86_64 #1
       failed: 785408 Call Trace:
      != 4194304
      
       [<ffffffff814e9541>] ? panic+0xa0/0x168
      Message from sy [<ffffffffa0366eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      slogd@client-5 a [<ffffffffa0a26882>] ? ll_file_io_generic+0x542/0x600 [lustre]
      t Mar 26 11:44:3 [<ffffffffa0a27baf>] ? ll_file_aio_read+0x13f/0x2c0 [lustre]
      9 ...
       kernel: [<ffffffffa0a27e9c>] ? ll_file_read+0x16c/0x2a0 [lustre]
      LustreError: 495 [<ffffffff81176cb5>] ? vfs_read+0xb5/0x1a0
      :0:(file.c:930:l [<ffffffff8100bd6e>] ? reschedule_interrupt+0xe/0x20
      l_file_io_generi [<ffffffff81176df1>] ? sys_read+0x51/0x90
      c()) LBUG
      
       [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      Message from sysInitializing cgroup subsys cpuset
      Initializing cgroup subsys cpu
      

      Attachments

        Activity

          [LU-3035] Failure on racer: ASSERTION( io->u.ci_rw.crw_count == count ) failed: 785408 != 4194304

          Minh, 2.3.63 doesn't have above fix.

          niu Niu Yawei (Inactive) added a comment - Minh, 2.3.63 doesn't have above fix.
          mdiep Minh Diep added a comment -

          also hit this in fc18 client testing using tag 2.3.63

          client-1 login: [ 911.413689] LustreError: 152-6: Ignoring deprecated mount option 'acl'.
          [ 931.554379] LustreError: 152-6: Ignoring deprecated mount option 'acl'.
          [ 962.209496] LustreError: 9285:0:(file.c:2610:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x33:0x0] error: rc = -116
          [ 962.546186] LustreError: 9281:0:(file.c:2610:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x33:0x0] error: rc = -116
          [ 1803.114221] LustreError: 11503:0:(file.c:930:ll_file_io_generic()) ASSERTION( io->u.ci_rw.crw_count == count ) failed: 808960 != 4194304
          [ 1803.127256] LustreError: 11503:0:(file.c:930:ll_file_io_generic()) LBUG
          [ 1803.138194] Kernel panic - not syncing: LBUG
          [ 1803.142714] Pid: 11503, comm: cat Tainted: GF O 3.6.10-4.fc18.x86_64 #1
          [ 1803.150525] Call Trace:
          [ 1803.153126] [<ffffffff816198db>] panic+0xc1/0x1d0
          [ 1803.158204] [<ffffffffa0297e5b>] lbug_with_loc+0xab/0xc0 [libcfs]
          [ 1803.164765] [<ffffffffa07fc2e0>] ll_file_io_generic+0x600/0x670 [lustre]
          [ 1803.171958] [<ffffffffa07fca10>] ll_file_aio_read+0xf0/0x200 [lustre]
          [ 1803.178884] [<ffffffffa07fcc35>] ll_file_read+0x115/0x220 [lustre]
          [ 1803.185508] [<ffffffff81190a99>] vfs_read+0xa9/0x180
          [ 1803.190853] [<ffffffff81190bba>] sys_read+0x4a/0x90
          [ 1803.196099] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b

          mdiep Minh Diep added a comment - also hit this in fc18 client testing using tag 2.3.63 client-1 login: [ 911.413689] LustreError: 152-6: Ignoring deprecated mount option 'acl'. [ 931.554379] LustreError: 152-6: Ignoring deprecated mount option 'acl'. [ 962.209496] LustreError: 9285:0:(file.c:2610:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x33:0x0] error: rc = -116 [ 962.546186] LustreError: 9281:0:(file.c:2610:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000402:0x33:0x0] error: rc = -116 [ 1803.114221] LustreError: 11503:0:(file.c:930:ll_file_io_generic()) ASSERTION( io->u.ci_rw.crw_count == count ) failed: 808960 != 4194304 [ 1803.127256] LustreError: 11503:0:(file.c:930:ll_file_io_generic()) LBUG [ 1803.138194] Kernel panic - not syncing: LBUG [ 1803.142714] Pid: 11503, comm: cat Tainted: GF O 3.6.10-4.fc18.x86_64 #1 [ 1803.150525] Call Trace: [ 1803.153126] [<ffffffff816198db>] panic+0xc1/0x1d0 [ 1803.158204] [<ffffffffa0297e5b>] lbug_with_loc+0xab/0xc0 [libcfs] [ 1803.164765] [<ffffffffa07fc2e0>] ll_file_io_generic+0x600/0x670 [lustre] [ 1803.171958] [<ffffffffa07fca10>] ll_file_aio_read+0xf0/0x200 [lustre] [ 1803.178884] [<ffffffffa07fcc35>] ll_file_read+0x115/0x220 [lustre] [ 1803.185508] [<ffffffff81190a99>] vfs_read+0xa9/0x180 [ 1803.190853] [<ffffffff81190bba>] sys_read+0x4a/0x90 [ 1803.196099] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
          pjones Peter Jones added a comment -

          Landed for 2.4

          pjones Peter Jones added a comment - Landed for 2.4
          niu Niu Yawei (Inactive) added a comment - http://review.whamcloud.com/5864

          This assertion isn't proper, since the crw_count could be changed in lov_io_rw_iter_init(). I'm going to change it as LASSERTF(io->ci_nob == 0, "%zd", io->ci_nob).

          niu Niu Yawei (Inactive) added a comment - This assertion isn't proper, since the crw_count could be changed in lov_io_rw_iter_init(). I'm going to change it as LASSERTF(io->ci_nob == 0, "%zd", io->ci_nob).

          This should be related to:

          commit ae76dd2f1866c9350df8cb4e772c12cc0d3c4314
          Author: Niu Yawei <niu@whamcloud.com>
          Date: Thu Mar 7 23:58:11 2013 -0500

          LU-2910 clio: restore iov when restart io

          so I reassign it to Niu.

          jay Jinshan Xiong (Inactive) added a comment - This should be related to: commit ae76dd2f1866c9350df8cb4e772c12cc0d3c4314 Author: Niu Yawei <niu@whamcloud.com> Date: Thu Mar 7 23:58:11 2013 -0500 LU-2910 clio: restore iov when restart io so I reassign it to Niu.
          pjones Peter Jones added a comment -

          Jinshan

          Could you please comment on this one?

          Peter

          pjones Peter Jones added a comment - Jinshan Could you please comment on this one? Peter

          People

            niu Niu Yawei (Inactive)
            sarah Sarah Liu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: