Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12600

Lustre tgt_brw_write() bug

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      In the latest version of lustre file system, ptlrpc module has a buffer overflow bug due to the lack of validation for specific fields of packets sent by client. We can overwrite up to 0xffffffff bytes of buffer, and it may cause rce problems.

      The kenrel panic:

       

      [277607.350937] BUG: unable to handle kernel paging request at ffff8a0fbf200000
      [277607.389337] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
      [277607.392123] task: ffff8a0fd5fd1040 ti: ffff8a0f2f630000 task.ti: ffff8a0f2f630000
      [277607.394819] RIP: 0010:[<ffffffffb6586dc6>]  [<ffffffffb6586dc6>] memcpy+0x6/0x110
      [277607.397521] RSP: 0018:ffff8a0f2f633b48  EFLAGS: 00010213
      [277607.399983] RAX: ffff8a0fbe293000 RBX: ffff8a0fe52c0000 RCX: ffffffffff092fff
      [277607.402641] RDX: ffffffffffffffff RSI: ffff8a0fe66738a8 RDI: ffff8a0fbf200000
      [277607.405232] RBP: ffff8a0f2f633cb8 R08: 0000000000000000 R09: 00000000000001e8
      [277607.407776] R10: 0000000000000000 R11: 0000000000000008 R12: 0000000000000000
      [277607.410304] R13: 0000000000000012 R14: ffff8a0fedf4fde8 R15: ffff8a0fe52c0000
      [277607.412787] FS:  0000000000000000(0000) GS:ffff8a0fffc00000(0000) knlGS:0000000000000000
      [277607.415350] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [277607.417665] CR2: ffff8a0fbf200000 CR3: 0000000429b88000 CR4: 00000000003606f0
      [277607.420114] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [277607.422527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [277607.424915] Call Trace:
      [277607.426900]  [<ffffffffc089531e>] ? tgt_brw_write+0xe8e/0x1cf0 [ptlrpc]
      [277607.429200]  [<ffffffffc0462395>] ? cfs_trace_unlock_tcd+0x35/0x90 [libcfs]
      [277607.431507]  [<ffffffffc0468af8>] ? libcfs_debug_vmsg2+0x6d8/0xb30 [libcfs]
      [277607.433782]  [<ffffffffb6966e92>] ? mutex_lock+0x12/0x2f
      [277607.435953]  [<ffffffffc08982ca>] tgt_request_handle+0x91a/0x15c0 [ptlrpc]
      [277607.438177]  [<ffffffffc0468fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [277607.440404]  [<ffffffffc083b88e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc]
      [277607.442661]  [<ffffffffb62cbadb>] ? __wake_up_common+0x5b/0x90
      [277607.444751]  [<ffffffffc083f384>] ptlrpc_main+0xbb4/0x20f0 [ptlrpc]
      [277607.446846]  [<ffffffffb62d08c0>] ? finish_task_switch+0x50/0x1c0
      [277607.448897]  [<ffffffffc083e7d0>] ? ptlrpc_register_service+0xfa0/0xfa0 [ptlrpc]
      [277607.451028]  [<ffffffffb62c1c71>] kthread+0xd1/0xe0
      [277607.452918]  [<ffffffffb62c1ba0>] ? insert_kthread_work+0x40/0x40
      [277607.454884]  [<ffffffffb6975c1d>] ret_from_fork_nospec_begin+0x7/0x21
      [277607.456855]  [<ffffffffb62c1ba0>] ? insert_kthread_work+0x40/0x40
      [277607.458760] Code: ca b6 31 c0 e8 4c 2d d1 ff 0f ae e8 0f 31 48 c1 e2 20 89 c0 48 09 c2 48 31 d3 e9 7b ff ff ff 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 
      [277607.464121] RIP  [<ffffffffb6586dc6>] memcpy+0x6/0x110
      [277607.465949]  RSP <ffff8a0f2f633b48>
      [277607.467566] CR2: ffff8a0fbf200000
      

       

       

      In function tgt_brw_write(), the varible comes from req_capsule_get_size() don't be checked and it is passed to the tgt_shortio2pages() function. But in tgt_shortio2pages(), when executing the memcpy function, do '?:' check, len is int type, when len is negative, it can pass 'len<size' check, and the third parameter of memcpy is unsigned int, -1 will be parsed into 0xffffffff, causing a buffer overflow.

       

      if (body->oa.o_flags & OBD_FL_SHORT_IO) { 
          int short_io_size; 
          unsigned char *short_io_buf;
          short_io_size = req_capsule_get_size(&req->rq_pill, &RMF_SHORT_IO, RCL_CLIENT); 
          short_io_buf = req_capsule_client_get(&req->rq_pill, &RMF_SHORT_IO); CDEBUG(D_INFO, "Client use short io for data transfer,"        " size = %d\n", short_io_size);
         /* Copy short io buf to pages */ 
         rc = tgt_shortio2pages(local_nb, npages, short_io_buf, short_io_size); desc = NULL;
      }
      
      for (i = 0; i < npages; i++) { 
          off = local[i].lnb_page_offset & ~PAGE_MASK; len = local[i].lnb_len;
          if (len == 0) continue;
           CDEBUG(D_PAGE, "index %d offset = %d len = %d left = %d\n", i, off, len, size); ptr = ll_kmap_atomic(local[i].lnb_page, KM_USER0); 
           if (ptr == NULL) return -EINVAL; 
           memcpy(ptr + off, buf, len < size ? len : size); 
           ll_kunmap_atomic(ptr, KM_USER0); 
           buf += len; 
           size -= len; 
      }
      

      The backtrace:

       

       ptlrpc_main -> ptlrpc_main -> ptlrpc_server_handle_request -> tgt_request_handle -> tgt_brw_write -> tgt_shortio2pages
      

       

      Attachments

        Issue Links

          Activity

            People

              pfarrell Patrick Farrell (Inactive)
              yunye.ry Alibaba Cloud (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: