Details

    • 3
    • 9223372036854775807

    Description

      In the latest version of lustre file system, ptlrpc module has a out-of-access bug due to the lack of validation for specific fields of packets sent by client.

      The kernel panic:

      [  926.531595] BUG: unable to handle kernel paging request at 000000001ebe8010
      [  926.533844] IP: [<ffffffffc0826783>] lu_context_key_get+0x13/0x30 [obdclass]
      [  926.536063] PGD 8000000424360067 PUD 42865d067 PMD 0 
      [  926.538060] Oops: 0000 [#1] SMP 
      [  926.539857] Modules linked in: ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) loop lustre(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) crc_t10dif crct10dif_generic ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod nfit libnvdimm iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ppdev glue_helper ablk_helper cryptd virtio_balloon joydev parport_pc parport i2c_piix4 pcspkr ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_net virtio_console virtio_blk cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common drm ata_piix libata crc32c_intel serio_raw virtio_pci virtio_ring virtio drm_panel_orientation_quirks floppy
      [  926.558093] CPU: 2 PID: 3308 Comm: ll_ost_io00_002 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.10.1.el7_lustre.x86_64 #1
      [  926.562313] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
      [  926.564575] task: ffff8911ac64b0c0 ti: ffff8911847ec000 task.ti: ffff8911847ec000
      [  926.566820] RIP: 0010:[<ffffffffc0826783>]  [<ffffffffc0826783>] lu_context_key_get+0x13/0x30 [obdclass]
      [  926.569301] RSP: 0018:ffff8911847ef9e8  EFLAGS: 00010246
      [  926.571339] RAX: 0000000000000016 RBX: 0000000000039594 RCX: 000000000000021d
      [  926.573536] RDX: 000000000000021d RSI: ffffffffc0f9f180 RDI: 000000001ebe8000
      [  926.575719] RBP: ffff8911847efa38 R08: ffff891184040000 R09: 0000000000000001
      [  926.577890] R10: 0000000000000001 R11: ffff89118cbdc1a0 R12: 0000000000000000
      [  926.580035] R13: ffff891189a48a00 R14: 0000000000000000 R15: ffff891184040000
      [  926.582180] FS:  0000000000000000(0000) GS:ffff8911bfd00000(0000) knlGS:0000000000000000
      [  926.584424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  926.586446] CR2: 000000001ebe8010 CR3: 00000004287fe000 CR4: 00000000003606e0
      [  926.588588] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  926.590725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  926.592836] Call Trace:
      [  926.594522]  [<ffffffffc0f71cc3>] ? osd_bufs_get+0x203/0x800 [osd_ldiskfs]
      [  926.596608]  [<ffffffffc1376af2>] ? ofd_preprw+0x422/0x1160 [ofd]
      [  926.598618]  [<ffffffffc0696394>] ? cfs_trace_unlock_tcd+0x34/0x90 [libcfs]
      [  926.600681]  [<ffffffffa2966e92>] ? mutex_lock+0x12/0x2f
      [  926.602572]  [<ffffffffc069cfa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [  926.604578]  [<ffffffffa22cbadb>] ? __wake_up_common+0x5b/0x90
      [  926.606557]  [<ffffffffc0a73384>] ? ptlrpc_main+0xbb4/0x20f0 [ptlrpc]
      [  926.608575]  [<ffffffffc0a727d0>] ? ptlrpc_register_service+0xfa0/0xfa0 [ptlrpc]
      [  926.610621]  [<ffffffffa22c1ba0>] ? insert_kthread_work+0x40/0x40
      [  926.612531] Code: 00 04 00 e8 f0 67 e7 ff 48 c7 c7 00 aa 88 c0 e8 c4 00 e7 ff 0f 1f 40 00 0f 1f 44 00 00 48 63 46 20 48 3b 34 c5 a0 30 8b c0 75 09 <48> 8b 57 10 48 8b 04 c2 c3 55 48 89 e5 e8 aa f9 02 00 90 66 2e 
      [  926.618057] RIP  [<ffffffffc0826783>] lu_context_key_get+0x13/0x30 [obdclass]
      [  926.620212]  RSP <ffff8911847ef9e8>
      [  926.621918] CR2: 000000001ebe8010
      

      In function osd_bufs_get() of osd_ldiskfs module, there is no check about the value len, which is derived from the Nio buffer section of the packet sent by client, and cause a out-of-access bug in osd_map_remote_to_local() function.

      static int osd_bufs_get(const struct lu_env *env, struct dt_object *dt, loff_t pos, ssize_t len,
                             struct niobuf_local *lnb, enum dt_bufs_type rw)
      {
              :
              osd_map_remote_to_local(pos, len, &npages, lnb); 
              :
      }
      

      Attachments

        Issue Links

          Activity

            [LU-12612] Lustre osd_bufs_get() bug

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36273/
            Subject: LU-12612 osd: add lnb size down to osd
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: a0680feff23c063cd666a3c912b8f855e64efc7e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36273/ Subject: LU-12612 osd: add lnb size down to osd Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: a0680feff23c063cd666a3c912b8f855e64efc7e

            Sorry, I attributed the failure to the wrong ticket. I thought that test_103 was introduced by patch https://review.whamcloud.com/35801 "LU-12612 osd: add lnb size down to osd" because I did a "git log lustre/tests/sanity.sh" to find the last commit on that file, but I was looking at the wrong file.

            The actual problem was caused by patch https://review.whamcloud.com/33660 "LU-11670 osc: glimpse - search for active lock", so I'll reopen that ticket instead.

            adilger Andreas Dilger added a comment - Sorry, I attributed the failure to the wrong ticket. I thought that test_103 was introduced by patch https://review.whamcloud.com/35801 " LU-12612 osd: add lnb size down to osd " because I did a " git log lustre/tests/sanity.sh " to find the last commit on that file, but I was looking at the wrong file. The actual problem was caused by patch https://review.whamcloud.com/33660 " LU-11670 osc: glimpse - search for active lock ", so I'll reopen that ticket instead.
            bzzz Alex Zhuravlev added a comment - Andreas, this https://testing.whamcloud.com/sub_tests/5465b1be-dc86-11e9-add9-52540065bddc happened before LU-12612 landing.
            bzzz Alex Zhuravlev added a comment - - edited

            yes, looking at that.. interesting, it's only ZFS affected, 7 of 62 runs did hit this.

            bzzz Alex Zhuravlev added a comment - - edited yes, looking at that.. interesting, it's only ZFS affected, 7 of 62 runs did hit this.
            adilger Andreas Dilger added a comment - The new sanityn test_103 is causing intermittent test failures on master since this patch has landed: https://testing.whamcloud.com/sub_tests/6e7ab5ca-dbf2-11e9-b62b-52540065bddc https://testing.whamcloud.com/sub_tests/5465b1be-dc86-11e9-add9-52540065bddc https://testing.whamcloud.com/sub_tests/7fb92682-de13-11e9-be86-52540065bddc https://testing.whamcloud.com/sub_tests/988e7d56-de18-11e9-a197-52540065bddc https://testing.whamcloud.com/sub_tests/b8faf26a-de1b-11e9-add9-52540065bddc https://testing.whamcloud.com/sub_tests/ea74ad9e-de49-11e9-be86-52540065bddc https://testing.whamcloud.com/sub_tests/b50fb5a0-de5c-11e9-be86-52540065bddc https://testing.whamcloud.com/sub_tests/df3dabbc-def6-11e9-be86-52540065bddc https://testing.whamcloud.com/sub_tests/70b6fa24-def9-11e9-9874-52540065bddc https://testing.whamcloud.com/sub_tests/b26d3222-df11-11e9-b62b-52540065bddc

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36273
            Subject: LU-12612 osd: add lnb size down to osd
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 8131ddef623bff6f20ea39735337c0bbce670b94

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36273 Subject: LU-12612 osd: add lnb size down to osd Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 8131ddef623bff6f20ea39735337c0bbce670b94
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35801/
            Subject: LU-12612 osd: add lnb size down to osd
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8033f80de3d0db87f7e965078ceee62033adb58d

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35801/ Subject: LU-12612 osd: add lnb size down to osd Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8033f80de3d0db87f7e965078ceee62033adb58d

            Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35801
            Subject: LU-12612 osd: add lnb size down to osd
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0d58fc5deb25dcba43d3711f098ad961def64dc4

            gerrit Gerrit Updater added a comment - Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35801 Subject: LU-12612 osd: add lnb size down to osd Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0d58fc5deb25dcba43d3711f098ad961def64dc4

            Please add "Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com>" to the patch commit message.

            adilger Andreas Dilger added a comment - Please add " Reported-by: Alibaba Cloud <yunye.ry@alibaba-inc.com> " to the patch commit message.

            People

              bzzz Alex Zhuravlev
              yunye.ry Alibaba Cloud (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: