Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17076

ptlrpc_nrs_req_stop_nolock() use after free with ORR NRS policy

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      sanityn.sh test 77c crashed as below:

      [13898.279218] Lustre: DEBUG MARKER: == sanityn test 77c: check ORR NRS policy ================ 11:42:47 (1693482167)
      [13898.512652] Lustre: DEBUG MARKER: lctl set_param ost.OSS.ost_io.nrs_policies=orr ost.OSS.*.nrs_orr_quantum=1 ost.OSS.*.nrs_orr_offset_type=physical ost.OSS.*.nrs_orr_supported=reads
      [13900.571898] Lustre: DEBUG MARKER: lctl set_param ost.OSS.*.nrs_orr_supported=writes ost.OSS.*.nrs_orr_quantum=64
      [13903.067814] Lustre: DEBUG MARKER: lctl set_param ost.OSS.*.nrs_orr_supported=reads_and_writes ost.OSS.*.nrs_orr_offset_type=logical
      [13905.203004] general protection fault, probably for non-canonical address 0x5a5a5a5a5a5a5a5a: 0000 [#1] SMP PTI
      [13905.205028] CPU: 1 PID: 250894 Comm: ll_ost_io00_027 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-477.15.1.el8_lustre.x86_64 #1
      [13905.207489] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [13905.208620] RIP: 0010:ptlrpc_nrs_req_stop_nolock+0x34/0x140 [ptlrpc]
      [13905.210443] Code: 00 00 a8 04 75 05 e9 0b 08 f0 c1 53 a8 01 74 65 a8 08 0f 85 8d 00 00 00 8b 87 08 01 00 00 48 8b 84 c7 f8 00 00 00 48 8b 58 08 <48> 8b 43 68 48 8b 40 20 48 8b 40 50 48 85 c0 74 0f 48 8d b7 f8 00
      [13905.213924] RSP: 0018:ffff9c0ec3513da8 EFLAGS: 00010246
      [13905.214961] RAX: ffff8ecb454357d0 RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000000
      [13905.216321] RDX: 0000000000000001 RSI: ffff8ecb309ca400 RDI: ffff8ecb309ca400
      [13905.217685] RBP: ffff8ecb05c4d8e8 R08: 00000000000000b8 R09: abcc77118461cefd
      [13905.219044] R10: ffff9c0ec3513d88 R11: ffff8ecb34982646 R12: ffff8ecb05c4d800
      [13905.220422] R13: ffff8ecb43bfb200 R14: 00000000000089b6 R15: ffff8ecb309ca400
      [13905.221866] FS:  0000000000000000(0000) GS:ffff8ecbbfd00000(0000) knlGS:0000000000000000
      [13905.223781] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13905.224902] CR2: 00007ffc748ef221 CR3: 0000000076810004 CR4: 00000000001706e0
      [13905.226268] Call Trace:
      [13905.226846]  ptlrpc_server_finish_active_request+0x26/0x120 [ptlrpc]
      [13905.228184]  ptlrpc_server_handle_request+0x4a2/0xbc0 [ptlrpc]
      [13905.229415]  ptlrpc_main+0xc91/0x15a0 [ptlrpc]
      [13905.230399]  ? __schedule+0x2d9/0x870
      [13905.231251]  ? ptlrpc_wait_event+0x590/0x590 [ptlrpc]
      [13905.232347]  kthread+0x134/0x150
      [13905.233064]  ? set_kthread_struct+0x50/0x50
      [13905.233920]  ret_from_fork+0x35/0x40

      report link: https://testing.whamcloud.com/test_sets/79713f31-a8a9-4056-8478-4fb529a14077

      It can be related to recent changes in ORR/TRR code in 'LU-8130 nrs: convert NRS ORR/TRR to rhashtable' landed two weeks  ago

      Attachments

        Issue Links

          Activity

            [LU-17076] ptlrpc_nrs_req_stop_nolock() use after free with ORR NRS policy

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52515/
            Subject: LU-17076 nrs: wait for RCU completion
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a9411a9856a0a1539ea2c8dc9c7eb1bd8fa2c409

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52515/ Subject: LU-17076 nrs: wait for RCU completion Project: fs/lustre-release Branch: master Current Patch Set: Commit: a9411a9856a0a1539ea2c8dc9c7eb1bd8fa2c409

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52295/
            Subject: LU-17076 orr: take a ref after lookup_get_insert_fast()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d7e70e38c942baebfdc186fcd5ad48cdcc2ebde3

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52295/ Subject: LU-17076 orr: take a ref after lookup_get_insert_fast() Project: fs/lustre-release Branch: master Current Patch Set: Commit: d7e70e38c942baebfdc186fcd5ad48cdcc2ebde3

            "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52515
            Subject: LU-17076 nrs: wait for RCU completion
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 18adf51db65b8226435765cef4c099722ad5627f

            gerrit Gerrit Updater added a comment - "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52515 Subject: LU-17076 nrs: wait for RCU completion Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 18adf51db65b8226435765cef4c099722ad5627f

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52295
            Subject: LU-17076 orr: take a ref after lookup_get_insert_fast()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: de641ac0edf93607966bbc7a7ca795edfb1d3049

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52295 Subject: LU-17076 orr: take a ref after lookup_get_insert_fast() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: de641ac0edf93607966bbc7a7ca795edfb1d3049

            I think the culprit is the following code:

            static int nrs_orr_res_get(struct ptlrpc_nrs_policy *policy,                     
                                       struct ptlrpc_nrs_request *nrq,                       
                                       const struct ptlrpc_nrs_resource *parent,             
                                       struct ptlrpc_nrs_resource **resp, bool moving_req)   
            ....
                    orro2 = rhashtable_lookup_get_insert_fast(&orrd->od_obj_hash,               
                                                              &orro->oo_rhead,                  
                                                              nrs_orr_hash_params);             
                    /* A returned non-error orro2 means it already exist */                     
                    if (!IS_ERR_OR_NULL(orro2)) {                                               <------ no ref taken if rhashtable_lookup_get_insert_fast returns an existing object
                            OBD_SLAB_FREE_PTR(orro, orrd->od_cache);                            
                            orro = orro2;                                                       
                    }                                                                           
                                                                                                
                    /* insertion failed */                                                      
                    if (IS_ERR(orro2)) {                                                        
            

            A ref needs to be taken if rhashtable_lookup_get_insert_fast() returns an object, otherwise the object could be freed before nrs_orr_res_put()/ptlrpc_nrs_req_stop_nolock().

            eaujames Etienne Aujames added a comment - I think the culprit is the following code: static int nrs_orr_res_get( struct ptlrpc_nrs_policy *policy, struct ptlrpc_nrs_request *nrq, const struct ptlrpc_nrs_resource *parent, struct ptlrpc_nrs_resource **resp, bool moving_req) .... orro2 = rhashtable_lookup_get_insert_fast(&orrd->od_obj_hash, &orro->oo_rhead, nrs_orr_hash_params); /* A returned non-error orro2 means it already exist */ if (!IS_ERR_OR_NULL(orro2)) { <------ no ref taken if rhashtable_lookup_get_insert_fast returns an existing object OBD_SLAB_FREE_PTR(orro, orrd->od_cache); orro = orro2; } /* insertion failed */ if (IS_ERR(orro2)) { A ref needs to be taken if rhashtable_lookup_get_insert_fast() returns an object, otherwise the object could be freed before nrs_orr_res_put()/ptlrpc_nrs_req_stop_nolock().
            eaujames Etienne Aujames added a comment - - edited

            Hi,

            I tried to reproduce this bug because I suspected the "c098c09 LU-14976 nrs: change nrs policies at run time".
            But it seems this is the "LU-8130 nrs: convert NRS ORR/TRR to rhashtable" patch causing the crash.

            I reproduced this crash with debug=all on zfs (it took arround 5-10min):

            ONLY_REPEAT=1000 FSTYPE=zfs auster -rv sanityn --only "77c"
            

            From the vmcore-dmesg:

            [ 1187.404425] RIP: 0010:[<ffffffffc0c33ddb>]  [<ffffffffc0c33ddb>] ptlrpc_nrs_req_stop_nolock+0x3b/0x150 [ptlrpc]  
            [ 1187.404509] RSP: 0018:ffff904365ab7ce0  EFLAGS: 00010246                                                         
            [ 1187.404528] RAX: ffff9043bbae4b40 RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000000                                    
            [ 1187.404552] RDX: 0000000000000001 RSI: ffff9043639ca880 RDI: ffff9043639ca880                                    
            [ 1187.404576] RBP: ffff904365ab7ce8 R08: 000000000000000a R09: 0000000000000038                                    
            [ 1187.404600] R10: 0000000000000000 R11: ffff904365ab7af6 R12: ffff9043797b8800                                    
            [ 1187.404624] R13: ffff9043797b88f0 R14: ffff9043baad0e00 R15: 00064f83754e9e80                                    
            [ 1187.404648] FS:  0000000000000000(0000) GS:ffff9043bfc00000(0000) knlGS:0000000000000000                         
            [ 1187.404676] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                    
            [ 1187.404695] CR2: 00007fc038264d00 CR3: 0000000078f9e000 CR4: 00000000000606f0                                    
            [ 1187.404731] Call Trace:                                                                                          
            [ 1187.404780]  [<ffffffffc0bfcb8b>] ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc]                        
            [ 1187.404846]  [<ffffffffc0c0080e>] ptlrpc_server_handle_request+0x40e/0xc00 [ptlrpc]                              
            [ 1187.404903]  [<ffffffffc0c0232c>] ptlrpc_main+0xc8c/0x1680 [ptlrpc]                                              
            [ 1187.404929]  [<ffffffffb5e2b59e>] ? __switch_to+0xce/0x580                                                       
            [ 1187.404990]  [<ffffffffc0c016a0>] ? ptlrpc_wait_event+0x5c0/0x5c0 [ptlrpc]                                       
            [ 1187.405789]  [<ffffffffb5ec5e61>] kthread+0xd1/0xe0                                                              
            [ 1187.406600]  [<ffffffffb5ec5d90>] ? insert_kthread_work+0x40/0x40                                                
            [ 1187.407483]  [<ffffffffb65aadf7>] ret_from_fork_nospec_begin+0x21/0x21                                           
            [ 1187.408317]  [<ffffffffb5ec5d90>] ? insert_kthread_work+0x40/0x40                                                
            

            From the crash:

            crash> ptlrpc_request ffff9043639ca880
            ....
                  sr_nrq = {                                                         
                    nr_res_ptrs = {0xffff904398815680, 0xffff9043bbae4b40},   
            ....
            
            crash> kmem 0xffff9043bbae4b40                                                    
            CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  
            ffff9043bd16c900 Acpi-State                80         31       306      6     4k  
              SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE                
              ffffe2c481eeb900  ffff9043bbae4000     0     51         10    41                
              FREE / [ALLOCATED]                                                              
               ffff9043bbae4b40  (cpu 0 cache)                                                
                                                                                              
                  PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS                      
            ffffe2c481eeb900 7bae4000                0 ffff9043bbae4b40  1 1fffff00000080 slab
            
            crash> struct nrs_orr_data -x -l nrs_orr_data.od_res  0xffff9043bbae4b40 | head -n10   
            struct nrs_orr_data {                                                                  
              od_res = {                                                                           
                res_parent = 0xffff9043bbae4cd0,                                                   
                res_policy = 0x5a5a5a5a5a5a5a5a                                                    
              },                                                                                   
              od_binheap = 0x5a5a5a5a5a5a5a5a,                                                     
              od_obj_hash = {                                                                      
                tbl = 0x5a5a5a5a5a5a5a5a,                                                          
                nelems = {                                                                         
                  counter = 0x5a5a5a5a                                                             
            
            

            The orr object was freed.

            The debug log confirmed this:

            00000100:00000010:1.1:1693989787.815785:0:10036:0:(nrs_orr.c:832:nrs_orr_object_free()) slab-freed '(orro)': 80 at ffff9043bbae4b40.
            
            eaujames Etienne Aujames added a comment - - edited Hi, I tried to reproduce this bug because I suspected the "c098c09 LU-14976 nrs: change nrs policies at run time". But it seems this is the " LU-8130 nrs: convert NRS ORR/TRR to rhashtable" patch causing the crash. I reproduced this crash with debug=all on zfs (it took arround 5-10min): ONLY_REPEAT=1000 FSTYPE=zfs auster -rv sanityn --only "77c" From the vmcore-dmesg: [ 1187.404425] RIP: 0010:[<ffffffffc0c33ddb>] [<ffffffffc0c33ddb>] ptlrpc_nrs_req_stop_nolock+0x3b/0x150 [ptlrpc] [ 1187.404509] RSP: 0018:ffff904365ab7ce0 EFLAGS: 00010246 [ 1187.404528] RAX: ffff9043bbae4b40 RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000000 [ 1187.404552] RDX: 0000000000000001 RSI: ffff9043639ca880 RDI: ffff9043639ca880 [ 1187.404576] RBP: ffff904365ab7ce8 R08: 000000000000000a R09: 0000000000000038 [ 1187.404600] R10: 0000000000000000 R11: ffff904365ab7af6 R12: ffff9043797b8800 [ 1187.404624] R13: ffff9043797b88f0 R14: ffff9043baad0e00 R15: 00064f83754e9e80 [ 1187.404648] FS: 0000000000000000(0000) GS:ffff9043bfc00000(0000) knlGS:0000000000000000 [ 1187.404676] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1187.404695] CR2: 00007fc038264d00 CR3: 0000000078f9e000 CR4: 00000000000606f0 [ 1187.404731] Call Trace: [ 1187.404780] [<ffffffffc0bfcb8b>] ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc] [ 1187.404846] [<ffffffffc0c0080e>] ptlrpc_server_handle_request+0x40e/0xc00 [ptlrpc] [ 1187.404903] [<ffffffffc0c0232c>] ptlrpc_main+0xc8c/0x1680 [ptlrpc] [ 1187.404929] [<ffffffffb5e2b59e>] ? __switch_to+0xce/0x580 [ 1187.404990] [<ffffffffc0c016a0>] ? ptlrpc_wait_event+0x5c0/0x5c0 [ptlrpc] [ 1187.405789] [<ffffffffb5ec5e61>] kthread+0xd1/0xe0 [ 1187.406600] [<ffffffffb5ec5d90>] ? insert_kthread_work+0x40/0x40 [ 1187.407483] [<ffffffffb65aadf7>] ret_from_fork_nospec_begin+0x21/0x21 [ 1187.408317] [<ffffffffb5ec5d90>] ? insert_kthread_work+0x40/0x40 From the crash: crash> ptlrpc_request ffff9043639ca880 .... sr_nrq = { nr_res_ptrs = {0xffff904398815680, 0xffff9043bbae4b40}, .... crash> kmem 0xffff9043bbae4b40 CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE ffff9043bd16c900 Acpi-State 80 31 306 6 4k SLAB MEMORY NODE TOTAL ALLOCATED FREE ffffe2c481eeb900 ffff9043bbae4000 0 51 10 41 FREE / [ALLOCATED] ffff9043bbae4b40 (cpu 0 cache) PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffe2c481eeb900 7bae4000 0 ffff9043bbae4b40 1 1fffff00000080 slab crash> struct nrs_orr_data -x -l nrs_orr_data.od_res 0xffff9043bbae4b40 | head -n10 struct nrs_orr_data { od_res = { res_parent = 0xffff9043bbae4cd0, res_policy = 0x5a5a5a5a5a5a5a5a }, od_binheap = 0x5a5a5a5a5a5a5a5a, od_obj_hash = { tbl = 0x5a5a5a5a5a5a5a5a, nelems = { counter = 0x5a5a5a5a The orr object was freed. The debug log confirmed this: 00000100:00000010:1.1:1693989787.815785:0:10036:0:(nrs_orr.c:832:nrs_orr_object_free()) slab-freed '(orro)': 80 at ffff9043bbae4b40.
            eaujames Etienne Aujames added a comment - +1 on master: https://testing.whamcloud.com/test_sets/49dd0ad5-bff7-4dba-a84e-7d8dcae4402a
            tappro Mikhail Pershin added a comment - - edited

            It happened prior related policy operation ..._req_stop call. I did objdump for related code and it looks like that is due to freed resource:

            static inline void nrs_request_stop(struct ptlrpc_nrs_request *nrq)
            {
                struct ptlrpc_nrs_policy *policy = nrs_request_policy(nrq);
            
                which is in turn:
            
                struct ptlrpc_nrs_policy *nrs_request_policy(struct ptlrpc_nrs_request *nrq)
                {
                    return nrs_request_resource(nrq)->res_policy; <--- here res_policy is filled from resource in %rbx
                }
            
            

            so it looks like nrs_request_resource returned by nrs_request_policy() is freed and contains that 0x5a5a5a5a5a 

            tappro Mikhail Pershin added a comment - - edited It happened prior related policy operation ..._req_stop call. I did objdump for related code and it looks like that is due to freed resource: static inline void nrs_request_stop(struct ptlrpc_nrs_request *nrq) {     struct ptlrpc_nrs_policy *policy = nrs_request_policy(nrq); which is in turn: struct ptlrpc_nrs_policy *nrs_request_policy(struct ptlrpc_nrs_request *nrq) {     return nrs_request_resource(nrq)->res_policy; <--- here res_policy is filled from resource in %rbx } so it looks like nrs_request_resource returned by nrs_request_policy() is freed and contains that 0x5a5a5a5a5a  

            https://testing.whamcloud.com/test_sets/49dd0ad5-bff7-4dba-a84e-7d8dcae4402a

            Another occurrence without relation to  LU-10465, so it seems bug is just started to appear sometimes

             

            tappro Mikhail Pershin added a comment - https://testing.whamcloud.com/test_sets/49dd0ad5-bff7-4dba-a84e-7d8dcae4402a Another occurrence without relation to   LU-10465 , so it seems bug is just started to appear sometimes  

            Strange bug since  nrs_orr_req_stop() is the under lying function being called and the debug data is not being handled by RCU so its not a rhashtable issue.  I was thinking it was due to the lquota memory corruption bug but if you can show its LU-10465 then we have a path to look at it.

            simmonsja James A Simmons added a comment - Strange bug since  nrs_orr_req_stop() is the under lying function being called and the debug data is not being handled by RCU so its not a rhashtable issue.  I was thinking it was due to the lquota memory corruption bug but if you can show its LU-10465 then we have a path to look at it.

            On further check it seems that all recent occurrences are from LU-10465 patch for 4M default stripe size, so it can be related to that patch. 

            tappro Mikhail Pershin added a comment - On further check it seems that all recent occurrences are from LU-10465 patch for 4M default stripe size, so it can be related to that patch. 

            People

              bzzz Alex Zhuravlev
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: