Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16144

OST crash at umount in ptlrpc_nrs_req_stop_nolock (with TBF policy).

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      OST calltrace:

      [5839915.258394] BUG: unable to handle kernel NULL pointer dereference at 0000000000000114  
      [5839915.260256] IP: [<ffffffffc0d9e965>] ptlrpc_nrs_req_stop_nolock+0x5/0x150 [ptlrpc]    
      .....
      [5839915.319008]  [<ffffffffc0d6861b>] ? ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc]        
      [5839915.320846]  [<ffffffffc0d68867>] ptlrpc_service_purge_all+0x137/0x920 [ptlrpc]                    
      [5839915.322159]  [<ffffffffc0d6ac37>] ptlrpc_unregister_service+0xe7/0x6f0 [ptlrpc]                    
      [5839915.323521]  [<ffffffffc09090f2>] ost_cleanup+0x52/0x1b0 [ost]                                      
      [5839915.324585]  [<ffffffffc0a4db2d>] class_free_dev+0x21d/0x720 [obdclass]                            
      [5839915.325761]  [<ffffffffc0a4e220>] class_export_put+0x1f0/0x2c0 [obdclass]                          
      [5839915.327088]  [<ffffffffc0a4fc95>] class_unlink_export+0x135/0x170 [obdclass]                        
      [5839915.328496]  [<ffffffffc0a659e0>] class_decref+0x80/0x160 [obdclass]                                
      [5839915.329883]  [<ffffffffc0a65e43>] class_detach+0x1b3/0x2e0 [obdclass]                              
      [5839915.331131]  [<ffffffffc0a6ca48>] class_process_config+0x1a38/0x2830 [obdclass]                    
      [5839915.332602]  [<ffffffffb08d3b0a>] ? complete+0x4a/0x60                                              
      [5839915.333756]  [<ffffffffb0ba14fd>] ? list_del+0xd/0x30                                              
      [5839915.334904]  [<ffffffffb0f814fe>] ? wait_for_completion+0x4e/0x140                                  
      [5839915.336336]  [<ffffffffc0a6da20>] class_manual_cleanup+0x1e0/0x710 [obdclass]                      
      [5839915.337972]  [<ffffffffc0a99835>] server_stop_servers+0xd5/0x160 [obdclass]                        
      [5839915.339302]  [<ffffffffc0a9ef9d>] server_put_super+0x12d/0xd00 [obdclass]                          
      [5839915.340450]  [<ffffffffb0a4d53d>] generic_shutdown_super+0x6d/0x100                                
      [5839915.341528]  [<ffffffffb0a4d942>] kill_anon_super+0x12/0x20                                        
      [5839915.342542]  [<ffffffffc0a70852>] lustre_kill_super+0x32/0x50 [obdclass]                            
      [5839915.343693]  [<ffffffffb0a4dd1e>] deactivate_locked_super+0x4e/0x70                                
      [5839915.344791]  [<ffffffffb0a4e4a6>] deactivate_super+0x46/0x60                                        
      [5839915.345863]  [<ffffffffb0a6d03f>] cleanup_mnt+0x3f/0x80                                            
      [5839915.346952]  [<ffffffffb0a6d0d2>] __cleanup_mnt+0x12/0x20                                          
      [5839915.347897]  [<ffffffffb08c2e5b>] task_work_run+0xbb/0xe0                                          
      [5839915.348805]  [<ffffffffb082cc65>] do_notify_resume+0xa5/0xc0                                        
      [5839915.349916]  [<ffffffffb0f8e23b>] int_signal+0x12/0x17                                              
      

      ptlrpc_server_request_get() return NULL pointer in ptlrpc_service_purge_all():

       ptlrpc_service_purge_all(struct ptlrpc_service *svc)
      ....
                       while (ptlrpc_server_request_pending(svcpt, true)) {       
                               req = ptlrpc_server_request_get(svcpt, true);      
                               ptlrpc_server_finish_active_request(svcpt, req);   
                       }                                                          
      

      It seems that nrs_tbf_req_get does not implement force mode:

      static                                                                          
      struct ptlrpc_nrs_request *nrs_tbf_req_get(struct ptlrpc_nrs_policy *policy,    
                                                 bool peek, bool force)               
      {                                                                               
              struct nrs_tbf_head       *head = policy->pol_private;                  
              struct ptlrpc_nrs_request *nrq = NULL;                                  
              struct nrs_tbf_client     *cli;                                         
              struct binheap_node       *node;                                        
                                                                                      
              assert_spin_locked(&policy->pol_nrs->nrs_svcpt->scp_req_lock);          
                                                                                      
              if (!peek && policy->pol_nrs->nrs_throttling)                           <---------
                      return NULL;                                                    
      ....
      

      Attachments

        Issue Links

          Activity

            People

              eaujames Etienne Aujames
              eaujames Etienne Aujames
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: