Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2637

Invalid kernel paging request in osc_import_event

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.4.0
    • 3
    • 6169

    Description

      I hit this today on a few of the IONs for Sequoia. I was running a mount and umount of the filesystem in a loop, upon killing the script I hit the crash below a some of the nodes:

      LustreError: 8462:0:(llite_lib.c:543:client_common_fill_super()) cannot start close thread: rc -513
      Unable to handle kernel paging request for data at address 0x00000010            
      Faulting instruction address: 0x80000000046825cc                                
      Oops: Kernel access of bad area, sig: 11 [#1]                                   
      SMP NR_CPUS=68 Blue Gene/Q                                                      
      Modules linked in: lmv(U) mgc(U) lustre(U) mdc(U) fid(U) fld(U) lov(U) osc(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) bgvrnic bgmudm
      NIP: 80000000046825cc LR: 8000000003bc3acc CTR: 8000000004682590                
      REGS: c0000003c4a0eaa0 TRAP: 0300   Not tainted  (2.6.32-220.23.3.bgq.18llnl.V1R1M2.bgq62_16.ppc64)
      MSR: 0000000080029000 <EE,ME,CE>  CR: 44088488  XER: 20000000                   
      DEAR: 0000000000000010, ESR: 0000000000000000                                   
      TASK = c0000003e4fdf360[8462] 'mount.lustre' THREAD: c0000003c4a0c000 CPU: 3     
      GPR00: 0000000003060580 c0000003c4a0ed20 80000000046f07a0 c0000003ce1726f0       
      GPR04: c000000360b9f800 00000000000012e0 c0000003c4a0f0b0 00000000640a0000       
      GPR08: c000000360b9f954 0000000000000000 0000000000000001 8000000004684240       
      GPR12: 8000000003bfebb0 c000000000764c00 c0000003c4a68000 00000000000005c2       
      GPR16: 0000000000000001 0000000000002d20 0000000002000400 000000000000028a       
      GPR20: 0000000000000295 0000000000020000 80000000025220e0 0000000000000001       
      GPR24: 0000000000000000 c0000003ce16c138 8000000000b2424c 0000000040080000       
      GPR28: c0000003ce1726f0 8000000000b24248 80000000046eda40 c0000003c4a0ed20       
      NIP [80000000046825cc] .osc_import_event+0xa5c/0x26d0 [osc]                     
      LR [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]            
      Call Trace:                                                                     
      [c0000003c4a0ed20] [c000000360b9f8b0] 0xc000000360b9f8b0 (unreliable)           
      [c0000003c4a0ee60] [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
      [c0000003c4a0ef30] [8000000003bc47b8] .ptlrpc_invalidate_import+0x1d8/0xef0 [ptlrpc]
      [c0000003c4a0f0d0] [8000000004690238] .osc_precleanup+0x2a8/0x720 [osc]         
      [c0000003c4a0f190] [80000000024ac3a0] .class_cleanup+0x240/0x17a0 [obdclass]    
      [c0000003c4a0f310] [80000000024b3208] .class_process_config+0x20f8/0x4b00 [obdclass]
      [c0000003c4a0f460] [80000000024b614c] .class_manual_cleanup+0x53c/0x1760 [obdclass]
      [c0000003c4a0f5c0] [800000000691d6e4] .ll_put_super+0x2c4/0x800 [lustre]        
      [c0000003c4a0f750] [800000000691e3d8] .ll_fill_super+0x7b8/0xae20 [lustre]      
      [c0000003c4a0f900] [80000000024e1144] .lustre_fill_super+0x4c4/0x8e0 [obdclass] 
      [c0000003c4a0f9d0] [c0000000000d4508] .get_sb_nodev+0x84/0xe8                   
      [c0000003c4a0fa80] [80000000024ba618] .lustre_get_sb+0x28/0x40 [obdclass]       
      [c0000003c4a0fb10] [c0000000000d2f14] .vfs_kern_mount+0x80/0x114                
      [c0000003c4a0fbc0] [c0000000000d3010] .do_kern_mount+0x58/0x130                 
      [c0000003c4a0fc80] [c0000000000f12fc] .do_mount+0x84c/0x908                     
      [c0000003c4a0fd70] [c0000000000f1470] .SyS_mount+0xb8/0x124                     
      [c0000003c4a0fe30] [c000000000000580] syscall_exit+0x0/0x2c                     
      Instruction dump:                                                               
      419e00f4 801a0000 20b14000 7fa50040 41dd1868 801d0000 780907e1 40820108         
      eb7900c0 777b4008 4182097c e9390000 <e9290010> eb6901c0 2fbb0000 419e0d10       
      Kernel panic - not syncing: Fatal exception                                      
      Call Trace:                                                                     
      [c0000003c4a0e7d0] [c000000000008d1c] .show_stack+0x7c/0x184 (unreliable)       
      [c0000003c4a0e880] [c000000000431ef4] .panic+0x80/0x1ac                         
      [c0000003c4a0e910] [c000000000019d40] .die+0x1a4/0x1bc                          
      [c0000003c4a0e9b0] [c00000000001f95c] .bad_page_fault+0xb8/0xd4                 
      [c0000003c4a0ea30] [c000000000014e4c] storage_fault_common+0x48/0x4c            
      --- Exception: 300 at .osc_import_event+0xa5c/0x26d0 [osc]                      
          LR = .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]                          
      [c0000003c4a0ed20] [c000000360b9f8b0] 0xc000000360b9f8b0 (unreliable)           
      [c0000003c4a0ee60] [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
      [c0000003c4a0ef30] [8000000003bc47b8] .ptlrpc_invalidate_import+0x1d8/0xef0 [ptlrpc]
      [c0000003c4a0f0d0] [8000000004690238] .osc_precleanup+0x2a8/0x720 [osc]         
      [c0000003c4a0f190] [80000000024ac3a0] .class_cleanup+0x240/0x17a0 [obdclass]    
      [c0000003c4a0f310] [80000000024b3208] .class_process_config+0x20f8/0x4b00 [obdclass]
      [c0000003c4a0f460] [80000000024b614c] .class_manual_cleanup+0x53c/0x1760 [obdclass]
      [c0000003c4a0f5c0] [800000000691d6e4] .ll_put_super+0x2c4/0x800 [lustre]        
      [c0000003c4a0f750] [800000000691e3d8] .ll_fill_super+0x7b8/0xae20 [lustre]      
      [c0000003c4a0f900] [80000000024e1144] .lustre_fill_super+0x4c4/0x8e0 [obdclass] 
      [c0000003c4a0f9d0] [c0000000000d4508] .get_sb_nodev+0x84/0xe8                   
      [c0000003c4a0fa80] [80000000024ba618] .lustre_get_sb+0x28/0x40 [obdclass]       
      [c0000003c4a0fb10] [c0000000000d2f14] .vfs_kern_mount+0x80/0x114                
      [c0000003c4a0fbc0] [c0000000000d3010] .do_kern_mount+0x58/0x130                 
      [c0000003c4a0fc80] [c0000000000f12fc] .do_mount+0x84c/0x908                     
      [c0000003c4a0fd70] [c0000000000f1470] .SyS_mount+0xb8/0x124                     
      [c0000003c4a0fe30] [c000000000000580] syscall_exit+0x0/0x2c 
      

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            prakash Prakash Surya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: