Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5340

oops in cl_object_top() after reconstructed setattr on removed directory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.6.0, Lustre 2.7.0
    • 3
    • 14897

    Description

      See the discussion on from http://review.whamcloud.com/#/c/11025/2/lustre/mdt/mdt_recovery.c.

      To reproduce:

      MOUNT_2=y llmount.sh
      mkdir /mnt/lustre/d0
      stat /mnt/lustre/d0
      lctl set_param fail_loc=0x119 # OBD_FAIL_MDS_REINT_NET_REP
      touch /mnt/lustre/d0 &
      sleep 1
      lctl set_param fail_loc=0
      rmdir /mnt/lustre2/d0
      
      [  132.938246] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt
      [  132.970143] Lustre: *** cfs_fail_loc=119, val=2147483648***
      [  132.972029] LustreError: 3623:0:(ldlm_lib.c:2399:target_send_reply_msg()) @@@ dropping reply  req@ffff8801f8edd908 x1473623045964728/t4294967298(0) o36->ac7dcf52-7dca-015b-3e5b-0a840c48c232@0@lo:0/0 lens 488/456 e 0 to 0 dl 1405356492 ref 1 fl Interpret:/0/0 rc 0/0
      [  139.969185] Lustre: 3855:0:(client.c:1926:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1405356486/real 1405356486]  req@ffff8801f64beb00 x1473623045964728/t0(0) o36->lustre-MDT0000-mdc-ffff880203aa4860@0@lo:12/10 lens 488/1016 e 0 to 1 dl 1405356493 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
      [  139.978122] Lustre: lustre-MDT0000-mdc-ffff880203aa4860: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
      [  139.983045] Lustre: lustre-MDT0000: Client ac7dcf52-7dca-015b-3e5b-0a840c48c232 (at 0@lo) reconnecting
      [  139.984979] Lustre: lustre-MDT0000-mdc-ffff880203aa4860: Connection restored to lustre-MDT0000 (at 0@lo)
      [  139.987189] BUG: unable to handle kernel NULL pointer dereference at (null)
      [  139.988113] IP: [<ffffffffa046084e>] cl_object_top+0xe/0x150 [obdclass]
      [  139.988113] PGD 1f4083067 PUD 1fbfb9067 PMD 0 
      [  139.988113] Oops: 0000 [#1] SMP 
      [  139.988113] last sysfs file: /sys/devices/system/cpu/possible
      [  139.988113] CPU 0 
      [  139.988113] Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) mgs(U) nodemap(U) osd_ldiskfs(U) ldiskfs(U) exportfs lquota(U) lfsck(U) jbd obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 microcode virtio_balloon virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      [  139.988113] 
      [  139.988113] Pid: 3855, comm: touch Not tainted 2.6.32-431.5.1.el6.lustre.x86_64 #1 Bochs Bochs
      [  139.988113] RIP: 0010:[<ffffffffa046084e>]  [<ffffffffa046084e>] cl_object_top+0xe/0x150 [obdclass]
      [  139.988113] RSP: 0018:ffff8801f40b9c58  EFLAGS: 00010292
      [  139.988113] RAX: ffff8801f8ed2f60 RBX: ffff880219e9a690 RCX: 0000000000000000
      [  139.988113] RDX: ffff8801f8f8bce8 RSI: ffffffffa04bca20 RDI: 0000000000000000
      [  139.988113] RBP: ffff8801f40b9c68 R08: 0000000000000001 R09: 0000000000000001
      [  139.988113] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880217e903c8
      [  139.988113] R13: 0000000000000002 R14: 0000000000000000 R15: ffff8801f8ed2f60
      [  139.988113] FS:  00007f641a061700(0000) GS:ffff88002f800000(0000) knlGS:0000000000000000
      [  139.988113] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  139.988113] CR2: 0000000000000000 CR3: 00000001f4074000 CR4: 00000000000006f0
      [  139.988113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  139.988113] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  139.988113] Process touch (pid: 3855, threadinfo ffff8801f40b8000, task ffff8801fbf16380)
      [  139.988113] Stack:
      [  139.988113]  ffff8801f40b9c68 ffff880219e9a690 ffff8801f40b9ca8 ffffffffa046fded
      [  139.988113] <d> ffff8801f40b9cd0 ffff880217e903a0 ffff880219e9a690 ffff8801f40b9e48
      [  139.988113] <d> ffff880217e903c8 ffff8801f40b9cd4 ffff8801f40b9d08 ffffffffa0e846e9
      [  139.988113] Call Trace:
      [  139.988113]  [<ffffffffa046fded>] cl_io_init+0x3d/0xe0 [obdclass]
      [  139.988113]  [<ffffffffa0e846e9>] cl_setattr_ost+0xe9/0x2f0 [lustre]
      [  139.988113]  [<ffffffffa0e4e98c>] ll_setattr_raw+0xa2c/0x10d0 [lustre]
      [  139.988113]  [<ffffffff8107b997>] ? current_fs_time+0x27/0x30
      [  139.988113]  [<ffffffffa0e4f095>] ll_setattr+0x65/0xd0 [lustre]
      [  139.988113]  [<ffffffff811c16a8>] notify_change+0x168/0x340
      [  139.988113]  [<ffffffff811d601e>] utimes_common+0xde/0x1c0
      [  139.988113]  [<ffffffff8119f51b>] ? put_unused_fd+0x3b/0x90
      [  139.988113]  [<ffffffff811d61d0>] do_utimes+0xd0/0x170
      [  139.988113]  [<ffffffff811d6372>] sys_utimensat+0x32/0x90
      [  139.988113]  [<ffffffff81554222>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [  139.988113]  [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      [  139.988113] Code: 48 89 df e8 85 71 d2 e0 48 c7 c3 f4 ff ff ff e9 2a ff ff ff 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 <48> 8b 07 0f 1f 80 00 00 00 00 48 89 c2 48 8b 80 a0 00 00 00 48 
      [  139.988113] RIP  [<ffffffffa046084e>] cl_object_top+0xe/0x150 [obdclass]
      [  139.988113]  RSP <ffff8801f40b9c58>
      [  139.988113] CR2: 0000000000000000
      

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: