Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10232

kernel BUG at cl_object.c:206!

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Lustre 2.11.0, Lustre 2.10.3
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      this bug was seen during Oleg tests:

       

      [ 2954.010902] ------------[ cut here ]------------
      [ 2954.011604] kernel BUG at /home/green/git/lustre-release/lustre/obdclass/cl_object.c:206!
      [ 2954.012833] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 2954.013401] Modules linked in: loop lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) mbcache lquota(OE) lfsck(OE) jbd2 obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 syscopyarea sysfillrect sysimgblt ata_generic pata_acpi ttm drm_kms_helper i2c_piix4 drm ata_piix pcspkr serio_raw i2c_core virtio_balloon virtio_console libata virtio_blk floppy nfsd ip_tables
      [ 2954.018129] CPU: 1 PID: 20034 Comm: lfs Tainted: G OE ------------ 3.10.0-debug #1
      [ 2954.019284] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 2954.019816] task: ffff88001cb74b80 ti: ffff880004108000 task.ti: ffff880004108000
      [ 2954.020797] RIP: 0010:[<ffffffffa037e281>] [<ffffffffa037e281>] cl_object_attr_get+0x141/0x150 [obdclass]
      [ 2954.021833] RSP: 0018:ffff88000410bbf8 EFLAGS: 00010246
      [ 2954.022354] RAX: 0000000000000000 RBX: ffff880008d97fa0 RCX: ffff880008d97f48
      [ 2954.022883] RDX: 0000000000000000 RSI: ffff880008d97fa0 RDI: ffff880008d97fa0
      [ 2954.023432] RBP: ffff88000410bc18 R08: 0000000000000008 R09: 00000000000000d8
      [ 2954.023985] R10: ffff88005356cf00 R11: 0000000000000000 R12: ffff88005356cf00
      [ 2954.024533] R13: ffff880007698f68 R14: ffff88000410bc60 R15: ffff88006b7e7ed0
      [ 2954.025093] FS: 00007f648fe5f740(0000) GS:ffff8800bc680000(0000) knlGS:0000000000000000
      [ 2954.026098] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 2954.026624] CR2: 00007f9901ead10c CR3: 000000008dfbe000 CR4: 00000000000006e0
      [ 2954.027184] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2954.027741] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 2954.028294] Stack:
      [ 2954.028766] 0000000000000100 ffff88005356cf00 0000000000a8f060 ffff88005356cf00
      [ 2954.029782] ffff88000410bcd8 ffffffffa08c185c 0000000000000020 ffff880007698f68
      [ 2954.030815] 0000000000000020 000000000bd10bd0 0000000000000000 0000000000000000
      [ 2954.031811] Call Trace:
      [ 2954.032293] [<ffffffffa08c185c>] lov_getstripe+0x79c/0x940 [lov]
      [ 2954.033778] [<ffffffffa08bfb4f>] lov_object_getstripe+0x6f/0x180 [lov]
      [ 2954.034366] [<ffffffffa037df4b>] cl_object_getstripe+0x6b/0x130 [obdclass]
      [ 2954.034953] [<ffffffffa0e1ad80>] ll_file_getstripe+0x70/0x170 [lustre]
      [ 2954.046852] [<ffffffffa0e2d322>] ll_lov_setstripe+0x332/0x380 [lustre]
      [ 2954.047402] [<ffffffffa0e2ecbe>] ll_file_ioctl+0x116e/0x35f0 [lustre]
      [ 2954.047953] [<ffffffff810646c5>] ? kernel_map_pages+0xb5/0x120
      [ 2954.048485] [<ffffffff81201985>] do_vfs_ioctl+0x305/0x520
      [ 2954.049018] [<ffffffff817063f7>] ? _raw_spin_unlock+0x27/0x40
      [ 2954.049546] [<ffffffff81201c41>] SyS_ioctl+0xa1/0xc0
      [ 2954.050068] [<ffffffff8170fc89>] system_call_fastpath+0x16/0x1b
      [ 2954.050599] Code: 3a a0 c7 05 32 7a 07 00 cf 00 00 00 48 c7 05 33 7a 07 00 00 00 00 00 c7 05 21 7a 07 00 01 00 00 00 e8 d4 89 e5 ff e9 0a ff ff ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55
      [ 2954.052727] RIP [<ffffffffa037e281>] cl_object_attr_get+0x141/0x150 [obdclass]
      [ 2954.053750] RSP <ffff88000410bbf8>
      
      

      Quick review show that cl_object.c at line 206:

      int cl_object_attr_get(const struct lu_env env, struct cl_object obj,
                              struct cl_attr *attr)
      {
              struct lu_object_header *top;
              int result;
      
              assert_spin_locked(cl_object_attr_guard(obj));
      
      

      and in caller it misses cl_object_attr_lock/unlock pair:

      lov_getstrpe()
      ...
      			cl_obj = cl_object_top(&obj->lo_cl);
      			cl_object_attr_get(env, cl_obj, &attr);
      
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wc-triage WC Triage
                Reporter:
                tappro Mikhail Pershin
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: