Details

    • 3
    • 8136

    Description

      RIP: 0010:[<ffffffffa06a2807>]  [<ffffffffa06a2807>] cfs_crypto_hash_digest+0xf6/0x166 [libcfs]
      RSP: 0018:ffff88007c5fbda8  EFLAGS: 00010293
      RAX: ffffea0001a55e00 RBX: ffff880078640000 RCX: 0000000087654321
      RDX: 0000000000000000 RSI: ffffffffa06c6398 RDI: ffff880078640000
      RBP: ffff88007c5fbe28 R08: 0000000000000000 R09: d84156c5635688c0
      R10: d84156c5635688c0 R11: 0000000000000020 R12: 0000000000000000
      R13: ffff88007c5fbe78 R14: 0000000000020000 R15: ffff88007c5fbe74
      FS:  00007f2ef3eb6700(0000) GS:ffff880005000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f4777fa5000 CR3: 000000007c62c000 CR4: 00000000000406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process insmod (pid: 8489, threadinfo ffff88007c5fa000, task ffff88007c6f00c0)
      Stack:
       0000000000000000 0000000000000000 0000000000000000 0000000000000000
      <0> 0000000000000000 0000000000000000 ffff88007c321988 0000000000000000
      <0> 00007f2ef1eee010 ffffffffa06c6390 ffff88007c5fbe28 0000000000000000
      Call Trace:
       [<ffffffffa06a29c8>] cfs_crypto_register+0x151/0x35a [libcfs]
       [<ffffffff8151cbcb>] ? _spin_unlock+0x2b/0x40
       [<ffffffffa06b3120>] ? cfs_wi_sched_create+0x3bf/0x4d4 [libcfs]
       [<ffffffffa06a5d34>] ? init_libcfs_module+0x0/0x377 [libcfs]
       [<ffffffffa06a5d34>] ? init_libcfs_module+0x0/0x377 [libcfs]
       [<ffffffffa06a5f5f>] init_libcfs_module+0x22b/0x377 [libcfs]
       [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0
       [<ffffffff810c1333>] sys_init_module+0xe3/0x260
       [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
      Code: c1 e8 0c 48 8d 0c c5 00 00 00 00 48 c1 e0 06 48 29 c8 48 b9 00 00 00 00 00 ea ff ff 48 8d 04 08 b9 21 43 65 87 48 39 4d 80 74 04 <0f> 0b eb fe f6 c2 01 74 04 0f 0b eb fe 83 e2 03 48 09 c2 48 89 
      RIP  [<ffffffffa06a2807>] cfs_crypto_hash_digest+0xf6/0x166 [libcfs]
       RSP <ffff88007c5fbda8>
      ---[ end trace 90f8b1994767d769 ]---
      Kernel panic - not syncing: Fatal exception
      Pid: 8489, comm: insmod Tainted: G      D    ----------------   2.6.32-220.7.1.el6.lustre.2015.x86_64.debug #1
      Call Trace:
       [<ffffffff81518f20>] ? panic+0x78/0x148
       [<ffffffff8151e564>] ? oops_end+0xe4/0x100
       [<ffffffff8100f3bb>] ? die+0x5b/0x90
       [<ffffffff8151dc54>] ? do_trap+0xc4/0x160
       [<ffffffff8100d005>] ? do_invalid_op+0xa5/0xb0
       [<ffffffff8100cff5>] ? do_invalid_op+0x95/0xb0
       [<ffffffffa06a2807>] ? cfs_crypto_hash_digest+0xf6/0x166 [libcfs]
       [<ffffffff8151c661>] ? trace_hardirqs_off_thunk+0x3a/0x3c
       [<ffffffff8100bf9b>] ? invalid_op+0x1b/0x20
       [<ffffffffa06a2807>] ? cfs_crypto_hash_digest+0xf6/0x166 [libcfs]
       [<ffffffffa06a27d7>] ? cfs_crypto_hash_digest+0xc6/0x166 [libcfs]
       [<ffffffffa06a29c8>] ? cfs_crypto_register+0x151/0x35a [libcfs]
       [<ffffffff8151cbcb>] ? _spin_unlock+0x2b/0x40
       [<ffffffffa06b3120>] ? cfs_wi_sched_create+0x3bf/0x4d4 [libcfs]
       [<ffffffffa06a5d34>] ? init_libcfs_module+0x0/0x377 [libcfs]
       [<ffffffffa06a5d34>] ? init_libcfs_module+0x0/0x377 [libcfs]
       [<ffffffffa06a5f5f>] ? init_libcfs_module+0x22b/0x377 [libcfs]
       [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
       [<ffffffff810c1333>] ? sys_init_module+0xe3/0x260
       [<ffffffff8100b0b2>] ? system_call_fastpath+0x16/0x1b
      

      Attachments

        Issue Links

          Activity

            [LU-1714] crash upon loading libcfs module

            I decided to drop the "cfs_" prefix since there are other "sg_" calls in the code which do not use any "cfs_" wrappers. So to be consistent with those calls, I think it's best to leave them with the "sg_" prefix.

            prakash Prakash Surya (Inactive) added a comment - I decided to drop the "cfs_" prefix since there are other "sg_" calls in the code which do not use any "cfs_" wrappers. So to be consistent with those calls, I think it's best to leave them with the "sg_" prefix.

            I think it`s better to add something like this to libcfs headers.

            #ifndef HAVE_SCATTERLIST_INITTABLE
            #define sg_init_table(sg, nents) memset(sg, 0, sizeof(*(sg))*(nents))
            #endif
            #ifndef HAVE_SCATTERLIST_SETPAGE
            sg_set_page(sg, p, len, off)   sg_set_buf(sg, page_address(p) + (off & ~CFS_PAGE_MASK), len)
            #endif
            

            Or maybe cfs_ prefix for this functions.
            lnet/klnds/o2iblnd/o2iblnd_cb.c already use initialized sg, and set_page is faster then set_buf, because it skip addr to page conversion.

            aboyko Alexander Boyko added a comment - I think it`s better to add something like this to libcfs headers. #ifndef HAVE_SCATTERLIST_INITTABLE #define sg_init_table(sg, nents) memset(sg, 0, sizeof(*(sg))*(nents)) #endif #ifndef HAVE_SCATTERLIST_SETPAGE sg_set_page(sg, p, len, off) sg_set_buf(sg, page_address(p) + (off & ~CFS_PAGE_MASK), len) #endif Or maybe cfs_ prefix for this functions. lnet/klnds/o2iblnd/o2iblnd_cb.c already use initialized sg, and set_page is faster then set_buf, because it skip addr to page conversion.
            prakash Prakash Surya (Inactive) added a comment - Please see: http://review.whamcloud.com/3709

            Some other places which may need similar fixes:

            lnet/klnds/o2iblnd/o2iblnd_cb.c:730
            lnet/klnds/o2iblnd/o2iblnd_cb.c:773
            lustre/obdclass/capa.c:266
            lustre/obdclass/capa.c:308
            lustre/obdclass/capa.c:311
            lustre/obdclass/capa.c:261
            lustre/obdclass/capa.c:364
            

            Those locations all make a direct call to sg_set_page which might need to change to sg_init_one.

            prakash Prakash Surya (Inactive) added a comment - Some other places which may need similar fixes: lnet/klnds/o2iblnd/o2iblnd_cb.c:730 lnet/klnds/o2iblnd/o2iblnd_cb.c:773 lustre/obdclass/capa.c:266 lustre/obdclass/capa.c:308 lustre/obdclass/capa.c:311 lustre/obdclass/capa.c:261 lustre/obdclass/capa.c:364 Those locations all make a direct call to sg_set_page which might need to change to sg_init_one.

            Alexander, sure I'll merge your fix into mine. I'll update this ticket when I push the new version and add you to the review list. I should be able to do that later today.

            prakash Prakash Surya (Inactive) added a comment - Alexander, sure I'll merge your fix into mine. I'll update this ticket when I push the new version and add you to the review list. I should be able to do that later today.

            Prakash Surya, how about merging? Or you did not do it?

            aboyko Alexander Boyko added a comment - Prakash Surya, how about merging? Or you did not do it?

            Sure, I'm open to merging the two or landing one and reworking the other. It's probably a good idea to scan the rest of the code for areas which need similar fixes as well.

            prakash Prakash Surya (Inactive) added a comment - Sure, I'm open to merging the two or landing one and reworking the other. It's probably a good idea to scan the rest of the code for areas which need similar fixes as well.

            OK, looking more closely, this has the same cause as ORI-722, but affects different code.

            Alexander and Prakash, could you please get together to make a single common patch that fixes these issues, or at least have a single autoconf check. I prefer the approach in http://review.whamcloud.com/3423 that has a wrapper that can be used in all the code, instead of #ifdefs inline in the code as in http://review.whamcloud.com/3552.

            adilger Andreas Dilger added a comment - OK, looking more closely, this has the same cause as ORI-722 , but affects different code. Alexander and Prakash, could you please get together to make a single common patch that fixes these issues, or at least have a single autoconf check. I prefer the approach in http://review.whamcloud.com/3423 that has a wrapper that can be used in all the code, instead of #ifdefs inline in the code as in http://review.whamcloud.com/3552 .

            Sure, check for sg_init_table looks the same. ORI-722 and http://review.whamcloud.com/3423 relate to the lnet fix. And this issue and http://review.whamcloud.com/3552 - relate to libcfs crypto hash fix. Both has the similar change at autoconf.

            aboyko Alexander Boyko added a comment - Sure, check for sg_init_table looks the same. ORI-722 and http://review.whamcloud.com/3423 relate to the lnet fix. And this issue and http://review.whamcloud.com/3552 - relate to libcfs crypto hash fix. Both has the similar change at autoconf.

            This is the same as ORI-722, recently found on the Orion development branch. Please see http://review.whamcloud.com/3423 for the current patch.

            adilger Andreas Dilger added a comment - This is the same as ORI-722 , recently found on the Orion development branch. Please see http://review.whamcloud.com/3423 for the current patch.
            aboyko Alexander Boyko added a comment - http://review.whamcloud.com/3552

            People

              emoly.liu Emoly Liu
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: