Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10804

obdfilter-survey test_3a: lprocfs_alloc_obd_stats()) ASSERTION( obd->obd_proc_entry != ((void *)0) ) failed; LBUG

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for elena <c17455@cray.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/5d126b54-2233-11e8-9ec4-52540065bddc

      [ 3072.347530] LustreError: 30713:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) ASSERTION( obd->obd_proc_entry != ((void *)0) ) failed: 
      [ 3072.351782] LustreError: 30713:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) LBUG
      [ 3072.354022] Pid: 30713, comm: lctl
      [ 3072.355821] 
      [ 3072.355821] Call Trace:
      [ 3072.359137]  [<ffffffffc06917ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      [ 3072.361177]  [<ffffffffc069183c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      [ 3072.363336]  [<ffffffffc0cbe06f>] lprocfs_alloc_obd_stats+0x18f/0x200 [obdclass]
      [ 3072.365454]  [<ffffffffc0c81730>] ? lprocfs_obd_setup+0x130/0x320 [obdclass]
      [ 3072.367602]  [<ffffffffc12b13ad>] echo_srv_device_alloc+0x48d/0x540 [obdecho]
      [ 3072.369738]  [<ffffffffc0c91cb4>] obd_setup+0x114/0x2a0 [obdclass]
      [ 3072.371892]  [<ffffffffc0c920e8>] class_setup+0x2a8/0x840 [obdclass]
      [ 3072.373960]  [<ffffffffc0c9651f>] class_process_config+0x1b3f/0x27e0 [obdclass]
      [ 3072.376145]  [<ffffffffc0c86ff9>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
      [ 3072.378240]  [<ffffffffc0c7053a>] class_handle_ioctl+0x56a/0x1df0 [obdclass]
      [ 3072.380342]  [<ffffffff8121f7b8>] ? destroy_inode+0x38/0x60
      [ 3072.382250]  [<ffffffff812b3ea8>] ? security_capable+0x18/0x20
      [ 3072.384178]  [<ffffffffc0c567f2>] obd_class_ioctl+0xd2/0x170 [obdclass]
      [ 3072.386089]  [<ffffffff8121730d>] do_vfs_ioctl+0x33d/0x540
      [ 3072.387929]  [<ffffffff81224a04>] ? mntput+0x24/0x40
      [ 3072.389676]  [<ffffffff81205013>] ? __fput+0x183/0x260
      [ 3072.391419]  [<ffffffff812175b1>] SyS_ioctl+0xa1/0xc0
      [ 3072.393128]  [<ffffffff816b8930>] ? system_call_after_swapgs+0x15d/0x214
      [ 3072.394862]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
      [ 3072.396554]  [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214
      [ 3072.398245] 
      [ 3072.399550] Kernel panic - not syncing: LBUG
      

      Attachments

        Issue Links

          Activity

            [LU-10804] obdfilter-survey test_3a: lprocfs_alloc_obd_stats()) ASSERTION( obd->obd_proc_entry != ((void *)0) ) failed; LBUG
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31664/
            Subject: LU-10804 echo: allow echo server to setup procfs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: af94e4b7afd3fa724d161a5127dc9a89d898e73c

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31664/ Subject: LU-10804 echo: allow echo server to setup procfs Project: fs/lustre-release Branch: master Current Patch Set: Commit: af94e4b7afd3fa724d161a5127dc9a89d898e73c

            Mike Pershin (mike.pershin@intel.com) uploaded a new patch: https://review.whamcloud.com/31664
            Subject: LU-10804 echo: allow echo server to setup procfs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8685e8db0b55651fab21170ed75bc9df454c0e58

            gerrit Gerrit Updater added a comment - Mike Pershin (mike.pershin@intel.com) uploaded a new patch: https://review.whamcloud.com/31664 Subject: LU-10804 echo: allow echo server to setup procfs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8685e8db0b55651fab21170ed75bc9df454c0e58

            This is LU-8066, commit 0100ab268c and this particular change:

            diff --git a/lustre/obdecho/echo_client.c b/lustre/obdecho/echo_client.c
            index 081a339..6e25a02 100644
            --- a/lustre/obdecho/echo_client.c
            +++ b/lustre/obdecho/echo_client.c
            @@ -3071,7 +3071,7 @@ static int __init obdecho_init(void)
                     if (rc != 0)
                             goto failed_0;
             
            -       rc = class_register_type(&echo_obd_ops, NULL, true, NULL,
            +       rc = class_register_type(&echo_obd_ops, NULL, false, NULL,
                                             LUSTRE_ECHO_NAME, NULL);
                    if (rc != 0)
                            goto failed_1;
            

            that change causes empty lprocfs root for echo server device and later it asserts on it in lprocfs_alloc_obd_stats() as I can see.

            tappro Mikhail Pershin added a comment - This is LU-8066 , commit 0100ab268c and this particular change: diff --git a/lustre/obdecho/echo_client.c b/lustre/obdecho/echo_client.c index 081a339..6e25a02 100644 --- a/lustre/obdecho/echo_client.c +++ b/lustre/obdecho/echo_client.c @@ -3071,7 +3071,7 @@ static int __init obdecho_init(void) if (rc != 0) goto failed_0; - rc = class_register_type(&echo_obd_ops, NULL, true , NULL, + rc = class_register_type(&echo_obd_ops, NULL, false , NULL, LUSTRE_ECHO_NAME, NULL); if (rc != 0) goto failed_1; that change causes empty lprocfs root for echo server device and later it asserts on it in lprocfs_alloc_obd_stats() as I can see.
            sarah Sarah Liu added a comment -

            Hit this in interop testing

            https://testing.hpdd.intel.com/test_sets/baa94546-2709-11e8-b3c6-52540065bddc

             

            client: 2.10.3

            server: lustre-master tag-2.10.59

            OSS console

            [75396.029755] Lustre: DEBUG MARKER: == obdfilter-survey test 3a: Network survey ========================================================== 18:48:29 (1520966909)
            [75401.745879] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.4.73@tcp) was lost; in progress operations using this service will wait for recovery to complete
            [75401.749951] Lustre: Skipped 9 previous similar messages
            [75410.250117] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts
            [75410.567342] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost1
            [75410.784095] Lustre: server umount lustre-OST0000 complete
            [75410.950964] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75411.276952] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost2' ' /proc/mounts
            [75411.590521] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost2
            [75411.965347] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75412.289128] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost3' ' /proc/mounts
            [75412.601354] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost3
            [75412.988541] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75413.310862] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost4' ' /proc/mounts
            [75413.620037] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost4
            [75413.745052] Lustre: 26996:0:(client.c:2100:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1520966920/real 1520966920] req@ffff880037402400 x1594796080200464/t0(0) o400->MGC10.9.4.73@tcp@10.9.4.73@tcp:26/25 lens 224/224 e 0 to 1 dl 1520966927 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
            [75413.751282] Lustre: 26996:0:(client.c:2100:ptlrpc_expire_one_request()) Skipped 1 previous similar message
            [75413.753466] LustreError: 166-1: MGC10.9.4.73@tcp: Connection to MGS (at 10.9.4.73@tcp) was lost; in progress operations using this service will fail
            [75413.977797] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75414.299315] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost5' ' /proc/mounts
            [75414.606635] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost5
            [75414.967344] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75415.289338] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost6' ' /proc/mounts
            [75415.598284] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost6
            [75415.793961] Lustre: server umount lustre-OST0005 complete
            [75415.795939] Lustre: Skipped 4 previous similar messages
            [75415.954904] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75416.274434] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost7' ' /proc/mounts
            [75416.583237] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost7
            [75416.965393] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
            [75419.722863] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @
            [75421.172060] LustreError: 14767:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) ASSERTION( obd->obd_proc_entry != ((void *)0) ) failed: 
            [75421.176024] LustreError: 14767:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) LBUG
            [75421.178166] Pid: 14767, comm: lctl
            [75421.179941] 
            [75421.179941] Call Trace:
            [75421.207941] [<ffffffffc06ae7ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
            [75421.210086] [<ffffffffc06ae83c>] lbug_with_loc+0x4c/0xb0 [libcfs]
            [75421.212244] [<ffffffffc08c867f>] lprocfs_alloc_obd_stats+0x18f/0x200 [obdclass]
            [75421.214387] [<ffffffffc088b920>] ? lprocfs_obd_setup+0x130/0x320 [obdclass]
            [75421.237295] [<ffffffffc10a8ced>] echo_setup+0x1bd/0x2c0 [obdecho]
            [75421.273090] [<ffffffffc0ab6810>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc]
            [75421.275072] [<ffffffffc089bfda>] obd_setup+0x18a/0x2b0 [obdclass]
            [75421.276950] [<ffffffffc06c0a66>] ? cfs_hash_create+0x366/0xa20 [libcfs]
            [75421.278899] [<ffffffffc089c3a8>] class_setup+0x2a8/0x840 [obdclass]
            [75421.280725] [<ffffffffc08a081c>] class_process_config+0x1b5c/0x2810 [obdclass]
            [75421.282654] [<ffffffffc0891219>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
            [75421.284527] [<ffffffffc087a71a>] class_handle_ioctl+0x56a/0x1de0 [obdclass]
            [75421.286438] [<ffffffff8122239b>] ? destroy_inode+0x3b/0x60
            [75421.288202] [<ffffffff812b72be>] ? security_capable+0x1e/0x20
            [75421.290020] [<ffffffffc08607f2>] obd_class_ioctl+0xd2/0x170 [obdclass]
            [75421.291824] [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560
            [75421.293554] [<ffffffff812275e4>] ? mntput+0x24/0x40
            [75421.295289] [<ffffffff812079b6>] ? __fput+0x186/0x260
            [75421.296973] [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0
            [75421.298680] [<ffffffff816c0655>] ? system_call_after_swapgs+0xa2/0x146
            [75421.300454] [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21
            [75421.302229] [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146
            [75421.304033] 
            [75421.305435] Kernel panic - not syncing: LBUG
            [75421.306428] CPU: 1 PID: 14767 Comm: lctl Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1
            [75421.306428] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            [75421.306428] Call Trace:
            [75421.306428] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
            [75421.306428] [<ffffffff816a8634>] panic+0xe8/0x21f
            [75421.306428] [<ffffffffc06ae854>] lbug_with_loc+0x64/0xb0 [libcfs]
            [75421.306428] [<ffffffffc08c867f>] lprocfs_alloc_obd_stats+0x18f/0x200 [obdclass]
            [75421.306428] [<ffffffffc088b920>] ? lprocfs_obd_setup+0x130/0x320 [obdclass]
            [75421.306428] [<ffffffffc10a8ced>] echo_setup+0x1bd/0x2c0 [obdecho]
            [75421.306428] [<ffffffffc0ab6810>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc]
            [75421.306428] [<ffffffffc089bfda>] obd_setup+0x18a/0x2b0 [obdclass]
            [75421.306428] [<ffffffffc06c0a66>] ? cfs_hash_create+0x366/0xa20 [libcfs]
            [75421.306428] [<ffffffffc089c3a8>] class_setup+0x2a8/0x840 [obdclass]
            [75421.306428] [<ffffffffc08a081c>] class_process_config+0x1b5c/0x2810 [obdclass]
            [75421.306428] [<ffffffffc0891219>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
            [75421.306428] [<ffffffffc087a71a>] class_handle_ioctl+0x56a/0x1de0 [obdclass]
            [75421.306428] [<ffffffff8122239b>] ? destroy_inode+0x3b/0x60
            [75421.306428] [<ffffffff812b72be>] ? security_capable+0x1e/0x20
            [75421.306428] [<ffffffffc08607f2>] obd_class_ioctl+0xd2/0x170 [obdclass]
            [75421.306428] [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560
            [75421.306428] [<ffffffff812275e4>] ? mntput+0x24/0x40
            [75421.306428] [<ffffffff812079b6>] ? __fput+0x186/0x260
            [75421.306428] [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0
            [75421.306428] [<ffffffff816c0655>] ? system_call_after_swapgs+0xa2/0x146
            [75421.306428] [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21
            [75421.306428] [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146
            [ 0.000000] Initializing cgroup subsys cpuset
            
            
            sarah Sarah Liu added a comment - Hit this in interop testing https://testing.hpdd.intel.com/test_sets/baa94546-2709-11e8-b3c6-52540065bddc   client: 2.10.3 server: lustre-master tag-2.10.59 OSS console [75396.029755] Lustre: DEBUG MARKER: == obdfilter-survey test 3a: Network survey ========================================================== 18:48:29 (1520966909) [75401.745879] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.9.4.73@tcp) was lost; in progress operations using this service will wait for recovery to complete [75401.749951] Lustre: Skipped 9 previous similar messages [75410.250117] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts [75410.567342] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost1 [75410.784095] Lustre: server umount lustre-OST0000 complete [75410.950964] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75411.276952] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost2' ' /proc/mounts [75411.590521] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost2 [75411.965347] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75412.289128] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost3' ' /proc/mounts [75412.601354] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost3 [75412.988541] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75413.310862] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost4' ' /proc/mounts [75413.620037] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost4 [75413.745052] Lustre: 26996:0:(client.c:2100:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1520966920/real 1520966920] req@ffff880037402400 x1594796080200464/t0(0) o400->MGC10.9.4.73@tcp@10.9.4.73@tcp:26/25 lens 224/224 e 0 to 1 dl 1520966927 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [75413.751282] Lustre: 26996:0:(client.c:2100:ptlrpc_expire_one_request()) Skipped 1 previous similar message [75413.753466] LustreError: 166-1: MGC10.9.4.73@tcp: Connection to MGS (at 10.9.4.73@tcp) was lost; in progress operations using this service will fail [75413.977797] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75414.299315] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost5' ' /proc/mounts [75414.606635] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost5 [75414.967344] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75415.289338] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost6' ' /proc/mounts [75415.598284] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost6 [75415.793961] Lustre: server umount lustre-OST0005 complete [75415.795939] Lustre: Skipped 4 previous similar messages [75415.954904] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75416.274434] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost7' ' /proc/mounts [75416.583237] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost7 [75416.965393] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' [75419.722863] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @ [75421.172060] LustreError: 14767:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) ASSERTION( obd->obd_proc_entry != ((void *)0) ) failed: [75421.176024] LustreError: 14767:0:(lprocfs_status_server.c:647:lprocfs_alloc_obd_stats()) LBUG [75421.178166] Pid: 14767, comm: lctl [75421.179941] [75421.179941] Call Trace: [75421.207941] [<ffffffffc06ae7ae>] libcfs_call_trace+0x4e/0x60 [libcfs] [75421.210086] [<ffffffffc06ae83c>] lbug_with_loc+0x4c/0xb0 [libcfs] [75421.212244] [<ffffffffc08c867f>] lprocfs_alloc_obd_stats+0x18f/0x200 [obdclass] [75421.214387] [<ffffffffc088b920>] ? lprocfs_obd_setup+0x130/0x320 [obdclass] [75421.237295] [<ffffffffc10a8ced>] echo_setup+0x1bd/0x2c0 [obdecho] [75421.273090] [<ffffffffc0ab6810>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [75421.275072] [<ffffffffc089bfda>] obd_setup+0x18a/0x2b0 [obdclass] [75421.276950] [<ffffffffc06c0a66>] ? cfs_hash_create+0x366/0xa20 [libcfs] [75421.278899] [<ffffffffc089c3a8>] class_setup+0x2a8/0x840 [obdclass] [75421.280725] [<ffffffffc08a081c>] class_process_config+0x1b5c/0x2810 [obdclass] [75421.282654] [<ffffffffc0891219>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [75421.284527] [<ffffffffc087a71a>] class_handle_ioctl+0x56a/0x1de0 [obdclass] [75421.286438] [<ffffffff8122239b>] ? destroy_inode+0x3b/0x60 [75421.288202] [<ffffffff812b72be>] ? security_capable+0x1e/0x20 [75421.290020] [<ffffffffc08607f2>] obd_class_ioctl+0xd2/0x170 [obdclass] [75421.291824] [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560 [75421.293554] [<ffffffff812275e4>] ? mntput+0x24/0x40 [75421.295289] [<ffffffff812079b6>] ? __fput+0x186/0x260 [75421.296973] [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0 [75421.298680] [<ffffffff816c0655>] ? system_call_after_swapgs+0xa2/0x146 [75421.300454] [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21 [75421.302229] [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146 [75421.304033] [75421.305435] Kernel panic - not syncing: LBUG [75421.306428] CPU: 1 PID: 14767 Comm: lctl Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1 [75421.306428] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [75421.306428] Call Trace: [75421.306428] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b [75421.306428] [<ffffffff816a8634>] panic+0xe8/0x21f [75421.306428] [<ffffffffc06ae854>] lbug_with_loc+0x64/0xb0 [libcfs] [75421.306428] [<ffffffffc08c867f>] lprocfs_alloc_obd_stats+0x18f/0x200 [obdclass] [75421.306428] [<ffffffffc088b920>] ? lprocfs_obd_setup+0x130/0x320 [obdclass] [75421.306428] [<ffffffffc10a8ced>] echo_setup+0x1bd/0x2c0 [obdecho] [75421.306428] [<ffffffffc0ab6810>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc] [75421.306428] [<ffffffffc089bfda>] obd_setup+0x18a/0x2b0 [obdclass] [75421.306428] [<ffffffffc06c0a66>] ? cfs_hash_create+0x366/0xa20 [libcfs] [75421.306428] [<ffffffffc089c3a8>] class_setup+0x2a8/0x840 [obdclass] [75421.306428] [<ffffffffc08a081c>] class_process_config+0x1b5c/0x2810 [obdclass] [75421.306428] [<ffffffffc0891219>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [75421.306428] [<ffffffffc087a71a>] class_handle_ioctl+0x56a/0x1de0 [obdclass] [75421.306428] [<ffffffff8122239b>] ? destroy_inode+0x3b/0x60 [75421.306428] [<ffffffff812b72be>] ? security_capable+0x1e/0x20 [75421.306428] [<ffffffffc08607f2>] obd_class_ioctl+0xd2/0x170 [obdclass] [75421.306428] [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560 [75421.306428] [<ffffffff812275e4>] ? mntput+0x24/0x40 [75421.306428] [<ffffffff812079b6>] ? __fput+0x186/0x260 [75421.306428] [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0 [75421.306428] [<ffffffff816c0655>] ? system_call_after_swapgs+0xa2/0x146 [75421.306428] [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21 [75421.306428] [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146 [ 0.000000] Initializing cgroup subsys cpuset

            Looking at the assertion, this appears like it is related to changes in /proc, possibly related to /sysfs changes? Possibly James already has a patch to fix this part of the code?

            adilger Andreas Dilger added a comment - Looking at the assertion, this appears like it is related to changes in /proc, possibly related to /sysfs changes? Possibly James already has a patch to fix this part of the code?
            pjones Peter Jones added a comment -

            Mike

            Can you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Mike Can you please investigate? Thanks Peter

            This was not observed in the latest master test results because of test_3a was not functional, see LU-7420 :

            > this test passes at Maloo, seems it does just nothing.

            LU-7420 patch https://review.whamcloud.com/#/c/18443/ fixes the test to be executed. The reported issue hit with this patch.

            egryaznova Elena Gryaznova added a comment - This was not observed in the latest master test results because of test_3a was not functional, see LU-7420 : > this test passes at Maloo, seems it does just nothing. LU-7420 patch https://review.whamcloud.com/#/c/18443/  fixes the test to be executed. The reported issue hit with this patch.
            green Oleg Drokin added a comment -

            hm, so that's all after Mar 6th round of landings, I guess.

            green Oleg Drokin added a comment - hm, so that's all after Mar 6th round of landings, I guess.

            I've reviewed all obdfilter-survey test 3a hangs over all branches starting from January 1, 2018 to today and I've found the following test sessions that have the same assertion as in this ticket:

            2018-03-09 22:59:29 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/67d21086-240d-11e8-b74b-52540065bddc

            2018-03-09 23:53:56 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/6d7aa550-240f-11e8-b3c6-52540065bddc

            2018-03-10 02:13:04 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/8d7b15dc-242e-11e8-b74b-52540065bddc

            2018-03-09 00:54:41 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/7c8f22cc-2348-11e8-9852-52540065bddc

            2018-03-08 18:53:42 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/965a62b4-231d-11e8-9852-52540065bddc

            2018-03-08 21:41:26 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/15d7653a-233e-11e8-8d2f-52540065bddc

            2018-03-08 22:23:08 UTC full-patchless – https://testing.hpdd.intel.com/test_sets/0f8acdfa-2340-11e8-88de-52540065bddc

            All are from full test sessions for our patchless Lustre builds/testing.

            jamesanunez James Nunez (Inactive) added a comment - I've reviewed all obdfilter-survey test 3a hangs over all branches starting from January 1, 2018 to today and I've found the following test sessions that have the same assertion as in this ticket: 2018-03-09 22:59:29 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/67d21086-240d-11e8-b74b-52540065bddc 2018-03-09 23:53:56 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/6d7aa550-240f-11e8-b3c6-52540065bddc 2018-03-10 02:13:04 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/8d7b15dc-242e-11e8-b74b-52540065bddc 2018-03-09 00:54:41 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/7c8f22cc-2348-11e8-9852-52540065bddc 2018-03-08 18:53:42 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/965a62b4-231d-11e8-9852-52540065bddc 2018-03-08 21:41:26 UTC full-patchless - https://testing.hpdd.intel.com/test_sets/15d7653a-233e-11e8-8d2f-52540065bddc 2018-03-08 22:23:08 UTC full-patchless – https://testing.hpdd.intel.com/test_sets/0f8acdfa-2340-11e8-88de-52540065bddc All are from full test sessions for our patchless Lustre builds/testing.

            People

              tappro Mikhail Pershin
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: