Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6271

(osc_cache.c:3150:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(page)))) ) failed:

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.8.0
    • Lustre 2.7.0
    • Hyperon
    • 3
    • 17581

    Description

      While performing failover testing:

      LustreError: 167-0: lustre-OST0000-osc-ffff88083b339000: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
      Feb 23 09:22:48 iwc260 kernel: LustreError: 51585:0:(ldlm_resource.c:777:ldlm_resource_complain()) lustre-OST0000-osc-ffff88083b339000: namespace resource [0x2d097bb:0x0:0x0].0 (ffff88106a992e40) refcount nonzero (1) after lock cleanup; forcing cleanup.
      Feb 23 09:22:48 iwc260 kernel: LustreError: 51585:0:(ldlm_resource.c:1374:ldlm_resource_dump()) --- Resource: [0x2d097bb:0x0:0x0].0 (ffff88106a992e40) refcount = 3
      Feb 23 09:22:48 iwc260 kernel: LustreError: 51585:0:(ldlm_resource.c:1377:ldlm_resource_dump()) Granted locks (in reverse order):
      Feb 23 09:22:48 iwc260 kernel: LustreError: 51585:0:(ldlm_resource.c:1380:ldlm_resource_dump()) ### ### ns: lustre-OST0000-osc-ffff88083b339000 lock: ffff88106a60d540/0xa1f402a8812988f4 lrc: 3/0,1 mode: PW/PW res: [0x2d097bb:0x0:0x0].0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->8191) flags: 0x126400020000 nid: local remote: 0x72a9f4c80e66e07b expref: -99 pid: 51570 timeout: 0 lvb_type: 1
      Feb 23 09:22:48 iwc260 kernel: Lustre: lustre-OST0000-osc-ffff88083b339000: Connection restored to lustre-OST0000 (at 192.168.120.13@o2ib)
      Feb 23 09:22:49 iwc260 kernel: LustreError: 90252:0:(osc_cache.c:3150:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(page)))) ) failed:
      Feb 23 09:22:49 iwc260 kernel: LustreError: 90252:0:(osc_cache.c:3150:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(page)))) ) failed:
      Feb 23 09:22:49 iwc260 kernel: LustreError: 90252:0:(osc_cache.c:3150:discard_cb()) LBUG
      Feb 23 09:22:49 iwc260 kernel: LustreError: 90252:0:(osc_cache.c:3150:discard_cb()) LBUG
      Feb 23 09:22:49 iwc260 kernel: Pid: 90252, comm: ldlm_bl_36
      Feb 23 09:22:49 iwc260 kernel:
      Feb 23 09:22:49 iwc260 kernel: Call Trace:
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0435895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0435e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2ba56>] discard_cb+0x156/0x190 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2bdcc>] osc_page_gang_lookup+0x1ac/0x330 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2b900>] ? discard_cb+0x0/0x190 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2c094>] osc_lock_discard_pages+0x144/0x240 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa04461c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2b900>] ? discard_cb+0x0/0x190 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b2298b>] osc_lock_flush+0x8b/0x260 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b22e08>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0761a6c>] ldlm_cancel_callback+0x6c/0x170 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa077450a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0779120>] ldlm_cli_cancel+0x60/0x360 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa0b22c3b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa04461c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa077cb60>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa077d0c1>] ldlm_bl_thread_main+0x291/0x3f0 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
      Feb 23 09:22:49 iwc260 kernel: [<ffffffffa077ce30>] ? ldlm_bl_thread_main+0x0/0x3f0 [ptlrpc]
      Feb 23 09:22:49 iwc260 kernel: [<ffffffff8109abf6>] kthread+0x96/0xa0
      Feb 23 09:22:49 iwc260 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
      Feb 23 09:22:49 iwc260 kernel: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      Feb 23 09:22:49 iwc260 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      Attachments

        1. iwc189.lbug.log.txt.gz
          0.2 kB
        2. iwc260.log.gz
          4.81 MB
        3. l-23.LU-6271.txt.gz
          0.2 kB
        4. lustre-log.iwc37.txt
          329 kB
        5. r2i1n4.messages.gz
          96 kB
        6. server_messages.tar.gz
          924 kB

        Issue Links

          Activity

            [LU-6271] (osc_cache.c:3150:discard_cb()) ASSERTION( (!(page->cp_type == CPT_CACHEABLE) || (!PageDirty(cl_page_vmpage(page)))) ) failed:

            Hi Dongyang, please upload your test cases.

            jay Jinshan Xiong (Inactive) added a comment - Hi Dongyang, please upload your test cases.

            OK I've been running the scripts with patchset 5 of 16456 for some time, seems the general IO is fine.
            But when I tried the group lock, the client crashed immediately after eviction:

            <0>LustreError: 1697:0:(osc_cache.c:2907:osc_cache_writeback_range()) ASSERTION( !ext->oe_hp ) failed: 
            <0>LustreError: 1697:0:(osc_cache.c:2907:osc_cache_writeback_range()) LBUG
            <4>Pid: 1697, comm: ldlm_bl_11
            <4>
            <4>Call Trace:
            <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs]
            <4> [<ffffffffa0922085>] osc_cache_writeback_range+0x1275/0x1280 [osc]
            <4> [<ffffffffa090b545>] osc_lock_flush+0x175/0x260 [osc]
            <4> [<ffffffffa090b8d8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
            <4> [<ffffffffa050ba57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
            <4> [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50
            <4> [<ffffffff8100969d>] ? __switch_to+0x7d/0x340
            <4> [<ffffffffa051e84a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            <4> [<ffffffffa05234bc>] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc]
            <4> [<ffffffffa090b70b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
            <4> [<ffffffff810672c2>] ? default_wake_function+0x12/0x20
            <4> [<ffffffffa0527400>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            <4> [<ffffffffa0527ee4>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa0527a60>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
            <4> [<ffffffff810a101e>] kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
            <4>
            <0>Kernel panic - not syncing: LBUG
            <4>Pid: 1697, comm: ldlm_bl_11 Not tainted 2.6.32-573.3.1.el6.x86_64 #1
            <4>Call Trace:
            <4> [<ffffffff81537c54>] ? panic+0xa7/0x16f
            <4> [<ffffffffa021decb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
            <4> [<ffffffffa0922085>] ? osc_cache_writeback_range+0x1275/0x1280 [osc]
            <4> [<ffffffffa090b545>] ? osc_lock_flush+0x175/0x260 [osc]
            <4> [<ffffffffa090b8d8>] ? osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
            <4> [<ffffffffa050ba57>] ? ldlm_cancel_callback+0x87/0x280 [ptlrpc]
            <4> [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50
            <4> [<ffffffff8100969d>] ? __switch_to+0x7d/0x340
            <4> [<ffffffffa051e84a>] ? ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            <4> [<ffffffffa05234bc>] ? ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc]
            <4> [<ffffffffa090b70b>] ? osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
            <4> [<ffffffff810672c2>] ? default_wake_function+0x12/0x20
            <4> [<ffffffffa0527400>] ? ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            <4> [<ffffffffa0527ee4>] ? ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa0527a60>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
            <4> [<ffffffff810a101e>] ? kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
            
            lidongyang Li Dongyang (Inactive) added a comment - OK I've been running the scripts with patchset 5 of 16456 for some time, seems the general IO is fine. But when I tried the group lock, the client crashed immediately after eviction: <0>LustreError: 1697:0:(osc_cache.c:2907:osc_cache_writeback_range()) ASSERTION( !ext->oe_hp ) failed: <0>LustreError: 1697:0:(osc_cache.c:2907:osc_cache_writeback_range()) LBUG <4>Pid: 1697, comm: ldlm_bl_11 <4> <4>Call Trace: <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa0922085>] osc_cache_writeback_range+0x1275/0x1280 [osc] <4> [<ffffffffa090b545>] osc_lock_flush+0x175/0x260 [osc] <4> [<ffffffffa090b8d8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc] <4> [<ffffffffa050ba57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc] <4> [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50 <4> [<ffffffff8100969d>] ? __switch_to+0x7d/0x340 <4> [<ffffffffa051e84a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] <4> [<ffffffffa05234bc>] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc] <4> [<ffffffffa090b70b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc] <4> [<ffffffff810672c2>] ? default_wake_function+0x12/0x20 <4> [<ffffffffa0527400>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] <4> [<ffffffffa0527ee4>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0527a60>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc] <4> [<ffffffff810a101e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 1697, comm: ldlm_bl_11 Not tainted 2.6.32-573.3.1.el6.x86_64 #1 <4>Call Trace: <4> [<ffffffff81537c54>] ? panic+0xa7/0x16f <4> [<ffffffffa021decb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa0922085>] ? osc_cache_writeback_range+0x1275/0x1280 [osc] <4> [<ffffffffa090b545>] ? osc_lock_flush+0x175/0x260 [osc] <4> [<ffffffffa090b8d8>] ? osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc] <4> [<ffffffffa050ba57>] ? ldlm_cancel_callback+0x87/0x280 [ptlrpc] <4> [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50 <4> [<ffffffff8100969d>] ? __switch_to+0x7d/0x340 <4> [<ffffffffa051e84a>] ? ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] <4> [<ffffffffa05234bc>] ? ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc] <4> [<ffffffffa090b70b>] ? osc_ldlm_blocking_ast+0xdb/0x3c0 [osc] <4> [<ffffffff810672c2>] ? default_wake_function+0x12/0x20 <4> [<ffffffffa0527400>] ? ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] <4> [<ffffffffa0527ee4>] ? ldlm_bl_thread_main+0x484/0x700 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0527a60>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc] <4> [<ffffffff810a101e>] ? kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20

            Please check the new patch and see if it can fix the problem.

            jay Jinshan Xiong (Inactive) added a comment - Please check the new patch and see if it can fix the problem.

            I did very similar thing with iozone but they wrote to different files, let me try your reproducer.

            jay Jinshan Xiong (Inactive) added a comment - I did very similar thing with iozone but they wrote to different files, let me try your reproducer.
            lidongyang Li Dongyang (Inactive) added a comment - - edited

            Hi Jinshan,
            Hardware shouldn't matter as I'm reproducing it on my VMs.
            Here are the scripts I used.
            On client, I'm using IOR and when IOR dies because of eviction, just bring up new ones:

            #!/bin/sh
            while true
            do
            	num=$(ps aux | grep IOR | wc -l)
            	while [ $num -lt 20 ]
            	do
            		/root/IOR/src/C/IOR -b 8g -w -e -E -t 1m -v -k -o /mnt/testfile &
            		num=$(ps aux | grep IOR | wc -l)
            	done
            	sleep 1
            done
            

            Here I got 20 of them running at the same time.

            On the OSS:

            #!/bin/sh
            while true
            do	
            	for ost in /proc/fs/lustre/obdfilter/*
            	do
            		if [ -n $(cat $ost/exports/<your client nid>/uuid) ]
            		then
            			echo $(cat $ost/exports/<your client nid>/uuid) > $ost/evict_client
            		fi
            	done
            	sleep 10
            done
            

            As you can see I'm evicting the client over and over, and I'm using 10secs as the interval. Again you can change this.

            Thanks
            Dongyang

            lidongyang Li Dongyang (Inactive) added a comment - - edited Hi Jinshan, Hardware shouldn't matter as I'm reproducing it on my VMs. Here are the scripts I used. On client, I'm using IOR and when IOR dies because of eviction, just bring up new ones: #!/bin/sh while true do num=$(ps aux | grep IOR | wc -l) while [ $num -lt 20 ] do /root/IOR/src/C/IOR -b 8g -w -e -E -t 1m -v -k -o /mnt/testfile & num=$(ps aux | grep IOR | wc -l) done sleep 1 done Here I got 20 of them running at the same time. On the OSS: #!/bin/sh while true do for ost in /proc/fs/lustre/obdfilter/* do if [ -n $(cat $ost/exports/<your client nid>/uuid) ] then echo $(cat $ost/exports/<your client nid>/uuid) > $ost/evict_client fi done sleep 10 done As you can see I'm evicting the client over and over, and I'm using 10secs as the interval. Again you can change this. Thanks Dongyang

            Hi Dongyang,

            it certainly lasted longer on my VM node

            Please upload your reproducer and I will restart working on this issue once I get access to a real hardware.

            Thanks,

            jay Jinshan Xiong (Inactive) added a comment - Hi Dongyang, it certainly lasted longer on my VM node Please upload your reproducer and I will restart working on this issue once I get access to a real hardware. Thanks,

            Hi Jinshan,
            I'm using master which has 14989 landed already, plus patch set 3 of 16456:
            The client crashed 5mins after I started the reproducer:
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:887:ldlm_resource_complain()) testfs-OST0000-osc-ffff8800375c4000: namespace resource [0x303:0x0:0x0].0 (ffff88003d2abcc0) refcount nonzero (1) after lock cleanup; forcing cleanup.
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1502:ldlm_resource_dump()) — Resource: [0x303:0x0:0x0].0 (ffff88003d2abcc0) refcount = 3
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1505:ldlm_resource_dump()) Granted locks (in reverse order):
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1508:ldlm_resource_dump()) ### ### ns: testfs-OST0000-osc-ffff8800375c4000 lock: ffff88000b69d380/0x2104da78ff029aa9 lrc: 8/0,1 mode: PW/PW res: [0x303:0x0:0x0].0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d1d697 expref: -99 pid: 16588 timeout: 0 lvb_type: 1
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1502:ldlm_resource_dump()) — Resource: [0x303:0x0:0x0].0 (ffff88003d2abcc0) refcount = 2
            Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1505:ldlm_resource_dump()) Granted locks (in reverse order):
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) page@ffff88003d5a7600[3 ffff88003a513b38 1 0 1 ffff88003a5d6b78]
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) vvp-page@ffff88003d5a7650(0:0) vm@ffffea00004439e0 2000000000087d 3:0 ffff88003d5a7600 25088 lru
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) lov-page@ffff88003d5a7690, raid0
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) osc-page@ffff88003d5a76f8 25088: 1< 0x845fed 258 0 + - > 2< 102760448 0 4096 0x0 0x520 | (null) ffff88003c479500 ffff88003a517e20 > 3< 1 0 0 > 4< 0 9 8 18446744073642434560 + | + - + - > 5< + - + - | 0 - | 5635 - ->
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) end page@ffff88003d5a7600
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) discard dirty page?
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:2447:osc_teardown_async_page()) extent ffff880039a18720@

            {[25088 -> 25341/25343], [2|0|+|locking|wi|ffff88003a517e20], [1040384|254|+|-|ffff88000b69d580|256|ffff8800372b3520]}

            trunc at 25088.
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:2447:osc_teardown_async_page()) ### extent: ffff880039a18720
            Sep 18 13:29:01 client kernel: ns: testfs-OST0000-osc-ffff8800375c4000 lock: ffff88000b69d580/0x2104da78ff029c0e lrc: 41/0,3 mode: PW/PW res: [0x303:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 102760448->103809023) flags: 0x20000000000 nid: local remote: 0x59a587f3f1d1d6ac expref: -99 pid: 16683 timeout: 0 lvb_type: 1
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) page@ffff88003d5a7600[3 ffff88003a513b38 4 0 1 (null)]
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) vvp-page@ffff88003d5a7650(0:0) vm@ffffea00004439e0 2000000000087d 3:0 ffff88003d5a7600 25088 lru
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) lov-page@ffff88003d5a7690, raid0
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) osc-page@ffff88003d5a76f8 25088: 1< 0x845fed 258 0 + - > 2< 102760448 0 4096 0x0 0x520 | (null) ffff88003c479500 ffff88003a517e20 > 3< 0 0 0 > 4< 0 9 8 18446744073642434560 + | + - + - > 5< + - + - | 0 - | 5635 - ->
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) end page@ffff88003d5a7600
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) Trying to teardown failed: -16
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed:

            Message from syslogd@client at Sep 18 13:29:01 ...
            kernel:LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed:
            Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) LBUG
            Sep 18 13:29:01 client kernel: Pid: 13392, comm: ldlm_bl_00
            Sep 18 13:29:01 client kernel:
            Sep 18 13:29:01 client kernel: Call Trace:
            Sep 18 13:29:01 client kernel: [<ffffffffa10ad875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            Sep 18 13:29:01 client kernel: [<ffffffffa10ade77>] lbug_with_loc+0x47/0xb0 [libcfs]
            Sep 18 13:29:01 client kernel: [<ffffffffa178bf9e>] osc_page_delete+0x46e/0x4e0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa12019cd>] cl_page_delete0+0x7d/0x210 [obdclass]
            Sep 18 13:29:01 client kernel: [<ffffffffa1201b9d>] cl_page_delete+0x3d/0x110 [obdclass]
            Sep 18 13:29:01 client kernel: [<ffffffffa16b6d2d>] ll_invalidatepage+0x8d/0x160 [lustre]
            Sep 18 13:29:01 client kernel: [<ffffffffa16c5d85>] vvp_page_discard+0xc5/0x160 [lustre]
            Sep 18 13:29:01 client kernel: [<ffffffffa11fffc8>] cl_page_invoid+0x68/0x160 [obdclass]
            Sep 18 13:29:01 client kernel: [<ffffffffa12000d3>] cl_page_discard+0x13/0x20 [obdclass]
            Sep 18 13:29:01 client kernel: [<ffffffffa1797158>] discard_cb+0x88/0x1e0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa1796f4e>] osc_page_gang_lookup+0x1ae/0x330 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa17970d0>] ? discard_cb+0x0/0x1e0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa17973f4>] osc_lock_discard_pages+0x144/0x240 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa17970d0>] ? discard_cb+0x0/0x1e0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa178e45b>] osc_lock_flush+0x8b/0x260 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa178e8d8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa138ea57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50
            Sep 18 13:29:01 client kernel: [<ffffffff8100969d>] ? __switch_to+0x7d/0x340
            Sep 18 13:29:01 client kernel: [<ffffffffa13a184a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffffa13a6480>] ldlm_cli_cancel+0x60/0x360 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffffa178e70b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
            Sep 18 13:29:01 client kernel: [<ffffffffa13aa380>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffffa13aae64>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            Sep 18 13:29:01 client kernel: [<ffffffffa13aa9e0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
            Sep 18 13:29:01 client kernel: [<ffffffff810a101e>] kthread+0x9e/0xc0
            Sep 18 13:29:01 client kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20

            lidongyang Li Dongyang (Inactive) added a comment - Hi Jinshan, I'm using master which has 14989 landed already, plus patch set 3 of 16456: The client crashed 5mins after I started the reproducer: Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:887:ldlm_resource_complain()) testfs-OST0000-osc-ffff8800375c4000: namespace resource [0x303:0x0:0x0] .0 (ffff88003d2abcc0) refcount nonzero (1) after lock cleanup; forcing cleanup. Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1502:ldlm_resource_dump()) — Resource: [0x303:0x0:0x0] .0 (ffff88003d2abcc0) refcount = 3 Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1505:ldlm_resource_dump()) Granted locks (in reverse order): Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1508:ldlm_resource_dump()) ### ### ns: testfs-OST0000-osc-ffff8800375c4000 lock: ffff88000b69d380/0x2104da78ff029aa9 lrc: 8/0,1 mode: PW/PW res: [0x303:0x0:0x0] .0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d1d697 expref: -99 pid: 16588 timeout: 0 lvb_type: 1 Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1502:ldlm_resource_dump()) — Resource: [0x303:0x0:0x0] .0 (ffff88003d2abcc0) refcount = 2 Sep 18 13:29:00 client kernel: LustreError: 16824:0:(ldlm_resource.c:1505:ldlm_resource_dump()) Granted locks (in reverse order): Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) page@ffff88003d5a7600 [3 ffff88003a513b38 1 0 1 ffff88003a5d6b78] Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) vvp-page@ffff88003d5a7650(0:0) vm@ffffea00004439e0 2000000000087d 3:0 ffff88003d5a7600 25088 lru Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) lov-page@ffff88003d5a7690, raid0 Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) osc-page@ffff88003d5a76f8 25088: 1< 0x845fed 258 0 + - > 2< 102760448 0 4096 0x0 0x520 | (null) ffff88003c479500 ffff88003a517e20 > 3< 1 0 0 > 4< 0 9 8 18446744073642434560 + | + - + - > 5< + - + - | 0 - | 5635 - -> Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) end page@ffff88003d5a7600 Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:3134:discard_cb()) discard dirty page? Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:2447:osc_teardown_async_page()) extent ffff880039a18720@ {[25088 -> 25341/25343], [2|0|+|locking|wi|ffff88003a517e20], [1040384|254|+|-|ffff88000b69d580|256|ffff8800372b3520]} trunc at 25088. Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_cache.c:2447:osc_teardown_async_page()) ### extent: ffff880039a18720 Sep 18 13:29:01 client kernel: ns: testfs-OST0000-osc-ffff8800375c4000 lock: ffff88000b69d580/0x2104da78ff029c0e lrc: 41/0,3 mode: PW/PW res: [0x303:0x0:0x0] .0 rrc: 2 type: EXT [0->18446744073709551615] (req 102760448->103809023) flags: 0x20000000000 nid: local remote: 0x59a587f3f1d1d6ac expref: -99 pid: 16683 timeout: 0 lvb_type: 1 Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) page@ffff88003d5a7600 [3 ffff88003a513b38 4 0 1 (null)] Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) vvp-page@ffff88003d5a7650(0:0) vm@ffffea00004439e0 2000000000087d 3:0 ffff88003d5a7600 25088 lru Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) lov-page@ffff88003d5a7690, raid0 Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) osc-page@ffff88003d5a76f8 25088: 1< 0x845fed 258 0 + - > 2< 102760448 0 4096 0x0 0x520 | (null) ffff88003c479500 ffff88003a517e20 > 3< 0 0 0 > 4< 0 9 8 18446744073642434560 + | + - + - > 5< + - + - | 0 - | 5635 - -> Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) end page@ffff88003d5a7600 Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:307:osc_page_delete()) Trying to teardown failed: -16 Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed: Message from syslogd@client at Sep 18 13:29:01 ... kernel:LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed: Sep 18 13:29:01 client kernel: LustreError: 13392:0:(osc_page.c:308:osc_page_delete()) LBUG Sep 18 13:29:01 client kernel: Pid: 13392, comm: ldlm_bl_00 Sep 18 13:29:01 client kernel: Sep 18 13:29:01 client kernel: Call Trace: Sep 18 13:29:01 client kernel: [<ffffffffa10ad875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Sep 18 13:29:01 client kernel: [<ffffffffa10ade77>] lbug_with_loc+0x47/0xb0 [libcfs] Sep 18 13:29:01 client kernel: [<ffffffffa178bf9e>] osc_page_delete+0x46e/0x4e0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa12019cd>] cl_page_delete0+0x7d/0x210 [obdclass] Sep 18 13:29:01 client kernel: [<ffffffffa1201b9d>] cl_page_delete+0x3d/0x110 [obdclass] Sep 18 13:29:01 client kernel: [<ffffffffa16b6d2d>] ll_invalidatepage+0x8d/0x160 [lustre] Sep 18 13:29:01 client kernel: [<ffffffffa16c5d85>] vvp_page_discard+0xc5/0x160 [lustre] Sep 18 13:29:01 client kernel: [<ffffffffa11fffc8>] cl_page_invoid+0x68/0x160 [obdclass] Sep 18 13:29:01 client kernel: [<ffffffffa12000d3>] cl_page_discard+0x13/0x20 [obdclass] Sep 18 13:29:01 client kernel: [<ffffffffa1797158>] discard_cb+0x88/0x1e0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa1796f4e>] osc_page_gang_lookup+0x1ae/0x330 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa17970d0>] ? discard_cb+0x0/0x1e0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa17973f4>] osc_lock_discard_pages+0x144/0x240 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa17970d0>] ? discard_cb+0x0/0x1e0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa178e45b>] osc_lock_flush+0x8b/0x260 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa178e8d8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa138ea57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50 Sep 18 13:29:01 client kernel: [<ffffffff8100969d>] ? __switch_to+0x7d/0x340 Sep 18 13:29:01 client kernel: [<ffffffffa13a184a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffffa13a6480>] ldlm_cli_cancel+0x60/0x360 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffffa178e70b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc] Sep 18 13:29:01 client kernel: [<ffffffffa13aa380>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffffa13aae64>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 Sep 18 13:29:01 client kernel: [<ffffffffa13aa9e0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc] Sep 18 13:29:01 client kernel: [<ffffffff810a101e>] kthread+0x9e/0xc0 Sep 18 13:29:01 client kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20

            Hi Dongyang,

            Please try patch 16456 and see if it can fix the problem.

            Jinshan

            jay Jinshan Xiong (Inactive) added a comment - Hi Dongyang, Please try patch 16456 and see if it can fix the problem. Jinshan

            Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/16456
            Subject: LU-6271 osc: further OSC cleanup after eviction
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9dd22234a098bb2bea26c4694d91edc928e027ac

            gerrit Gerrit Updater added a comment - Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/16456 Subject: LU-6271 osc: further OSC cleanup after eviction Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9dd22234a098bb2bea26c4694d91edc928e027ac

            I reckon it is too soon to land http://review.whamcloud.com/14989/
            With the patch the client still crashes when I run the reproducer, for example:
            <3>LustreError: 15348:0:(ldlm_resource.c:835:ldlm_resource_complain()) testfs-OST0000-osc-ffff880037fea000: namespace resource [0x302:0x0:0x0].0 (ffff88003deae500) refcount no
            nzero (1) after lock cleanup; forcing cleanup.
            <3>LustreError: 15348:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0].0 (ffff88003deae500) refcount = 2
            <3>LustreError: 15348:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order):
            <3>LustreError: 15348:0:(ldlm_resource.c:1456:ldlm_resource_dump()) ### ### ns: testfs-OST0000-osc-ffff880037fea000 lock: ffff880037e29940/0xaa5a0b4e389da92d lrc: 4/0,1 mode:
            PW/PW res: [0x302:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d1130e expref: -99 pid: 15181 tim
            eout: 0 lvb_type: 1
            <3>LustreError: 15348:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0].0 (ffff88003deae500) refcount = 2
            <3>LustreError: 15348:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order):
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) page@ffff880038c3fa00[3 ffff88003ac29b38 1 0 1 ffff88003a78ebb8 (null)]
            <3>
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) vvp-page@ffff880038c3fa68(0:0) vm@ffffea000071dfb8 2000000000087f 2:0 ffff880038c3fa00 21504 lru
            <3>
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) lov-page@ffff880038c3faa8, raid0
            <3>
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) osc-page@ffff880038c3fb10 21504: 1< 0x845fed 258 0 + - > 2< 88080384 0 4096 0x0 0x520 | (null) ffff88003ac58500 ffff880
            03cfdbe60 > 3< 1 0 0 > 4< 0 10 8 18446744073619427328 + | + - + - > 5< + - + - | 0 - | 6663 - ->
            <3>
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) end page@ffff880038c3fa00
            <3>
            <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) discard dirty page?
            <3>LustreError: 12670:0:(osc_cache.c:2454:osc_teardown_async_page()) extent ffff88003da0eb10@

            {[21504 -> 21759/21759], [2|0|-|cache|wi|ffff88003cfdbe60], [1048576|256|+|-|ffff8 8003c75d680|256|(null)]}

            trunc at 21504.
            <3>LustreError: 12670:0:(osc_cache.c:2454:osc_teardown_async_page()) ### extent: ffff88003da0eb10
            <3> ns: testfs-OST0000-osc-ffff880037fea000 lock: ffff88003c75d680/0xaa5a0b4e389daa3e lrc: 68/0,6 mode: PW/PW res: [0x302:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615]
            (req 70254592->71303167) flags: 0x20000000000 nid: local remote: 0x59a587f3f1d11323 expref: -99 pid: 15246 timeout: 0 lvb_type: 1
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) page@ffff880038c3fa00[3 ffff88003ac29b38 4 0 1 (null) (null)]
            <3>
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) vvp-page@ffff880038c3fa68(0:0) vm@ffffea000071dfb8 2000000000087f 2:0 ffff880038c3fa00 21504 lru
            <3>
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) lov-page@ffff880038c3faa8, raid0
            <3>
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) osc-page@ffff880038c3fb10 21504: 1< 0x845fed 258 0 + - > 2< 88080384 0 4096 0x0 0x520 | (null) ffff88003ac58500 ffff
            88003cfdbe60 > 3< 0 0 0 > 4< 0 9 8 18446744073619361792 + | + - + - > 5< + - + - | 0 - | 6679 - ->
            <3>
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) end page@ffff880038c3fa00
            <3>
            <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) Trying to teardown failed: -16
            <0>LustreError: 12670:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed:
            <0>LustreError: 12670:0:(osc_page.c:308:osc_page_delete()) LBUG
            <4>Pid: 12670, comm: ldlm_bl_00
            <4>
            <4>Call Trace:
            <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs]
            <4> [<ffffffffa090dbfe>] osc_page_delete+0x46e/0x4e0 [osc]
            <4> [<ffffffffa037e9dd>] cl_page_delete0+0x7d/0x210 [obdclass]
            <4> [<ffffffffa037ebad>] cl_page_delete+0x3d/0x110 [obdclass]
            <4> [<ffffffffa0837d0d>] ll_invalidatepage+0x8d/0x160 [lustre]
            <4> [<ffffffffa0846da5>] vvp_page_discard+0xc5/0x160 [lustre]
            <4> [<ffffffffa037cfd8>] cl_page_invoid+0x68/0x160 [obdclass]
            <4> [<ffffffffa037d0e3>] cl_page_discard+0x13/0x20 [obdclass]
            <4> [<ffffffffa0919678>] discard_cb+0x88/0x1e0 [osc]
            <4> [<ffffffffa091946e>] osc_page_gang_lookup+0x1ae/0x330 [osc]
            <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc]
            <4> [<ffffffffa0919914>] osc_lock_discard_pages+0x144/0x240 [osc]
            <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc]
            <4> [<ffffffffa090ff7b>] osc_lock_flush+0x8b/0x260 [osc]
            <4> [<ffffffffa09103f8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
            <4> [<ffffffffa050e9dc>] ldlm_cancel_callback+0x6c/0x170 [ptlrpc]
            <4> [<ffffffffa052190a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            <4> [<ffffffffa0526540>] ldlm_cli_cancel+0x60/0x360 [ptlrpc]
            <4> [<ffffffffa091022b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
            <4> [<ffffffffa052a440>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            <4> [<ffffffffa052af24>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa052aaa0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
            <4> [<ffffffff810a101e>] kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
            <4>
            <0>Kernel panic - not syncing: LBUG
            <4>Pid: 12670, comm: ldlm_bl_00 Not tainted 2.6.32-573.3.1.el6.x86_64 #1
            <4>Call Trace:
            <4> [<ffffffff81537c54>] ? panic+0xa7/0x16f
            <4> [<ffffffffa021decb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
            <4> [<ffffffffa090dbfe>] ? osc_page_delete+0x46e/0x4e0 [osc]
            <4> [<ffffffffa037e9dd>] ? cl_page_delete0+0x7d/0x210 [obdclass]
            <4> [<ffffffffa037ebad>] ? cl_page_delete+0x3d/0x110 [obdclass]
            <4> [<ffffffffa0837d0d>] ? ll_invalidatepage+0x8d/0x160 [lustre]
            <4> [<ffffffffa0846da5>] ? vvp_page_discard+0xc5/0x160 [lustre]
            <4> [<ffffffffa037cfd8>] ? cl_page_invoid+0x68/0x160 [obdclass]
            <4> [<ffffffffa037d0e3>] ? cl_page_discard+0x13/0x20 [obdclass]
            <4> [<ffffffffa0919678>] ? discard_cb+0x88/0x1e0 [osc]
            <4> [<ffffffffa091946e>] ? osc_page_gang_lookup+0x1ae/0x330 [osc]
            <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc]
            <4> [<ffffffffa0919914>] ? osc_lock_discard_pages+0x144/0x240 [osc]
            <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc]
            <4> [<ffffffffa090ff7b>] ? osc_lock_flush+0x8b/0x260 [osc]
            <4> [<ffffffffa09103f8>] ? osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc]
            <4> [<ffffffffa050e9dc>] ? ldlm_cancel_callback+0x6c/0x170 [ptlrpc]
            <4> [<ffffffffa052190a>] ? ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            <4> [<ffffffffa0526540>] ? ldlm_cli_cancel+0x60/0x360 [ptlrpc]
            <4> [<ffffffffa091022b>] ? osc_ldlm_blocking_ast+0xdb/0x3c0 [osc]
            <4> [<ffffffffa052a440>] ? ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            <4> [<ffffffffa052af24>] ? ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa052aaa0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
            <4> [<ffffffff810a101e>] ? kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20

            and another one, however don't know if it's related.
            <3>LustreError: 4278:0:(ldlm_resource.c:835:ldlm_resource_complain()) testfs-OST0001-osc-ffff88003c715c00: namespace resource [0x302:0x0:0x0].0 (ffff880037015800) refcount non
            zero (1) after lock cleanup; forcing cleanup.
            <3>LustreError: 4278:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0].0 (ffff880037015800) refcount = 2
            <3>LustreError: 4278:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order):
            <3>LustreError: 4278:0:(ldlm_resource.c:1456:ldlm_resource_dump()) ### ### ns: testfs-OST0001-osc-ffff88003c715c00 lock: ffff880037269380/0x532f2161a177e283 lrc: 19/0,1 mode:
            PW/PW res: [0x302:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d100b5 expref: -99 pid: 4040 time
            out: 0 lvb_type: 1
            <3>LustreError: 790:0:(osc_io.c:1010:osc_req_attr_set()) page@ffff880022f02600[2 ffff88003be7db38 2 0 1 (null) ffff88003df849c0]
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) page@ffff88003dfc8000[2 ffff88003be7db38 2 0 1 (null) ffff88003b3982c0]
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) vvp-page@ffff88003dfc8068(0:0) vm@ffffea0000785428 2000000000282c 2:0 ffff88003dfc8000 19200 lru
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) lov-page@ffff88003dfc80a8, raid0
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) osc-page@ffff88003dfc8110 19200: 1< 0x845fed 258 0 + + > 2< 78643200 0 4096 0x5 0x520 | (null) ffff88003d970540 ffff88
            003bc81e20 > 3< 1 12 0 > 4< 0 7 8 18446744073678077952 - | - - - - > 5< - - - - | 0 - | 0 - ->
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) end page@ffff88003dfc8000
            <3>
            <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) uncovered page!
            <3>LustreError: 789:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0].0 (ffff880037015800) refcount = 3
            <4>Pid: 789, comm: ptlrpcd_00_00
            <4>
            <4>Call Trace:
            <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            <4> [<ffffffffa09133fa>] osc_req_attr_set+0x55a/0x720 [osc]
            <4> [<ffffffffa0383829>] cl_req_attr_set+0xc9/0x220 [obdclass]
            <4> [<ffffffffa0904082>] osc_build_rpc+0x882/0x12d0 [osc]
            <4> [<ffffffffa091f623>] osc_io_unplug0+0x1133/0x1af0 [osc]
            <4> [<ffffffffa0918428>] ? osc_ap_completion+0x1a8/0x550 [osc]
            <4> [<ffffffffa0917a3e>] ? osc_extent_put+0xbe/0x260 [osc]
            <4> [<ffffffffa0374f75>] ? lu_object_put+0x135/0x3b0 [obdclass]
            <4> [<ffffffffa09224b0>] osc_io_unplug+0x10/0x20 [osc]
            <4> [<ffffffffa0905593>] brw_interpret+0xac3/0x2320 [osc]
            <4> [<ffffffffa0546ee2>] ? ptlrpc_unregister_bulk+0xa2/0xac0 [ptlrpc]
            <4> [<ffffffffa053f4bc>] ? ptlrpc_unregister_reply+0x6c/0x810 [ptlrpc]
            <4> [<ffffffffa053e2a4>] ? ptlrpc_send_new_req+0x154/0x980 [ptlrpc]
            <4> [<ffffffffa0540551>] ptlrpc_check_set+0x331/0x1be0 [ptlrpc]
            <4> [<ffffffffa056e443>] ptlrpcd_check+0x3d3/0x610 [ptlrpc]
            <4> [<ffffffffa056e8fa>] ptlrpcd+0x27a/0x500 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa056e680>] ? ptlrpcd+0x0/0x500 [ptlrpc]
            <4> [<ffffffff810a101e>] kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
            <4>
            <0>LustreError: 789:0:(osc_io.c:1020:osc_req_attr_set()) LBUG
            <4>Pid: 789, comm: ptlrpcd_00_00
            <4>
            <4>Call Trace:
            <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs]
            <4> [<ffffffffa0913410>] osc_req_attr_set+0x570/0x720 [osc]
            <4> [<ffffffffa0383829>] cl_req_attr_set+0xc9/0x220 [obdclass]
            <4> [<ffffffffa0904082>] osc_build_rpc+0x882/0x12d0 [osc]
            <4> [<ffffffffa091f623>] osc_io_unplug0+0x1133/0x1af0 [osc]
            <4> [<ffffffffa0918428>] ? osc_ap_completion+0x1a8/0x550 [osc]
            <4> [<ffffffffa0917a3e>] ? osc_extent_put+0xbe/0x260 [osc]
            <4> [<ffffffffa0374f75>] ? lu_object_put+0x135/0x3b0 [obdclass]
            <4> [<ffffffffa09224b0>] osc_io_unplug+0x10/0x20 [osc]
            <4> [<ffffffffa0905593>] brw_interpret+0xac3/0x2320 [osc]
            <4> [<ffffffffa0546ee2>] ? ptlrpc_unregister_bulk+0xa2/0xac0 [ptlrpc]
            <4> [<ffffffffa053f4bc>] ? ptlrpc_unregister_reply+0x6c/0x810 [ptlrpc]
            <4> [<ffffffffa053e2a4>] ? ptlrpc_send_new_req+0x154/0x980 [ptlrpc]
            <4> [<ffffffffa0540551>] ptlrpc_check_set+0x331/0x1be0 [ptlrpc]
            <4> [<ffffffffa056e443>] ptlrpcd_check+0x3d3/0x610 [ptlrpc]
            <4> [<ffffffffa056e8fa>] ptlrpcd+0x27a/0x500 [ptlrpc]
            <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
            <4> [<ffffffffa056e680>] ? ptlrpcd+0x0/0x500 [ptlrpc]
            <4> [<ffffffff810a101e>] kthread+0x9e/0xc0
            <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
            <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0
            <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20

            lidongyang Li Dongyang (Inactive) added a comment - I reckon it is too soon to land http://review.whamcloud.com/14989/ With the patch the client still crashes when I run the reproducer, for example: <3>LustreError: 15348:0:(ldlm_resource.c:835:ldlm_resource_complain()) testfs-OST0000-osc-ffff880037fea000: namespace resource [0x302:0x0:0x0] .0 (ffff88003deae500) refcount no nzero (1) after lock cleanup; forcing cleanup. <3>LustreError: 15348:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0] .0 (ffff88003deae500) refcount = 2 <3>LustreError: 15348:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order): <3>LustreError: 15348:0:(ldlm_resource.c:1456:ldlm_resource_dump()) ### ### ns: testfs-OST0000-osc-ffff880037fea000 lock: ffff880037e29940/0xaa5a0b4e389da92d lrc: 4/0,1 mode: PW/PW res: [0x302:0x0:0x0] .0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d1130e expref: -99 pid: 15181 tim eout: 0 lvb_type: 1 <3>LustreError: 15348:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0] .0 (ffff88003deae500) refcount = 2 <3>LustreError: 15348:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order): <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) page@ffff880038c3fa00 [3 ffff88003ac29b38 1 0 1 ffff88003a78ebb8 (null)] <3> <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) vvp-page@ffff880038c3fa68(0:0) vm@ffffea000071dfb8 2000000000087f 2:0 ffff880038c3fa00 21504 lru <3> <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) lov-page@ffff880038c3faa8, raid0 <3> <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) osc-page@ffff880038c3fb10 21504: 1< 0x845fed 258 0 + - > 2< 88080384 0 4096 0x0 0x520 | (null) ffff88003ac58500 ffff880 03cfdbe60 > 3< 1 0 0 > 4< 0 10 8 18446744073619427328 + | + - + - > 5< + - + - | 0 - | 6663 - -> <3> <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) end page@ffff880038c3fa00 <3> <3>LustreError: 12670:0:(osc_cache.c:3141:discard_cb()) discard dirty page? <3>LustreError: 12670:0:(osc_cache.c:2454:osc_teardown_async_page()) extent ffff88003da0eb10@ {[21504 -> 21759/21759], [2|0|-|cache|wi|ffff88003cfdbe60], [1048576|256|+|-|ffff8 8003c75d680|256|(null)]} trunc at 21504. <3>LustreError: 12670:0:(osc_cache.c:2454:osc_teardown_async_page()) ### extent: ffff88003da0eb10 <3> ns: testfs-OST0000-osc-ffff880037fea000 lock: ffff88003c75d680/0xaa5a0b4e389daa3e lrc: 68/0,6 mode: PW/PW res: [0x302:0x0:0x0] .0 rrc: 2 type: EXT [0->18446744073709551615] (req 70254592->71303167) flags: 0x20000000000 nid: local remote: 0x59a587f3f1d11323 expref: -99 pid: 15246 timeout: 0 lvb_type: 1 <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) page@ffff880038c3fa00 [3 ffff88003ac29b38 4 0 1 (null) (null)] <3> <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) vvp-page@ffff880038c3fa68(0:0) vm@ffffea000071dfb8 2000000000087f 2:0 ffff880038c3fa00 21504 lru <3> <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) lov-page@ffff880038c3faa8, raid0 <3> <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) osc-page@ffff880038c3fb10 21504: 1< 0x845fed 258 0 + - > 2< 88080384 0 4096 0x0 0x520 | (null) ffff88003ac58500 ffff 88003cfdbe60 > 3< 0 0 0 > 4< 0 9 8 18446744073619361792 + | + - + - > 5< + - + - | 0 - | 6679 - -> <3> <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) end page@ffff880038c3fa00 <3> <3>LustreError: 12670:0:(osc_page.c:307:osc_page_delete()) Trying to teardown failed: -16 <0>LustreError: 12670:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed: <0>LustreError: 12670:0:(osc_page.c:308:osc_page_delete()) LBUG <4>Pid: 12670, comm: ldlm_bl_00 <4> <4>Call Trace: <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa090dbfe>] osc_page_delete+0x46e/0x4e0 [osc] <4> [<ffffffffa037e9dd>] cl_page_delete0+0x7d/0x210 [obdclass] <4> [<ffffffffa037ebad>] cl_page_delete+0x3d/0x110 [obdclass] <4> [<ffffffffa0837d0d>] ll_invalidatepage+0x8d/0x160 [lustre] <4> [<ffffffffa0846da5>] vvp_page_discard+0xc5/0x160 [lustre] <4> [<ffffffffa037cfd8>] cl_page_invoid+0x68/0x160 [obdclass] <4> [<ffffffffa037d0e3>] cl_page_discard+0x13/0x20 [obdclass] <4> [<ffffffffa0919678>] discard_cb+0x88/0x1e0 [osc] <4> [<ffffffffa091946e>] osc_page_gang_lookup+0x1ae/0x330 [osc] <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc] <4> [<ffffffffa0919914>] osc_lock_discard_pages+0x144/0x240 [osc] <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc] <4> [<ffffffffa090ff7b>] osc_lock_flush+0x8b/0x260 [osc] <4> [<ffffffffa09103f8>] osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc] <4> [<ffffffffa050e9dc>] ldlm_cancel_callback+0x6c/0x170 [ptlrpc] <4> [<ffffffffa052190a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] <4> [<ffffffffa0526540>] ldlm_cli_cancel+0x60/0x360 [ptlrpc] <4> [<ffffffffa091022b>] osc_ldlm_blocking_ast+0xdb/0x3c0 [osc] <4> [<ffffffffa052a440>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] <4> [<ffffffffa052af24>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa052aaa0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc] <4> [<ffffffff810a101e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 12670, comm: ldlm_bl_00 Not tainted 2.6.32-573.3.1.el6.x86_64 #1 <4>Call Trace: <4> [<ffffffff81537c54>] ? panic+0xa7/0x16f <4> [<ffffffffa021decb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa090dbfe>] ? osc_page_delete+0x46e/0x4e0 [osc] <4> [<ffffffffa037e9dd>] ? cl_page_delete0+0x7d/0x210 [obdclass] <4> [<ffffffffa037ebad>] ? cl_page_delete+0x3d/0x110 [obdclass] <4> [<ffffffffa0837d0d>] ? ll_invalidatepage+0x8d/0x160 [lustre] <4> [<ffffffffa0846da5>] ? vvp_page_discard+0xc5/0x160 [lustre] <4> [<ffffffffa037cfd8>] ? cl_page_invoid+0x68/0x160 [obdclass] <4> [<ffffffffa037d0e3>] ? cl_page_discard+0x13/0x20 [obdclass] <4> [<ffffffffa0919678>] ? discard_cb+0x88/0x1e0 [osc] <4> [<ffffffffa091946e>] ? osc_page_gang_lookup+0x1ae/0x330 [osc] <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc] <4> [<ffffffffa0919914>] ? osc_lock_discard_pages+0x144/0x240 [osc] <4> [<ffffffffa09195f0>] ? discard_cb+0x0/0x1e0 [osc] <4> [<ffffffffa090ff7b>] ? osc_lock_flush+0x8b/0x260 [osc] <4> [<ffffffffa09103f8>] ? osc_ldlm_blocking_ast+0x2a8/0x3c0 [osc] <4> [<ffffffffa050e9dc>] ? ldlm_cancel_callback+0x6c/0x170 [ptlrpc] <4> [<ffffffffa052190a>] ? ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] <4> [<ffffffffa0526540>] ? ldlm_cli_cancel+0x60/0x360 [ptlrpc] <4> [<ffffffffa091022b>] ? osc_ldlm_blocking_ast+0xdb/0x3c0 [osc] <4> [<ffffffffa052a440>] ? ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] <4> [<ffffffffa052af24>] ? ldlm_bl_thread_main+0x484/0x700 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa052aaa0>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc] <4> [<ffffffff810a101e>] ? kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 and another one, however don't know if it's related. <3>LustreError: 4278:0:(ldlm_resource.c:835:ldlm_resource_complain()) testfs-OST0001-osc-ffff88003c715c00: namespace resource [0x302:0x0:0x0] .0 (ffff880037015800) refcount non zero (1) after lock cleanup; forcing cleanup. <3>LustreError: 4278:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0] .0 (ffff880037015800) refcount = 2 <3>LustreError: 4278:0:(ldlm_resource.c:1453:ldlm_resource_dump()) Granted locks (in reverse order): <3>LustreError: 4278:0:(ldlm_resource.c:1456:ldlm_resource_dump()) ### ### ns: testfs-OST0001-osc-ffff88003c715c00 lock: ffff880037269380/0x532f2161a177e283 lrc: 19/0,1 mode: PW/PW res: [0x302:0x0:0x0] .0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x126400000000 nid: local remote: 0x59a587f3f1d100b5 expref: -99 pid: 4040 time out: 0 lvb_type: 1 <3>LustreError: 790:0:(osc_io.c:1010:osc_req_attr_set()) page@ffff880022f02600 [2 ffff88003be7db38 2 0 1 (null) ffff88003df849c0] <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) page@ffff88003dfc8000 [2 ffff88003be7db38 2 0 1 (null) ffff88003b3982c0] <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) vvp-page@ffff88003dfc8068(0:0) vm@ffffea0000785428 2000000000282c 2:0 ffff88003dfc8000 19200 lru <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) lov-page@ffff88003dfc80a8, raid0 <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) osc-page@ffff88003dfc8110 19200: 1< 0x845fed 258 0 + + > 2< 78643200 0 4096 0x5 0x520 | (null) ffff88003d970540 ffff88 003bc81e20 > 3< 1 12 0 > 4< 0 7 8 18446744073678077952 - | - - - - > 5< - - - - | 0 - | 0 - -> <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) end page@ffff88003dfc8000 <3> <3>LustreError: 789:0:(osc_io.c:1010:osc_req_attr_set()) uncovered page! <3>LustreError: 789:0:(ldlm_resource.c:1450:ldlm_resource_dump()) — Resource: [0x302:0x0:0x0] .0 (ffff880037015800) refcount = 3 <4>Pid: 789, comm: ptlrpcd_00_00 <4> <4>Call Trace: <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa09133fa>] osc_req_attr_set+0x55a/0x720 [osc] <4> [<ffffffffa0383829>] cl_req_attr_set+0xc9/0x220 [obdclass] <4> [<ffffffffa0904082>] osc_build_rpc+0x882/0x12d0 [osc] <4> [<ffffffffa091f623>] osc_io_unplug0+0x1133/0x1af0 [osc] <4> [<ffffffffa0918428>] ? osc_ap_completion+0x1a8/0x550 [osc] <4> [<ffffffffa0917a3e>] ? osc_extent_put+0xbe/0x260 [osc] <4> [<ffffffffa0374f75>] ? lu_object_put+0x135/0x3b0 [obdclass] <4> [<ffffffffa09224b0>] osc_io_unplug+0x10/0x20 [osc] <4> [<ffffffffa0905593>] brw_interpret+0xac3/0x2320 [osc] <4> [<ffffffffa0546ee2>] ? ptlrpc_unregister_bulk+0xa2/0xac0 [ptlrpc] <4> [<ffffffffa053f4bc>] ? ptlrpc_unregister_reply+0x6c/0x810 [ptlrpc] <4> [<ffffffffa053e2a4>] ? ptlrpc_send_new_req+0x154/0x980 [ptlrpc] <4> [<ffffffffa0540551>] ptlrpc_check_set+0x331/0x1be0 [ptlrpc] <4> [<ffffffffa056e443>] ptlrpcd_check+0x3d3/0x610 [ptlrpc] <4> [<ffffffffa056e8fa>] ptlrpcd+0x27a/0x500 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa056e680>] ? ptlrpcd+0x0/0x500 [ptlrpc] <4> [<ffffffff810a101e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4> <0>LustreError: 789:0:(osc_io.c:1020:osc_req_attr_set()) LBUG <4>Pid: 789, comm: ptlrpcd_00_00 <4> <4>Call Trace: <4> [<ffffffffa021d875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa021de77>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa0913410>] osc_req_attr_set+0x570/0x720 [osc] <4> [<ffffffffa0383829>] cl_req_attr_set+0xc9/0x220 [obdclass] <4> [<ffffffffa0904082>] osc_build_rpc+0x882/0x12d0 [osc] <4> [<ffffffffa091f623>] osc_io_unplug0+0x1133/0x1af0 [osc] <4> [<ffffffffa0918428>] ? osc_ap_completion+0x1a8/0x550 [osc] <4> [<ffffffffa0917a3e>] ? osc_extent_put+0xbe/0x260 [osc] <4> [<ffffffffa0374f75>] ? lu_object_put+0x135/0x3b0 [obdclass] <4> [<ffffffffa09224b0>] osc_io_unplug+0x10/0x20 [osc] <4> [<ffffffffa0905593>] brw_interpret+0xac3/0x2320 [osc] <4> [<ffffffffa0546ee2>] ? ptlrpc_unregister_bulk+0xa2/0xac0 [ptlrpc] <4> [<ffffffffa053f4bc>] ? ptlrpc_unregister_reply+0x6c/0x810 [ptlrpc] <4> [<ffffffffa053e2a4>] ? ptlrpc_send_new_req+0x154/0x980 [ptlrpc] <4> [<ffffffffa0540551>] ptlrpc_check_set+0x331/0x1be0 [ptlrpc] <4> [<ffffffffa056e443>] ptlrpcd_check+0x3d3/0x610 [ptlrpc] <4> [<ffffffffa056e8fa>] ptlrpcd+0x27a/0x500 [ptlrpc] <4> [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa056e680>] ? ptlrpcd+0x0/0x500 [ptlrpc] <4> [<ffffffff810a101e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a0f80>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14989/
            Subject: LU-6271 osc: handle osc eviction correctly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8f01f8b51d114b0d2d54a5ab7db3161782e52447

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14989/ Subject: LU-6271 osc: handle osc eviction correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8f01f8b51d114b0d2d54a5ab7db3161782e52447

            People

              jay Jinshan Xiong (Inactive)
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: