Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.9.0
-
None
-
2.8.55-85-g9d84696, Solo setup.
-
3
-
9223372036854775807
Description
- stdout.log
# sh llmount.sh Stopping clients: localhost.localdomain /mnt/lustre (opts:) Stopping clients: localhost.localdomain /mnt/lustre2 (opts:) Loading modules from /root/lustre-release/lustre/tests/.. detected 4 online CPUs by sysfs Force libcfs to create 2 CPU partitions ../libcfs/libcfs/libcfs options: 'cpu_npartitions=2' debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck subsystem_debug=all -lnet -lnd -pinger ../lnet/lnet/lnet options: 'networks=tcp(eth3) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 Checking servers environments Checking clients localhost.localdomain environments Loading modules from /root/lustre-release/lustre/tests/.. detected 4 online CPUs by sysfs Force libcfs to create 2 CPU partitions debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck subsystem_debug=all -lnet -lnd -pinger gss/krb5 is not supported Setup mgs, mdt, osts Starting mds1: -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 Started lustre-MDT0000 Starting ost1: -o loop /tmp/lustre-ost1 /mnt/lustre-ost1 Started lustre-OST0000 Starting ost2: -o loop /tmp/lustre-ost2 /mnt/lustre-ost2 Started lustre-OST0001 Starting client: localhost.localdomain: -o user_xattr,flock localhost.localdomain@tcp:/lustre /mnt/lustre Using TIMEOUT=20 seting jobstats to procname_uid Setting lustre.sys.jobid_var from disable to procname_uid Waiting 90 secs for update Updated after 8s: wanted 'procname_uid' got 'procname_uid' disable quota as required [root@localhost tests]# [root@localhost tests]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 44G 31G 13G 72% / tmpfs 1.9G 0 1.9G 0% /dev/shm /dev/sda1 2.0G 350M 1.6G 19% /boot /dev/loop0 139M 18M 112M 14% /mnt/lustre-mds1 /dev/loop1 359M 30M 310M 9% /mnt/lustre-ost1 /dev/loop2 359M 30M 310M 9% /mnt/lustre-ost2 localhost.localdomain@tcp:/lustre 717M 59M 620M 9% /mnt/lustre [root@localhost tests]#
- lctl blockdev*
[root@localhost lustre-release]# insmod lustre/llite/llite_lloop.ko [root@localhost lustre-release]# MD=/mnt/; dd if=/dev/zero of=$MD/testfile bs=4096 count=5000; lctl blockdev_attach $MD/testfile /dev/loop3; lctl blockdev_info /dev/loop3 5000+0 records in 5000+0 records out 20480000 bytes (20 MB) copied, 0.244223 s, 83.9 MB/s attach error(Inappropriate ioctl for device) error: Invalid argument [root@localhost lustre-release]# [root@localhost lustre-release]# MD=/mnt/lustre; dd if=/dev/zero of=$MD/testfile bs=4096 count=5000; lctl blockdev_attach $MD/testfile loop3; lctl blockdev_info loop3 5000+0 records in 5000+0 records out 20480000 bytes (20 MB) copied, 1.13594 s, 18.0 MB/s Message from syslogd@localhost at Jul 12 16:28:50 ... kernel:LustreError: 59548:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed: Message from syslogd@localhost at Jul 12 16:28:50 ... kernel:LustreError: 59548:0:(osc_page.c:308:osc_page_delete()) LBUG
- dmesg
1805 <4>Lustre: DEBUG MARKER: Using TIMEOUT=20 1806 <4>Lustre: 59548:0:(lloop.c:720:lloop_ioctl()) Enter llop_ioctl 1807 <3>LustreError: 59548:0:(osc_cache.c:2488:osc_teardown_async_page()) extent ffff8800871bdd08@{[4864 -> 4999/5119], [2|0|-|cache|wi|ffff880087763ea8], [581632|136|+|-|ffff8800877ffd00|256|(null)]} trunc at 4864. 1808 <3>LustreError: 59548:0:(osc_cache.c:2488:osc_teardown_async_page()) ### extent: ffff8800871bdd08 ns: lustre-OST0001-osc-ffff880139dae800 lock: ffff8800877ffd00/0x8c9f98a4b6bd6cd1 lrc: 2/0,0 mode: PW/PW res: [0x2:0x0:0x0].0x0 rrc: 1 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20000000000 nid: local remote: 0x8c9f98a4b6bd6cd8 expref: -99 pid: 59546 timeout: 0 lvb_type: 1 1809 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) page@ffff88008328de00[2 ffff88008706eb38 4 1 (null)] 1810 <3> 1811 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) vvp-page@ffff88008328de50(0:0) vm@ffffea0001c901a0 2000000000083d 3:0 ffff88008328de00 4864 lru 1812 <3> 1813 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) lov-page@ffff88008328de90, raid0 1814 <3> 1815 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) osc-page@ffff88008328def8 4864: 1< 0x845fed 258 0 + - > 2< 19922944 0 4096 0x0 0x520 | (null) ffff88008829b4f0 ffff880087763ea8 > 3< 0 0 0 > 4< 0 0 8 21954560 - | - - + - > 5< - - + - | 0 - | 136 - -> 1816 <3> 1817 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) end page@ffff88008328de00 1818 <3> 1819 <3>LustreError: 59548:0:(osc_page.c:307:osc_page_delete()) Trying to teardown failed: -16 1820 <0>LustreError: 59548:0:(osc_page.c:308:osc_page_delete()) ASSERTION( 0 ) failed: 1821 <0>LustreError: 59548:0:(osc_page.c:308:osc_page_delete()) LBUG 1822 <4>Pid: 59548, comm: lctl 1823 <4> 1824 <4>Call Trace: 1825 <4> [<ffffffffa0cb1875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 1826 <4> [<ffffffffa0cb1e77>] lbug_with_loc+0x47/0xb0 [libcfs] 1827 <4> [<ffffffffa15cb4ae>] osc_page_delete+0x46e/0x4e0 [osc] 1828 <4> [<ffffffffa113662d>] cl_page_delete0+0x7d/0x210 [obdclass] 1829 <4> [<ffffffffa11367fd>] cl_page_delete+0x3d/0x110 [obdclass] 1830 <4> [<ffffffffa0afc34d>] ll_invalidatepage+0x8d/0x160 [lustre] 1831 <4> [<ffffffff811372e5>] do_invalidatepage+0x25/0x30 1832 <4> [<ffffffff81137602>] truncate_inode_page+0xa2/0xc0 1833 <4> [<ffffffff811379af>] truncate_inode_pages_range+0x16f/0x500 1834 <4> [<ffffffff8128596a>] ? kobject_get+0x1a/0x30 1835 <4> [<ffffffff81137dd5>] truncate_inode_pages+0x15/0x20 1836 <4> [<ffffffffa0167900>] lloop_ioctl+0x5a0/0x780 [llite_lloop] 1837 <4> [<ffffffffa0acb267>] ll_file_ioctl+0x667/0x3eb0 [lustre] 1838 <4> [<ffffffff811865b1>] ? nameidata_to_filp+0x31/0x70 1839 <4> [<ffffffff8119c3c8>] ? do_filp_open+0x798/0xd20 1840 <4> [<ffffffff8119e972>] vfs_ioctl+0x22/0xa0 1841 <4> [<ffffffff8119ee3a>] do_vfs_ioctl+0x3aa/0x580 1842 <4> [<ffffffff81196dd6>] ? final_putname+0x26/0x50 1843 <4> [<ffffffff8119f091>] sys_ioctl+0x81/0xa0 1844 <4> [<ffffffff810e202e>] ? __audit_syscall_exit+0x25e/0x290 1845 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
@andreas
This is seen on my solo VM setup, I want to debug and give the fix for this, if its fine.
Thanks.