Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.1.6
-
3
-
11946
Description
Hi,
At IFERC customer site, 7 compute nodes crashed with the following message in the console:
2013-11-21 00:57:45 LustreError: 92325:0:(llite_lib.c:1683:ll_update_inode()) ASSERTION( lu_fid_eq(&lli->lli_fid, &body->fid1) ) failed: Trying to change FID [0x217294ce4:0x107f0:0x0] to the [0x217294ce4:0x107f1:0x0], inode 150634522759727089/35072332(ffff8807dcbf85f8) 2013-11-21 00:57:45 LustreError: 92325:0:(llite_lib.c:1683:ll_update_inode()) LBUG 2013-11-21 00:57:45 Pid: 92325, comm: writer_v131 2013-11-21 00:57:45 2013-11-21 00:57:45 Call Trace: 2013-11-21 00:57:45 [<ffffffffa046f7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2013-11-21 00:57:45 [<ffffffffa046fe07>] lbug_with_loc+0x47/0xb0 [libcfs] 2013-11-21 00:57:45 [<ffffffffa0a91ca0>] ll_update_inode+0x4a0/0xf60 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a928ea>] ll_prep_inode+0x18a/0xae0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a7c8c3>] ll_intent_file_open+0x563/0xb80 [lustre] 2013-11-21 00:57:45 [<ffffffffa0aa6a90>] ? ll_md_blocking_ast+0x0/0x700 [lustre] 2013-11-21 00:57:45 [<ffffffff8108163e>] ? down+0x2e/0x50 2013-11-21 00:57:45 [<ffffffffa0a7cf67>] ll_lov_setstripe_ea_info+0x87/0x2b0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a831a5>] ll_lov_setstripe+0x85/0x5a0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0aa3e8b>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a84ac6>] ll_file_ioctl+0x826/0xe00 [lustre] 2013-11-21 00:57:45 [<ffffffff81179ff2>] vfs_ioctl+0x22/0xa0 2013-11-21 00:57:45 [<ffffffff8117a4ba>] do_vfs_ioctl+0x3aa/0x580 2013-11-21 00:57:45 [<ffffffff8117a711>] sys_ioctl+0x81/0xa0 2013-11-21 00:57:45 [<ffffffff8149970e>] ? do_device_not_available+0xe/0x10 2013-11-21 00:57:45 [<ffffffff810030f2>] system_call_fastpath+0x16/0x1b 2013-11-21 00:57:45 2013-11-21 00:57:45 Kernel panic - not syncing: LBUG 2013-11-21 00:57:45 Pid: 92325, comm: writer_v131 Tainted: G W --------------- 2.6.32-279.5.2.bl6.Bull.36.x86_64 #1 2013-11-21 00:57:45 Call Trace: 2013-11-21 00:57:45 [<ffffffff81495fe3>] ? panic+0xa0/0x168 2013-11-21 00:57:45 [<ffffffffa046fe5b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 2013-11-21 00:57:45 [<ffffffffa0a91ca0>] ? ll_update_inode+0x4a0/0xf60 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a928ea>] ? ll_prep_inode+0x18a/0xae0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a7c8c3>] ? ll_intent_file_open+0x563/0xb80 [lustre] 2013-11-21 00:57:45 [<ffffffffa0aa6a90>] ? ll_md_blocking_ast+0x0/0x700 [lustre] 2013-11-21 00:57:45 [<ffffffff8108163e>] ? down+0x2e/0x50 2013-11-21 00:57:45 [<ffffffffa0a7cf67>] ? ll_lov_setstripe_ea_info+0x87/0x2b0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a831a5>] ? ll_lov_setstripe+0x85/0x5a0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0aa3e8b>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre] 2013-11-21 00:57:45 [<ffffffffa0a84ac6>] ? ll_file_ioctl+0x826/0xe00 [lustre] 2013-11-21 00:57:45 [<ffffffff81179ff2>] ? vfs_ioctl+0x22/0xa0 2013-11-21 00:57:45 [<ffffffff8117a4ba>] ? do_vfs_ioctl+0x3aa/0x580 2013-11-21 00:57:45 [<ffffffff8117a711>] ? sys_ioctl+0x81/0xa0 2013-11-21 00:57:45 [<ffffffff8149970e>] ? do_device_not_available+0xe/0x10 2013-11-21 00:57:45 [<ffffffff810030f2>] ? system_call_fastpath+0x16/0x1b
This issue looks like LU-2523 and LU-3311, but the patch for b2_1 has not made any progress since July.
I havetested with the following reproducer, given in LU-2523:
llmount.sh cd /mnt/lustre touch file1 In a single process do: struct lov_user_md_v3 *lum; /* Initialize lum */ fd2 = open("file2", O_RDWR|O_CREAT|O_LOV_DELAY_CREATE, 0666); rename("file1", "file2"); ioctl(fd2, LL_IOC_LOV_SETSTRIPE, lum);
With a stock 2.1.6 I can easily reproduce the issue. And unfortunately, with patch at http://review.whamcloud.com/6775 I am still able to hit the bug.
Thanks,
Sebastien.