[LU-2210] ll_update_inode()) ASSERTION( (((inode->i_mode) & 00170000) == 0100000) ) Created: 18/Oct/12  Updated: 07/Jan/16  Resolved: 07/Jan/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: Hongchao Zhang
Resolution: Won't Fix Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 5257

 Description   

Just hit this running racer in a loop.
I have vmcore available

[ 1313.222408] LustreError: 9851:0:(vvp_io.c:1038:vvp_io_commit_write()) Write page 10688 of inode ffff880010eedb20 failed -28
[ 1313.243518] LustreError: 9851:0:(vvp_io.c:1038:vvp_io_commit_write()) Skipped 17 previous similar messages
[ 1598.972120] LustreError: 14768:0:(llite_lib.c:1680:ll_update_inode()) ASSERTION( (((inode->i_mode) & 00170000) == 0100000) ) failed: 
[ 1598.973611] LustreError: 14768:0:(llite_lib.c:1680:ll_update_inode()) LBUG
[ 1598.974348] Pid: 14768, comm: mv
[ 1598.975044] 
[ 1598.975045] Call Trace:
[ 1598.983544]  [<ffffffffa0449915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[ 1598.984478]  [<ffffffffa0449f27>] lbug_with_loc+0x47/0xb0 [libcfs]
[ 1598.985582]  [<ffffffffa0e5e257>] ll_update_inode+0xc37/0xe60 [lustre]
[ 1598.986600]  [<ffffffffa0999f9b>] ? lmv_get_lustre_md+0x19b/0x490 [lmv]
[ 1598.990581]  [<ffffffff81044f4e>] ? kernel_map_pages+0xfe/0x110
[ 1598.991493]  [<ffffffffa0e5e602>] ll_prep_inode+0x182/0xc00 [lustre]
[ 1599.009606]  [<ffffffffa0e74e7d>] ? ll_i2gids+0x3d/0xe0 [lustre]
[ 1599.010451]  [<ffffffffa0e315a0>] ll_revalidate_it_finish+0x60/0x190 [lustre]
[ 1599.011233]  [<ffffffffa0e4b837>] __ll_inode_revalidate_it+0x287/0xd60 [lustre]
[ 1599.012534]  [<ffffffffa0e74f20>] ? ll_md_blocking_ast+0x0/0x780 [lustre]
[ 1599.013339]  [<ffffffff8116172e>] ? cache_free_debugcheck+0x2ae/0x360
[ 1599.014101]  [<ffffffffa0e4c574>] ll_inode_revalidate_it+0x44/0x1c0 [lustre]
[ 1599.014950]  [<ffffffffa0e4c739>] ll_getattr_it+0x49/0x170 [lustre]
[ 1599.015728]  [<ffffffffa0e4c897>] ll_getattr+0x37/0x40 [lustre]
[ 1599.041593]  [<ffffffff81214fa3>] ? security_inode_getattr+0x23/0x30
[ 1599.042332]  [<ffffffff81180891>] vfs_getattr+0x51/0x80
[ 1599.043015]  [<ffffffff814faeee>] ? _spin_unlock_irq+0xe/0x20
[ 1599.043699]  [<ffffffff81180920>] vfs_fstatat+0x60/0x80
[ 1599.060479]  [<ffffffff811809ae>] vfs_lstat+0x1e/0x20
[ 1599.061151]  [<ffffffff811809d4>] sys_newlstat+0x24/0x50
[ 1599.072666]  [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
[ 1599.073371] 
[ 1600.231001] Kernel panic - not syncing: LBUG
[ 1600.239372] Pid: 14768, comm: mv Not tainted 2.6.32-debug #6
[ 1600.240075] Call Trace:
[ 1600.240622]  [<ffffffff814f75e4>] ? panic+0xa0/0x168
[ 1600.241300]  [<ffffffffa0449f7b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[ 1600.247118]  [<ffffffffa0e5e257>] ? ll_update_inode+0xc37/0xe60 [lustre]
[ 1600.247930]  [<ffffffffa0999f9b>] ? lmv_get_lustre_md+0x19b/0x490 [lmv]
[ 1600.248711]  [<ffffffff81044f4e>] ? kernel_map_pages+0xfe/0x110
[ 1600.253484]  [<ffffffffa0e5e602>] ? ll_prep_inode+0x182/0xc00 [lustre]
[ 1600.254325]  [<ffffffffa0e74e7d>] ? ll_i2gids+0x3d/0xe0 [lustre]
[ 1600.255130]  [<ffffffffa0e315a0>] ? ll_revalidate_it_finish+0x60/0x190 [lustre]
[ 1600.256520]  [<ffffffffa0e4b837>] ? __ll_inode_revalidate_it+0x287/0xd60 [lustre]
[ 1600.262045]  [<ffffffffa0e74f20>] ? ll_md_blocking_ast+0x0/0x780 [lustre]
[ 1600.262867]  [<ffffffff8116172e>] ? cache_free_debugcheck+0x2ae/0x360
[ 1600.263831]  [<ffffffffa0e4c574>] ? ll_inode_revalidate_it+0x44/0x1c0 [lustre]
[ 1600.275495]  [<ffffffffa0e4c739>] ? ll_getattr_it+0x49/0x170 [lustre]
[ 1600.276425]  [<ffffffffa0e4c897>] ? ll_getattr+0x37/0x40 [lustre]
[ 1600.299448]  [<ffffffff81214fa3>] ? security_inode_getattr+0x23/0x30
[ 1600.300349]  [<ffffffff81180891>] ? vfs_getattr+0x51/0x80
[ 1600.302211]  [<ffffffff814faeee>] ? _spin_unlock_irq+0xe/0x20
[ 1600.302988]  [<ffffffff81180920>] ? vfs_fstatat+0x60/0x80
[ 1600.303776]  [<ffffffff811809ae>] ? vfs_lstat+0x1e/0x20
[ 1600.322655]  [<ffffffff811809d4>] ? sys_newlstat+0x24/0x50
[ 1600.323338]  [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b

/exports/crashdumps/192.168.10.213-2012-10-18-14\:01\:13



 Comments   
Comment by Hongchao Zhang [ 12/Nov/12 ]

status update:

this ticket is still under investigation

Comment by Hongchao Zhang [ 20/Nov/12 ]

Hi Oleg, Can this issue be reproduced in single node? I have run racer in loop for more than 5 hours without this bug.

Comment by Oleg Drokin [ 18/Dec/12 ]

Yes, it can be reproduced on single node, but it takes quite a while. I just hit it again.
I have a crashdump too.
This time it also got some more interesting messages

[487814.936363] LustreError: 18837:0:(pack_generic.c:771:lustre_msg_string()) can't unpack short string in msg ffffc90007e98000 buffer[6] len 3: strlen 1
[487814.937509] LustreError: 18837:0:(layout.c:1741:__req_capsule_get()) @@@ Wrong buffer for field `name' (6 of 8) in format `LDLM_INTENT_OPEN': 3 vs. 0 (client)
[487814.937512]   req@ffff8800a4c91840 x1421627245160989/t0(0) o101->81360474-d078-5a6a-8505-ee5bd44c63e3@0@lo:0/0 lens 576/0 e 0 to 0 dl 1355771191 ref 1 fl Interpret:/0/ffffffff rc 0/-1
[487814.940316] LustreError: 18837:0:(mdt_handler.c:1716:mdt_reint_internal()) Can't unpack reint, rc -14
[487814.953036] LustreError: 11-0: an error occurred while communicating with 0@lo. The ldlm_enqueue operation failed with -14
[487814.960292] LustreError: Skipped 1 previous similar message
[487814.960764] LustreError: 15682:0:(mdc_locks.c:784:mdc_enqueue()) ldlm_cli_enqueue: -14
[487900.363058] LustreError: 31382:0:(llite_lib.c:1685:ll_update_inode()) ASSERTION( (((inode->i_mode) & 00170000) == 0100000) ) failed: 
...

Looks like some sort of a bug in mmap->revalidate.
I can extract debug log from this crashdump too.

Comment by Hongchao Zhang [ 21/Jan/13 ]

the related code lines has been dropped by LU-1876 and this ticket could be marked as invalid.

Comment by John Fuchs-Chesney (Inactive) [ 07/Jan/16 ]

Issue no longer valid.
~ jfc.

Generated at Sat Feb 10 01:23:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.