[LU-11627] LustreError: 86028:0:(file.c:454:ll_dom_finish_open()) ASSERTION( lnb.lnb_file_offset % (1UL << 16) == 0 ) failed: Created: 06/Nov/18  Updated: 25/Feb/19  Resolved: 06/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: James A Simmons Assignee: Jian Yu
Resolution: Duplicate Votes: 0
Labels: None
Environment:

ARM/Power8 clients running latest pre-2.12 lustre


Issue Links:
Duplicate
duplicates LU-11595 sanity-dom sanityn test 11: LBUG: (fi... Resolved
duplicates LU-12014 check correct size in ll_dom_finish_o... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

While testing ARM/Power8 I saw the following crash in sanity 271f:

[Mon Nov  5 23:24:34 2018][13642.788756] LustreError: 86028:0:(file.c:454:ll_dom_finish_open()) ASSERTION( lnb.lnb_file_offset % (1UL << 16) == 0 ) failed:

[Mon Nov  5 23:24:34 2018][13642.788923] LustreError: 86028:0:(file.c:454:ll_dom_finish_open()) LBUG

[Mon Nov  5 23:24:34 2018][13642.788983] Pid: 86028, comm: cat 4.14.0-49.6.1.el7a.ppc64le #1 SMP Wed May 16 21:05:05 UTC 2018

[Mon Nov  5 23:24:34 2018][13642.789065] Call Trace:

[Mon Nov  5 23:24:34 2018][13642.789101]  libcfs_call_trace+0x98/0xf0 [libcfs]

[Mon Nov  5 23:24:34 2018][13642.789154]  lbug_with_loc+0x5c/0xc0 [libcfs]

[Mon Nov  5 23:24:34 2018][13642.789215]  ll_dom_finish_open+0xa48/0xb40 [lustre]

[Mon Nov  5 23:24:34 2018][13642.789273]  ll_lookup_it_finish+0x6d0/0xe40 [lustre]

[Mon Nov  5 23:24:34 2018][13642.789331]  ll_lookup_it+0x4ac/0xf10 [lustre]

[Mon Nov  5 23:24:34 2018][13642.789389]  ll_atomic_open+0x264/0xdb0 [lustre]

[Mon Nov  5 23:24:34 2018][13642.789439]  lookup_open+0x200/0x780

[Mon Nov  5 23:24:34 2018][13642.789475]  path_openat+0x824/0x1070

[Mon Nov  5 23:24:34 2018][13642.789512]  do_filp_open+0x88/0x140

[Mon Nov  5 23:24:34 2018][13642.789549]  SyS_open+0x1bc/0x300

[Mon Nov  5 23:24:34 2018][13642.789586]  system_call+0x58/0x6c

[Mon Nov  5 23:24:34 2018][13642.789622] Kernel panic - not syncing: LBUG

[Mon Nov  5 23:24:34 2018][13642.789671] CPU: 97 PID: 86028 Comm: cat Tainted: G           OE  ------------   4.14.0-49.6.1.el7a.ppc64le #1

[Mon Nov  5 23:24:34 2018][13642.789765] Call Trace:

[Mon Nov  5 23:24:34 2018][13642.789792] [c0000017e6773530] [c000000000c3fdcc] dump_stack+0xb0/0xf4 (unreliable)

[Mon Nov  5 23:24:34 2018][13642.789864] [c0000017e6773570] [c000000000136d64] panic+0x150/0x32c

[Mon Nov  5 23:24:34 2018][13642.789929] [c0000017e6773600] [d000000013090d38] lbug_with_loc+0xb8/0xc0 [libcfs]

[Mon Nov  5 23:24:34 2018][13642.790010] [c0000017e6773670] [d0000000187d0558] ll_dom_finish_open+0xa48/0xb40 [lustre]

[Mon Nov  5 23:24:34 2018][13642.790091] [c0000017e6773780] [d0000000188145f0] ll_lookup_it_finish+0x6d0/0xe40 [lustre]



 Comments   
Comment by Peter Jones [ 06/Nov/18 ]

Jian

Could you please advise?

Thanks

Peter

Comment by Jian Yu [ 06/Nov/18 ]

Sure, Peter.

The failure is related to large page size of ARM processor:

# getconf PAGE_SIZE
65536

And the codes that caused the assertion are:

ll_dom_finish_open()
        lnb.lnb_file_offset = rnb->rnb_offset;
        start = lnb.lnb_file_offset / PAGE_SIZE;
        index = 0;
        LASSERT(lnb.lnb_file_offset % PAGE_SIZE == 0);

Let me close this ticket as a duplicate of LU-11595 so that Mike can advise how to fix the above DOM codes.

Generated at Sat Feb 10 02:45:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.