Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>
This issue relates to the following test suite run:
https://testing.whamcloud.com/test_sets/7a49f99b-6593-423d-97ce-6bb8790f459d
test_113 failed with the following error:
[13531.690472] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [13531.692557] Oops: 0000 [#1] SMP PTI [13531.693244] CPU: 1 PID: 1175920 Comm: llog_process_th 4.18.0-425.10.1.el8_lustre.x86_64 #1 [13531.695593] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [13531.696668] RIP: 0010:ls_device_get+0x1e3/0x3b0 [obdclass] [13531.712774] Call Trace: [13531.713318] local_oid_storage_init+0xb8/0x16c0 [obdclass] [13531.714401] llog_osd_setup+0x9d/0x400 [obdclass] [13531.715359] llog_setup.part.6+0x146/0x840 [obdclass] [13531.716373] osp_sync_llog_init+0x1e0/0xb10 [osp] [13531.718232] osp_sync_init+0x262/0x770 [osp] [13531.719065] osp_init0.isra.19+0x1689/0x19f0 [osp] [13531.720005] osp_device_alloc+0xcb/0x180 [osp] [13531.720878] obd_setup+0x119/0x300 [obdclass] [13531.721771] class_setup+0x587/0x7a0 [obdclass] [13531.722687] class_process_config+0x1248/0x2160 [obdclass] [13531.723771] class_config_llog_handler+0x93b/0x12e0 [obdclass] [13531.724916] llog_process_thread+0xedf/0x1b60 [obdclass] [13531.728786] llog_process_thread_daemonize+0x9b/0xe0 [obdclass] [13531.729936] kthread+0x10b/0x130
Test session details:
clients: https://build.whamcloud.com/job/lustre-master/4447 - 5.15.0-52-generic
servers: https://build.whamcloud.com/job/lustre-master/4447 - 4.18.0-425.10.1.el8_lustre.x86_64
There have been 13 failures since 2023-07-08, but none in the two months before then.
$ git log --oneline --after 2023-07-06 --before 2023-07-09 51d62f2122fe LU-16637 llite: call truncate_inode_pages() in inode lock 0cb7ebf22304 LU-16927 tests: improve sanity-quota 629d6bca95f9 LU-8191 tests: convert functions to static 97df1cba957b LU-16925 osd-ldiskfs: Remove unused bio_integrity_enabled 1defc11dfa59 LU-16922 kernel: update RHEL 9.2 [5.14.0-284.18.1.el9_2] a1d332f613ac LU-8191 mdt: convert functions to static 094ae18ed8a9 LU-16548 lnet: Fixing missing gnilnd define CURRENT_LND_VERSION 0d77e94b4793 LU-16723 parser: fix help hanging 46a9abf4330e LU-16890 obd: OBD_FREE_PRE() to ignore NULL pointers 9190af53287b LU-16899 gnilnd: Use libcfs_nidstr and fix typo 35017d0973bb LU-16898 osd-ldiskfs: do not return dr_error from past RPC 4fc3c208422e LU-16518 obd: fix style and clang error acdc2c8bb7aa LU-16796 libcfs: Remove reference to LASSERT_ATOMIC_GT c1915c5f0dd8 LU-16846 nrs: Fix console messages 4ce452292fbe LU-16842 fsx: tolerate delete last non-stale mirror error 7ea4e0c7c534 LU-12019 build: Recognize Debian Kernel and set KMP dir c2f548dacc5f LU-16805 llite: improve readpage debug f5a75ea44db3 LU-16697 llite: Set BDI_CAP_* flags for lustre b16c9333a008 LU-16691 ldiskfs: limit length of per-inode prealloc list bba59b1287c9 LU-16651 llite: hold invalidate_lock when invalidate cache pages 3ef773db80fc LU-16594 build: get_random_u32_below, get_acl with dentry e7cf1fc1f274 LU-13340 lustre: Support large nids in LCFG_ADD_UUID 7f1aa5b66b24 LU-16518 build: llvm/clang support e3e91ea95fd9 LU-13343 gss: no sec flavor on loopback connection 530a302e10fc LU-12511 build: include firewalld files for native Linux client aac625055e50 LU-12511 llite: use mapping_set_error instead of opencoded set_bit 84bb366642e8 LU-16847 ldiskfs: refactor t10 code. 9e5040a304a9 LU-16847 ldiskfs: do not copy ldiskfs_chunk_trans_blocks
None of these patches look particularly related to the crash. However, it looks like there is always an earlier failure due to LU-16954 from the BDI_CAP patch.
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_113 - onyx-65vm4 crashed during conf-sanity test_113
Attachments
Issue Links
- is related to
-
LU-16954 mount failed: File exists(cannot create duplicate filename '/devices/virtual/bdi/lustre-ffffxxx')
- Resolved