Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
There are essentially two problems here:
- the first issue is that the LDLM namespace is not being cleaned up properly for some reason, which is causing sysfs to report an error trying to re-register a parameter file
- the secondary issue is that ldlm_namespace_sysfs_register() is returning an -EEXIST = -17 error to ldlm_namespace_new(), but ldlm_namespace_new() returns NULL on any failure, and the caller interprets this NULL as -ENOMEM = -12 which generates a misleading "Cannot allocate memory" error higher up the stack and returns this to userspace
sysfs: cannot create duplicate filename '/fs/lustre/ldlm/namespaces/lustre-OST0002-osc-ffff89f33be70000' Call Trace: dump_stack+0x19/0x1b __warn+0xd8/0x100 sysfs_warn_dup+0x64/0x80 sysfs_create_dir_ns+0x8e/0xa0 kobject_add_internal+0xaa/0x330 kobject_init_and_add+0x70/0xb0 ldlm_namespace_sysfs_register+0x68/0xc0 [ptlrpc] ldlm_namespace_new+0x335/0xac0 [ptlrpc] client_obd_setup+0xd77/0x1430 [ptlrpc] osc_setup_common+0x63/0x320 [osc] osc_setup+0x33/0x240 [osc] osc_device_alloc+0xa5/0x240 [osc] obd_setup+0x129/0x2f0 [obdclass] class_setup+0x2a8/0x840 [obdclass] class_process_config+0x1569/0x27c0 [obdclass] class_config_llog_handler+0x7f9/0x1370 [obdclass] llog_process_thread+0x85f/0x1a20 [obdclass] llog_process_thread_daemonize+0xa4/0xe0 [obdclass] kthread+0xd1/0xe0 mount.lustre: mount trevis-12vm4@tcp:/lustre at /mnt/lustre failed: Cannot allocate memory
A few such errors were reported on 2020-11-26 and 2020-11-27:
https://testing.whamcloud.com/test_sets/96c7542c-7d1f-4f4f-824b-cd2b5102f2b4
https://testing.whamcloud.com/test_sets/e7d9ec4f-2403-404f-b1a3-293a067ba0fa
https://testing.whamcloud.com/test_sets/7fbb7803-bc75-41a9-a3e9-af6ea524ff38
but a large number of such messages are reported after conf-sanity.sh test_4 fails and this is reported for every subsequent mount attempt n that session, starting with test_5a, such as on 2020-09-02 (the first incidence reported in Kibana, for a "full" test run, so not associated with a specific patch), 2020-09-11, 2020-10-29, and 2020-11-30:
https://testing.whamcloud.com/test_sets/9614c939-ed8a-42e2-bbc1-a7122778a554
https://testing.whamcloud.com/test_sets/3c636685-44c2-499a-93ed-4667b74c9257
https://testing.whamcloud.com/test_sets/4d65b612-3a33-4515-9f8c-e22aaebdee4c
https://testing.whamcloud.com/test_sets/5773671a-303c-4f7d-b6b0-e37be7d34e7a
Attachments
Issue Links
- mentioned in
-
Page Loading...