Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.4.0
-
None
-
virtual machine, Lustre build: 2.4.0-RC2-gd3f91c4-PRISTINE-2.6.32-358.6.2.el6_lustre.g230b174.x86_64
-
3
-
10820
Description
During testing on a virtual machine, one OSS rebooted when unmounting OSTs in parallel:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000e
IP: [<ffffffffa00add09>] fsfilt_put_ops+0x9/0x20 [lvfs]
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/host3/target3:0:2/3:0:2:0/state
CPU 0
Modules linked in: osp(U) ofd(U) ost(U) mgc(U) fsfilt_ldiskfs(U) lustre(U) osd_ldiskfs(U) lov(U) ldiskfs(U) osc(U) lquota(U) mdd(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ko2iblnd(U) ksocklnd(U) lnet(U) libcfs(U) rdma_cm iw_cm ib_addr sha512_generic sha256_generic autofs4 ib_srp scsi_transport_srp scsi_tgt sunrpc ib_cm ipv6 ib_uverbs ib_umad iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib mlx4_ib ib_sa mlx4_en mlx4_core ib_mthca ib_mad ib_core dm_round_robin ipmi_devintf ppdev parport_pc parport microcode virtio_net i2c_piix4 i2c_core sg ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sym53c8xx scsi_transport_spi virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libcfs]
Pid: 24601, comm: umount Not tainted 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa00add09>] [<ffffffffa00add09>] fsfilt_put_ops+0x9/0x20 [lvfs]
RSP: 0018:ffff880029bd1ac8 EFLAGS: 00010282
RAX: 0000000000000044 RBX: ffff88003d1f6000 RCX: ffff88003746d1c0
RDX: 0000000000000043 RSI: ffff88003d1f6000 RDI: fffffffffffffffe
RBP: ffff880029bd1ac8 R08: 0000000000000000 R09: 0000000000000002
R10: 5a5a5a5a5a5a5a5a R11: 5a5a5a5a5a5a5a5a R12: ffff880029bd1b18
R13: ffffffffa0ece3a0 R14: ffff880029bd1b18 R15: 0000000000000001
FS: 00007f26472de740(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000000e CR3: 000000002e8c6000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 24601, threadinfo ffff880029bd0000, task ffff88002e8c3540)
Stack:
ffff880029bd1ae8 ffffffffa0e925b9 ffff88003d1f6000 ffff880029bd1b18
<d> ffff880029bd1b08 ffffffffa0e97247 ffff88002f282038 ffff88003d1f6000
<d> ffff880029bd1b88 ffffffffa08a7ba7 0000000210000080 0000000000000000
Call Trace:
[<ffffffffa0e925b9>] osd_umount+0x39/0x150 [osd_ldiskfs]
[<ffffffffa0e97247>] osd_device_fini+0x147/0x190 [osd_ldiskfs]
[<ffffffffa08a7ba7>] class_cleanup+0x577/0xda0 [obdclass]
[<ffffffffa087cb36>] ? class_name2dev+0x56/0xe0 [obdclass]
[<ffffffffa08a948c>] class_process_config+0x10bc/0x1c80 [obdclass]
[<ffffffffa08a2cb3>] ? lustre_cfg_new+0x353/0x7e0 [obdclass]
[<ffffffffa08aa1c9>] class_manual_cleanup+0x179/0x6f0 [obdclass]
[<ffffffffa0758717>] ? cfs_waitq_broadcast+0x17/0x20 [libcfs]
[<ffffffffa087aee6>] ? class_export_put+0xf6/0x2b0 [obdclass]
[<ffffffffa0e9b7a5>] osd_obd_disconnect+0x1c5/0x1d0 [osd_ldiskfs]
[<ffffffffa08ac1fe>] lustre_put_lsi+0x17e/0x1100 [obdclass]
[<ffffffffa08b4f58>] lustre_common_put_super+0x5f8/0xc40 [obdclass]
[<ffffffffa08deada>] server_put_super+0x1ca/0xf00 [obdclass]
[<ffffffff8118334b>] generic_shutdown_super+0x5b/0xe0
[<ffffffff81183436>] kill_anon_super+0x16/0x60
[<ffffffffa08ac026>] lustre_kill_super+0x36/0x60 [obdclass]
[<ffffffff81183bd7>] deactivate_super+0x57/0x80
[<ffffffff811a1c4f>] mntput_no_expire+0xbf/0x110
[<ffffffff811a26bb>] sys_umount+0x7b/0x3a0
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Here is the other umount bt:
PID: 24600 TASK: ffff88003c7c5500 CPU: 0 COMMAND: "umount"
#0 [ffff88003d11f798] schedule at ffffffff8150df42
#1 [ffff88003d11f860] jbd2_log_wait_commit at ffffffffa00bdce5 [jbd2]
#2 [ffff88003d11f8f0] ldiskfs_sync_fs at ffffffffa0da657f [ldiskfs]
#3 [ffff88003d11f930] vfs_quota_disable at ffffffff811e1756
#4 [ffff88003d11fa20] ldiskfs_quota_off at ffffffffa0da98f0 [ldiskfs]
#5 [ffff88003d11fa80] deactivate_super at ffffffff81183bc6
#6 [ffff88003d11faa0] mntput_no_expire at ffffffff811a1c4f
#7 [ffff88003d11fad0] osd_umount at ffffffffa0e925f9 [osd_ldiskfs]
#8 [ffff88003d11faf0] osd_device_fini at ffffffffa0e97247 [osd_ldiskfs]
#9 [ffff88003d11fb10] class_cleanup at ffffffffa08a7ba7 [obdclass]
#10 [ffff88003d11fb90] class_process_config at ffffffffa08a948c [obdclass]
#11 [ffff88003d11fc20] class_manual_cleanup at ffffffffa08aa1c9 [obdclass]
#12 [ffff88003d11fce0] osd_obd_disconnect at ffffffffa0e9b7a5 [osd_ldiskfs]
#13 [ffff88003d11fd20] lustre_put_lsi at ffffffffa08ac1fe [obdclass]
#14 [ffff88003d11fd50] lustre_common_put_super at ffffffffa08b4f58 [obdclass]
#15 [ffff88003d11fdc0] server_put_super at ffffffffa08deada [obdclass]
#16 [ffff88003d11fe30] generic_shutdown_super at ffffffff8118334b
#17 [ffff88003d11fe50] kill_anon_super at ffffffff81183436
#18 [ffff88003d11fe70] lustre_kill_super at ffffffffa08ac026 [obdclass]
#19 [ffff88003d11fe90] deactivate_super at ffffffff81183bd7
#20 [ffff88003d11feb0] mntput_no_expire at ffffffff811a1c4f
#21 [ffff88003d11fee0] sys_umount at ffffffff811a26bb
#22 [ffff88003d11ff80] system_call_fastpath at ffffffff8100b072
The vmcore is only 97M if you would like to see it. Or I can run some commands on it if you prefer.
Attachments
Issue Links
- duplicates
-
LU-3411 Encountered at NULL pointer exception for function osd_read_prep
- Resolved