Author Topic: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update  (Read 23771 times)

consul

  • Zen Apprentice
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« on: February 04, 2016, 03:37:06 pm »
Hello!
After updating my Zentyal 4.2 to latest version, it began to appear this error on consolle:

BUG: soft lockup - CPU #1 stuck for 23s!

error that is multiplied up to completely block the server and only a hardware reset work to restore functionality of server  :(

Has anyone had any experience in this?

Thank you!

petteri.jekunen

  • Zen Apprentice
  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #1 on: February 05, 2016, 05:04:07 am »
Yes, the same in our environment. We are running Zentyal in a Proxmox VM.
Regards,
-Petteri

consul

  • Zen Apprentice
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #2 on: February 09, 2016, 03:54:40 pm »
After install the latest updates for Samba, it seems that the problem is solved...

zerolife

  • Zen Apprentice
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #3 on: February 10, 2016, 09:26:33 pm »
Thank you.
Have been experiencing the same issue.

matrizze

  • Zen Apprentice
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #4 on: February 11, 2016, 09:40:02 pm »
@consul
can you short describe, how to install latest updates for samba?

Thx

Edit: Maybe as usual: apt-get install zentyal-samba?
« Last Edit: February 11, 2016, 10:07:53 pm by matrizze »

consul

  • Zen Apprentice
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #5 on: February 12, 2016, 08:46:11 am »
I installed them directly via the web console Zentyal.

consul

  • Zen Apprentice
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #6 on: February 12, 2016, 11:14:48 am »
I must recant...  :-[
Today occurred the same error...  >:(

zerolife

  • Zen Apprentice
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #7 on: February 16, 2016, 04:02:06 pm »
Same here... Issue is not resolved with the latest updates.  :(

Has anyone made any headway regarding this?

BerT666

  • Zen Warrior
  • ***
  • Posts: 228
  • Karma: +17/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #8 on: February 17, 2016, 01:50:46 pm »
Howdy,

maybe dmesg could give you a hint, or the syslog (both under /var/log)...
Does this happen directly after booting, or after ??? hours?
Can you see anything strange with top / htop?

BTW is it Hardware (what kind of) or VM (what kind of Hypervisor)?

My Zentyal is running as a xenserver VM and I did not get anything like this...

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #9 on: February 17, 2016, 08:22:40 pm »
Hi,

I have the same issue!

I've set up a 4.2 branch server on a Athlon X2 HP machine for a non-profit refugee support project. We want to provide about 10 workplaces to write their CV's, etc., all things they cannot do without a proper text writing software.

I'am setting up (Mint 17.2) clients to connect via samba4 ads, mount home drives, etc. and after a few sleepless nights, it seems to work well somehow :-)

Now, what I'm seeing is that the server is getting soft lockups every now and then, however, at some point those are getting many and finally the machine will get stuck. Before, I used an Lenovo/IBM Core2 machine with a separate installation, but I had some annoying hick-ups on that unit, so I thought I should change and made a new install on this HP box, but still seem to have the same problems.
So I started to look at syslog and found massively soft lockup entries.


This is from syslog at the first incident during that day:
Code: [Select]
Feb 14 14:00:01 zentyal CRON[23986]: (clamav) CMD (/usr/bin/freshclam --quiet)
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [smbd:23984]
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] Modules linked in: xt_mark xt_connmark iptable_mangle 8021q garp mrp stp llc quota_v2 quota_tree ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp xt_conntrack iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_h323 nf_conntrack_h323 nf_conntrack_tftp nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack amdkfd amd_iommu_v2 snd_hda_codec_analog snd_hda_codec_generic radeon hp_wmi snd_hda_intel sparse_keymap ppdev snd_hda_controller snd_hda_codec ttm snd_hwdep drm_kms_helper snd_pcm snd_timer drm kvm edac_core snd i2c_algo_bit soundcore shpchp serio_raw k8temp edac_mce_amd 8250_fintek wmi i2c_piix4 tpm_infineon parport_pc mac_hid lp parport uas usb_storage hid_generic usbhid hid psmouse 3c59x mii floppy tg3 ahci libahci ptp pps_core
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] CPU: 0 PID: 23984 Comm: smbd Not tainted 3.19.0-49-generic #55~14.04.1-Ubuntu
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] Hardware name: Hewlett-Packard HP Compaq dc5850 Microtower/3029h, BIOS 786F6 v01.09 04/09/2008
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] task: ffff8800693a93a0 ti: ffff88006bd30000 task.ti: ffff88006bd30000
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] RIP: 0010:[<ffffffff817b77f5>]  [<ffffffff817b77f5>] _raw_spin_lock+0x35/0x60
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] RSP: 0018:ffff88006bd33e20  EFLAGS: 00000206
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] RAX: 0000000000003db1 RBX: ffff88000897d0c0 RCX: 00000000000001d2
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] RDX: 00000000000001d4 RSI: 00000000000001d2 RDI: ffff8800691b6120
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] RBP: ffff88006bd33e48 R08: 00000000000001d4 R09: ffff88006bd33c14
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] R10: ffff88006bd33ee2 R11: 0000000000000005 R12: ffff88002031a870
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] R13: 00000000000000a2 R14: 0000000400000001 R15: ffff88002031a870
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] FS:  00007fef5ef10780(0000) GS:ffff88006fc00000(0000) knlGS:0000000000000000
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] CR2: 00007f51e9e72000 CR3: 0000000014ac3000 CR4: 00000000000007f0
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] Stack:
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  ffffffff817484a0 ffff8800691b4000 ffff8800691b5e00 ffff8800200ff480
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  ffff8800691b4000 ffff88006bd33ea8 ffffffff8174b643 ffff88006bd33e88
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  ffffffff81cda080 ffff88006bd33e78 00000028200ff480 000000000000006e
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] Call Trace:
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  [<ffffffff817484a0>] ? unix_state_double_lock+0x60/0x70
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  [<ffffffff8174b643>] unix_dgram_connect+0x93/0x250
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  [<ffffffff8168f367>] SYSC_connect+0xe7/0x120
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  [<ffffffff8169054e>] SyS_connect+0xe/0x10
Feb 14 14:00:08 zentyal kernel: [ 7928.088009]  [<ffffffff817b7c0d>] system_call_fastpath+0x16/0x1b
Feb 14 14:00:08 zentyal kernel: [ 7928.088009] Code: f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 01 c3 89 d1 0f b7 f2 b8 00 80 00 00 eb 0a 0f 1f 00 f3 90 83 e8 01 74 20 0f b7 17 41 89 d0 <41> 31 c8 41 81 e0 fe ff 00 00 75 e7 55 0f b7 f2 48 89 e5 e8 6b
Feb 14 14:00:30 zentyal dhcpd: DHCPDISCOVER from 00:1e:0b:80:82:21 (mint) via eth1

It then continues, until the machine finally completely stops:
Code: [Select]
Feb 15 01:59:58 zentyal kernel: [ 8836.084003] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [smbd:6522]
Feb 15 01:59:58 zentyal kernel: [ 8836.084005] Modules linked in: quota_v2 quota_tree ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp xt_conntrack iptable_nat nf_nat_ipv4 iptable_filter nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_h323 nf_conntrack_h323 nf_conntrack_tftp nf_nat_ftp nf_nat nf_conntrack_ftp xt_mark nf_conntrack_ipv4 nf_defrag_ipv4 xt_connmark nf_conntrack iptable_mangle ip_tables x_tables snd_hda_codec_analog snd_hda_codec_generic amdkfd snd_hda_intel hp_wmi sparse_keymap snd_hda_controller amd_iommu_v2 ppdev radeon snd_hda_codec snd_hwdep snd_pcm snd_timer ttm kvm drm_kms_helper drm serio_raw snd edac_core k8temp soundcore edac_mce_amd i2c_algo_bit parport_pc i2c_piix4 wmi shpchp 8250_fintek tpm_infineon mac_hid lp parport uas usb_storage hid_generic usbhid hid psmouse tg3 3c59x ptp mii pps_core floppy ahci libahci
Feb 15 01:59:58 zentyal kernel: [ 8836.084005] CPU: 0 PID: 6522 Comm: smbd Tainted: G             L 3.19.0-49-generic #55~14.04.1-Ubuntu
Feb 15 01:59:58 zentyal kernel: [ 8836.084005] Hardware name: Hewlett-Packard HP Compaq dc5850 Microtower/3029h, BIOS 786F6 v01.09 04/09/2008
Feb 15 01:59:58 zentyal kernel: [ 8836.084005] task: ffff880069554e80 ti: ffff88006b930000 task.tFeb 17 19:24:18 zentyal rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="450" x-info="http://www.rsyslog.com"] start

With this, I'm afraid it makes the system pretty much useless and all but reliable.
Would you have any suggestions how to solve?
Zentyal 6.1

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #10 on: February 17, 2016, 10:09:25 pm »
just some more form the syslog

Code: [Select]
Feb 17 21:59:41 zentyal kernel: [ 9345.516006] INFO: rcu_sched self-detected stall on CPU { 1}  (t=15000 jiffies g=68817 c=68816 q=0)
Feb 17 21:59:41 zentyal kernel: [ 9345.516006] Task dump for CPU 1:
Feb 17 21:59:41 zentyal kernel: [ 9345.516006] smbd            R  running task        0 18719   5097 0x00000008
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  ffffffff81c56000 ffff88006fc83d78 ffffffff810a0276 0000000000000001
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  ffffffff81c56000 ffff88006fc83d98 ffffffff810a386d 0000000000000087
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  0000000000000002 ffff88006fc83dc8 ffffffff810d4100 ffff88006fc94bc0
Feb 17 21:59:41 zentyal kernel: [ 9345.516006] Call Trace:
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  <IRQ>  [<ffffffff810a0276>] sched_show_task+0xb6/0x130
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810a386d>] dump_cpu_task+0x3d/0x50
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810d4100>] rcu_dump_cpu_stacks+0x90/0xd0
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810d7fbc>] rcu_check_callbacks+0x42c/0x670
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810a48a1>] ? account_process_tick+0x61/0x180
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810dcef9>] update_process_times+0x39/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810ec405>] tick_sched_handle.isra.16+0x25/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810ec484>] tick_sched_timer+0x44/0x80
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810ddbb7>] __run_hrtimer+0x77/0x1d0
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810ec440>] ? tick_sched_handle.isra.16+0x60/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff810ddf97>] hrtimer_interrupt+0xe7/0x220
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff8104abc9>] local_apic_timer_interrupt+0x39/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff817bac85>] smp_apic_timer_interrupt+0x45/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff817b8cbd>] apic_timer_interrupt+0x6d/0x80
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  <EOI>  [<ffffffff817b77ea>] ? _raw_spin_lock+0x2a/0x60
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff817484a0>] ? unix_state_double_lock+0x60/0x70
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff8174b643>] unix_dgram_connect+0x93/0x250
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff8168f367>] SYSC_connect+0xe7/0x120
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff8169054e>] SyS_connect+0xe/0x10
Feb 17 21:59:41 zentyal kernel: [ 9345.516006]  [<ffffffff817b7c0d>] system_call_fastpath+0x16/0x1b

It also appears that you can't kill the hanging process, not even with a forced (-9) kill. It's an smbd task, stalling it all. :-(
« Last Edit: February 17, 2016, 10:13:13 pm by segelfreak »
Zentyal 6.1

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #11 on: February 17, 2016, 10:17:35 pm »
Just found this: http://ubuntuforums.org/showthread.php?t=2205211&p=12996968#post12996968

Refers to a malfunctioning power supply. I will try to use another unit as this really rings a bell: I actually used the same power supply in both machines I had tried independently so far. It's indeed the only link between the two installations.
Will keep you updated.

EDIT: OK, it's not the power supply. Changed it and the error still comes up.
« Last Edit: February 17, 2016, 11:17:48 pm by segelfreak »
Zentyal 6.1

matrizze

  • Zen Apprentice
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #12 on: February 18, 2016, 02:38:03 pm »
I have no proof for it, but after updating with apt-get install zentyal-samba and deleting other Debian VM in my ESXI Hypersphere 5.5 this failure didn't come till last post.

The other Debian machine provisioned by 2 CPU with 1 Cores. After that i added only machines with 1 CPU and 2 Cores and had not this failure again??!

But like i said, i have no proof for it.

M.

BerT666

  • Zen Warrior
  • ***
  • Posts: 228
  • Karma: +17/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #13 on: February 18, 2016, 02:57:38 pm »
Howdy,

just to be clear about this: do you have the vmWare Tools installed?

Regards

Thomas

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #14 on: February 18, 2016, 08:44:45 pm »
Howdy,

just to be clear about this: do you have the vmWare Tools installed?

Regards

Thomas

Not sure who you mean, but just in case: I didn't use any VM, so I do not expect this issue to have its root there. If I had the vmWare tools installed? Not on purpose, however, if they have been installed during the standard setup, its possible.
I can't check anymore, since I'm making a fresh install and then will not (!) make any upgrades. Let's see if this makes the lockups stop.
Zentyal 6.1