Author Topic: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update  (Read 15572 times)

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #15 on: February 18, 2016, 11:13:22 pm »
Ok folks..

So, after having made a new installation I recognized that zentyal had updated its packages already automatically.
At first I was a bit disappointed, but then I decided to move forward, step by step.
So I installed the various updates, except for the new kernel. In my case, it would be generic kernel 3.19.0.49.34.

Up to now, no lockups... cross fingers!
Zentyal 6.1

peptoniET

  • Zen Apprentice
  • *
  • Posts: 28
  • Karma: +4/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #16 on: February 19, 2016, 12:57:24 pm »
Hi everyone.

Experienced the same problem on two different servers.  I had 2 lockups on the first one.  After some research, I decided to downgrade the kernel from 3.19.0-49-generic to 3.19.0-47-generic.  So far, no more lockups.

Today I experienced same behaviour on another server.  Checked kernel version, and it was 3.19.0-49-generic. Just downgraded too this one to 3.19.0-47-generic.

I'll keep you informed about results.  First server has not lockup since downgrade.

Both Zentyal 4.2.2 up to date.

How to downgrade:

Code: [Select]
sudo apt-get purge linux-image-3.19.0-49-generic
sudo update-grub

then reboot.

matrizze

  • Zen Apprentice
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #17 on: February 19, 2016, 01:54:46 pm »
@BerT666

I haven't installed the VMWare Tools on all VMs.

segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #18 on: February 19, 2016, 03:30:11 pm »
Hi everyone.

Experienced the same problem on two different servers.  I had 2 lockups on the first one.  After some research, I decided to downgrade the kernel from 3.19.0-49-generic to 3.19.0-47-generic.  So far, no more lockups.

Today I experienced same behaviour on another server.  Checked kernel version, and it was 3.19.0-49-generic. Just downgraded too this one to 3.19.0-47-generic.

I'll keep you informed about results.  First server has not lockup since downgrade.

Both Zentyal 4.2.2 up to date.

How to downgrade:

Code: [Select]
sudo apt-get purge linux-image-3.19.0-49-generic
sudo update-grub

then reboot.

I may just add that one should ensure that the previous kernel is still "available". Auto-remove function of apt might have deleted it, no?
And finally, you need to put upgrade offers for the new kernel on hold...
« Last Edit: February 19, 2016, 03:32:03 pm by segelfreak »
Zentyal 6.1

peptoniET

  • Zen Apprentice
  • *
  • Posts: 28
  • Karma: +4/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #19 on: February 19, 2016, 06:45:38 pm »
I may just add that one should ensure that the previous kernel is still "available". Auto-remove function of apt might have deleted it, no?
And finally, you need to put upgrade offers for the new kernel on hold...

Default config saves 2 to 3 kernels:

/etc/kernel/postinst.d/apt-auto-removal says:

# Mark as not-for-autoremoval those kernel packages that are:
#  - the currently booted version
#  - the kernel version we've been called for
#  - the latest kernel version (determined using rules copied from the grub
#    package for deciding which kernel to boot)
#  - the second-latest kernel version, if the booted kernel version is
#    already the latest and this script is called for that same version,
#    to ensure a fallback remains available in the event the newly-installed
#    kernel at this ABI fails to boot
# In the common case, this results in exactly two kernels saved, but it can
# result in three kernels being saved.  It's better to err on the side of
# saving too many kernels than saving too few.

So, in a default configuration, it should be safe. But you are right, one should check before deleting the kernel.  Anyway, i'm curious about what would happen if we try to remove the last kernel...


segelfreak

  • Zen Monk
  • **
  • Posts: 80
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #20 on: February 21, 2016, 11:53:32 am »
  Anyway, i'm curious about what would happen if we try to remove the last kernel...

 ;D Wanna try?
Zentyal 6.1

peptoniET

  • Zen Apprentice
  • *
  • Posts: 28
  • Karma: +4/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #21 on: February 21, 2016, 07:00:22 pm »
Nope...  ;D

BerT666

  • Zen Warrior
  • ***
  • Posts: 218
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #22 on: February 22, 2016, 11:55:28 am »
In the past, I there was a problem with the kernel regarding to a firewall script...

There it was no problem to remove the newest kernel...

The only problem is, that you have to keep it in mind and do a bit more testing of package-updates ;-)

@matrizze: I heared of several POSSIBLE issues, when the vmWare Tools are not installed. That was the background of my question ;-)

Regards

Thomas


« Last Edit: February 22, 2016, 12:01:21 pm by BerT666 »

hotsummer55

  • Zen Apprentice
  • *
  • Posts: 27
  • Karma: +2/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #23 on: February 24, 2016, 10:17:30 am »
I have the same issue
I have a kvm server host with zentyal running as guest with 8 cpus
It was running on 4.1 until friday 19 feb 2016 no problems for about 10 months.
After upgrade to 4.2  it is running on linux-image-3.19.0-49-generic kernel.
I am having tainted g in smbd and one of the cpu's seem to be stuck at 100% when looking from htop.
It seem to eventually run out of memory and then hangs.This has happened once to me.
Today 23/02/2016 in vm i have reduced cpus to 4 to see if that helps and have noticed there is a new kernel update available.
I can't go back as I haven't got a earlier kernel .(Just upgraded)
If problem reappears today I will run another apt-get dist-upgrade and get latest kernel to see if that fixes problem.
I will report back

peptoniET

  • Zen Apprentice
  • *
  • Posts: 28
  • Karma: +4/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #24 on: February 24, 2016, 10:26:31 am »
Hi everyone.

Experienced the same problem on two different servers.  I had 2 lockups on the first one.  After some research, I decided to downgrade the kernel from 3.19.0-49-generic to 3.19.0-47-generic.  So far, no more lockups.

Today I experienced same behaviour on another server.  Checked kernel version, and it was 3.19.0-49-generic. Just downgraded too this one to 3.19.0-47-generic.

I'll keep you informed about results.  First server has not lockup since downgrade.

Both Zentyal 4.2.2 up to date.

How to downgrade:

Code: [Select]
sudo apt-get purge linux-image-3.19.0-49-generic
sudo update-grub

then reboot.

Second downgraded kernel server has been stable.  No more lockups after kernel downgrade.  Same with first server.  It seems that we have pinpointed the problem.

So what is happening...?  a Kernel bug...? Samba bug...?

LaM

  • Zen Apprentice
  • *
  • Posts: 41
  • Karma: +0/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #25 on: February 25, 2016, 10:52:47 am »
Hi!

Same issues here!
After trying to update samba to the latest (which didn't exactly went as expected) we're experiencing some processor freeze which leads to a complete unresponsive machine.
Which only leads to reboot.

Lets break down the steps to replicate the issue:

  • Update to the latest samba packages
Code: [Select]
samba-common-bin:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
samba-common:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
samba-dsdb-modules:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
samba-libs:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
samba-vfs-modules:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
samba:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1),
smbclient:amd64 (4.3.1-zentyal2, 4.3.4-zentyal1)

  • 2. Update reports that packages aren't correctly installed
  • 3. Reboot:
    • Virtual Machine: At every reboot, KERNEL PANIC
          


  • Physical machines
Able to reboot, needed to work the issue around with package reconfig (can't rem right now all the exact actions). One of the issues i've encountered was a broken sql table and samba service wasn't working at all.
I needed to rebuild the samba_access table using the suggestions from this url:  http://stackoverflow.com/questions/8843776/mysql-table-is-marked-as-crashed-and-last-automatic-repair-failed

Code: [Select]
cd /var/lib/mysql/zentyal/
myisamchk -r -v -f samba_access.MYD   <-- If I do remember correctly won't work on this one since my issues were on index
myisamchk -r -v -f samba_access.MYI
sudo dpkg --configure -a
sudo reboot

After that at least package were installed correctly and machine were able to work

The other two machines were less tricky, one went fine on the first shot and the second were able to be package-reconfigured via

Code: [Select]
dpkg --configure -a
  • 4. Normal operation:
Machines froze 4 times in 5 days (never happened). The sixth day I was actively monitoring and top showed this up:
[/list]

Code: [Select]
top - 11:34:30 up 17:58,  2 users,  load average: 0.82, 0.34, 0.15
Tasks: 444 total,   2 running, 442 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.0 us,  1.3 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  16185268 total, 15717188 used,   468080 free,   374772 buffers
KiB Swap: 16544764 total,        0 used, 16544764 free. 13313988 cached Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                 
 19904 ebox      20   0  326012  14052  11768 R  99.9  0.1   1:13.49 net                         
 

One processor was stucked at 100% and killing the net process prevented a machine freeze.

Now we're running fine since monday (3 days, no issues 'till now)...dunno if I'm going to downgrade (as peptoniET) or upgrade since this updates are available

Code: [Select]
linux-generic Complete Generic Linux kernel and headers 3.13.0.79.85
linux-headers-generic Generic Linux kernel headers 3.13.0.79.85
linux-image-generic Generic Linux kernel image 3.13.0.79.85
linux-image-generic-lts-vivid Generic Linux kernel image 3.19.0.51.36
linux-source Linux kernel source with Ubuntu patches 3.13.0.79.85
linux-source-3.13.0 Linux kernel source for version 3.13.0 with Ubuntu patches 3.13.0-79.123

Suggestions?

Thx all, hope everything is clear.

I'm here for questions.

L

peptoniET

  • Zen Apprentice
  • *
  • Posts: 28
  • Karma: +4/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #26 on: February 25, 2016, 11:07:32 am »
I must add:  I have a third Zentyal 4.2.2 machine, with 3.19.0-49-generic kernel working without any problem since 12 days.  BUT THIS MACHINE HAS NO SAMBA SHARES.  The module is enabled, but no shares have been made (only working as Domain Controller + VirtualBox VM server).

BerT666

  • Zen Warrior
  • ***
  • Posts: 218
  • Karma: +9/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #27 on: February 25, 2016, 01:45:28 pm »
Howdy,

so it seems it is another occurence of this bug / failure:
https://forum.zentyal.org/index.php/topic,26954.0.html

There was also a problem with kernel / net scripts...

So there are two solutions for us "normal admins":

- update to latest kernel and hope this is gone
- revert to an older kernel where all is OK...

I had this only one or two times on my VM (I think it was related to a big data transfer (about 1,5TB)), so I am not sure if this is solved with the actual kernel...

Regards

Thomas

hotsummer55

  • Zen Apprentice
  • *
  • Posts: 27
  • Karma: +2/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #28 on: February 25, 2016, 05:20:31 pm »
Follow on from my previous post
problem
server eventually runs out of memory and you can't do anything with it and have to switch off.
My server is  doing a lot of smbd ,samba traffic .we backup by mounting samba shares and rsyncing to storage
On a copy of the server that has no samba traffic error don't seem to appear .
Many errors in syslog are generated always with smbd process
 ++++++++++++++++++++++++++++++++
 kernel: [40908.028001] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [smbd:3049]
Feb 24 09:34:44 server2 kernel: [40908.028001] Modules linked in: xt_mac xt_mark nf_conntrack_ipv4 nf_defrag_ipv4 xt_connmark nf_conntrack iptable_mangle ip_tables x_tables quota_v2 quota_tree nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc fscache iosf_mbi kvm_intel kvm crct10dif_pclmul crc32_pclmul dm_crypt aesni_intel snd_hda_codec_generic aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel snd_hda_controller serio_raw snd_hda_codec snd_hwdep virtio_rng snd_pcm snd_timer i2c_piix4 snd soundcore pvpanic 8250_fintek mac_hid parport_pc ppdev lp parport psmouse pata_acpi floppy
Feb 24 09:34:44 server2 kernel: [40908.028001] CPU: 1 PID: 3049 Comm: smbd Not tainted 3.19.0-49-generic #55~14.04.1-Ubuntu
Feb 24 09:34:44 server2 kernel: [40908.028001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Feb 24 09:34:44 server2 kernel: [40908.028001] task: ffff8800b9f9ce80 ti: ffff88019951c000 task.ti: ffff88019951c000
Feb 24 09:34:44 server2 kernel: [40908.028001] RIP: 0010:[<ffffffff817b77ea>]  [<ffffffff817b77ea>] _raw_spin_lock+0x2a/0x60
Feb 24 09:34:44 server2 kernel: [40908.028001] RSP: 0018:ffff88019951fe20  EFLAGS: 00000206
++++++++++++++++++++++++++++++
tainted should be zero and is not
cat /proc/sys/kernel/tainted          displays the tainted value
and htop you have a cpu stuck at 100%
and free -m displays memory is all used
+++++++++++++++++++++++++++++
This is on a server that was upgraded from zentyal 4.1 and after upgraded was running on kernel
linux-image-3.19.0-49-generic
prior to this is was running on 4.1 for about 10 months no problem
+++++++++++++++
I decided to go back to linux-image-3.19.0-47-generic kernel but as i had just upgraded it was not installed
so
run
sudo apt-cache madison linux-image-3.19.0-4
or
sudo apt-cache madison linux-image-3.19.0-5
To display available kernels then i ran
sudo apt-get install linux-image-3.19.0-47-generic        to install required kernel if not already available
As I had no other 3.19 kernel installed as this was a upgrade from 4.1.

Then I wanted to force grub to only load linux-image-3.19.0-47-generic kernel
so edited  /etc/default/grub
and replaced GRUB_DEFAULT=0
with must be exactly correct.
GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 3.19.0-47-generic"
then run
sudo update_grub
then reboot
Today 25/02/2016 the server is running with no problem and definitely would have problem if this did not  solved this issue



 

jwilliams1976

  • Zen Apprentice
  • *
  • Posts: 22
  • Karma: +1/-0
    • View Profile
Re: Zentyal 4.2 - BUG: soft lockup - CPU #1, after latest update
« Reply #29 on: February 26, 2016, 04:02:08 pm »
I was having this same issue and thanks to your help here have rolled back to the 3.19.0-47 kernel and everything seems to be normal again. Has anyone tested the 3.19.0.51.36 vivid kernel yet? I'm on a production server and can't really test it out.