Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - LaM

Pages: [1] 2 3
1
Hi,

no I think I will try this weekend to backup and update all 3.
Uptime by now are 14 days, 21 days and 6 days (  :'( still on .56 and it was up since 38 days sigh)

I'll keep You updated asap

2
Ok..even better. I cannot create a new share (or assign permissions) via interface because samba module's state is unknown.


3
Sigh...after 38 days I had to restart C due to proc 9 to 100%...she is on kernel 3.19.0.56...i'll update her either and we'll see.

The other 2 are on kernel .58 and up since 7 and 14 days.

I'll keep You updated

L

4
Hey Andreas,

thx for the info.
Right now I have C on .56 that's running since 35 days and the other two with kernel .58 running since 11 and 4 days. The latter one has the samba module updated.

I'll keep You up-to-date...let's hope for the best... =)

Still would like to hear something from devs...at least a sign of life...

L

5
Ok,

I had an issue this afternoon on the server with the older kernel:

Server  |   Kernel  |   Uptime  |   load average  |    Samba load
A   |   3.19.0-51-generic   |   ??:??:?? up 16 days, 15:06   |   0.??, 0.??, 0.??   |   High

Had to reboot 3 times in order to make it working again.

I can understand that zentyal developers wants us to buy their stuff...but this way is really a mess.. If You don't want to keep developing at least advertise ppl about it

6
Guys!

1 component update: Domain Controller and File Sharing, from 4.2.2 to 4.2.3...

SHOULD WE TRUST?

Would it fix our issues?

Opinions?

L

7
That sucks. I just upgraded to that -58 kernel.  :'(

Yeah, me too...

9
Update:
I compared the change logs for both Ubuntu kernels
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_3.13.0-85.129/changelog
That's for Kernel v3.13 (Ubuntu 14.04 LTS, which is definitely containing the fix assigned to LaunchPad ID: 1543980 => https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1543980 )
The changelog is containing:
Quote
  * af_unix: Guard against other == sk in unix_dgram_sendmsg
    - LP: #1543980, #1557191

And one time for kernel v3.19 (our kernel used in Zentyal 4.2) in build 3.19.0-58.64~14.04.1
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux-lts-vivid/linux-lts-vivid_3.19.0-58.64~14.04.1/changelog
is containing as well:
Quote
  * af_unix: Guard against other == sk in unix_dgram_sendmsg
    - LP: #1556297
So obviously the fix has been merged back by the ubuntu maintainers to kernel v3.19

So the kernel version 3.19.0-58 should fix the 'samba deadlock' alias 'soft lockup - CPU #1' bug and should be safe to use!!! :)  + ;D + 8)
(3.19.0-56 is not ... because the fix was integrated in Ubuntu's internal build of kernel 3.19.0-57, probably a test build)

Cheers,
Andreas

Ok, i'll take this for gold, i'll jump B to the .58 tonight (if nothing occurs meanwhile).

Thanks Andreas for everything so far!

I'll keep You up to date

L

10
Hi everybody!

Sorry I was traveling for job and have been away for almost a week.

Before quoting and reporting "news" let me show some new stats (I will call the 3 servers A, B and C):

Server
|
Kernel
|
Uptime
|
load average
|
Samba load
A|3.19.0-51-generic|06:14:14 up 6 days, 19:52|0.00, 0.01, 0.05|High
B|3.19.0-51-generic|06:14:16 up 5 days, 17:28|0.04, 0.07, 0.12|mid/low
C|3.19.0-56-generic|06:14:19 up 21 days,  8:10|0.09, 0.11, 0.10|mid/high

The 3 servers have the same exact HW and diff only for the installed kernel.

Ok. Now:

@Andreas
@LaM and @BerT666:
I don't think, that that can or should be the test to find out, if a kernel-version is affected by this bug or not.  :-\
- data-transfer of 500 GB up to 1 TB...
- or instruct all users to put as much operation on it as possible at the same time...
That's IT technically destructive for the own reputation, "Oh yeah, please help me to crash the server"

I wouldn't ask anybody to help out to stress test my systems; of course it isn't quite a clever idea even to think about something like that and never hit my mind (or been close thinking to write here something like that).
Being a system administrator and having full control over server and clients in network I can use some scripts/jobs to stress test the sys during night (aka outside production time).
Talking about reputation I think that a hit to our reputation will be sitting here waiting for the system to collapse (Which luckily isn't what we're doing).

There's a slight difference in waiting for the issue to come out and react before it comes out and as You later wrote in Your post:

I would love to have verified:
  • that the bug is gone
  • and finding a quick test to verify, if a system is affected by this bug or not

and of course I'm with You saying

Where are we with Zentyal at the moment?
but even better
Where are the Zentyal maintainers? Do they read this forum?!

Anyway

(Disclaimer: this is just a hint) I've found out that samba shares' RecycleBin (but some 'regular' folders are either affected) are being filled with .tmp files.
I think that this started to occur after the first buggy kernel update. My bad that I've ignored the situation before, thinking it was related to something else (and latched only to one directory) but after a quick search I've discovered that my share's RecycleBin folders are quite full and I've also been reported that now this files are shown in regular folders either and sometimes being manually deleted (sigh). Could it be related?
Looks like that creating a tmp file is a normal procedure for samba (in order to preserve files during write ops) but tmp files not being always correctly deleted (they shouldn't be in RecycleBin) after normal operations is quite strange.

@Andreas
I'll dig the temporary patch by Philipp and in this direction:
The bug can be reproduced and confirmed, but obviously only as developer on an affected system (To be honest I don't know how to do it):
Quote
It's easily reproducible by running the following commands in the Samba master branch:
./configure.developer TDB_NO_FSYNC=1 make -j test FAIL_IMMEDIATELY=1 SOCKET_WRAPPER_KEEP_PCAP=1 TESTS="samba3.raw.composite"

Good call!

L

11
In our case we only have transfers which are less than 1 GB...   =(

I rather think it occurs when there are many concurrency calls...but still i'm not able to produce a test.

@BerT666, did You reproduce the issue?

L

12
That's my point. I would like to find a way to reproduce the issue in order to be sure that is gone from the installed kernel.
Waiting is not the correct option imo. It doesn't give You the assurance that the kernel is bug-free
E.g. mine run with .51 and .56 and had been well for days...more than a week (and then one started to crush...)

Honestly I'm still trying to figure how to reproduce it. Looks latched to some concurrency with samba's calls...but i'm not sure.

I'll update You all asa i've more infos...

L

My servers are production can not come and change the kernel, leave you with these versions and hope for the best lol.

I will report any errors here in the forum.

Mine are production servers either of course, tha'ts why I don't want to wait the bug to happen during day (production time) but I would like to stress the system during night (not-so-much production time) in order to try finding the solution. =)

The 'ip route ls' command I mentioned earlier has worked for me to test that the bug exists or does not in a given kernel. See this post for more info:
https://forum.zentyal.org/index.php/topic,26954.msg99367.html#msg99367

Quote
It stems from a bug in the kernel that makes the ip command output the first rule infinitely.  You can use this command to see if you're affected:
ip route ls

Broken Output:
0:   from all lookup local
0:   from all lookup local
0:   from all lookup local
0:   from all lookup local
0:   from all lookup local
0:   from all lookup local
0:   from all lookup local
<repeats indefinitely - ctrl+c to quit>

In Zentyal, this causes one of the network scripts to hang because it's waiting for that command to end.  This prevents loading of other services and resulted in my network being severely broken.

Besides the previously mentioned fix of rolling back the kernel, you can modify the script in question:
/usr/share/zentyal-network/flush-fwmarks

I'll test tomorrow night and check the script, THX FOR THE HINT!! =)

L

13
That's my point. I would like to find a way to reproduce the issue in order to be sure that is gone from the installed kernel.
Waiting is not the correct option imo. It doesn't give You the assurance that the kernel is bug-free
E.g. mine run with .51 and .56 and had been well for days...more than a week (and then one started to crush...)

Honestly I'm still trying to figure how to reproduce it. Looks latched to some concurrency with samba's calls...but i'm not sure.

I'll update You all asa i've more infos...

L

14
Nice...so 58 looks stable...

But waiting for the issue to come...isn't there a way to force the issue?

L

15
Hey guys,

have anyone found a way to force the issue?

I'm running fine on the only updated machine which runs the .56 kernel....quite strange (now that I've said that hell will run on that machine  ::) )
uname -a
Linux dccharlie 3.19.0-56-generic #62~14.04.1-Ubuntu SMP Fri Mar 11 11:03:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
uptime
 08:49:24 up 14 days, 10:45,  1 user,  load average: 0.25, 0.17, 0.15

Thx

L



Pages: [1] 2 3