Author Topic: Zentyal 6: LDAP Bind slow compared to Windows AD (and Zentyal 5?)  (Read 1945 times)

mytfg

  • Zen Apprentice
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Zentyal 6: LDAP Bind slow compared to Windows AD (and Zentyal 5?)
« on: September 16, 2019, 08:17:18 am »
Hello everyone,

using LDAP for PHP to create and query users from Zentyal, I certainly noticed that the `ldap_bind` call is quite slow compared to the completely same calls towards a Windows AD.

To switch from the Windows AD towards a Zentyal-based one, I set up a Zentyal 5 Server with a completely new AD which we are running in parallel for a year now so we have plenty of time for re-configuring dependent services.
Thus, my (self-implemented) user management platform creates and edits users in both directories simultaneously.

A few months ago, I updated to Zentyal 6 and I think - even though I'm not entirely sure the problem did not exists with 5.1 - the performance drop started there.

The LDAP binding takes around 300ms when binding to zentyal. The time for arbitrary queries afterwards is neglectible.

When binding to the windows AD, it usually takes less than 10ms.

While 300ms is still acceptable, I would really like to know why there is such a difference between the systems and if it is possible to speed up the bind process.

Using wireshark (or tshark), I tried to capture the calls from the frontend server towards the Zentyal Server - the ldap_bind_request takes 300ms to get a reply. However, it receives an ACK nearly immediately.

I tried both, binding to the hostname as well as to the IP address. However, there seems to be no difference since the bind response is slow, the bind request is issued without delay.

This is the PHP time measurement:
Code: [Select]
"New LDAP Connection for Domain 1 (Windows)",
"Connection Time for Domain: 0.0077838897705078s",
"New LDAP Connection for Domain 2 (Zentyal)",
"Connection Time for Domain: 0.2680070400238s"

Edit:
Just another info:
When opening and closing many connections (at max 20 simultaneous connections, but ~3000 consecutives in total), both bind requests get slower. However, then I get about 17 ms for the Windows AD (i.e., about twice the time for a bind) compared to 2.7 seconds = 2700ms (i.e., 10 times slower).

Edit 2:
With the high amount of queries, the delay increases to 10 seconds or even higher. CPU load on the server is at 100%. The top 10 processes when it comes to CPU utilization are as follows:
Code: [Select]
79.5 29707 MYTFG\c+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
35.5 29703 root     /usr/sbin/samba --foreground --no-process-group
24.3 29679 MYTFG\i+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
24.3 29664 MYTFG\i+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
18.7 29626 MYTFG\m+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
14.4 29662 MYTFG\b+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
12.8 29650 MYTFG\e+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
10.2 29631 MYTFG\n+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
 9.7 29532 root     /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
 8.6 29596 MYTFG\s+ /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground

So what internal process can cause this high CPU load?

Edit 3:
Some CPU load was caused by the high logging level I had set - with logging disabled I get lower bind times, but still much slower than when binding to the Windows DC.

I would appreciate any ideas and suggestions.

Best regards  :)
« Last Edit: September 17, 2019, 09:49:39 am by mytfg »

doncamilo

  • Zen Samurai
  • ****
  • Posts: 478
  • Karma: +165/-1
    • View Profile
Re: Zentyal 6: LDAP Bind slow compared to Windows AD (and Zentyal 5?)
« Reply #1 on: September 16, 2019, 05:22:15 pm »
 :)

I don't know if these values could be useful in order to compare with your own experience. I have runned this command:
Code: [Select]
xxxxxxxx@os0:$ time smbclient -U andrea%andrea //10.5.20.61/it_xxx -c 'ls'
Domain=[xxxxx] OS=[Windows 6.1] Server=[Samba 4.7.6-Ubuntu]
  .                                   D        0  Mon Sep 16 17:07:56 2019
  ..                                  D        0  Mon Sep 16 16:34:05 2019

19519312 blocks of size 1024. 14370188 blocks available

real 0m0.375s
user 0m0.053s
sys 0m0.005s

And the SMB packages wireshar capture  of this network traffic is :

Code: [Select]
"No.","Time","Protocol","Info"
"30","4.933379975","SMB","Negotiate Protocol Request"
"36","5.166984619","SMB","Negotiate Protocol Response"
"38","5.170061266","SMB","Session Setup AndX Request, NTLMSSP_NEGOTIATE"
"40","5.175547521","SMB","Session Setup AndX Response, NTLMSSP_CHALLENGE, Error: STATUS_MORE_PROCESSING_REQUIRED"
"41","5.176016696","SMB","Session Setup AndX Request, NTLMSSP_AUTH, User: MOVADE\\andrea"
"42","5.212031517","SMB","Session Setup AndX Response"
"43","5.212430667","SMB","Tree Connect AndX Request, Path: \\\\10.5.20.61\\IPC$"
"44","5.224817835","SMB","Tree Connect AndX Response"
"45","5.225180440","SMB","Trans2 Request, GET_DFS_REFERRAL, File: \\10.5.20.61\\it_zaragoza"
"46","5.226120945","SMB","Trans2 Response, GET_DFS_REFERRAL, Error: STATUS_NOT_FOUND"
"47","5.226308638","SMB","Tree Disconnect Request"
"48","5.227140071","SMB","Tree Disconnect Response"
"49","5.227360123","SMB","Tree Connect AndX Request, Path: \\\\10.5.20.61\\IT_ZARAGOZA"
"50","5.241130641","SMB","Tree Connect AndX Response"
"51","5.241469849","SMB","Trans2 Request, FIND_FIRST2, Pattern: \\*"
"52","5.250918572","SMB","Trans2 Response, FIND_FIRST2, Files: . .."
"53","5.251326676","SMB","Trans2 Request, QUERY_FS_INFO, Query Full FS Size Info"
"54","5.253575413","SMB","Trans2 Response, QUERY_FS_INFO"
"55","5.253777330","SMB","Tree Disconnect Request"
"56","5.255546793","SMB","Tree Disconnect Response"

(The server is a VBOX device with 2GB RAM.)

Cheers!
« Last Edit: September 17, 2019, 02:24:06 pm by doncamilo »
- Do my pigeons bother you passing over your land?
- They block the sun!

G. Guareschi., Don Camillo.,

mytfg

  • Zen Apprentice
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: Zentyal 6: LDAP Bind slow compared to Windows AD (and Zentyal 5?)
« Reply #2 on: September 17, 2019, 10:13:12 am »
Hi,

thanks for your reply  :)

So the part
Code: [Select]
"30","4.933379975","SMB","Negotiate Protocol Request"
"36","5.166984619","SMB","Negotiate Protocol Response"
takes quite a "long" time as well.

I just captured the traffic with Wireshark, too, and again got a similar delay between the Client Hello and the Server Hello.
The actual file / directory access in your caputure is faster than the initial connection as well.

Code: [Select]
"37";"1.085010466";"TLSv1.2";"Client Hello"  (Client to Zentyal)
"38";"1.085036481";"TCP";"ACK"  (Zentyal to Client)
...
"69";"1.512500216";"TLSv1.2";"Server Hello"  (Zentyal to Client)

Between packet number 38 and 69 there are several unrelated packets (broadcasts, VM control traffic, etc.).
This also applies to the case where I disable SSL, so I can see the bindRequest and bindResponse instead of Client Hello and Server Hello. The time difference, however, is the same.
« Last Edit: September 17, 2019, 10:17:05 am by mytfg »

doncamilo

  • Zen Samurai
  • ****
  • Posts: 478
  • Karma: +165/-1
    • View Profile
Re: Zentyal 6: LDAP Bind slow compared to Windows AD (and Zentyal 5?)
« Reply #3 on: September 17, 2019, 01:05:02 pm »
 :)
I like these development questions. They obliges us to think deepliest in some systems administration aspects which we accept without criticism.
Please let me know how your project goes on.
Cheers!

PS: I have done some perl trials and I have found similar response times. It seems to me that it isn't a PHP ldap_bind library problem.
- Do my pigeons bother you passing over your land?
- They block the sun!

G. Guareschi., Don Camillo.,