Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - AlecM

Pages: [1]
1
Other modules / DHCP Leases file garbage?
« on: June 03, 2024, 12:11:58 pm »
Zentyal version 8.0.3.  We have been using Zentyal for many years.

We have recently started having issues with our DHCP causing loss of client device connectivity.  Clients devices seem to be losing IP addresses for a period of time before re-establishing new ones.

We are a moderately small office and have just one range of DHCP IP's available (currently 10.0.0.59 - 10.0.0.254), the rest being reserved for servers, some Dev PC's and other network devices such as printers, switches etc.

Looking at the content of our leases file (/var/lib/dhcp/dhcpd.leases), we see a mix of some very old expired leases (from November 2023), current leases (3rd June 2024) and some leases (current) with some sort of scripting for "on expiry" and "on release".

I don't know if the two script blocks are legitimate entries, since not all entries have this format.

Our leases file is also getting very long, with over eight TEN thousand lines (increased during time of writing this post) of lease entries (lines bulked by the coded outputs as exampled below), almost all of them dated for today.

Example of the scripting:

Code: [Select]
on expiry {
    set ClientIP =
       binary-to-ascii (10, 8, ".", leased-address) ;
    log (debug,
        concat ("Expired: IP: ", ClientIP));
    execute ("/usr/share/zentyal-dhcp/dhcp-dyndns.sh", "delete", ClientIP, "", "0");
  }
  on release {
    set ClientIP =
       binary-to-ascii (10, 8, ".", leased-address) ;
    set ClientDHCID =
       concat (concat (concat (concat (concat (concat (concat (concat (concat (
                                                                               concat
                                                                              (
                                                                             suffix
                                                                              (
                                                                             concat
                                                                              (
                                                                             "0",
                                                                             
                                                                             
                                                                             binary-to-ascii
                                                                              (16
                                                                             ,
                                                                             8,
                                                                             ""
                                                                             ,
                                                                             
                                                                             substring
                                                                              (
                                                                             hardware,
                                                                             1,
                                                                             1)
                                                                             ))
                                                                             ,
                                                                             2)
                                                                             ,
                                                                               ":")
                                                                               ,
                                                                               
                                                                               suffix
                                                                              (
                                                                             concat
                                                                              (
                                                                             "0",
                                                                             
                                                                             
                                                                             binary-to-ascii
                                                                              (16
                                                                             ,
                                                                             8,
                                                                             ""
                                                                             ,
                                                                             
                                                                             substring
                                                                              (
                                                                             hardware,
                                                                             2,
                                                                             1)
                                                                             ))
                                                                             ,
                                                                             2)
                                                                       ), ":"),
                                                               
                                                               suffix (concat (
                                                                               "0",
                                                                               
                                                                               
                                                                               binary-to-ascii
                                                                              (16
                                                                             ,
                                                                             8,
                                                                             ""
                                                                             ,
                                                                             
                                                                             substring
                                                                              (
                                                                             hardware,
                                                                             3,
                                                                             1)
                                                                               ))
                                                                       , 2)),
                                                       ":"),
                                               suffix (concat ("0",
                                                               binary-to-ascii
                                                               (16, 8, "",
                                                                substring (
                                                                           hardware,
                                                                4, 1))), 2)),
                                       ":"),
                               suffix (concat ("0",
                                               binary-to-ascii (16, 8, "",
                                                                substring (
                                                                           hardware,
                                                                5, 1))), 2)),
                       ":"),
               suffix (concat ("0",
                               binary-to-ascii (16, 8, "",
                                                substring (hardware, 6, 1))), 2
               )) ;
    log (debug,
        concat ("Release: IP: ", ClientIP));
    execute ("/usr/share/zentyal-dhcp/dhcp-dyndns.sh", "delete", ClientIP, ClientDHCID);
  }


Can anyone enlighten me as to whether we have a buggy DHCP service (if so, what should I do to remedy), and whether I should try deleting the old or oddly-formed lease entries from the file in an effort to resolve it?

(I have made a backup copy of the file already.)

Thanks in advance,
Alec

=============== UPDATE (4th June 24) ================
Applying some basic troubleshooting/elimination processes on our network devices, I turned off our new WiFi AP (an Ubiquiti U7 Pro) and the address loss/reclaiming seems to have stabilised.  Perhaps a bit early to tell after only a couple of hours, as we had seen things stabilise after the morning anyway - so tomorrow morning should provide the real test of whether that device had been doing something rogue on the LAN.  It had applied an update back on 9th May (to v. 7.0.47), and we think that date is around when we started seeing the connectivity issues, but not sure why it had become increasingly worse during the last couple of weeks.

Have ordered a pair of NetGear AP's to test/replace the Ubiqiti stuff...
==============================================

2
Thanks Daniel!
You were quite correct, our mysql was not set to start with the system.
Code: [Select]
~$ sudo systemctl is-enabled mysql
disabled
So enabled as you indicated:
Code: [Select]
~$ sudo systemctl enable mysql
Synchronizing state of mysql.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable mysql
Created symlink /etc/systemd/system/multi-user.target.wants/mysql.service → /lib/systemd/system/mysql.service.

Checked the permissions on the log folder - I think it's correct? (mysql:adm)
Code: [Select]
~$ sudo ls -ld /var/log/mysql/
drwxrwx--- 2 mysql adm 4096 Apr 30 00:00 /var/log/mysql/

Thanks again Daniel.  I have just checked the service status after 24 hours, to allow for log rotation, and happy to report that it is still "enabled" at this point:

Code: [Select]
~$ sudo systemctl is-enabled mysql
enabled

3
Thanks Daniel!
You were quite correct, our mysql was not set to start with the system.
Code: [Select]
~$ sudo systemctl is-enabled mysql
disabled
So enabled as you indicated:
Code: [Select]
~$ sudo systemctl enable mysql
Synchronizing state of mysql.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable mysql
Created symlink /etc/systemd/system/multi-user.target.wants/mysql.service → /lib/systemd/system/mysql.service.

Checked the permissions on the log folder - I think it's correct? (mysql:adm)
Code: [Select]
~$ sudo ls -ld /var/log/mysql/
drwxrwx--- 2 mysql adm 4096 Apr 30 00:00 /var/log/mysql/


4
I had some issues after upgrading too, with both the Antivirus and MySQL, resolved with the help of steps provided by Daniel - thank you!

Hi,

We are debugging the antivirus issue, we believe three things should be analyzed:

1. The server was not rebooted after the upgrade, so the directory /var/run/clamav does not belong to the owner clamav.


 Ensure the directory /var/run/clamav belongs to clamav as chapderprinz suggested.
   
    sudo chown -R clamav /var/run/clamav/
   

4. Restart the module through the CLI:
   
    sudo zs antivirus restart
   


... <snip> ..

Best regards, Daniel Joven.


That fixed the same antivirus issue for us.

And the MySQL error on trying to login:

Code: [Select]
2024/04/26 11:04:02 ERROR> MyDBEngine.pm:200 EBox::MyDBEngine::_connect - Connection DB Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 at Connection DB Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 at /usr/share/perl5/EBox/MyDBEngine.pm line 200

The following fixed that issue (copied from another post by Daniel and edited down a bit):
Quote
check mysql running status:

   
Code: [Select]
sudo systemctl status mysql   
Status, inactive:

Code: [Select]
○ mysql.service - MySQL Community Server
     Loaded: loaded (/lib/systemd/system/mysql.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
   
If it is stopped (inactive), run the following commands:
   
Code: [Select]
mkdir /var/log/mysql/
    chown -R mysql:adm /var/log/mysql
    chmod -R 0770 /var/log/mysql
    systemctl restart mysql
   

Status Update on 29/04/2024 - issues returned
Issues continue:  We have been getting intermittent network connectivity issues since the upgrade, with users on DHCP leased IP's.  Static IP systems (couple of users and various printers and servers) seem unaffected - I am suspecting the DHCP, but not totally sure it is the issue.  Our address elase time was very short (2 hours), so have updated this to a 24-hour lease duration.

Looking at the /var/log/syslog file we also see several entries for the following error that seems to relate to DNS:
Code: [Select]
Apr 29 09:28:43 titan sh[3185911]: ERROR(runtime): uncaught exception - (5, 'WERR_ACCESS_DENIED')
Apr 29 09:28:43 titan sh[3185911]:   File "/usr/lib/python3/dist-packages/samba/netcmd/__init__.py", line 186, in _run
Apr 29 09:28:43 titan sh[3185911]:     return self.run(*args, **kwargs)
Apr 29 09:28:43 titan sh[3185911]:   File "/usr/lib/python3/dist-packages/samba/netcmd/dns.py", line 1235, in run
Apr 29 09:28:43 titan sh[3185911]:     raise e
Apr 29 09:28:43 titan sh[3185911]:   File "/usr/lib/python3/dist-packages/samba/netcmd/dns.py", line 1223, in run
Apr 29 09:28:43 titan sh[3185911]:     dns_conn.DnssrvUpdateRecord2(dnsserver.DNS_CLIENT_VERSION_LONGHORN,

But are unsure if this had anyhting to do with the connectivity issues, as the entry occured again more recently in the file (12:17), but no-one raised query that it had given them a problem at that time.

Additionally, the AntiVirus and MySql issues returned on our PDC (which also runs the DHCP service), with the antivirus stopping and not restarting.
Tried to reset the permission on the clamav folder, but result was the directory not found:
Code: [Select]
sudo chown -R clamav /var/run/clamav/
chown: cannot access '/var/run/clamav/': No such file or directory

So as per the upgrade guidance instructions on https://doc.zentyal.org/en/upgrade.html#antivirus-module re-ran the dpkg command for clamav:

Code: [Select]
sudo dpkg-reconfigure clamav-daemon
Service still failed to start.  Result in the error log showed:

Code: [Select]
Mon Apr 29 11:48:30 2024 -> ERROR: LOCAL: Could not create socket directory: /var/run/clamav: Permission denied
47 Mon Apr 29 11:48:30 2024 -> ERROR: LOCAL: Socket file /var/run/clamav/clamd.ctl could not be bound: No such file or directory

I re-ran the clamav config

Code: [Select]
sudo dpkg-reconfigure clamav-daemon
This time,  I selected YES to handling the configuration file automatically - the opposite to the upgrade guidance.  Went through that accepting the defaults - it recreated the missing /var/run/clamav/ folder for us, and added the missing clamd.ctl file.
This seems to have sorted it, the service has remained running for the last 15 minutes...

MySQL
The mysql error occurred again when trying to check the DHCP module from the web admin (though the dashboard was displaying OK).

Checked the mysql status in CLI:
Code: [Select]
~$ sudo systemctl status mysql
[sudo] password for copeadmin:
○ mysql.service - MySQL Community Server
     Loaded: loaded (/lib/systemd/system/mysql.service; disabled; vendor preset: enabled)
     Active: [b]inactive[/b] (dead)

Checked the current permissions in the mysql log directory:
Code: [Select]
~$ sudo ls -lh /var/log/mysql
total 12K
-rw-r----- 1 mysql adm     0 Apr 29 00:00 error.log
-rw-r----- 1 mysql adm    20 Apr 28 00:00 error.log.1.gz
-rw-r----- 1 mysql adm    20 Apr 27 00:00 error.log.2.gz
-rw-r----- 1 mysql mysql 942 Apr 26 19:05 error.log.3.gz
Note the difference in owner for file dated 26/04, when it ought to match the other 3 files (mysql:adm).  This had previously been corrected on 26/04 in the morning.  I had rebooted the server after office hours, but afraid I can't recall exact time - I don't think it was as late as 19:00 though.

So reapplied the permission chown commands:
Code: [Select]
sudo chown -R mysql:adm /var/log/mysql
sudo chmod -R 0770 /var/log/mysql

Result:
Code: [Select]
~$ sudo ls -lh /var/log/mysql
total 12K
-rwxrwx--- 1 mysql adm   0 Apr 29 00:00 error.log
-rwxrwx--- 1 mysql adm  20 Apr 28 00:00 error.log.1.gz
-rwxrwx--- 1 mysql adm  20 Apr 27 00:00 error.log.2.gz
-rwxrwx--- 1 mysql adm 942 Apr 26 19:05 error.log.3.gz

And then restarted mysql again.

sudo systemctl restart mysql

And check status again - OK:

Code: [Select]
~$ sudo systemctl status mysql
● mysql.service - MySQL Community Server
     Loaded: loaded (/lib/systemd/system/mysql.service; disabled; vendor preset: enabled)
     Active: active (running) since Mon 2024-04-29 10:24:57 BST; 4s ago
    Process: 3236229 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
   Main PID: 3236248 (mysqld)
     Status: "Server is operational"
      Tasks: 39 (limit: 7071)
     Memory: 437.9M
        CPU: 1.527s
     CGroup: /system.slice/mysql.service
             └─3236248 /usr/sbin/mysqld

So it's OK again for now - but worrying I had to re-apply the permission changes on the log directory again.  I don't want to be having to re-apply this every day!


5
I had some issues after upgrading too, with both the Antivirus and MySQL, resolved with the help of steps provided by Daniel - thank you!

Hi,

We are debugging the antivirus issue, we believe three things should be analyzed:

1. The server was not rebooted after the upgrade, so the directory /var/run/clamav does not belong to the owner clamav.


 Ensure the directory /var/run/clamav belongs to clamav as chapderprinz suggested.
   
    sudo chown -R clamav /var/run/clamav/
   

4. Restart the module through the CLI:
   
    sudo zs antivirus restart
   


... <snip> ..

Best regards, Daniel Joven.


That fixed the same antivirus issue for us.

And the MySQL error on trying to login:

Code: [Select]
2024/04/26 11:04:02 ERROR> MyDBEngine.pm:200 EBox::MyDBEngine::_connect - Connection DB Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 at Connection DB Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 at /usr/share/perl5/EBox/MyDBEngine.pm line 200

The following fixed that issue (copied from another post by Daniel and edited down a bit):
Quote
check mysql running status:

   
Code: [Select]
sudo systemctl status mysql   
Status, inactive:

Code: [Select]
○ mysql.service - MySQL Community Server
     Loaded: loaded (/lib/systemd/system/mysql.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
   
If it is stopped (inactive), run the following commands:
   
Code: [Select]
mkdir /var/log/mysql/
    chown -R mysql:adm /var/log/mysql
    chmod -R 0770 /var/log/mysql
    systemctl restart mysql
   

6
Installation and Upgrades / Re: Zentyal and MS SQL
« on: July 08, 2020, 11:53:09 am »
I haven't seen a note for installing MS-SQL via a Snap or Flatpak, but it does officially support Docker - so that might be an option to resolve that dependency conflict under your config rather than using another VM?
https://docs.microsoft.com/en-us/sql/linux/quickstart-install-connect-docker?view=sql-server-ver15&pivots=cs1-bash

7
I was facing similar conundrum, as the documentation on the Zentyal Wiki lacks detail exampling in this area, but I believe I have sorted it for our case (using Z.6.0.1)

I realise this thread is now classed as quite old, but thought I'd post my solution in case anyone else came across the topic in a search result like I did.

Scenario:
Our HR team have an existing top-level share, and had added a sub-sub-folder in that, which they wanted to share directly to our reception team without giving access to all the higher level content.

In the "Share path" text box of the Zentyal File Share editor, I pasted the path from the HR-users path, then changed the original back-slashes (\) of the sub-folder pathing, that Windows uses, to forward slashes (/).

Zentyal allowed that.  Once that stage was saved, I then added the access permissions by using the Domain security group method.

The permissions can take a little while to apply - possibly varying by how much content it needs to process, so some patience is required while the system does that.

Hope that helps anyone else looking to achieve the same type of configuration.
Alec

8
Ok - so answering my own question...
On our BDC (which I felt was less issue to risk if things went wrong), I ran;

Code: [Select]
apt-get upgrade
As per suggestion in the "how to make a good post" sticky... ::)

And once that completed (image 06) then checked the admin web dashboard, which confirmed the status (image 07).

9
Supplemental: (Since I cannot add more attachments when editing an existing message)
Using
Code: [Select]
apt-cache show zentyal-corethe pending update is for 5.0.10 (image 05 attached).

So I'm puzzled why this isn't showing on the web console as a proper system update?

10
We have two Zentyal systems (PDC and BDC), both running 5.0.9.  On the web admin dashboard, under the "Software" status, we are seeing "1 component updates" (sic), but when we click on that status to view the Zentyal Components page, there is nothing listed under the "Update" list.  Screen-shots 01 and 02 attached below.

Accessing the system via PuTTY, the post-login summary also indicates a package to update (image 03).

Running "apt-get upgrade" (after autoclean, autoremove and update), apt indicates 2 packages, one of which is zentyal-core (image 04).

Cancelling that upgrade, logging out and back in again, the summary still lists only 1 package to upgrade, even though apt is actually listing 2.

Is anyone else seeing this (or been through allowing the upgrade without problems)?

11
Thought I'd come back and update our status on this.  Dolanj00's suggested script fix was spot on to correct the ad-migrate script (giving it the target language/engine to use).

I ended up disabling the AD service on the Windows 2008 server and adding a second Zentyal system ("pluto") as a BDC.  This has stabilised most of the authentication issues we had experienced, but it is by no means a fully resolved situation, as some Windows server-based authentication processes will still fail. 

The two problem areas I've discovered so far have been;
  • Trying to launch a Hyper-V machine console (from the Hyper-V admin window on my local PC) will fail to authenticate the user login for the connection, even with the domain administrator credentials.  Launching the Hyper-V virtual console directly on the Hyper-V host (via an RDP session) does still work.
  • Launching Visual Studio with a project that is under Team Foundation Services (TFS) source code control. The TFS server (a Windows 2008 server) fails to authenticate my domain credentials - yet it manages to "see" enough to be able to lock my account if I keep trying  :-[.

Weirdly, using Remote Desktop (RDP) from my PC to the Hyper-V VM's still works fine (thankfully).

Looking at the Windows server error logs, the Hyper-V issue seem to relate to the Group Policy folders not being present in the Zentyal version of the Sysvol path - that is, \\[domain]\pdc\[domain]\sysvol\Policies (actual error Event ID 1096).  Some folks suggested setting a local machine policy to override the domain policy to allow the supplied credentials to work - but when trying to apply the change using "gpupdate /force", the system errors out reporting the lack of the sysvol policy folder (which seems a rather circular problem).

I managed to get a copy of the Policies folder and it's content (GUID-named sub-folders) from the Windows server ("neptune"), but I cannot copy these to the Zentyal sysvol area - the system tells me I don't have permission (I've got domain admins membership).  This permission setting may be by intentional design of how sysvol operates, but I don't know.

I have recently found several other Zentyal forum posts that touch on the same "sysvol" issue, dating back a few years - so this is (sadly) not a new problem:

https://forum.zentyal.org/index.php?topic=22923.0  (The noted summary in this matches almost exactly with the start of the issues I observed on our systems)
https://forum.zentyal.org/index.php?topic=30507.0
https://forum.zentyal.org/index.php/topic,23116.msg89031.html#msg89031

Following some of the suggested tests/resolution steps posted by Zentyal staff in the last linked page, the samba-tools are returning python script errors.
On the PDC ("titan"), command:

Code: [Select]
sudo samba-tool ntacl sysvolcheck
Gives the following (note the "ERROR" after opening the idmap data file):

Code: [Select]
lp_load_ex: refreshing parameters
Initialising global parameters
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[global]"
Processing section "[homes]"
Processing section "[IT]"
Processing section "[ScanArchive]"
Processing section "[COPEDocs]"
Processing section "[MedicallyConfidential]"
Processing section "[Public]"
Processing section "[WPT]"
Processing section "[HP_Scans]"
Processing section "[netlogon]"
Processing section "[sysvol]"
ldb_wrap open of idmap.ldb
ERROR(<type 'exceptions.TypeError'>): uncaught exception - (61, 'No data available')
  File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 176, in _run
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/samba/netcmd/ntacl.py", line 270, in run
    lp)
  File "/usr/lib/python2.7/dist-packages/samba/provision/__init__.py", line 1714, in checksysvolacl
    fsacl = getntacl(lp, dir_path, direct_db_access=direct_db_access, service=SYSVOL_SERVICE)
  File "/usr/lib/python2.7/dist-packages/samba/ntacls.py", line 81, in getntacl
    xattr.XATTR_NTACL_NAME)

Checking replication status - and I expect the Neptune server synch to fail as I've turned off the AD service on it:

Code: [Select]
sudo samba-tool drs showrepl
Gives:

Code: [Select]
ldb_wrap open of secrets.ldb
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'spnego' registered
GENSEC backend 'schannel' registered
GENSEC backend 'naclrpc_as_system' registered
GENSEC backend 'sasl-EXTERNAL' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'ntlmssp_resume_ccache' registered
GENSEC backend 'http_basic' registered
GENSEC backend 'http_ntlm' registered
GENSEC backend 'krb5' registered
GENSEC backend 'fake_gssapi_krb5' registered
Using binding ncacn_ip_tcp:titan.copeohs.com[,seal]
resolve_lmhosts: Attempting lmhosts lookup for name titan.copeohs.com<0x20>
resolve_lmhosts: Attempting lmhosts lookup for name titan.copeohs.com<0x20>
resolve_lmhosts: Attempting lmhosts lookup for name titan.copeohs.com<0x20>
HeadOffice\TITAN
DSA Options: 0x00000001
DSA object GUID: 5f9f0c17-282e-47b2-ac00-40edb9d29b74
DSA invocationId: 87a25fb1-187b-48dd-b885-a2fd93d8e6ee

==== INBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=copeohs,DC=com
        HeadOffice\NEPTUNE via RPC
                DSA object GUID: 24e8a117-37b2-4146-9474-da7c45278313
                Last attempt @ Wed Sep 20 11:40:29 2017 BST failed, result 1234 (WERR_PORT_UNREACHABLE)
                5514 consecutive failure(s).
                Last success @ Fri Sep  1 08:11:45 2017 BST

DC=DomainDnsZones,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ Wed Sep 20 11:40:29 2017 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Sep 20 11:40:29 2017 BST

CN=Configuration,DC=copeohs,DC=com
        HeadOffice\NEPTUNE via RPC
                DSA object GUID: 24e8a117-37b2-4146-9474-da7c45278313
                Last attempt @ Wed Sep 20 11:40:29 2017 BST failed, result 1234 (WERR_PORT_UNREACHABLE)
                5515 consecutive failure(s).
                Last success @ Fri Sep  1 08:11:45 2017 BST

CN=Configuration,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ Wed Sep 20 11:40:29 2017 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Sep 20 11:40:29 2017 BST

CN=Schema,CN=Configuration,DC=copeohs,DC=com
        HeadOffice\NEPTUNE via RPC
                DSA object GUID: 24e8a117-37b2-4146-9474-da7c45278313
                Last attempt @ Wed Sep 20 11:40:29 2017 BST failed, result 1234 (WERR_PORT_UNREACHABLE)
                5514 consecutive failure(s).
                Last success @ Fri Sep  1 08:11:45 2017 BST

CN=Schema,CN=Configuration,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ Wed Sep 20 11:40:29 2017 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Sep 20 11:40:29 2017 BST

DC=ForestDnsZones,DC=copeohs,DC=com
        HeadOffice\NEPTUNE via RPC
                DSA object GUID: 24e8a117-37b2-4146-9474-da7c45278313
                Last attempt @ Wed Sep 20 11:40:29 2017 BST failed, result 1234 (WERR_PORT_UNREACHABLE)
                5514 consecutive failure(s).
                Last success @ Fri Sep  1 08:11:45 2017 BST

DC=ForestDnsZones,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ Wed Sep 20 11:40:29 2017 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Sep 20 11:40:29 2017 BST

DC=copeohs,DC=com
        HeadOffice\NEPTUNE via RPC
                DSA object GUID: 24e8a117-37b2-4146-9474-da7c45278313
                Last attempt @ Wed Sep 20 11:40:30 2017 BST failed, result 1234 (WERR_PORT_UNREACHABLE)
                5513 consecutive failure(s).
                Last success @ Fri Sep  1 08:12:56 2017 BST

DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ Wed Sep 20 11:40:30 2017 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Sep 20 11:40:30 2017 BST

==== OUTBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ NTTIME(0) was successful
                0 consecutive failure(s).
                Last success @ NTTIME(0)

CN=Configuration,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ NTTIME(0) was successful
                0 consecutive failure(s).
                Last success @ NTTIME(0)

CN=Schema,CN=Configuration,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ NTTIME(0) was successful
                0 consecutive failure(s).
                Last success @ NTTIME(0)

DC=ForestDnsZones,DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ NTTIME(0) was successful
                0 consecutive failure(s).
                Last success @ NTTIME(0)

DC=copeohs,DC=com
        HeadOffice\PLUTO via RPC
                DSA object GUID: ac19e956-ee12-4a4c-943d-3b0883f33c74
                Last attempt @ NTTIME(0) was successful
                0 consecutive failure(s).
                Last success @ NTTIME(0)

==== KCC CONNECTION OBJECTS ====

Connection --
        Connection name: 6f8dd7a5-2ad2-4354-a5a9-099fd08bf301
        Enabled        : TRUE
        Server DNS name : pluto.copeohs.com
        Server DN name  : CN=NTDS Settings,CN=PLUTO,CN=Servers,CN=HeadOffice,CN=Sites,CN=Configuration,DC=copeohs,DC=com
                TransportType: RPC
                options: 0x00000001
Warning: No NC replicated for Connection!
Connection --
        Connection name: f3671125-06e5-464f-8cd3-57313e483a20
        Enabled        : TRUE
        Server DNS name : Neptune.copeohs.com
        Server DN name  : CN=NTDS Settings,CN=NEPTUNE,CN=Servers,CN=HeadOffice,CN=Sites,CN=Configuration,DC=copeohs,DC=com
                TransportType: RPC
                options: 0x00000001
Warning: No NC replicated for Connection!

I have not yet been brave enough to try the samba-tools ntacl sysvolreset command.

Additionally, I have tried to run DCPROMO on neptune to demote it - but due to the status it thinks its under (FSMO master but not verified - which I couldn't even correct by forced seizing) this fails.  Trying to delete it from the Windows domain admin tools fails with a "script not present" error.

Would greatly appreciate feedback on what I should try next....

EDIT
Found another posted thread https://forum.zentyal.org/index.php?topic=23892.0 where the Group Policy problem was mentioned again...
This also mentioned running the sysvolreset samba command, so I have now tried this - but again there are Python script errors reported (scroll to bottom of output for this):

Code: [Select]
lp_load_ex: refreshing parameters
Initialising global parameters
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[global]"
Processing section "[homes]"
Processing section "[IT]"
Processing section "[ScanArchive]"
Processing section "[COPEDocs]"
Processing section "[MedicallyConfidential]"
Processing section "[Public]"
Processing section "[WPT]"
Processing section "[HP_Scans]"
Processing section "[netlogon]"
Processing section "[sysvol]"
ldb_wrap open of idmap.ldb
lp_load_ex: refreshing parameters
Processing section "[global]"
Processing section "[homes]"
Processing section "[IT]"
Processing section "[ScanArchive]"
Processing section "[COPEDocs]"
Processing section "[MedicallyConfidential]"
Processing section "[Public]"
Processing section "[WPT]"
Processing section "[HP_Scans]"
Processing section "[netlogon]"
Processing section "[sysvol]"
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Module 'acl_xattr' loaded
Initialising custom vfs hooks from [dfs_samba4]
Module 'dfs_samba4' loaded
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service Unknown Service (snum == -1)
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Initialising custom vfs hooks from [dfs_samba4]
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service Unknown Service (snum == -1)
lp_load_ex: refreshing parameters
Processing section "[global]"
Processing section "[homes]"
Processing section "[IT]"
Processing section "[ScanArchive]"
Processing section "[COPEDocs]"
Processing section "[MedicallyConfidential]"
Processing section "[Public]"
Processing section "[WPT]"
Processing section "[HP_Scans]"
Processing section "[netlogon]"
Processing section "[sysvol]"
ldb_wrap open of idmap.ldb
ldb_wrap open of idmap.ldb
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Initialising custom vfs hooks from [dfs_samba4]
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service sysvol
unpack_nt_owners: owner sid mapped to uid 0
unpack_nt_owners: group sid mapped to gid 4
set_nt_acl: chown /var/lib/samba/sysvol. uid = 0, gid = 4.
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Initialising custom vfs hooks from [dfs_samba4]
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service sysvol
unpack_nt_owners: owner sid mapped to uid 0
unpack_nt_owners: group sid mapped to gid 4
set_nt_acl: chown /var/lib/samba/sysvol/copeohs.com/scripts. uid = 0, gid = 4.
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Initialising custom vfs hooks from [dfs_samba4]
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service sysvol
unpack_nt_owners: owner sid mapped to uid 0
unpack_nt_owners: group sid mapped to gid 4
set_nt_acl: chown /var/lib/samba/sysvol/copeohs.com. uid = 0, gid = 4.
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
idmap range not specified for domain '*'
Initialising default vfs hooks
Initialising custom vfs hooks from [/[Default VFS]/]
Initialising custom vfs hooks from [acl_xattr]
Initialising custom vfs hooks from [dfs_samba4]
connect_acl_xattr: setting 'inherit acls = true' 'dos filemode = true' and 'force unknown acl user = true' for service sysvol
open: error=2 (No such file or directory)
ERROR(runtime): uncaught exception - (-1073741823, 'Undetermined error')
  File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 176, in _run
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/samba/netcmd/ntacl.py", line 239, in run
    lp, use_ntvfs=use_ntvfs)
  File "/usr/lib/python2.7/dist-packages/samba/provision/__init__.py", line 1609, in setsysvolacl
    set_gpos_acl(sysvol, dnsdomain, domainsid, domaindn, samdb, lp, use_ntvfs, passdb=s4_passdb)
  File "/usr/lib/python2.7/dist-packages/samba/provision/__init__.py", line 1502, in set_gpos_acl
    use_ntvfs=use_ntvfs, skip_invalid_chown=True, passdb=passdb, service=SYSVOL_SERVICE)
  File "/usr/lib/python2.7/dist-packages/samba/ntacls.py", line 162, in setntacl
    smbd.set_nt_acl(file, security.SECINFO_OWNER | security.SECINFO_GROUP | security.SECINFO_DACL | security.SECINFO_SACL, sd, service=service)

12
Thanks to dolanj00 for the debugging contribution - I have not tried that, as we have tried to follow an alternative solution.

After my original post I ended up adding a new Windows 2008 R2 server, installing the AD services and promoting that to be the PDC for our domain.  The Zentyal system is still configured as an Additional Domain Controller.  The old W2K3 server has now been demoted and removed from AD duties and is a simple member server (temporarily retained until all features are migrated elsewhere).

The domain various FSMO "masters" were transferred to the new W2K8R2 server (named "Neptune").  The Forest Functional Level was raised to "Windows Server 2008 R2".

At this point Neptune was logging lots of DNS access issues (the DNS service couldn't connect to the AD PDC, which is also Neptune).  This seemed to be due to the Kerberos system having an issue, but I got that sorted by stopping the Kerberos Distribution Centre service (on the Windows PDC, Neptune), setting the service startup to "manual", restarting the PDC, then restarting the Kerberos service and setting its startup back to automatic.

But, I'm still seeing issues where the Zentyal server (name "Titan") is failing to synch back to Neptune in just one aspect.  All other directory features seem to be synchronising correctly.  The error log indicates Event ID 8418.

Following the MS troubleshooting guide https://support.microsoft.com/en-gb/help/2734946/troubleshooting-ad-replication-error-8418-the-replication-operation-fa
, obtaining CSV report using the command
Code: [Select]
repadmin /showrepl * /csv > replresults.csv, we see the one-way failure:

showrepl_COLUMNSDestination DSA SiteDestination DSANaming ContextSource DSA SiteSource DSATransport TypeNumber of FailuresLast Failure TimeLast Success TimeLast Failure Status
showrepl_INFOHeadOfficeTITANDC=DomainDnsZones,DC=copeohs,DC=comHeadOfficeNEPTUNERPC0016/08/2017 11:530
showrepl_INFOHeadOfficeTITANCN=Configuration,DC=copeohs,DC=comHeadOfficeNEPTUNERPC0016/08/2017 11:560
showrepl_INFOHeadOfficeTITANCN=Schema,CN=Configuration,DC=copeohs,DC=comHeadOfficeNEPTUNERPC0016/08/2017 11:530
showrepl_INFOHeadOfficeTITANDC=ForestDnsZones,DC=copeohs,DC=comHeadOfficeNEPTUNERPC0016/08/2017 11:530
showrepl_INFOHeadOfficeTITANDC=copeohs,DC=comHeadOfficeNEPTUNERPC0016/08/2017 11:530
showrepl_INFOHeadOfficeNEPTUNEDC=copeohs,DC=comHeadOfficeTITANRPC2016/08/2017 11:5612/07/2017 07:088418
showrepl_INFOHeadOfficeNEPTUNECN=Configuration,DC=copeohs,DC=comHeadOfficeTITANRPC0016/08/2017 11:560
showrepl_INFOHeadOfficeNEPTUNECN=Schema,CN=Configuration,DC=copeohs,DC=comHeadOfficeTITANRPC0016/08/2017 11:560
showrepl_INFOHeadOfficeNEPTUNEDC=ForestDnsZones,DC=copeohs,DC=comHeadOfficeTITANRPC0016/08/2017 11:560
showrepl_INFOHeadOfficeNEPTUNEDC=DomainDnsZones,DC=copeohs,DC=comHeadOfficeTITANRPC0016/08/2017 11:560

Has anyone else come across this situation?

At this point I'm stumped for what to do next - I've considered reinstalling the Zentyal system, or even getting rid of the Zentyal mechanisms altogether and using a "normal" Ubuntu server with Webmin on top instead.

Is it possible/feasible to remove the ActiveDirectory feature from the Zentyal 5 server without losing the file sharing (Samba) capability?  The Zentyal Wiki docs for version 4 seemed to indicate this might be possible (https://wiki.zentyal.org/wiki/En/4.0/Users,_Computers_and_File_Sharing) via "external Active Directory" config, but this info is not present in the v.5 wiki of the same topic (indicating either the wiki docs are inconsistent, or the capability is deprecated).

13
We have a Zentyal (Dev/community) v.5 system running as a BDC for an old Windows AD system.  The PDC is a rather ancient Windows 2003 SP2 system that had the 2008 R2 domain upgrade applied to it (I know, it really ought not to be still in use).

The Zentyal system was originally a 3.5 install, later upgraded to 4.2. The more recent upgrade to v5 totally borked it, so I had to fresh-build it (retaining server name and same modules installed).

Recently the old PDC Windows system has been failing to synch the AD data from the newer Zentyal BDC - this was flagged up when users changed their passwords successfully (apparently serviced by the Z-box), but then when accessing a share or service from the old PDC, their accounts would immediately get locked.  Using the MS resource tool to change the user password on the old server to match their new password resolved the connections for them.

I tried manually synchronising from the BDC Z-box to the PDC using the MS tool "Active Directory Sites and Services", as per the guide doc from Technet.  The synch errored out reporting that "The replication operation failed because of a schema mismatch between the servers involved."

I don't know if this is something caused by a Zentyal update, or really is simply the old Windows system just being old kak.

Either way, I had planned to use the Zentyal-provided Operations Master migration script "ad-migrate" to transfer the PDC role to the Zentyal system, so that I could start to decommission the old Windows server.  But this script fails, reporting back with the following "not found" messages:

Code: [Select]
./ad-migrate: 18: ./ad-migrate: use: not found
./ad-migrate: 19: ./ad-migrate: use: not found
./ad-migrate: 21: ./ad-migrate: use: not found
./ad-migrate: 22: ./ad-migrate: use: not found
./ad-migrate: 25: ./ad-migrate: use: not found
./ad-migrate: 27: ./ad-migrate: use: not found
./ad-migrate: 28: ./ad-migrate: use: not found
./ad-migrate: 29: ./ad-migrate: use: not found
./ad-migrate: 34: ./ad-migrate: Syntax error: Bad function name

Has anyone else encountered this issue?

I had wondered if (in lieu of the ad-migrate not working) I could change the Zentyal Domain setting from "Additional domain controller" to "Domain controller" (ie. standalone), then stop the old Windows AD service, but I really don't want to end up with no working DC at all!

14
Other modules / OpenVPN connections not supporting all protocols?
« on: May 15, 2017, 03:50:45 pm »
I've recently migrated my small set of users from an old Window-hosted OpenVPN connection point (which was using bridged mode, via TAP) to a new Zentyal OpenVPN connection, but the new Zentyal-hosted version is giving us some issues.

We have the Zentyal server (version 5) installed as a Secondary Domain Controller within an existing Windows AD network.  It is behind a pfSense-based firewall - so the pfSense firewall is performing the public-WAN to private-LAN port-forwarding/NAT for specific ports.

Our new Zentyal-based VPN problem manifests as some network issues for the client machines (all Windows laptops).  The issues were first noticed in VoIP calls (using an internally-hosted VoIP server, 3CX), where the remote laptop is now getting one-way audio for certain types of call (consistent, see tested scenarios below).  A second observed issue has been network shares not always showing their content properly (content would suddenly disappear then reappear later).

The client computers are all using OpenVPN client 2.4.1 64-bit, running as a Service on system startup to enable the network Route table to be modified (as this requires admin privileges).

Test scenarios for the VoIP call problem:
  • Make a VoIP call from VPN client machine using software phone (3CXPhone) to another internal extension that is also using 3CXPhone - result when the dialled user picks up call is success.
  • Repeat step 1 - result when the dialled user cannot pick up is the call gets directed to Voicemail, but the person who is making the call cannot hear the automated Voicemail menu. (one-way audio).
  • Make a VoIP call from VPN client machine using software phone to an external number - result is the recipient/target will hear the caller, but the caller cannot hear the person they have dialled. (one-way audio).
  • Make a calls from softphone client from LAN that is not using the VPN - all calls work fully as expected.

I ran some wireshark capture on the 3CX server to try and compare the successful calls with the one-way calls, but my knowledge of what to look is not sufficient to really diagnose them properly.

The Zentyal system is running as a QEMU Virtual Machine configured with 6 virtual-CPU and 6GiB RAM on an Ubuntu 16.04 server host.  The Zentyal server is also providing SAMBA shares.

I had originally configured the Zentyal VPN using TUN-mode, but due to the issue I have since added a second config (on a different port) to test using the TAP-mode option.  Unfortunately, the TAP connection version exhibits the same problem.

Note that Remote Desktop Protocol through either tun or tap works very well - hence my topic subject of suspecting the issue relates to specific network protocols.

I cannot seem to find sufficient documentation on the Zentyal WIKI to describe all the option settings for the various modules managed in Zentyal, which is deeply frustrating.  There are guides, but I find that the description of the various settings (why/when to use them and conversely why/when not to) is not discussed thoroughly enough.

Some youtube videos indicate adding network objects for the internal LAN, but don't configure the object - and this seems to be an unnecessary step anyway, as the OpenVPN module automatically adds the internal LAN (10.0.0.0/24) as an advertised network to the VPN server config.

I've wondered if the issue could be the DNS settings, as the Zentyal server is using forwarding to our pre-existing internal Windows DNS host.  It is not configured for transparent cache mode (again, documentation in the WIKI on this feels skimpy to me and doesn't fully describe the pro-and cons of enabling transparent cache mode).

Has anyone else experienced this type of "partial" connectivity with the Zentyal implementation of OpenVPN?

15
I've encountered exactly the same problem with v.5.  Our system is a QEMU VM (4GiB RAM, 4 CPU, 290GiB disk space using LVM), hosted on an Ubuntu 16.04 LTS physical DELL PowerEdge server.

We've been running Zentyal 4.1/4.2 quite happily for a few years as a Backup Domain Controller (within a Windows network), OpenVPN connector point and providing some file shares, then along came the 5.0 upgrade...  which failed miserably, causing loss of the webadmin service and OpenVPN (and using commands noted elsewhere in forum to restart the service [sudo zs webadmin restart] just sat on a blank command line...). The samba shares were still working.

I first tried to re-install a fresh v.5 but preserving the existing data by re-using the existing partitions (without formatting) of the VM... but this hits the same "Starting Hold until boot processes finish" you've reported.  After a few hours of waiting (just in case), I used Ctrl+F2 to get an alternative console - but the screen kept switching back to the first (default) console before I could finish typing the user name...  Then it started flashing between the two console instances, though staying longer on the default and "blinking" the second one periodically.

Having earlier made an external backup of the original user-data prior to the problem, I decided to re-install totally and use the "guided - reuse partition" install option this morning... but this hit the same hang point/message (screen-shot example attached).

So I tried a third time, but this time by erasing and re-creating the LVM volume... this last approach has worked for me this time and I have the webadmin console again, though I now have to recreate the module configurations and shares and restore data.

During all this hassle, I've seriously considered ditching Zentyal and maybe trying to use a "vanilla" Ubuntu system with the webmin toolset to install/configure Samba and other modules instead.  I've used webmin on another server, but only to configure a MariaDB database (all the main install occurring at CLI) for a wordpress site, so although I've read articles indicating we can configure Samba using webmin, I've not done it personally (yet).

Pages: [1]