Author Topic: Removing dead domain controllers  (Read 12893 times)

robgwood

  • Zen Apprentice
  • *
  • Posts: 5
  • Karma: +1/-0
    • View Profile
Removing dead domain controllers
« on: August 20, 2013, 01:38:21 pm »
I am experimenting with a Zentyal (3.0.2, core version 3.0.25, community) only multi domain controller network and I want to be able to completely remove a failed and unrecoverable additional domain controller from the PDC.
I am testing this as part of my company's disaster recovery planning process.
I think I have tried all of the recommended ways to do this including ntdsutil, active directory sites and services and active directory users and computers but none of them work as billed, all producing errors and not removing all of the references to the "dead" controller from AD.
Has anyone actually managed to completely remove a dead ADC from the domain and then added a newly built machine with the same name back? If so, can you point me to any documentation that shows how to do this?


faustotex

  • Zen Apprentice
  • *
  • Posts: 21
  • Karma: +2/-0
    • View Profile
Re: Removing dead domain controllers
« Reply #1 on: August 20, 2013, 03:03:41 pm »
When running "samba-tool domain join" at a command line the messages indicate that it cleans up a lot of dead pointers before creating new ones.  I have found that by identifying the uuid of the dead DC through examination of an Alias(CNAME) record in the DNS Forward Zone "_msdcs.<YOUR-DOMAIN>", then it suffices to remove the following two records from this zone, to successfully join another DC of the same name:

1. the uuid Alias itself;

2. the "_ldap._tcp.<uuid>.domains" record;

Note that if you are using the RSAT DNS Manager, the second record will be found under "domains" in the tree browser of the "_msdcs.<YOUR.DOMAIN>" Forward Zone.  The tell-tale uuid will stick out like a sore thumb.

You may have subsidiary Zentyal issues but, generally, joining of a DC of the same name will occur after removal of these two mentioned DNS records.

robgwood

  • Zen Apprentice
  • *
  • Posts: 5
  • Karma: +1/-0
    • View Profile
Re: Removing dead domain controllers
« Reply #2 on: August 20, 2013, 06:12:49 pm »

I have found that by identifying the uuid of the dead DC through examination of an Alias(CNAME) record in the DNS Forward Zone "_msdcs.<YOUR-DOMAIN>", then it suffices to remove the following two records from this zone, to successfully join another DC of the same name:

1. the uuid Alias itself;

2. the "_ldap._tcp.<uuid>.domains" record;

To save me some time looking for these objects, can you let me know if they can be accessed and manipulated directly within the file system of the PDC or do I have to use the RSAT tools? I have done a quick search and cannot find any reference to _msdcs in the files on the PDC. Thanks

faustotex

  • Zen Apprentice
  • *
  • Posts: 21
  • Karma: +2/-0
    • View Profile
Re: Removing dead domain controllers
« Reply #3 on: August 20, 2013, 06:57:01 pm »
You need to look at the DNS records for your domain.  If you have an XP machine with SP1 or later, download and install "Windows Server 2003 Adminpak".  If you have Windows 7, download and install "RSAT".  It is possible to remove the mentioned records with other free tools such as "Ldapadmin", but it not as straightforward to confidently find the required uuid of the stale DC as it is with RSAT and its predecessor 2003 Adminpak.

robgwood

  • Zen Apprentice
  • *
  • Posts: 5
  • Karma: +1/-0
    • View Profile
Re: Removing dead domain controllers
« Reply #4 on: August 21, 2013, 08:09:27 am »
Thanks for that, I have now had partial success.

Just to clarify, I want to use the Zentyal GUIs to reattach the replacement machine to the domain as I don't want to bypass their tools using samba-tool directly to join the domain and end up with the Zentyal and the machine being in different states of awareness re the configuration of the system so my ambition is to come up with a tool that will strip out a dead machine allowing my DR plan just to say... "Build a standard Zentyal installation and use the Zentyal GUI to join it to the domain..."

Anyway what happened? Well I have 2 machines pdc, the domain controller system  and adc, you guessed, the additional domain controller. I did as you suggested with pdc and then used Zentyal to add the adc back onto the domain. It looked like all had worked, the users and groups were populated in adc but my initial tests "running samba-tool drs showrepl" on adc machine produced this...

root@adc:~# samba-tool drs showrepl 2>&1
ldb_wrap open of secrets.ldb
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'schannel' registered
GENSEC backend 'spnego' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'krb5' registered
GENSEC backend 'fake_gssapi_krb5' registered
Using binding ncacn_ip_tcp:adc.impact0.lan[,seal]
Wrong username or password: kinit for ADC$@IMPACT0.LAN failed (Client not found in Kerberos database)

SPNEGO(gssapi_krb5) creating NEG_TOKEN_INIT failed: NT_STATUS_LOGON_FAILURE
Got challenge flags:
Got NTLMSSP neg_flags=0x60898235
NTLMSSP: Set final flags:
Got NTLMSSP neg_flags=0x60088235
NTLMSSP Sign/Seal - Initialising with flags:
Got NTLMSSP neg_flags=0x60088235
Wrong username or password: kinit for ADC$@IMPACT0.LAN failed (Client not found in Kerberos database)

SPNEGO(gssapi_krb5) creating NEG_TOKEN_INIT failed: NT_STATUS_LOGON_FAILURE
Got challenge flags:
Got NTLMSSP neg_flags=0x60898205
NTLMSSP: Set final flags:
Got NTLMSSP neg_flags=0x60088205
Default-First-Site-Name\ADC
DSA Options: 0x00000001
DSA object GUID: 2c8e8a01-71df-4714-9671-1fff353b6b32
DSA invocationId: 3dee2ddc-5a70-41a6-9547-86f097a09467

==== INBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=impact0,DC=lan
        Default-First-Site-Name\PDC via RPC
                DSA object GUID: bcfdaa72-373e-40f3-ad0e-59c6d3af4d0e
                Last attempt @ Wed Aug 21 06:47:37 2013 BST failed, result 2 (WERR_BADFILE)
                158 consecutive failure(s).
                Last success @ Tue Aug 20 17:47:47 2013 BST

<<text deleted>>
==== OUTBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=impact0,DC=lan
        Default-First-Site-Name\PDC via RPC
                DSA object GUID: bcfdaa72-373e-40f3-ad0e-59c6d3af4d0e
                Last attempt @ Wed Aug 21 06:50:46 2013 BST failed, result 2 (WERR_BADFILE)
                18 consecutive failure(s).
                Last success @ NTTIME(0)
<<text deleted>>
==== KCC CONNECTION OBJECTS ====

Connection --
        Connection name: 62f7abbc-40c3-4053-a6cb-a4016603209b
        Enabled        : TRUE
        Server DNS name : pdc.impact0.lan
        Server DN name  : CN=NTDS Settings,CN=PDC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=impact0,DC=lan
                TransportType: RPC
                options: 0x00000001
Warning: No NC replicated for Connection!
root@adc:~#


and on the PDC, this...

root@pdc:~# samba-tool drs showrepl 2>&1
ldb_wrap open of secrets.ldb
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'schannel' registered
GENSEC backend 'spnego' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'krb5' registered
GENSEC backend 'fake_gssapi_krb5' registered
Using binding ncacn_ip_tcp:pdc.impact0.lan[,seal]
Default-First-Site-Name\PDC
DSA Options: 0x00000001
DSA object GUID: bcfdaa72-373e-40f3-ad0e-59c6d3af4d0e
DSA invocationId: eb3a8a1b-c7cd-4960-a6c4-2abb24a83e8c

==== INBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=impact0,DC=lan
        Default-First-Site-Name\ADC via RPC
                DSA object GUID: 2c8e8a01-71df-4714-9671-1fff353b6b32
                Last attempt @ Wed Aug 21 06:54:16 2013 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Aug 21 06:54:16 2013 BST
<<text deleted>>

==== OUTBOUND NEIGHBORS ====

DC=DomainDnsZones,DC=impact0,DC=lan
        Default-First-Site-Name\ADC via RPC
                DSA object GUID: 2c8e8a01-71df-4714-9671-1fff353b6b32
                Last attempt @ Wed Aug 21 00:47:35 2013 BST was successful
                0 consecutive failure(s).
                Last success @ Wed Aug 21 00:47:35 2013 BST

<<text deleted>>
==== KCC CONNECTION OBJECTS ====

Connection --
        Connection name: 170fa377-ace6-41b3-a303-a5976db20daf
        Enabled        : TRUE
        Server DNS name : ADC.impact0.lan
        Server DN name  : CN=NTDS Settings,CN=ADC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=impact0,DC=lan
                TransportType: RPC
                options: 0x00000001
Warning: No NC replicated for Connection!

Any ideas/suggestions on what i may have missed? Thanks.

robgwood

  • Zen Apprentice
  • *
  • Posts: 5
  • Karma: +1/-0
    • View Profile
Re: Removing dead domain controllers
« Reply #5 on: August 22, 2013, 11:50:34 am »
Update. A working solution

I revisited faustotex’s suggestions and completely rebuilt the test machines pdc and adc, killed off adc, removed the two UUID references and replaced it with a machine built from the Zentyal installation media  and this time managed to reconnect the adc back on the domain by just using the standard Zentyal GUI file sharing dialog.
So I now have a way to replace a dead DC. However, during my experiments I had accidentally installed a machine called sdc which I did not want to keep but both pdc and adc were trying and failing to keep it updated, I used “samba-tool drs showrepl” to see this. The error was:

DC=ForestDnsZones,DC=impact0,DC=lan
        Default-First-Site-Name\sdc via RPC
                DSA object GUID: b55021e9-ce07-4909-87cf-78429121704c
                Last attempt @ Wed Aug 21 22:27:40 2013 BST failed, result 2 (WERR_BADFILE)

I knew exactly what was causing the error as I had managed to use the various bits of the Microsoft ADAT GUIs that I could get to work (has anyone actually got ntdsutil to “list domains” without getting a syntax error message) to remove everything from DNS, Computers and Users, Sites and Services that I could but in Sites and Services I was left with the NTDS Settings object in Sites>Default-First-Site-Name>Servers>SDC that I could not remove.  I got the error

Windows cannot delete object LDAP://pdc.impact0.lan/CN=NTDS Settings, CN=pdc,CN=Servers,CN= Default-First-Site-Name,CN=Configuration,DC=impact0,DC=lan because: An invalid directory pathname was passwd.

At this point I was stuck. I had a look at the Luma and the command line ldapdelete but again I was getting errors when trying to remove the sdc machine and I was looking for a way to include this process in a DR plan that an average IT manager could follow, not to become an LDAP/AD guru.

I suspect that the order objects are removed using the ADAT is critical but I have not been able to confirm that.

 I then downloaded the evaluation copy of the LDAPSoft AD Admin Tool 6.6, navigated to the same location in the ForestDnsZones DN as I had with the Mictrosoft tools and was able to remove the NTDS Settings and the SDC entries with no problem and the samba-tool is no longer reporting errors.
The LDAPsoft is time limited to 15 days but appears to be fully functional and it did the actual delete I wanted not just tell me it could if I bought the full version.

So, I am now able to replace a dead DC by building a new machine and adding it to the domain as an additional domain controller with the same name as the dead one and I can completely remove any reference to a dead domain controller that I do not want to replace but is causing errors in the domain.

I am documenting the process now as part of my DR plan but even at this late stage I do feel that I am re-inventing the wheel and the document I am about to write that actually does remove dead DCs rather than just contains snippets of information I have managed to find on the web that purport to do the job but are usually just  clones of  Microsoft how-to web pages. If anyone does have a fully working solution can they let me know… Ideally using an open source version of an AD admin tool

Lonniebiz

  • Zen Samurai
  • ****
  • Posts: 320
  • Karma: +24/-2
    • View Profile
Re: Removing dead domain controllers
« Reply #6 on: May 26, 2014, 09:32:30 pm »
Zentyal really needs to have a place in their web GUI that allows you remove all remanets of a previously added Domain Controller.

In order to upgrade from 3.3.10 to 3.4.3, I had to completely rebuild my servers and restore their configurations. The "upgrade to 3.4" button fails.

When I tried to restore the configuration of my additional domain controller I got an error that said:
"Restore is only possible if the server is the unique domain controller of the forest"

So far, all my attempts to get this additional domain controller joined again have failed. I'm suspect I'm going to have to give this server a new name and IP address on my local LAN before I will actually succeed in getting it to become a functioning additional domain controller again.