Author Topic: Let's Recover!  (Read 3554 times)

Sam Graf

  • Guest
Re: Let's Recover!
« Reply #15 on: August 17, 2011, 02:36:36 pm »
I can think of four disaster recovery cases, put in hardware terms:
  • Same computer, same drives
  • Same computer, different drives
  • Different computer, same drives
  • Different computer, different drives
I think only in case three are we potentially not starting out with blank drives--say the motherboard failed but the drives and the data on them are intact. In my experience, a case 3 disaster recovery, from existing and intact drives, is a relatively easy disaster to recover from for non-gurus; even I was able to recover from that kind of disaster in less than an hour. But in terms of the documentation, I don't think that third case is the kind of disaster we're recovering from.

So in practical terms, I think there are just two basic disaster recovery procedures:
  • starting from blank drives
  • starting from working drives (case 3 above)
While both types of recovery procedure could be documented, it is the first that is easily the more complicated and vulnerable procedure. I take it that the existing documentation should be describing that procedure.

But even with adequate documentation, I'm not entirely sure a non-guru is going to be able to pull it off (just based on my own experience). It seems like it needs to be a guided procedure, a dedicated disaster recovery process from installation to migration. The LDAP part was the part that eventually stumped me and seems to me the weak link in the recovery chain, especially since application integration has seemed to mean trying to recover using two or more distinct, non-integrated data backups--Zentyal's, and the application's.

The cloud-based solution seems like an ideal answer in terms of process--the data is secure and the procedure is guided and/or automated. If we need an alternative to the cloud (and I'm not suggesting we don't), I'm not sure documentation alone is going to be it. But I could be wrong.

christian

  • Guest
Re: Let's Recover!
« Reply #16 on: August 17, 2011, 04:37:36 pm »
Sam,

I like the way you present it  ;)

For sure there is no "turn key DRP" because, as you explain, cases can be different. Even worst, Zentyal configuration will most likely differ and this may have an impact.
Then, error or incident that is triggering DRP does matter. Let's take an example that doesn't fit 100% with Zentyal (although I hope it will one day or another  ::))
- in environment where you have multiple LDAP servers replicating, generating an LDIF file from time to time (depending on your RPO) is enough because in case one server fails (major hardware failure), you will still be able to rebuild 100% of the LDAP content using replica (e.g. promoting slave as master and reverting replication flow). This kind of failure is easy for what concerns LDAP in such environment. DNS, when you have full zone transfer, is very similar. Then if you take the case of wrong process applied (say, e.g., someone playing with Webmin interface  ;D and deleting de whole ou=people branch  :-\) entry removal will be replicated and you slave LDAP server is useless. This is where LDIF file may help.

I'm exposing this to highlight that DRP also depends on how many Zentyal server you run even if today I don't think Zentyal is ready for such approach. When it will (I hope), it will provide very short RPO for some services.

Then regarding cases you describe, don't you think that "DRP for idiot" could benefit from an approach that is to:
- start from similar hardware
- erase everything you may have on your disks including system
- start one "simple" recovery procedure

Sure it may be less satisfactory than well tuned process where you may try to recover some data on disk before reinstalling but at least it will be "simple". Would you need something more efficient in term of data recovery, then either buy service or consultancy from external company or guru  8)

hubertus

  • Zen Apprentice
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Re: Let's Recover!
« Reply #17 on: August 17, 2011, 05:23:45 pm »
Hello everbody, Hello Sean

please don't get me wrong - I don't want to be rude at all but when reading through this thread my first impression was - hey please "learn" linux.
"More or less" it makes no difference into which machine you plug your harddrive. Linux will simply start up and do it's job. And, yes, doing a disaster recovery with 60 people standing behind you isn't fun. In such a situation it's VERY useful to know that you DID a dry run of a recovery before. But this is something that can't be done by a point and click interface. Therefore you need to get your hands dirty on the console.

If your car suffers a engine breakdown you can grap your toolbox and repair whatever fell apart.
Or you can put your car into a repair shop and let them do the job.

Now I'm back at the beginning. If you don't know how to repair something, you can either learn how to do it or pay someone to do it for you.

Have a nice day


Sam Graf

  • Guest
Re: Let's Recover!
« Reply #18 on: August 17, 2011, 05:41:52 pm »
"DRP for Dummies" is exactly what I need! ;D

Any other sort of do-it-yourself process is going to risk missing the point of using Zentyal in the first place, I think. Outside of the economics of using free software, many people are going to use Zentyal because it helps them do what they need to do without a lot of overwhelming grief. And Zentyal itself is geared toward those people. Excellent! Changing the game on them during their worst nightmare come true--at disaster recovery time--is really not altogether fair. Go all geeky on people when they are vulnerable, and they are going to feel burned, even if they really weren't. And Microsoft and the cloud are only a few mouse clicks away, ready and willing to love on people burned during a bad Linux experience ... even for those who just took their best shot at doing a practice Zentyal recovery and found out what they're up against.

So yes, I think we need the subscription services and we need a do-it-yourself "DRP for Dummies." Non-guru, point-and-click admins then have a realistic and healthy choice, one that they can live with (keeping in mind that admins often have little or no say in budget matters, or even if they do, may need time to make their business case to non-geeks--maybe even lots of time).

Of course, there is the "Just Buck Up and Learn Linux" approach ... but that's also a widely used argument against adopting products like Zentyal in the first place. Anyway, when it comes to FOSS, it seems like the community approach, the FSF philosophy if you will, is meant to be helpful to dummies. I don't immediately see why disaster recovery should be an exception. :)

hubertus

  • Zen Apprentice
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Re: Let's Recover!
« Reply #19 on: August 17, 2011, 06:10:09 pm »
As I said im my posting - please don't get me wrong. I'm glad that you think about a recovery before rolling out.

Given the fact that you use a RAID1 (Mirror) setup of your harddrives.
1.) The Motherboard goes nuts: Grab the drives, plug them into another machine and approx half an hour later you're back in business. Maybe (don't know if zentyal handles that different to ubuntu) you have no network interfaces after booting because they are bound to the old mac addresses.
2.) One drives fails: Hopefully you had a RAID monitor running and got aware of that before the second drive chrashes. Shut down the machine, replace the failed drive and resync the array. This is something that can go painfully wrong if you mess up sda and sdb for example.
3.) Everything is fucked up / burned / stolen or whatever. Now you REALLY need the desaster recovery.

1.) can be solved from almost everybody
2.) requires some linux knowledge
3.) hopefully we have a DSR for dummies  ;)

I absolutely understand your position and I think that I know your situation very well. But as long as you can't offer  support for a solution you can't offer that solution. If you feel comfortable with zentyal then start using it. But try to have someone that has a understanding of the basics and start learning them by yourself.
« Last Edit: August 17, 2011, 06:11:58 pm by hubertus »

christian

  • Guest
Re: Let's Recover!
« Reply #20 on: August 17, 2011, 06:37:44 pm »
I absolutely understand your position and I think that I know your situation very well. But as long as you can't offer  support for a solution you can't offer that solution. If you feel comfortable with zentyal then start using it. But try to have someone that has a understanding of the basics and start learning them by yourself.

Although I share your first statement when you wrote "if you want to do something not provided out of the box, start learning Linux", I'm not in line with this one  ;) Zentyal aims at providing services and applications (and more and more BTW) to SMBs not having any skill to build the same with internal resources or not willing to allocate resources to this kind of activity.
DRP when not exposed either as a (turn key) service or associated with "DRP for dummies" simple process is very far from the "basics" you ask them to understand.
DRP as presented in Zentyal documentation is, for what I understand (I looked at it quickly) ready for deployment. Perhaps some, few, additional sentences are required to explain what it does and cover and what it doesn't.

We are debating here thanks to Sean trying to investigate cases where this process doesn't work or misses some detail in documentation. Although we are currently debating about the "DRP concept", I'm pretty suer we will end up soon with some basic sentences like:
- enable incremental backup following this process
- use same hardware
- wipe everything on disks
- execute this procedure to restore
- anything else doesn't work or is not supported  :P

This will be a limited but workable DRP, bringing frustration to anyone willing to do something different, more efficient or whatever but at least offering something so that people do not deploy solution that can't be supported.

Well, it looks easy isn't it  ;D ? But even with this approach, we will debate again and again because RPO concept (and RTO) is not yet understood by most of Zentyal users and the answer is not in term of process only. Depending on user's needs, and as you explain, it might require RAID or external storage (NAS or SAN) with specific fail-over and redundancy and specific backup procedures. Beyond Zentyal current concept if I understand well...  8)

For all above reasons, I feel it's easier (if I can say so) to publish DRP procedure with limited scope and spend time and energy explaining what it covers and what it doesn't.
For other... I share: start learning Linux and DRP basics.  :)

Sam Graf

  • Guest
Re: Let's Recover!
« Reply #21 on: August 17, 2011, 07:29:25 pm »
In my world, people use Microsoft products precisely because of the "if you can't support a solution you can't offer it as a solution" Linux argument. I find it sort of a hostile environment in which to work, frankly, on both sides (people on both sides looking down at the little guy who deploys Linux-based solutions--why would any clear-thinking little guy do such a thing?). The argument even strikes me as a little simplistic. We aren't talking about simple file management skills, after all--creation, permissions, etc. We're talking about Zentyal disaster recovery, which extends well beyond basic Linux skills. It includes, potentially, critical knowledge of how Zentyal works as a managed, integrated offering. It involves knowledge of the software selections made by the developers, and how to recover from disaster in those cases.

I'm pretty suer we will end up soon with some basic sentences like:
- enable incremental backup following this process
- use same hardware
- wipe everything on disks
- execute this procedure to restore
- anything else doesn't work or is not supported  :P

This will be a limited but workable DRP, bringing frustration to anyone willing to do something different, more efficient or whatever but at least offering something so that people do not deploy solution that can't be supported.
Absolutely agreed. But consider the mess even this simple outline possibly entails. I'll start with just the first item, and to avoid pointless hypothetical stuff, I'll confine myself to my real world experience trying to back up (and recover) an eBox deployment running eGroupWare:

Exactly what does the included backup system back up? If an integrated, managed application relies on a database back end, is that application's data included in the backup? If not, then what? Are the application's backup tools (and/or general administration tools) exposed by Zentyal? If not, why not? Are there reasons for that? For example, will attempting to recover using the application's backup have negative consequences for the LDAP back end? If so, then what?

I won't go on, because I think I've made my point and I don't intend to derail productive construction of a Zentyal DRP for Dummies. I will say that if every Zentyal user is rightly to be expected to be able to master unassisted disaster recovery in this kind of environment before deploying Zentyal in good conscience, then we are asking a tremendous amount of insurance expertise from people who as likely as not evaluated Zentyal in the first place because they lack that kind of expertise. What an odd-sounding proposition.
« Last Edit: August 17, 2011, 07:31:39 pm by Sam Graf »

hubertus

  • Zen Apprentice
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Re: Let's Recover!
« Reply #22 on: August 17, 2011, 08:07:41 pm »
Well for some sort of automated DSR I think that Debian/Ubuntu already offers great tools.
So why not create a solution that does something like dpkg --get-selections in combination with some other magic that creates a bootable image that EXACTLY restores the system including it's configuration ? After that you would only have to restore the data from a backup location. Of course automated.
If such a way is easy and reliable then this could be solution that would kick M$  in the a... And this would not even require to use the same hardware. In my opinion something like that could be handled in less than an hour time (except the unpredictable time required for copying the data)

Zentyal: with foolproof desaster recovery included ! 8) Ever heard something like that from M$ ?

christian

  • Guest
Re: Let's Recover!
« Reply #23 on: August 17, 2011, 08:53:12 pm »
Hubertus,

what you describe is more or less the idea with current process based on GRML. Did you look at it?
Issue, to me, is not technical but more in term of wording to have procedure that is really accurate in term of how to but also, and this is the most important here if one wants to avoid question like "yes but what is I've 3 more LAN interfaces", very accurate and exhaustive in term of scope and target.
It may also require to provide less flexibility with backup tool to ensure "everything" required during restore is included in backup.

jimmyland

  • Zen Apprentice
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Re: Let's Recover!
« Reply #24 on: September 29, 2011, 02:33:50 am »
Did OP ever find a solution for this? I, too, am trying to test out various recovery scenarios and finding it difficult.