Linux Disaster Recovery Planning: A Practical Guide for Mumbai Businesses

By Arun Valecha

24 May 2026

By Arun Valecha, AV Services · Linux Infrastructure Expert since 1999

Most Mumbai businesses do not have a disaster recovery plan for their Linux servers. They have a backup. Those are not the same thing.

A backup is a copy of data. A disaster recovery plan is the documented, tested process for restoring operations after a serious failure — hardware death, ransomware, fire, flood, or catastrophic human error. The difference becomes clear at 2am when the server is down and nobody knows what to do first.

What a disaster looks like in practice

In 25 years of Linux infrastructure work in Mumbai, the disasters that have caused the most damage share a common pattern: the backup existed, but nobody had ever tested restoring from it. The hardware failed, the restore was attempted, and something went wrong — wrong version, missing configuration, corrupt archive, expired credentials for the backup destination.

The disaster was not the hardware failure. The disaster was the untested recovery process meeting a real emergency for the first time.

RTO and RPO — the two numbers your plan needs

Recovery Time Objective (RTO) is how long your business can tolerate the server being down before the impact becomes unacceptable. For a jewellery manufacturer mid-production run: maybe 2 hours. For a stockbroker during trading hours: maybe 15 minutes. For an e-commerce site during a sale: maybe 30 minutes.

Recovery Point Objective (RPO) is how much data loss is acceptable. If your last backup was 24 hours ago and the server dies now, you lose 24 hours of transactions. For most businesses, losing a day of data is catastrophic. For some, it is acceptable. Your backup frequency should match your RPO.

Most businesses have never defined either number. Without them, a disaster recovery plan has no target to aim for.

The 6 components of a working DR plan

1. Documented server configuration. Every managed server at AV Services has a plain-text documentation file: OS version, installed packages, service configurations, cron jobs, open ports, firewall rules, backup destinations, credentials stored securely. If the server dies, this document is what allows restoration without memory.

2. Verified offsite backups with known restore time. Not just backups — timed restore tests. How long does a full restore actually take? 14 minutes for one AV Services client running borgbackup to a remote server. That is a known number. It means the RTO can be planned around it.

3. Spare hardware or cloud failover plan. If the physical server dies, what replaces it? A spare server, a cloud VM, a rented dedicated server? The answer needs to exist before the emergency — not during it. Cloud VMs can be provisioned in minutes if the DR plan specifies which provider, which region, which specs.

4. Defined escalation path with contact numbers. Who gets called first? Who has access to the backup credentials? Who can authorize emergency spending? This sounds obvious until 6am on a Monday when the ERP is down and three people are calling each other asking who has the root password.

5. Tested recovery runbook. A step-by-step document of exactly what to do during a recovery — in order, with commands. Not written from memory during the emergency. Written in advance, tested in a drill, reviewed annually. The goal is that anyone with Linux access can execute it.

6. Annual DR drill. Once a year, simulate a failure. Restore from backup to a test environment. Time it. Identify what is missing from the runbook. Update the plan. The businesses that recover fastest from real disasters are the ones that have practiced recovery from fake ones.

Mumbai-specific considerations

Mumbai’s infrastructure profile creates specific DR requirements. Monsoon flooding has taken out ground-floor server rooms in Bhandup, Kurla, and Andheri. Power grid instability causes unclean shutdowns that corrupt databases. The 2017 telecom outages showed how dependent businesses are on single-provider connectivity.

A Mumbai DR plan should include: offsite backup to a geographically separate location (Pune, Bangalore, or cloud), UPS with graceful shutdown configured, and at minimum a secondary internet connection for remote management during recovery.