Proxmox DRP: Multi-Site Architecture & RPO/RTO

DRP: Definition and Challenges

The Disaster Recovery Plan (DRP) is the set of procedures and technical resources enabling IT operations to resume after a major disaster: fire, flood, cyberattack, catastrophic hardware failure, or critical human error.

Two fundamental metrics structure every DRP:

RPO (Recovery Point Objective): the maximum amount of data you accept to lose. An RPO of 24h means you can lose up to one day of data.
RTO (Recovery Time Objective): the maximum delay to bring services back online. An RTO of 4h means production must be restored in under 4 hours.

DRP vs BCP: Two Complementary Approaches

It is essential to distinguish the DRP from the BCP (Business Continuity Plan). The BCP aims to maintain operations without interruption through high availability (HA clustering, synchronous Ceph replication, automatic failover). The DRP steps in when the BCP has failed: it organizes reconstruction from backups or a standby site.

Criteria	BCP	DRP
Objective	Zero interruption	Recovery after disaster
Proxmox Technologies	HA, synchronous Ceph, live migration	PBS, async replication, DR site
Typical RPO	~0 (zero loss)	Minutes to hours
Typical RTO	Seconds	Minutes to hours
Cost	High (doubled infrastructure)	Moderate
Scope	Individual failures	Major disasters

Regulatory Obligations: DORA, NIS2, ISO 22301

The European regulatory framework now imposes strict requirements for IT resilience:

DORA (Digital Operational Resilience Act): applicable to the financial sector since January 2025, requires regular resilience testing, formalized IT risk management, and documented and tested recovery plans.
NIS2 (Network and Information Security Directive): extends cybersecurity obligations to many sectors (energy, transport, healthcare, digital). Requires risk management measures including business continuity and crisis management.
ISO 22301: international standard for business continuity management systems. A reference framework for structuring a BCP/DRP compliant with best practices.

Regulatory note

Since 2025, not having a documented and tested DRP exposes companies covered by DORA or NIS2 to significant financial penalties. Beyond compliance, a well-designed DRP protects the long-term viability of your business.

Multi-Site DRP Architecture with Proxmox

Proxmox VE provides all the building blocks needed to construct a robust multi-site DRP. The architecture relies on three key components: the production cluster, the disaster recovery (DR) site, and Proxmox Backup Server (PBS) as the cornerstone of the backup strategy.

DRP Architecture Diagram

  PRIMARY SITE (Production)              DR SITE (Standby)
  ================================       ================================
  |  Proxmox VE Cluster (3 nodes)  |       |  Proxmox VE Cluster (2 nodes)  |
  |  - Production VMs              |       |  - Standby VMs                 |
  |  - Ceph storage (replicated x3)|       |  - Ceph storage (replicated x2)|
  |  - HA manager active           |       |  - HA manager ready            |
  ================================       ================================
        |               |                        ^               ^
        |  Application  |                        |               |
        |  replication  |________________________|               |
        |  (MariaDB...)                                          |
        |                                                        |
        v                                                        |
  ================================                               |
  |  Local PBS                     |    PBS Sync (pull mode)     |
  |  - Daily backup                |  --------------------------->|
  |  - Deduplication               |                             |
  |  - Verify jobs                 |       ========================
  |  - 30-day retention            |       |  Remote PBS           |
  ================================       |  - Off-site copy       |
        |                                 |  - AES-256 encryption  |
        v                                 |  - 90-day retention    |
  ================================       ========================
  |  Air-Gapped PBS (Nimbus)       |
  |  - Disconnected disks          |
  |  - Weekly rotation             |
  |  - Ransomware protection       |
  ================================

Complete DRP architecture: application replication between sites + multi-tier PBS backups

Application Replication vs Ceph Stretch Cluster

For inter-site replication, two approaches stand out depending on the criticality level:

Application Replication (recommended)

MariaDB/MySQL replication, PostgreSQL streaming, rsync...
Simple to implement and maintain
Works over standard WAN links
Configurable RPO (seconds to minutes)
Moderate cost, no latency constraints
Manual or semi-automatic failover

Ceph Stretch Cluster

Synchronous replication between 2 sites
RPO = 0 (zero data loss)
Automatic failover
High complexity (inter-site Ceph)
Requires a 3rd site (monitor tiebreaker)
High cost (bandwidth, latency < 10ms)

For the majority of SMBs and mid-market companies, application replication (MariaDB, PostgreSQL...) combined with PBS offers the best trade-off between simplicity, cost, and DRP effectiveness. The Ceph stretch cluster is reserved for the rare critical environments (finance, healthcare) where RPO = 0 is an absolute requirement — its operational complexity is not justified for most businesses.

PBS: Central Component of the DRP

Proxmox Backup Server plays a central role in any DRP architecture. Unlike Ceph replication which protects against hardware failures, PBS protects against logical corruption: human errors, ransomware, application bugs. It is the defense layer that allows you to roll back to a previous healthy state.

To learn more about PBS backup strategy, see our dedicated article: Proxmox 3-2-1 Backup Strategy

RPO/RTO: What Can You Achieve with Proxmox?

Achievable recovery objectives depend directly on the deployed architecture. Here is a realistic comparison of the three main approaches:

Architecture	RPO	RTO	Relative Cost	Use Case
PBS only (daily backup)	24h	2 - 4h	$	SMBs, non-critical applications
PBS + 2x/day backup	12h	2 - 4h	$	SMBs with evolving data
Application replication + PBS	~few min	30 min - 1h	$	Best trade-off
Ceph stretch cluster (synchronous)	~0	< 5 min	$$$	Finance, healthcare, critical

Our recommendation: Application replication + PBS

For the majority of businesses, we recommend the application replication + PBS combination. MariaDB, PostgreSQL, or rsync replication between two sites is more than enough to ensure critical data continuity (RPO of a few minutes). PBS provides the additional protection layer against logical corruption with long retention and verified backups. This approach is simple to implement, cost-effective, and reliable — it covers both hardware failures and logical disasters (ransomware, human error) without the complexity of an inter-site Ceph infrastructure.

To outsource your DRP backups, discover NimbusBackup for your Proxmox DRP : we offer Hosted PBS solutions for your Proxmox DRP with multi-site replication and end-to-end encryption.

For a detailed cost analysis of Proxmox infrastructure compared to VMware, see our TCO VMware vs Proxmox 2026 comparison

PBS: Your Best DRP Ally

Proxmox Backup Server is much more than a simple backup tool. It is an enterprise-grade solution that natively integrates essential features for a reliable DRP:

Deduplication and Efficiency

Chunk-level deduplication: 60 to 90% storage space reduction
Incremental backups: only modified blocks are transferred
Native compression: optimized storage and bandwidth usage

Security and Verification

Verify Jobs : automatic integrity verification of every backup
Client-side AES-256-GCM encryption: data is encrypted before transfer
Sync jobs: PBS-to-PBS replication for off-site copies

Off-Site PBS with Nimbus

For the off-site layer of your DRP, RDEM Systems offers Nimbus, our range of external backup solutions:

Nimbus Double Drive PBS : two mirrored disks in a remote datacenter for complete redundancy of your PBS backups
Nimbus Air Gapped PBS : physically disconnected disks in rotation, maximum protection against ransomware and account compromises

Discover all our backup solutions at nimbus.rdem-systems.com .

Testing Your DRP: The Key to Reliability

An untested DRP is a DRP that will fail. It is a statistical certainty. Regular testing validates that procedures work, that RTOs are achievable, and that teams know how to respond in a crisis situation.

DRP Testing Methodology

1Documentation review (monthly): review of procedures, verification of emergency contacts, update of VM inventories and restoration priorities.
2Partial technical test (quarterly): restoring individual VMs from PBS on an isolated network. Verifying boot and application functionality. Measuring actual restoration time.
3Full failover test (semi-annual): activating the DR site, restoring all critical services, business validation by functional teams. Measuring actual RTO and RPO.
4Post-mortem (after each test): documenting gaps between objectives and actual results, corrective action plan, procedure updates.

DRP Validation Checklist

PBS backups are intact (verify jobs OK)
Restored VMs boot correctly on the DR site
Business applications are functional after restoration
Measured RTO is less than or equal to the target
Measured RPO meets expectations
Network access (DNS, VPN, firewall) is operational on the DR site
Teams know the procedures and emergency contacts
DRP documentation is up to date and accessible outside the production site

DRP and Ransomware: Advanced Protection

Ransomware is today the number one threat to IT infrastructure. An effective DRP must include specific anti-ransomware measures, as a sophisticated attacker will seek to compromise backups before triggering encryption.

The 4 Pillars of Protection

1. Air Gap

Backup copies on physically disconnected media from the network. Even an attacker with administrative access cannot reach a disk that is not plugged in. This is the ultimate protection.

Nimbus Air Gapped PBS

2. Immutability

PBS backups can be protected against deletion and modification through strict retention policies and separate credentials. The PBS datastore is only accessible in append-only write mode from the hypervisors.

3. Encryption

PBS supports client-side AES-256-GCM encryption. Data is encrypted before leaving the hypervisor. Even if the PBS server is compromised, the data remains unreadable without the encryption key.

4. Access Separation

Backup access credentials must be strictly separated from production credentials. A Proxmox admin account should not be able to delete PBS backups. The principle of least privilege applied rigorously.

Ransomware alert

Modern attackers spend an average of 21 days in the system before triggering encryption. During this period, they identify and compromise backups. This is why long retention (90 days minimum) and air-gapped copies are essential: they allow restoring a healthy state prior to the compromise.

PBS Tape: Long-Term Archival

For businesses with regulatory archival obligations (10 years in the financial sector), PBS supports export to magnetic tapes. Tapes offer very low-cost storage, 30+ years durability, and native air-gap protection (tapes are physically removable and can be stored in a safe).

Our DRP Approach at RDEM Systems

At RDEM Systems, we support businesses in the design, implementation, and ongoing maintenance of their Proxmox DRP. Our approach stands out through its pragmatism and adaptation to each client's actual budget and constraints.

Resilience audit: analysis of your existing infrastructure, identification of critical VMs, definition of target RPO/RTO per business service
Custom DRP architecture: designing a multi-site architecture tailored to your constraints (budget, regulatory, geographic). Choice between Ceph replication, off-site PBS, or a hybrid approach
Implementation and documentation: deploying the DRP architecture, configuring PBS backups and replication, writing recovery procedures
Quarterly DRP tests: executing restoration tests, measuring actual RPO/RTO, compliance reporting and improvement plan
Monitoring and alerting: continuous backup monitoring, alerts on failures, proactive backup integrity verification

Our DRP integrates into our comprehensive Proxmox managed services offering and benefits from our sovereign infrastructure operated from France. For off-site backup needs, our Nimbus range covers all protection levels, from standard backup to air-gapped.

For comprehensive support, discover our 24/7 managed services and on-call support for your DRP : monitoring, failover testing, and support in case of disaster.

Check our pricing or contact us for a free resilience audit.

If you are considering a migration from VMware, our VMware to Proxmox migration guide integrates DRP considerations from the start.

Frequently Asked Questions

Official Documentation

To dive deeper into the concepts covered in this article, consult the official documentation:

Let's Design Your Proxmox DRP Together

RDEM Systems supports you from the resilience audit to the implementation of your multi-site DRP. Get a free audit of your infrastructure and a DRP architecture proposal tailored to your needs.

DRP with Proxmox: Multi-Site Architecture, RPO/RTO & PBS

DRP: Definition and Challenges

DRP vs BCP: Two Complementary Approaches

Regulatory Obligations: DORA, NIS2, ISO 22301

Regulatory note

Multi-Site DRP Architecture with Proxmox

DRP Architecture Diagram

Application Replication vs Ceph Stretch Cluster

Application Replication (recommended)

Ceph Stretch Cluster

PBS: Central Component of the DRP

RPO/RTO: What Can You Achieve with Proxmox?

Our recommendation: Application replication + PBS

PBS: Your Best DRP Ally

Deduplication and Efficiency

Security and Verification

Off-Site PBS with Nimbus

Testing Your DRP: The Key to Reliability

DRP Testing Methodology

DRP Validation Checklist

DRP and Ransomware: Advanced Protection

The 4 Pillars of Protection

1. Air Gap

2. Immutability

3. Encryption

4. Access Separation

Ransomware alert

PBS Tape: Long-Term Archival

Our DRP Approach at RDEM Systems

Frequently Asked Questions

Official Documentation

Related Articles

Also Read on Our Blogs

Let's Design Your Proxmox DRP Together

DRP: Definition and Challenges

DRP vs BCP: Two Complementary Approaches

Regulatory Obligations: DORA, NIS2, ISO 22301

Regulatory note

Multi-Site DRP Architecture with Proxmox

DRP Architecture Diagram

Application Replication vs Ceph Stretch Cluster

Application Replication (recommended)

Ceph Stretch Cluster

PBS: Central Component of the DRP

RPO/RTO: What Can You Achieve with Proxmox?

Our recommendation: Application replication + PBS

PBS: Your Best DRP Ally

Deduplication and Efficiency

Security and Verification

Off-Site PBS with Nimbus

Testing Your DRP: The Key to Reliability

DRP Testing Methodology

DRP Validation Checklist

DRP and Ransomware: Advanced Protection

The 4 Pillars of Protection

1. Air Gap

2. Immutability

3. Encryption

4. Access Separation

Ransomware alert

PBS Tape: Long-Term Archival

Our DRP Approach at RDEM Systems

Frequently Asked Questions

What is the difference between DRP and BCP?

What RPO and RTO can be achieved with Proxmox?

Is PBS sufficient as a DRP solution?

How does Proxmox comply with DORA and NIS2?

Can you test your DRP without impacting production?

How to protect your DRP against ransomware?

Official Documentation

Related Articles

Also Read on Our Blogs

Let's Design Your Proxmox DRP Together