InfrastructureFebruary 16, 202616 min read

    DRP with Proxmox: Multi-Site Architecture, RPO/RTO & PBS

    The Disaster Recovery Plan (DRP) is no longer a luxury reserved for large enterprises. With DORA and NIS2 regulatory obligations, and the constant threat of ransomware, every IT department must design a credible recovery strategy. Proxmox VE, combined with PBS and Ceph, provides a complete platform to build a high-performance and cost-effective multi-site DRP.

    DRP: Definition and Challenges

    The Disaster Recovery Plan (DRP) is the set of procedures and technical resources enabling IT operations to resume after a major disaster: fire, flood, cyberattack, catastrophic hardware failure, or critical human error.

    Two fundamental metrics structure every DRP:

    • RPO (Recovery Point Objective): the maximum amount of data you accept to lose. An RPO of 24h means you can lose up to one day of data.
    • RTO (Recovery Time Objective): the maximum delay to bring services back online. An RTO of 4h means production must be restored in under 4 hours.

    DRP vs BCP: Two Complementary Approaches

    It is essential to distinguish the DRP from the BCP (Business Continuity Plan). The BCP aims to maintain operations without interruption through high availability (HA clustering, synchronous Ceph replication, automatic failover). The DRP steps in when the BCP has failed: it organizes reconstruction from backups or a standby site.

    CriteriaBCPDRP
    ObjectiveZero interruptionRecovery after disaster
    Proxmox TechnologiesHA, synchronous Ceph, live migrationPBS, async replication, DR site
    Typical RPO~0 (zero loss)Minutes to hours
    Typical RTOSecondsMinutes to hours
    CostHigh (doubled infrastructure)Moderate
    ScopeIndividual failuresMajor disasters

    Regulatory Obligations: DORA, NIS2, ISO 22301

    The European regulatory framework now imposes strict requirements for IT resilience:

    • DORA (Digital Operational Resilience Act): applicable to the financial sector since January 2025, requires regular resilience testing, formalized IT risk management, and documented and tested recovery plans.
    • NIS2 (Network and Information Security Directive): extends cybersecurity obligations to many sectors (energy, transport, healthcare, digital). Requires risk management measures including business continuity and crisis management.
    • ISO 22301: international standard for business continuity management systems. A reference framework for structuring a BCP/DRP compliant with best practices.

    Regulatory note

    Since 2025, not having a documented and tested DRP exposes companies covered by DORA or NIS2 to significant financial penalties. Beyond compliance, a well-designed DRP protects the long-term viability of your business.

    Multi-Site DRP Architecture with Proxmox

    Proxmox VE provides all the building blocks needed to construct a robust multi-site DRP. The architecture relies on three key components: the production cluster, the disaster recovery (DR) site, and Proxmox Backup Server (PBS) as the cornerstone of the backup strategy.

    DRP Architecture Diagram

      PRIMARY SITE (Production)              DR SITE (Standby)
      ================================       ================================
      |  Proxmox VE Cluster (3 nodes)  |       |  Proxmox VE Cluster (2 nodes)  |
      |  - Production VMs              |       |  - Standby VMs                 |
      |  - Ceph storage (replicated x3)|       |  - Ceph storage (replicated x2)|
      |  - HA manager active           |       |  - HA manager ready            |
      ================================       ================================
            |               |                        ^               ^
            |  Application  |                        |               |
            |  replication  |________________________|               |
            |  (MariaDB...)                                          |
            |                                                        |
            v                                                        |
      ================================                               |
      |  Local PBS                     |    PBS Sync (pull mode)     |
      |  - Daily backup                |  --------------------------->|
      |  - Deduplication               |                             |
      |  - Verify jobs                 |       ========================
      |  - 30-day retention            |       |  Remote PBS           |
      ================================       |  - Off-site copy       |
            |                                 |  - AES-256 encryption  |
            v                                 |  - 90-day retention    |
      ================================       ========================
      |  Air-Gapped PBS (Nimbus)       |
      |  - Disconnected disks          |
      |  - Weekly rotation             |
      |  - Ransomware protection       |
      ================================
    

    Complete DRP architecture: application replication between sites + multi-tier PBS backups

    Application Replication vs Ceph Stretch Cluster

    For inter-site replication, two approaches stand out depending on the criticality level:

    Application Replication (recommended)

    • MariaDB/MySQL replication, PostgreSQL streaming, rsync...
    • Simple to implement and maintain
    • Works over standard WAN links
    • Configurable RPO (seconds to minutes)
    • Moderate cost, no latency constraints
    • Manual or semi-automatic failover

    Ceph Stretch Cluster

    • Synchronous replication between 2 sites
    • RPO = 0 (zero data loss)
    • Automatic failover
    • High complexity (inter-site Ceph)
    • Requires a 3rd site (monitor tiebreaker)
    • High cost (bandwidth, latency < 10ms)

    For the majority of SMBs and mid-market companies, application replication (MariaDB, PostgreSQL...) combined with PBS offers the best trade-off between simplicity, cost, and DRP effectiveness. The Ceph stretch cluster is reserved for the rare critical environments (finance, healthcare) where RPO = 0 is an absolute requirement — its operational complexity is not justified for most businesses.

    PBS: Central Component of the DRP

    Proxmox Backup Server plays a central role in any DRP architecture. Unlike Ceph replication which protects against hardware failures, PBS protects against logical corruption: human errors, ransomware, application bugs. It is the defense layer that allows you to roll back to a previous healthy state.

    To learn more about PBS backup strategy, see our dedicated article: Proxmox 3-2-1 Backup Strategy

    RPO/RTO: What Can You Achieve with Proxmox?

    Achievable recovery objectives depend directly on the deployed architecture. Here is a realistic comparison of the three main approaches:

    ArchitectureRPORTORelative CostUse Case
    PBS only (daily backup)24h2 - 4h$SMBs, non-critical applications
    PBS + 2x/day backup12h2 - 4h$SMBs with evolving data
    Application replication + PBS~few min30 min - 1h$Best trade-off
    Ceph stretch cluster (synchronous)~0< 5 min$$$Finance, healthcare, critical

    Our recommendation: Application replication + PBS

    For the majority of businesses, we recommend the application replication + PBS combination. MariaDB, PostgreSQL, or rsync replication between two sites is more than enough to ensure critical data continuity (RPO of a few minutes). PBS provides the additional protection layer against logical corruption with long retention and verified backups. This approach is simple to implement, cost-effective, and reliable — it covers both hardware failures and logical disasters (ransomware, human error) without the complexity of an inter-site Ceph infrastructure.

    To outsource your DRP backups, discover NimbusBackup for your Proxmox DRP : we offer Hosted PBS solutions for your Proxmox DRP with multi-site replication and end-to-end encryption.

    For a detailed cost analysis of Proxmox infrastructure compared to VMware, see our TCO VMware vs Proxmox 2026 comparison

    PBS: Your Best DRP Ally

    Proxmox Backup Server is much more than a simple backup tool. It is an enterprise-grade solution that natively integrates essential features for a reliable DRP:

    Deduplication and Efficiency

    • Chunk-level deduplication: 60 to 90% storage space reduction
    • Incremental backups: only modified blocks are transferred
    • Native compression: optimized storage and bandwidth usage

    Security and Verification

    • Verify Jobs : automatic integrity verification of every backup
    • Client-side AES-256-GCM encryption: data is encrypted before transfer
    • Sync jobs: PBS-to-PBS replication for off-site copies

    Off-Site PBS with Nimbus

    For the off-site layer of your DRP, RDEM Systems offers Nimbus, our range of external backup solutions:

    • Nimbus Double Drive PBS : two mirrored disks in a remote datacenter for complete redundancy of your PBS backups
    • Nimbus Air Gapped PBS : physically disconnected disks in rotation, maximum protection against ransomware and account compromises

    Discover all our backup solutions at nimbus.rdem-systems.com .

    Testing Your DRP: The Key to Reliability

    An untested DRP is a DRP that will fail. It is a statistical certainty. Regular testing validates that procedures work, that RTOs are achievable, and that teams know how to respond in a crisis situation.

    DRP Testing Methodology

    • 1Documentation review (monthly): review of procedures, verification of emergency contacts, update of VM inventories and restoration priorities.
    • 2Partial technical test (quarterly): restoring individual VMs from PBS on an isolated network. Verifying boot and application functionality. Measuring actual restoration time.
    • 3Full failover test (semi-annual): activating the DR site, restoring all critical services, business validation by functional teams. Measuring actual RTO and RPO.
    • 4Post-mortem (after each test): documenting gaps between objectives and actual results, corrective action plan, procedure updates.

    DRP Validation Checklist

    • PBS backups are intact (verify jobs OK)
    • Restored VMs boot correctly on the DR site
    • Business applications are functional after restoration
    • Measured RTO is less than or equal to the target
    • Measured RPO meets expectations
    • Network access (DNS, VPN, firewall) is operational on the DR site
    • Teams know the procedures and emergency contacts
    • DRP documentation is up to date and accessible outside the production site

    DRP and Ransomware: Advanced Protection

    Ransomware is today the number one threat to IT infrastructure. An effective DRP must include specific anti-ransomware measures, as a sophisticated attacker will seek to compromise backups before triggering encryption.

    The 4 Pillars of Protection

    1. Air Gap

    Backup copies on physically disconnected media from the network. Even an attacker with administrative access cannot reach a disk that is not plugged in. This is the ultimate protection.

    Nimbus Air Gapped PBS

    2. Immutability

    PBS backups can be protected against deletion and modification through strict retention policies and separate credentials. The PBS datastore is only accessible in append-only write mode from the hypervisors.

    3. Encryption

    PBS supports client-side AES-256-GCM encryption. Data is encrypted before leaving the hypervisor. Even if the PBS server is compromised, the data remains unreadable without the encryption key.

    4. Access Separation

    Backup access credentials must be strictly separated from production credentials. A Proxmox admin account should not be able to delete PBS backups. The principle of least privilege applied rigorously.

    Ransomware alert

    Modern attackers spend an average of 21 days in the system before triggering encryption. During this period, they identify and compromise backups. This is why long retention (90 days minimum) and air-gapped copies are essential: they allow restoring a healthy state prior to the compromise.

    PBS Tape: Long-Term Archival

    For businesses with regulatory archival obligations (10 years in the financial sector), PBS supports export to magnetic tapes. Tapes offer very low-cost storage, 30+ years durability, and native air-gap protection (tapes are physically removable and can be stored in a safe).

    Our DRP Approach at RDEM Systems

    At RDEM Systems, we support businesses in the design, implementation, and ongoing maintenance of their Proxmox DRP. Our approach stands out through its pragmatism and adaptation to each client's actual budget and constraints.

    • Resilience audit: analysis of your existing infrastructure, identification of critical VMs, definition of target RPO/RTO per business service
    • Custom DRP architecture: designing a multi-site architecture tailored to your constraints (budget, regulatory, geographic). Choice between Ceph replication, off-site PBS, or a hybrid approach
    • Implementation and documentation: deploying the DRP architecture, configuring PBS backups and replication, writing recovery procedures
    • Quarterly DRP tests: executing restoration tests, measuring actual RPO/RTO, compliance reporting and improvement plan
    • Monitoring and alerting: continuous backup monitoring, alerts on failures, proactive backup integrity verification

    Our DRP integrates into our comprehensive Proxmox managed services offering and benefits from our sovereign infrastructure operated from France. For off-site backup needs, our Nimbus range covers all protection levels, from standard backup to air-gapped.

    For comprehensive support, discover our 24/7 managed services and on-call support for your DRP : monitoring, failover testing, and support in case of disaster.

    Check our pricing or contact us for a free resilience audit.

    If you are considering a migration from VMware, our VMware to Proxmox migration guide integrates DRP considerations from the start.

    Frequently Asked Questions

    Official Documentation

    To dive deeper into the concepts covered in this article, consult the official documentation:

    Let's Design Your Proxmox DRP Together

    RDEM Systems supports you from the resilience audit to the implementation of your multi-site DRP. Get a free audit of your infrastructure and a DRP architecture proposal tailored to your needs.