MSPs have a duty to their clients to minimize downtime and keep them online and fully operational. Toward that end, one part of keeping downtime low is preparing for the unexpected. The process of setting goals for resolving issues and getting back online is critical in reducing client downtime.
Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are concepts that play a key role in the disaster recovery planning process. Managed service providers should examine each of these metrics, define their role in the recovery process, and work to build these objectives into their clients’ resilience plans.
What this article will cover:
- What are RTO and RPO?
- The differences between RTO vs RPO
- Why RTO and RPO are important to managed service providers
- How to calculate RTO and RPO
- How IT tools help you meet recovery objectives
What is an RTO?
Recovery Time Objective (RTO) defines the parameters for how quickly a business must recover its systems from downtime after an incident. This is calculated for each individual client and is unique to their operation.
Defining an RTO allows you to make more informed decisions about backup and disaster recovery (BDR) solutions and implementation. Hard numbers make it easier to keep things realistic and objective-based, rather than relying on an ambiguous idea like “get them back up and running as quickly as possible”. Such generalizations are hard to define in Service Level Agreements.
It’s easy to visualize an RTO in action. It’s simply a goal set by analyzing the costs and risks associated with downtime (it’s good to define what “downtime” means to the specific client) and determining how long they can wait to recover before losses become significant.
Some factors that might influence a user’s RTO include:
- How much revenue their business will lose for every hour of downtime
- How much financial loss they can/will absorb during an emergency
- The availability of resources needed to restore operations
- Their own customers’ tolerance for downtime
If a client needs their systems working within three hours, this is their RTO. If their calculated average time to actual recovery is five hours, they have exceeded their RTO by two hours. Because this is a preparatory calculation, it indicates that more investments need to be made into BDR to reduce the actual time to recovery.
What is an RPO?
Recovery Point Objective (RPO) is a similar risk/loss threshold. Whereas the RTO defines the amount of time that can be lost, the RPO defines the amount of data your client can lose without significant or catastrophic results.
This largely centers around data backup cadence — the frequency of the last backup point. If your client were to lose everything right now and had nothing but their last backup, how much mission-critical data would they still have?
Many use healthcare operations as an example of RPO. While some companies can afford to lose whatever data they enter over the course of a week (they may be able to just reenter it off of paper documents), hospitals generally don’t have that margin for error. With dozens of medical professionals dispensing thousands of medications each day, there’s very little chance that staff will remember everything they’ve done or need to do concerning treatment.
And since we’re talking about pharmaceuticals, losing even a day’s worth of data could mean messing up doses or mixing medications. These are potentially life-threatening problems, so such an operation needs to back up its data frequently. That need for up-to-date data informs their RPO.
RPOs are important for the MSP because they help guide their recommendations for data backup solutions — especially when it comes to storage space and modality. More frequent backups mean more data usage. It’s important that clients understand why their RPO is important when explaining the value of that additional cost.
Some factors that might influence a client’s RPO include:
- Complexity and number of critical applications and systems
- Data volume and access requirements
- How frequently data changes (i.e. how often important information is added or amended in a file)
- Data backup frequency and method
What is the difference between RTO vs. RPO?
Both metrics are important when formulating data backup and data recovery plans. RPO and RTO will help you decide on key backup and recovery features and inform your recommendations about client BDR solutions. Ultimately, your goal is to ensure that critical data and systems are available when needed, and these calculations can help you meet that goal.
While both are functional within recovery planning, they are different in practice. Active RTOs are typically designated after an event occurs (excuses those used theoretically during planning). RPOs are always determined before recovery is needed.
In some cases, recovery planning centers around systems and not data. In these situations, the only concern is RTO. As soon as data becomes part of the equation, the MSP will want to calculate and factor in the RPO. It is worth noting that when the two are combined, a short RTO usually requires an equally short RPO.
Calculating RPO and RTO
You will commonly determine relevant RTO and RPO targets during a Business Impact Analysis (BIA) or general risk analysis.
When using a BIA, your goal is to identify mission-critical business processes and identify the technologies and data needed to support those operations. These reports will often consider the financial implications of downtime or interruption and illustrate potential downtime risks.
The MSP will generally seek input from client leadership or relevant senior management to identify objectives and assign numeric values to recovery scenarios. YOu may begin by exploring best-case and worst-case scenarios and working your way backward to find attainable, reasonable numbers.
There’s no standard formula for calculating RTO/RPO values as they are numeric time values unique to every organization. A critical server might have an RTO of one hour, while a less critical system’s RTO might be 24 hours. The entire purpose of the BIA is to find reasonable objectives based on how necessary various systems are to the user.
As RTO and RPO decrease, the costs involved in reaching those objectives are likely to increase. In that regard, RTO/RPO calculations give you the information you need to research solutions and price points — essential forethought when it comes to keeping your agreements profitable.
The BIA and RTO/RPO figures can be useful during the sales process, as well. Conflicts often arise around costs, so it’s important to be able to show the value of BDR services. This is easier when pointing out that less expensive solutions wouldn’t meet their RTO/RPO needs and would result in greater costs should a disaster occur.
How NinjaOne helps MSPs meet RTOs and RPOs
Based on the results of your risk analysis and BIA, you should have a good idea of what could put your client at risk. Part of the overall analysis is determining the frequency of occurrences, the likelihood of the danger, and the possible effects they could have on the client.
Once you’ve quantified these risk-based metrics, you can translate these factors into recommended assets and measures. A centralized Remote Monitoring and Management hub like NinjaOne makes both of these processes far simpler. By aggregating data about important client assets and utilization, you can gain deeper insight while determining RPO and RTO.
Naturally, the fact that BDR is paramount to the use of RTO/RPO metrics means that NinjaOne’s integrated backup solution will help you seamlessly address the challenge, not just evaluate it.
Adding Value with Business Impact Analysis (BIA)
Understanding RTO and RPO is crucial for establishing a sound disaster recovery plan, but these metrics become even more powerful when used in conjunction with a business impact analysis (BIA). A BIA predicts the consequences of disruption to business functions and processes, enabling businesses to develop comprehensive recovery strategies.
The BIA identifies both the operational and financial impacts that might result from a disruption. These impacts can range from lost sales and income, and increased expenses like overtime labor, regulatory fines, and even customer dissatisfaction or defection. It also takes into account the timing and duration of the disruption, as the point in time when a business function or process is disrupted can have a significant bearing on the loss sustained.
The Real Cost of Recovery
When discussing recovery, it’s crucial to consider the cost of recovery. These costs should be compared with the potential impacts of a disruption, as determined by the BIA. The BIA report should prioritize the restoration of business processes that have the greatest operational and financial impacts. It’s worth noting that as RTO and RPO decrease, the costs involved in reaching those objectives are likely to increase.
Types of Disasters and Contingency Plans
Disasters that can disrupt business operations come in various forms, such as physical damage to buildings, breakdown of machinery, restricted access to a site, interruption of the supply chain, utility outages, and even absenteeism of essential employees. As such, it’s important to have contingency plans tailored for different types of disasters. These plans should be informed by the results of the BIA and the defined RTOs and RPOs.
Testing and Rehearsing Disaster Recovery Plans
Once you’ve identified RTOs and RPOs, conducted a BIA, and created contingency plans, it’s crucial to test and rehearse these disaster recovery strategies. Regular testing ensures that your plans are effective and gives you a chance to refine your strategies based on the results. It’s also a good way to make sure all stakeholders are familiar with their roles during an actual recovery scenario.
Importance of Vendor Relationships
Vendors play an important role in disaster recovery plans. It’s crucial to establish strong relationships with vendors who can provide the necessary services and resources during a recovery process. You might need to rely on these vendors to restore services, provide alternative solutions, or even provide resources for a temporary location. Check out these top 8 data backup and recovery software vendors for your IT team.
RTO and RPO are key metrics that guide the development of your disaster recovery plan. However, they should not be used in isolation. By conducting a thorough BIA, considering the cost of recovery, preparing for various types of disasters, testing and rehearsing your plans, and building strong vendor relationships, you can create a robust and reliable disaster recovery strategy.
Conclusion
There are two metrics that help MSPs achieve the best results when it comes to data backup and recovery: Recovery Time Objective and Recovery Point Objective. Both metrics are essential and interrelated when working with data backup and recovery solutions, business continuity strategies, and disaster recovery plans.
For an MSP who is providing data backup and recovery, these metrics are essential for planning out and building value in the solution. RTO/RPO each helps determine the optimum data backup and technology configuration to achieve their goals. These figures can also be important for compliance and auditing, as auditors might look for evidence of these values as marked data backup/recovery controls.
NinjaOne can help you obtain, calculate, and utilize RPO/RTO for your clients, assuring that you provide the best service and meet expectations when it comes to uptime and operational continuity.