Author: DKALYA

Quick Reference for DR and BC Metrics – RPO, RTO & WRT Concepts

 

New Feature: Listen to this Article

How can I not have an article on Disaster Recovery and Business Continuity Planning? A must have understanding for anyone in Security.

If you are a security professional with years of experience, then you are very familiar with these important fundamental metrics that is used in developing a Business Impact Analysis (BIA) Report which will identity your business processes , identify resources required for recovering of these processes in the event of a disaster and a become part of your Business Continuity Plan (BCP).

The metrics I am referring to are RPORTO and WRT. Also, Maximum Tolerable Downtime. I hope someone who is just getting into security and trying to grasp this concept will find this explanation very useful.

Example:

Let us assume a business which is operating normally represented by the following chart. Note, the X axis represents Time. The concepts that we are going to learn are a function of time. Time scale = 1 hr

Normal Operation.


Figure 1

Disaster Strikes.


Figure 2

Recovery Efforts Begin

Figure 3

Normal Operation Resumes

Figure 4

A disaster hits a business which is under normal operation at 3 am, recovery starts at 6 am, normal operation resumes at 8 am. Then we can define the terms as follows:

  • Recovery point objective (RPO) is defined as Measures maximum acceptable data point to be recovered.
  • Recovery Time Objective (RTO) is defined as Maximum time needed for data recovery.
  • Work Recovery Time (WRT) is defined as Maximum amount of time needed to verify data integrity to resume operation.

Maximum Tolerable Downtime (MTD) is defined as The amount of time business process can be disrupted without causing significant harm to the organization’s mission.

For this particular example, from Figure 4 shows a RTO of 3 hrs and WRT of 2 hrs. The MTD is calculated as follows:
MTD RTO WRT
MTD = 3 hrs. + 2 hrs.
MTD = 5 hrs.

This is a very simple example for understanding the concept of calculating the Maximum Tolerable Downtime. For a deeper understanding I recommend indulging into books and materials written on DR and BC. Note that there is a very thin line and it can get blurred between resuming total business normal operation which may mean that you have switched back to the primary site for operation. For practical purposes , getting back to normal operation is more critical and important than returning to the primary site.

If you would like to get more understanding of these topics please see the following references:

A technical article on RTO Vs RPO by msp360.com

A blog post from Default Reasoning by Marek Zdrojewski

What is Operational Technology (OT)?

I have been asked several times, at several occasions about this mysterious term called OT. While it stands for Operational Technology, what is Operational Technology?

Few years ago, I stood in front of a large audience from diverse backgrounds such as Process control, Maintenance, IT and management and delivering a motivating speech on cybersecurity for manufacturing, while I had never used the word OT in the context of Industrial Control System, the word existed but not necessarily used as commonly as today.

While different individuals use it differently to describe their trade, especially in Industrial Cybersecurity practice, the word itself has become somewhat of a open secret, we think we know what it means, but do we really? So to tackle this problem, I asked a bunch of people in my closest professional circle and tried to define it myself. While the definition of OT might change, but as far as I am concerned, what I am about to tell you will still be relevant because instead of defining all the different systems that the Operational Technology represents, I will simply define what it represents.

So here you,

Operational technology (OT) is a set of hardware, software, and communication systems that are used to monitor, control, and automate industrial processes. OT systems are typically used in critical infrastructure industries such as manufacturing, energy, and transportation.

These can include IACS and Control System Components, Information Technology components that are part of the control systems etc., but are defined by the organization to appropriately apply the necessary security measures and controls.

Lets look at a diagram. Yes, that is MS Paint and I did create it 5 years ago. As you can see, OT, Operational Technology is an umbrella term defined by the organization to include systems such as Industrial Automation and Control systems (IACS), Fire Systems, Access Control Systems, Lighting Controls etc.

What is the relevance of defining Operational Technology?

The importance of defining OT for your organization is simply to be able to develop / design and implement appropriate security controls and measures to protect your business operations. Properly identifying what falls inside the OT environment and what falls outside, what is included and what is excluded, will provide you with the right information to develop your OT Cybersecurity Strategy.

Do you have any comments or suggestions to improve this definition? Send me a message.