I’ve found that it’s common to make the assumption that people actually know how perform certain manual tasks when testing proves they don’t.
At one stage, one of our clients had a plan to use two-way walkie talkies if the mobile network went down.
The experience of the floods in Queensland and Northern NSW and bushfires and heat waves in Victoria, stressed that we can’t ask for the luxury solutions straight away.
The APRA standard for the finance industry introduced in 2005 forced banks and insurance companies to make plans – and test these plans.
For a large scale BCP test, my recommendation is that it’s better to aim for partial testing of one/more ‘end to end’ processes say, every 6-12 months, rather than ‘resource based testing’ such as ‘all the IT’. an entire customer service ‘flow’ from a customer phoning up, to the right person handling the call from an alternate workplace, to a transaction being logged and the customer’s issue resolved) is usually far more insightful than just testing all IT recovery aspects but not the staff facilities, phone systems and other items that are part of a time-critical process.
I’ve found that in organisations that have a rhythm or natural regularity of testing, we see different aspects of the plan being tested more frequently. Do the different types of testing in a way that people can manage it as part of their everyday operations.Of course, the walkie talkies were hard to locate in a dusty box in some corner of the office.Testing found that new and younger staff had not been trained how to use these and guess what, there was no procedure written down.There’s a step in risk assessment where you look at what things could happen, what are the core consequence scenarios that we should plan for, what are the likelihoods and possible impacts of those scenarios. Security is one of the sources of risk, but it is also one of the controls that you put in place.For example, a source of risk might be people breaking into your warehouse.Plus, some scenarios go on for longer than you planned and manual workarounds let you do something rather than nothing.Given the magnitude of an event – say, it’s your building and/or your people and your systems and/or your suppliers that are affected – you may need to extend the planned manual workaround – so documentation, training and testing are very important.For example, if something dramatic happens and I’m a call centre that’s normally opened from 8am-8pm, perhaps in the first two weeks following an incident, I might look at opening from 9am-5pm or 8am-12 in order to sustain my business at an acceptable level for a certain period of time.I think the current process reflects the experience that we’ve had with actual disasters, compliance and the testing of these plans over time.Manual workarounds let us get back to a level which isn’t 100% but it’s a degree that is acceptable to our clients, stakeholders, shareholders and the general community.The documentation of manual workarounds is very important as it isn’t simply about how things were done in the past, say using a fax, the post or a hard-copy form, when the Internet is down.