Technology needs robust governance to mitigate the risks of disaster.

It goes without saying that technology is now critical for all areas of life and society.

It is almost impossible to do anything without relying on technology in some way or another.

It is also safe to say that this reliance will only continue to grow (a) as society continues to rely more and more on technology and (b) as technology as a discipline expands with the addition of new capabilities such as artificial intelligence, big data, machine learning and the internet-of-things.

Despite this reliance on technology, the list of major issues and problems is vast. A simple Internet search found the following within 30 seconds:

Airbus A380 suffers from incompatible software issues (2006)
AT&T network collapse (1990)
Blackberry Outage (2011)
Chernobyl Accident (1986)
EDS and the Child Support Agency (2004)
Faulty Soviet early warning system nearly causes WWIII (1983)
iPhone Bendgate (2014)
LA Airport flights grounded (2007)
Mars Climate Observer metric problem (1998)
Siemens and the passport system (1999)
Southwest outage (2016)
The explosion of the Ariane 5 (1996)
The two-digit year-2000 problem (1999/2000)
Various data breaches (ongoing)

This growing reliance and dependence does cause several issues:

Technology infrastructures are complex. They contain many different and interacting parts which need to work together to provide the service required. These individual parts cover wider areas such as the ‘pure’ technology elements (namely hardware and software), the processes to support the technology (such change and service management) and finally, the human individuals to operate and use the technology.
Technology platforms support critical services. For example, power supplies, airports and shipping. Any issues with the technology could, at best, cause disruption or, at worse, a loss of life.
Regulators are now very aware of this complexity and the critical nature of technology. They are therefore implementing legislation to ensure that infrastructures are a) sufficiently managed to reduce the likelihood of material problems and b) are used properly for legal and moral purposes, such as not trying to sell products to vulnerable individuals.

To ensure that these laws (such as GDPR and the senior managers and certification regime (SM&CR) for financial services organisations) are taken seriously, named individuals are often made responsible for compliance. This means that in the event of breaches, both the organisation and the named individuals are subject to prosecution. Thus, organisations need to develop capabilities to ensure they have suitable and robust governance in place around their technology services.

What exactly is governance?

Unfortunately, the term ‘governance’ frequently gets a bad press.

It is often seen as red tape which is run by ‘busy bodies’ who insist that pointless forms are completed, which do not provide any benefit, waste people’s time and get in the way of doing ‘proper work’.

Whilst I think that everyone can relate to completing pointless forms at some point, it is important to note that well-designed governance procedures are essential to help control the activities, issues and risks of an organisation – both holistically as well as for technology.

So, what is good governance?

If one were to search for the terms ‘technology’ and ‘governance’ on the internet, several definitions would result. While there are differences between them, they do cover several similar themes. These can be summarised as ensuring the following:

The overall objective for technology is clearly defined and understood. This objective should also be morally correct and comply with all relevant laws.
There are controls around technology to ensure that this objective is met, namely processes to track performance; manage problems that have arisen; predict future problems so they can be mitigated and to safely make changes to the infrastructure.
There are clear roles, responsibilities and ownership across all elements of the technology. For example, who is responsible for tracking performance, managing issues, making changes and so on. This is particularly important if local laws demand that individuals are named.

What does good governance look like for technology?

There is no ‘off the shelf’ governance model the fits all areas but a simple model can be defined containing seven parts:

Objective and importance

It is important to understand both what the purpose of the technology is and what the implications are if there are issues or problems. This is key because it will shape what performance monitoring is required, how critical issues are managed and the approach for trying to predict issues. Critical functions will require much tighter controls than less critical functions.

Clear roles, responsibilities and ownership

As discussed, technology infrastructures are complex and contain many elements. A clear owner (with executive power) needs to be allocated to each element to ensure it is managed. Again, this is particularly important if there is legalisation in place where named owners are required.

Typically, most organisations will split this ownership model into two levels. At a senior level, there will be a forum which owns the elements at an executive or Board level. Under this, there will be several working groups and owners who own the elements at an operational level.

Performance measurement

When the above objective is defined, it will determine what that the implications are if there are issues with the technology. This will then allow a number of key performance indicators (KPIs) to be defined to allow performance to be monitored.

KPIs themselves contain two parts: namely the measure itself and the frequency.

The measure will focus on what needs to be tracked to ensure the objective is being met. Each measure will have alerts which, if breached, will trigger activity. E.g., there could be a measure for response-times but if the responses are greater than a certain value, then triggers are raised.
The frequency determines how often the KPI needs to be measured and reported. A simple rule-of-thumb is that the more critical the KPI, then the more frequently it needs to be measured. E.g., monitoring spare disk space may be measured each second, while processing volumes could be measured daily.

Managing current problems

Regardless of how thorough people are, problems will happen. Therefore, processes need to be put in place to identify, manage and eventually close problems.

Once a problem has been raised, its impact needs to be understood. This process is called triaging. If the problem is critical, then processes need to be in place to manage it urgently. This could cover areas such as implementing work arounds, installing emergency fixes or other contingencies. Likewise, for non-critical issues, a similar set of procedures is required, although these tend to be less instant.

Managing future problems

This area (often called Risk Management) is both important and challenging, because it is hard to predict future problems. However, a common approach is to look at the risks through both internal and external lenses.

The internal assessment involves reviewing all parts of infrastructure to determine whether any problems are likely, including old versions of software; reliance of key individuals; a supplier which is failing and so on. Once risks have been identified, then their impact can be assessed so mitigations can implemented.
Assessing external risks is more challenging and most organisations will either miss them or discover them late. Who would have foreseen COVID-19 during early 2019, for instance? However, one method is to assess political, technological, environment and social impacts in the jurisdictions the organisation operates so that risks can be identified and managed.

All risks identified need to be monitored on a regular basis, because their criticality could change as time progresses.

‘With society’s almost 100% reliance on technology then appropriate governance is essential to ensuring technology runs smoothly.’

Change management

Processes need to be in place to allow changes to be made to the technology infrastructure with as minimal risk and impact as possible. Actual changes can range from immediate changes (to fix critical problems) to large complex changes requiring weekends to implement (such as a major hardware upgrade). These change processes also need include robust roll-back process in the event of problems during the change implementation.

Effective communication

This area is often forgotten but ineffective communication will cause issues. Typically, there are regular dashboards that will report on KPIs, problems and risks. These will be issued to the relevant people on a regular basis; however, if any communication is issued, it is important that the receiver understands the information so they can react as required.

If too much information is provided (which is very common) and / or produced too frequently, the readers may not be able to determine the key issues or escalations, which could mean that they do not address what needs to be done. Similarly, if the information provided is too sparse (again very common) then it may not contain all the detail required, which could mean the reader misunderstands it and again doesn’t react as required.

Therefore, all communication needs to be at the appropriate level of detail and frequency to allow the receiver to understand it and react accordingly.

Summary

With the increased reliance on technology across society, it is now essential that organisations have appropriate governance in place to ensure technology platforms are suitably maintained.

Technology needs robust governance to mitigate the risks of disaster.

What exactly is governance?

So, what is good governance?

What does good governance look like for technology?

Objective and importance

Clear roles, responsibilities and ownership

Performance measurement

Managing current problems

Managing future problems

Change management

Effective communication

Summary

Recent Posts

Recent Comments