The Matrix That Isn't

 Being invited to offer a tutorial on the MATLAB and Gnu Octave matrix languages, for cyber security specialists, prompted me to revisit a question that has bothered me for some time.

In cyber security, as in many other fields, a ‘Risk Matrix’ is a table of likelihood versus severity, into whose boxes one places various risk events. Likelihood times severity gives a useful metric for the ‘likely severity’ – called ‘impact’ - so we can focus our attention on the most likely and severe events.

Likelihood

Severity

Low likelihood

Medium likelihood

High likelihood

High severity

High severity

High severity

Low likelihood

Medium likelihood

High likelihood

Medium severity

Medium severity

Medium severity

Low likelihood

Medium likelihood

High likelihood

Low severity

Low severity

Low severity

Figure 1: Conventional Risk Matrix

The idea is to place events in their relevant likelihood-severity box, so in the example table those events that end up in the top right corner are highly likely and highly severe, so best avoided.

The problem with the Risk Matrix is that it isn’t really a matrix.

Well, it IS a matrix in the most general sense – a table with entries in its boxes – but it isn’t a very useful one in the mathematical sense.

Let’s be clear: a matrix can contain anything at all in its boxes: symbols, numbers, text, images, anything. But we don’t have readily accessible – and more importantly standardised – tools available to work with arbitrary matrices so we tend to stick to ones that have numbers, which give us the opportunity to leverage Matrix Arithmetic. So let’s assign numeric likelihoods and severities. I’m going with a statistician’s model - a range from 0 (no likelihood or severity) to 1 (total meltdown) – with, say, five levels: ‘very low’ being 0.2 and ‘very high’ being 0.8 and even spacing in between (this is a linear scale):

very low

low

medium

high

very high

0.1

0.3

0.5

0.7

0.9

Figure 2: Likelihood and severity numeric ranges

The conventional Risk Matrix multiplies likelihood by severity to calculate impact:

Likelihood

0.1

0.3

0.5

0.7

0.9

Severity

0.9

0.09

0.27

0.45

0.63

0.81

0.7

0.07

0.21

0.35

0.49

0.63

0.5

0.05

0.15

0.25

0.35

0.45

0.3

0.03

0.09

0.15

0.21

0.27

0.1

0.01

0.03

0.05

0.07

0.09

Figure 3: Conventional Risk Matrix with impact

Which actually adds no new information: but because I am using a matrix programming language, I can at least use a nice ‘surface’ plot to visualize this Risk Matrix:


Figure 4: Conventional Risk Matrix as a surface plot

(A surface plot shows height as a value against two dimensions: in this case the height is impact, the two dimensions are likelihood and severity).

The conventional Risk Matrix is problematic because its ‘impact’ is redundant – we already know it because we know both likelihood and severity – so it doesn’t really add any insight beyond ‘look at the high likelihood high severity corner’.

The conventional Risk Matrix is doubly problematic, in fact, because a matrix has rows numbered downwards, starting at 1 with the row number increasing as we go down, whereas the Risk Matrix typically has its vertical axis (severity or likelihood depending how you arrange it..) going up as you go … well, ‘up’ … so a Risk Matrix is upside down compared to a mathematical matrix. I will use mathematical matrices from now on.

To make a more meaningful Risk Matrix we might, for example, count the number of events that fall into a given impact cell of the Risk Matrix. Note here that an ‘event’ can be anything that we decide it to be: breach of confidentiality on a customer database, physical destruction of a disk drive, stealing of invoice details to use in a finance scam – anything that matters to us can be an ‘event’ – and we will come back to this later. But for now, let’s start something readily identifiable, well documented, and easily seen as relevant to cyber-security: cyber-attacks. An attack being something an adversary does to exploit a weakness in cyber security.

A useful resource, for instance, is CAPEC™ - Common Attack Pattern Enumeration and Classification,

 

– which is a table of cyber ‘Attack Patterns’: a sort of dictionary of common ways in which cyber resources can be attacked. Here it is imported into Octave as a matrix:



Figure 5: CAPEC Common Attack pattern Enumeration And Classification

There are currently 546 lines of attack – each is held in one row of the matrix. Many of the entries are text, and only some of them use standard codes that we can look up and interpret automatically, but at least some are susceptible to automated analysis.

CAPEC lists against each attack its likelihood and severity:

'ID

Name

Likelihood Of Attack

Typical Severity

1

Accessing Functionality Not Properly Constrained by ACLs

High

High

10

Buffer Overflow via Environment Variables

High

High

100

Overflow Buffers

High

Very High

101

Server Side Include (SSI) Injection

High

High

Figure 6: CAPEC database with likelihood and severity

Unfortunately, CAPEC – as with many cyber security resources – uses words rather than numbers for things that ought to be numeric. So, our first step is to convert text strings to numeric values – trivial but surprisingly annoying. (Guys, please for the sake of my stress levels, provide me with numbers for things that can be numeric…).

Then, we can make a matrix – call it an Impact Matrix - with four columns: one for the attack ID, then one each for likelihood, severity and impact for each attack:


Figure 7: Impact Matrix

It’s now quite easy to do things like count the number of attacks that fall into each cell of the conventional Risk Matrix, or calculate the ‘average’ impact over all attacks:


Figure 8: Count and average impact Risk Matrices

And, since I am using a matrix language that offers many ways to visualize matrices, I can show each of these in different ways:


Figure 9: Risk matrices visualized in different ways

Here I have displayed both metrics in three ways: as a 3D scatter plot (these are most helpful if you rotate them which help you see where the scatter points fall in 3D space); as a surface; and as an image (which is close to how the conventional Risk Matrix is usually displayed).

You might gain some insights from these: for example, it seems that most CAPEC Attack Paths fall into mid Likelihood and Severity, and so do their averages – but one might expect that, as the greatest number of paths are quite unlikely and not very severe and may overly weight the average anyway. Another insight is that grading things as ‘low’, ‘medium’ and ‘high’ isn’t all that helpful in doing actual arithmetic because the visualizations are very granular. But let’s see what else might be useful from the CAPEC Matrix.

Attacks, though interesting, don’t tell much of the cyber security story: they are ways in which an adversary might attempt to do something that might cause harm, but they don’t say what that harm might be. But CAPEC, in common with other cyber security databases, does list the Consequences of an attack. Consequences are in some sense things that matter – categories of harm – so if we are more interested in the risks of certain kinds of harm than in details of specific attacks, they offer a helpful categorisation. CAPEC Consequences fall into six categories:

Consequences

::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::

::SCOPE:AvailabilityTECHNICAL IMPACT:Unreliable Execution::SCOPE:Confidentiality:SCOPE:Integrity:SCOPE:AvailabilityTECHNICAL IMPACT:Execute Unauthorized Commands:NOTE:Confidentiality Integrity Availability Execute Unauthorized Commands Run Arbitrary Code::SCOPE:ConfidentialityTECHNICAL IMPACT:Read Data::SCOPE:IntegrityTECHNICAL IMPACT:Modify Data::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::

::SCOPE:AvailabilityTECHNICAL IMPACT:Unreliable Execution::SCOPE:Confidentiality:SCOPE:Integrity:SCOPE:AvailabilityTECHNICAL IMPACT:Execute Unauthorized Commands:NOTE:Confidentiality Integrity Availability Execute Unauthorized Commands Run Arbitrary Code::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::

::SCOPE:ConfidentialityTECHNICAL IMPACT:Read Data::SCOPE:Confidentiality:SCOPE:Integrity:SCOPE:AvailabilityTECHNICAL IMPACT:Execute Unauthorized Commands:NOTE:Confidentiality Integrity Availability Execute Unauthorized Commands Run Arbitrary Code::

::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::SCOPE:IntegrityTECHNICAL IMPACT:Modify Data::SCOPE:ConfidentialityTECHNICAL IMPACT:Read Data::SCOPE:AvailabilityTECHNICAL IMPACT:Unreliable Execution::

::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::SCOPE:IntegrityTECHNICAL IMPACT:Modify Data::SCOPE:ConfidentialityTECHNICAL IMPACT:Read Data::SCOPE:AvailabilityTECHNICAL IMPACT:Unreliable Execution::

::SCOPE:IntegrityTECHNICAL IMPACT:Modify Data::SCOPE:ConfidentialityTECHNICAL IMPACT:Read Data::SCOPE:Confidentiality:SCOPE:Access Control:SCOPE:AuthorizationTECHNICAL IMPACT:Gain Privileges::SCOPE:Confidentiality:SCOPE:Integrity:SCOPE:AvailabilityTECHNICAL IMPACT:Execute Unauthorized Commands:NOTE:Confidentiality Integrity Availability Execute Unauthorized Commands Run Arbitrary Code::

Figure 10: CAPEC Consequences

Sadly (and this is becoming a depressingly familiar pattern) this is all encoded as text strings – actually XML file format – but we can place them into six categories (the first three of which match the traditional ‘CIA triad’):

1.      Confidentiality

2.      Integrity

3.      Availability

4.      Access Control

5.      Authorization

6.      Authentication

and from them we can form a Consequences Matrix with six columns (one for each of the Consequences) and 546 rows (one for each CAPEC attack):


Figure 11: CAPEC Consequences Matrix

Here I have weighted each Consequence with the attack’s impact.

This is a useful matrix: it shows, for each attack, the impact on each of the six CAPEC Consequences. If we added up all the impacts in each column we would have an indication, for each Consequence, of the overall impact of all CAPEC cyber attacks, weighted for likelihood and severity:


Figure 12: Impact of all CAPEC attacks for each CAPEC Consequence

This isn’t all that useful, though: it isn’t selective of attacks, it just assumes all can happen. Not an unreasonable assumption, actually, but the CAPEC database is flawed – as are most cyber security listings – because it lists EVERYTHING that might happen, even though in most circumstances most attacks won’t. If the likelihood and severity were reliable (I don’t know if they are or not..) and not so crude, then this might be a useful step, but I don’t think it is. And this plot just shows the impact of everything without saying on what: Confidentiality (column 1) is the greatest risk but confidentiality of what – the customer database, the bank account details, private messages?

Let’s think about impacts, not just overall but selectively. Ideally, I would choose some value or asset that mattered, and say which attacks might harm that. I could do that by saying how likely it is that each attack might harm that particular value or asset: a ‘filter’ that selects attacks and so groups their impacts. But I’m working with available dats, not making it up myself, so I will go with something CAPEC already offers: Mechanisms Of Attack.

CAPEC defines ‘Mechanisms Of Attack’ - for example “Engage in Deceptive Interactions” or “Subvert Access Control” or “Inject Unexpected Items” are Mechanisms – by specifying groups of attacks that fall into that category. Each Mechanism Of Attack is a filter, selecting attacks. There are nine Mechanisms Of Attack.

Note that I could generate a Risk Matrix for each Mechanism Of Attack: and that would be a useful thing to do but I want to do something a it more clever.

We can define a new matrix with nine columns - one for each Mechanism Of Attack – and 546 rows, one for each attack. Like the Consequences Matrix this MOA Matrix lists, for each attack, how much it contributes risk to each MOA. For a mechanism Of Attack the filter is either r off or on – the attack is either in the MOA or not – but for other views we might allocate some weighting number.

Now we have two matrices: The Consequences Matrix, of 546 rows and 6 columns; and the MOA Matrix, of 546 rows and 9 columns. To work out the total impact for each Consequence for each MOA I need to run down the respective columns, multiplying the pairs of elements from each matrix together, then add them all up to form an overall impact. Handily, this is exactly what mathematical Matrix Multiplication does (strictly this is a Matrix Cross product): creating a new matrix of 9 rows (Mechanisms Of Attack) and 6 columns (Consequences):


Figure 13: Consequences of Mechanisms Of Attack

This shows, for each intersection, the impact (risk..) to that Consequence, due to that Mechanism Of Attack. Which is, in fact, a Risk Matrix. And we can visualize it nicely too:



Figure 14: Risk Matrix - Consequences for each Mechanism Of Attack

and I think that is rather neat.

Comments

Popular posts from this blog

Wave Watching

Homeschool maths

Bread and cocktails