froglogic / Blog / Tip of the Week / What exactly is cyclomatic complexity?

# What exactly is cyclomatic complexity?

The cyclomatic complexity is a measurement of the code complexity proposed by Thomas J. McCabe which is often considered as a magic number which allows us to measure the complexity of a program. It is common to say that a function with a cyclomatic complexity higher than 10 is difficult to maintain due to an over-reliance on branches, switch/cases or loops. These functions are often a good candidate for a redesign.

How good is this metric? Would the number of lines of code give us the same result? If that were true, it would make more sense to the simpler metric since everybody intuitively understands it.

To answer this question, I will expose how to compute it and then analyze some properties.

## How to compute the cyclomatic complexity?

The formula of the cyclomatic complexity of a function is based on a graph representation of its code. Once this is produced, it is simply:

M = E – N + 2

E is the number of edges of the graph, N is the number of nodes and M is McCabe’s complexity.

To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules:

1. Create one node per instruction.
2. Connect a node to each node which is potentially the next instruction.
3. One exception: for a recursive call or function call, it is necessary to consider it as a normal sequential instruction. If we don’t do that, we would analyze the complexity of a function and all its called subroutines.

Let us have a try with a small sample:

``````#include <stdio.h>

int main( int argc, char *argv[] )
{
if ( argc > 1 )
{
fprintf( stderr,
"This program does not accept arguments!\n"
);
return 1;
}
else
{
fprintf( stdout, "Hello world!\n" );
return 0;
}
}``````

The assembly output is:

``````%struct._IO_FILE = type { i32, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, i8*, %struct._IO_marker*, %struct._IO_FILE*, i32, i32, i64, i16, i8, [1 x i8], i8*, i64, i8*, i8*, i8*, i8*, i64, i32, [20 x i8] }
%struct._IO_marker = type { %struct._IO_marker*, %struct._IO_FILE*, i32 }

@stderr = external global %struct._IO_FILE*, align 8
@.str = private unnamed_addr constant [41 x i8] c"This program does not accept arguments!\0A\00", align 1
@stdout = external global %struct._IO_FILE*, align 8
@.str.1 = private unnamed_addr constant [14 x i8] c"Hello world!\0A\00", align 1

; Function Attrs: nounwind uwtable
define i32 @main(i32 %argc, i8** %argv) #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i8**, align 8
store i32 0, i32* %1
store i32 %argc, i32* %2, align 4
store i8** %argv, i8*** %3, align 8
%4 = load i32, i32* %2, align 4
%5 = icmp sgt i32 %4, 1
br i1 %5, label %6, label %9

; <label>:6                                       ; preds = %0
%7 = load %struct._IO_FILE*, %struct._IO_FILE** @stderr, align 8
%8 = call i32 (%struct._IO_FILE*, i8*, ...) @fprintf(%struct._IO_FILE* %7, i8* getelementptr inbounds ([41 x i8], [41 x i8]* @.str, i32 0, i32 0))
store i32 1, i32* %1
br label %12

; <label>:9                                       ; preds = %0
%10 = load %struct._IO_FILE*, %struct._IO_FILE** @stdout, align 8
%11 = call i32 (%struct._IO_FILE*, i8*, ...) @fprintf(%struct._IO_FILE* %10, i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.1, i32 0, i32 0))
store i32 0, i32* %1
br label %12

; <label>:12                                      ; preds = %9, %6
%13 = load i32, i32* %1
ret i32 %13
}
``````

We can then create a graphical representation of the assembly instruction flow:

We have 19 nodes and 19 edges, McCabe’s complexity is then:

M = 19 – 19 + 2 = 2

## Properties

The McCabe’s formula ‘E – N + 2‘  in the first approach strange because adding or subtracting two measures (here number of nodes and number of edges) expects that they have the same units. That’s why a distance is never added to a duration in physic but in most of the cases multiplied or divided.

But, in the case of McCabe, this formula simply the number of independent paths that exist between the input of the function and its output. For our example, the cyclomatic complexity is 2: one for the path for the ‘if’ branch plus one for the path of the ‘else’ case.

This has also a number of implications:

1. McCabe is independent to the number of source lines of a function: it does not matter how many statements are present in the code but how many branches are into it.
2. The metric is completely independent of the code style. Reformatting the code does not affect it. This is a big difference to the common  line of code metric.
3. The metric has a linear grow with the function complexity. For example, adding a ‘if ( … ) … else …‘ statement somewhere in a function increments the metric by one. It does not matter if the selection statement is nested or at the beginning of the function and it does not grow exponentially.
4. McCabe is easy to interpret for a single function. But, its computation for a source file, a class or an application is difficult to interpret. The reasons for this is that the function call is not represented in the flow diagram. The McCabe metric is computed independently for each function and it is as if there are interactions between the functions present in the source code.
5. McCabe is not adapted to measure the complexity of the software architecture. Or, to say it differently, it does not work on a complex C++/C# design pattern which splits the complexity over several classes.
6. And of course, like for all metrics, McCabe’s measure is only of code complexity and not the complexity of the data structures.

## Analysis of some results

Computing the cyclomatic complexity on some samples allows us to better identify how source code affects the metric.
Here are some generic rules:

1. Adding an ‘else’ block to a ‘if’ statement does not affect its metric (see sample I1 and I2).
2. The complexity of a boolean expression has no impact on the code metric (see sample I2 and I3). McCabe can be compared to the decision coverage for which it is important to track if a complete boolean expression was true and false.
3. If a function is composed of two blocks, the cyclomatic complexity of the function is the sum  of the complexity of each of it. If we take I2 and add and ‘if’ statement (see sample I4), the complexity is incremented because a single ‘if’ statement has a complexity of 1.
4. It does not matter if some code is composed of two sequential or two nested parts. I4 and I5 illustrate this: cascading ‘if’ statements have the same effect on the metric as sequential if statements.
5. All loops are equivalent to an ‘if’ statement (see L1, L2, L3).
6. Adding a ‘case’ to a ‘switch’ statement increments the cyclomatic complexity. (see S1 and S2)

## Conclusion

McCabe’s cyclomatic complexity is a simple and intuitive metric that allows us achieve reliable results at the function level. It does not cover all kinds of software complexity but it allows us to clearly identify functions with a lot of loops, selection statements and switch/cases. That’s why we have also included it in Squish Coco v4.2 and above.

It seems in the example l3 error. CC value must be 3

No it is 2: a ‘for’ loop is exactly like a ‘while’ loop and has a complexity of 2:
for ( i= 0; i<10; i++ )
foo();

This is the same as:
i=0;
while ( i<10 )
{
foo();
i++;
}

Both have a complexity of 2 because there are only 2 branches:
1) one which exit the loop
2) one which continue the loop

I mean this example
void foo( int a, int b ) {
if ( a && b )
return ;
}

The If statement adds up with the binary operation (&&). And in total with the main branch give 3

McCabe does per definition not take care about the conditions in a an expression.
So the complexity of ‘if ( a )’ and ‘if ( a && b )’ is the same.

It is in fact the same as:
c= a && b;
if ( c )

Ok, thanks for you reply. You are right, if you are talking about the original work for Fortran, which did not have the pervasive “default assign” pattern.

Hi Sebastien,

Thank you for your article. Very informative.
I just wanted to know the “If-else ” has a cyclomatic complexity of 2 as in the exemple l1. This is given to the fact there are 2 possible paths.

Would not l2 has a cyclomatic complexity of 1? as the else block is omitted. Which implies that the code will only have one path instead? Please advise

A ‘if’ statement without an else should be considered as an ‘if-else’ with an empty statement for the ‘else’ block.
If you draw a flow graph you will see that the missing else create a specific path which is a simple shortcut to the end of the ‘if’ block.

Also, my next question is that what happens in the case below:

if ( a)

else if (b)

else

Will this implies a cyclomatic complexity of 3?

This will effectively have a complexity of 3.
It is the same as:
if (a)
A
else
{
if (b)
B
else
C
}

The complexity is the sum of the complexity of each block A, B and C. If they are composed of empty or sequential statements, then the complexity of each block is 1 and the overall complexity is 3.

Hi Sebastien,

A part from the points discussed above, what would be a typical code with cyclomatic complexity of 1 in general.

Every list of sequential statements have a complexity of 1.

For instance this: int x = 0; ?

This is a single statement, which is to be consider as a sequence of statements. So the complexity is 1

Thanks Sebastien,

The example below:

for (int i = 0; i < 10; i++)
{
if (i == 2 || i == 4 || i == 6)
{
i = i+2;
}
}
int x = 0;

Am I right to say the cyclomatic complexity will be 6?

Why should this be 6?