The University of New Haven Center for Cybercrime and Forensic Computer Investigation and The University of Southern California Department of Mathematics
By "All the evidence is in the computers, but I have hired the best computer science talent in the country and it is impossible for you to get it," said the organized crime boss. He was making about US $1.2 million a month net profit before taxes--and no evidence was found that he was paying any taxes. He had indeed secured highly qualified encryption professionals, and expended large amounts on information security. After his arrest, the task of evidence recovery fell to Los Angeles (California, USA) police department reserve officer Ross Mayfield, an information systems adjunct professor and member of an elite group of detectives in the organized crime vice division.
Despite the proclamations that this information system was impossible to penetrate, the evidence was recovered and the crime boss was caught. Mayfield had already started to suspect that making an information system completely impenetrable would require an infinite amount of money. If an entity such as the US National Security Agency wants to keep everyone out of its information systems, it builds an underground structure on a military base, surrounds it with garrison troops, watches its workers closely and understands that it still has some level of risk. Keeping everyone out of an information system renders the cost of that system extremely high.
Prior to working on the crime boss's case, Mayfield had been involved in a project that mirrored this increasing cost conundrum. It was a pilot Internet project to get all the students and faculty of a university onto the university's system. It was easy and inexpensive to get the first, small portion of this population on the Internet. This group had technical expertise and for the most part was already using the Internet elsewhere. The next group of users required more equipment purchases and training. As more and more groups of users were targeted, it became increasingly difficult and more expensive to add users to the information system. This was particularly true for physically and mentally handicapped users. Ultimately, some users, despite the significant investment of funds and expertise, were never able to get onto the information system by themselves.
From the university project and the crime boss's case, the evidence was building for a paradox: keeping everyone out of an information system requires an infinite amount of money and getting everyone onto an information system requires an infinite amount of money, but the costs between these extremes are relatively low.
Mayfield, who first observed this paradox in 1991, began to lecture to audiences about it in 1994 after joining the Los Angeles police department. He believed that only some subset of humanity could be denied access to a system or successfully gain access to a system. This theory was depicted as a U curve, where the cost of a system is on the vertical axis and the percent of humanity that can access the system is on the horizontal axis.
Acceptance of this paradox by the information security community was immediate because it was consistent with common sense and the professional experiences of this group. Mayfield's Paradox points out that, at some point of the curve, additional security becomes unrealistically expensive. Conversely, these professionals knew that the Internet start-up companies that were claiming they would get all of humanity onto their systems would need infinite money to do so. These professionals began using this model to rationalize budgetary decisions both for security expenditures and additional user capacity. Though Mayfield's Paradox was a simple concept model predicting real-world observations, it lacked a mathematical proof that would give it acceptance beyond this small community of information security professionals.
The authors of this article first began discussing Mayfield's Paradox at a soccer game where their sons were playing on the same team. Jaksa Cvitanic, a University of Southern California (USA) mathematics professor, immediately observed what the paradox lacked: a proof. In response, he envisioned two proofs:
First Argument: Let n be the number of individuals having access to at least one computer system. Denote by f(n) the associated cost, namely the cost of allowing/enabling exactly n individuals to have access to a computer system. Let N be the total possible number of individuals (think of N as very large, for example, the size of the world's population) to have access to the system. As described above, Mayfield's Paradox says that the function f(n) is a U-shaped curve. Both the cost of allowing a very small number of people to have access to a computer system and the cost of enabling a very large number of people to have access, are very high. In fact, they tend to infinity as n tends to zero or to N.
In order to argue why this is the case, it is necessary to model the empirical fact that it becomes increasingly difficult to block access to a computer system to a very large proportion of the population. For example, consider a situation in which exactly n individuals have access, where n is small. Assume that it is desired to block access to additional n/2 individuals, that is, the goal is to cut in half the number of people having access. This should be more difficult than blocking access to additional n/2 individuals when the total number having access is equal to 2n. For instance, assume that the following incremental cost equation is satisfied:
f(2n) - f(n) = f(n) - f(n/2)
This means that the cost of reducing the number of individuals with access from 2n to n is equal to the cost of reducing it from n to n/2. It is easily checked that the function
f(n) = - log(n) + k , for n small
satisfies the incremental equation, where k is some positive number. Indeed,
f(2n) - f(n) = - log(2n) + log(n) = - log(2) - log(n) + log(n) = - log(2)
f(n) - f(n/2) = - log(n) + log(n/2) = - log(n) + log(n) + log(1/2) = - log(2)
So, in this case, it is reasonable to model the cost function f(n) for small n as a negative logarithmic function, which starts at infinity at zero and, close to zero, is shaped as the left half of a smile.
Similarly, it is possible to use the same argument for n close to N to conclude that in that region one can model the cost as the function
f(n) = - log(N-n) + k, for n large
shaped as the right half of a smile.
The same function can be used for modeling the more general case in which the incremental cost equation is
f(bn) - f(n) = f(n) - f(n/b)
This is only one example, and it would be possible to get the same shape, but of different size and scale, with any function of the form
f(n) = - a log(n) + k
(for n small) for some positive numbers a and k.
Second Argument: Now consider the cost function as a function of the percentage p of individuals having access to at least one computer system
f(p) = f(n/N)
Again, start with n small, hence p small. Let a, b, h be positive numbers, with h being very small. Suppose that the marginal cost to decrease the percentage of individuals with access to a computer system from p+h to p is a times larger than to decrease it from bp+h to bp. In other words:
f(p+h) - f(p) = a ( f(bp+h) - f(bp) )
If this equation is divided by h, the previous equation is approximately equal to the following marginal cost equation:
f'(p) = a f'(bp)
where f' is the first derivative of f. If, for example, one considers the solutions of the type
f(p) = py
the marginal cost equation implies
y py-1 = a y (bp)y-1
from which one gets
0 = log(a) + (y-1) log(b)
y = 1 - log(a) / log(b)
For example, if a = b2, then y = -1, so that
f(p) = 1/p
for p small. This function is also infinite at zero, and looks like the left half of a smile. A similar argument would point out that for p large (close to 1, namely 100 percent), it is possible to model the cost as
f(p) = 1 / (1-p)
More generally, this argument suggests that one can use the functions of the form
f(p) = a py+ b
for small values of p and
f(p) = a (1-p)y + b
for large values of p, where y is a negative number.
Mayfield's Paradox is an empirically observed fact that the cost of providing/restricting access to a computer system is a U-shaped function (curve) of the percentage of the population having that access. In this article, two mathematical models and proofs for this phenomenon have been provided. The two models address the left and the right ends of the curve, namely the very small and the very large parts of the population.
If the aim is to model very precisely the cost in the middle range of the population, then the models could be improved. This would require data fitting to a particular case at hand. For example, a systems administrator installing a system for the first time has no incremental cost in his choice of configuration parameters that dramatically affect the percent of humanity that access the system--the middle portion of the curve. Most of humanity can be excluded by a "deny all" choice in configuration or most of humanity can be included by an "allow all" choice, making the middle section of the U curve flat as there is no change in cost between these choices.
Mayfield's Paradox observes that keeping everyone out of an information system requires an infinite amount of money and getting everyone onto an information system also requires an infinite amount of money, while costs between these extremes are relatively low. Two mathematical proofs have been presented to demonstrate why this fundamental concept in information security is so. The implications are that systems administrators and information professionals will never be able to keep everyone out of their systems and, paradoxically, they will never be able to get everyone onto their systems either. Understanding this paradox is useful to establish realistic expectations for information systems security and user capacity budgets.
Copyright 2000, Mayfield & Cvitanic