# The strange non-transitivity of random variables – Mathematics, Cryptology, Computing …

In probability theory and stochastics there is a somewhat paradoxical phenomenon called the non-transitivity of random variables (of stochastic variables). The following can also be entirely formulated without measure-theoretic notions. Here however, I want to indicate the perspective from measure theory, as I think it adds some flexibility and helps handling with measure theoretic stochastics.

Let’s first introduce the notions involved.

Def.: (Binary relation)

Let $$M$$ be any set.

Then a binary relation $$R$$ on $$M$$ is a subset $$R \subseteq M \times M$$.

An element $$(x,y) \in R$$ of a binary relation $$R$$, we usually write as $$xRy$$.

Examples of binary relations are the order relations $$<, \leq, >, \geq$$ on $$\mathbb{R}$$ or unary operations. Binary operations like $$+, – , \cdot, /$$ on $$\mathbb{N}, \mathbb{Z}, \mathbb{Q}, \mathbb{R}, \mathbb{C}, \mathbb{R}^n$$, … etc. (where defined), are examples of ternary relations, which are defined accordingly as a subset $$R \subseteq M \times M \times M$$.

Def.: (Transitive and non-transitive binary relation)

A binary relation $$R$$ is called transitive

$$:\Longleftrightarrow \forall x,y,z \in M: xRy \wedge yRz \Rightarrow xRz.$$

Otherwise, we call the relation non-transitive.

Def.: (Probability space)

A probability space is a triple $$(\Omega, \mathcal{F}, \mathbb{P})$$, where

$$\Omega$$ is any set,

$$\mathcal{F}$$ is a $$\sigma$$-algebra over $$\Omega$$,

$$\mathbb{P}$$ is a probability measure on $$\mathcal{F}$$.

This is the usual measure-theoretic definition of a probability space which means that it is a measure space where $$\mathbb{P}(\Omega) = 1$$. The set $$\Omega$$ is the collection of all possible outcomes (samples), the result space of the random experiment. The $$\sigma$$-algebra $$\mathcal{F}$$ denotes the event space of the random experiment. In the case that $$\Omega$$ is a finite set, the event space is the power set $$\mathcal{P}(\Omega)$$ of the sample space, i.e., all possible subsets. The probability measure $$\mathbb{P}$$ associates to each event $$F \in \mathcal{F}$$ its probability $$\mathbb{P}(F)$$.

A simple example is in the context of throwing a dice which we will also use below to explain the paradoxon. The set of outcomes (the sample space), in such a random experiment is then given by $$\Omega := \{1,2,3,4,5,6\}$$. The set of events contains all subset (since $$\Omega$$ is a finite set) and represents such events like “the dice thrown is odd” ($$F = \{1,3,5\} \in \mathcal{F}$$), “the number $$6$$ has been thrown” ($$F = \{6\} \in \mathcal{F}$$) or “a number greater than $$4$$ has been thrown” ($$F = \{5,6\} \in \mathcal{F}$$).

Def.: (Random variable (stochastic variable))

A random variable or stochastic variable is a measurable function

$$X : (\Omega, \mathcal{F}, \mathbb{P}) \to \mathbb{R}$$.

More generally, a random variable is a $$\mathcal{F}_1-\mathcal{F}_2$$-measurable function on a probability space $$(\Omega_1,\mathcal{F}_1, \mathbb{P})$$ into a measurable space $$(\Omega_2, \mathcal{F}_2)$$. But we do not need to go into details with respect to random variables and measurable functions. An example of a random variable is given at the end of the following typical example which shows the paradoxon:

Example: (This is a reduced variant of the so-called Efron’s dice.)

We consider the “stochastic experiment” of throwing three fair dice with a bit special numbers marked on it. The first dice $$D_1$$ has the numbers $$\{2,2,4,4,9,9\}$$, the second dice $$D_2$$ the numbers $$\{1,1,6,6,8,8\}$$ and the third $$D_3$$ has $$\{3,3,5,5,7,7\}$$.

Hence, the space of our elementary events is

$$\Omega := \{(\omega_1, \omega_2, \omega_3) \mid \omega_i \in \Omega_i, i \in \mathbb{N}_3\}$$,

where $$\Omega_1 := \{2,4,9\}, \Omega_2 := \{1,6,8\}, \Omega_3 := \{3,5,7\}$$ and $$\forall i \in \mathbb{N}_3, \forall \omega \in \Omega_i:$$

$$\mathbb{P}(\{\omega\}) = \frac{1}{3}$$.

We want to study the probability of the outcome, that the dots of dice $$D_i$$ is greater than the ones of $$D_j, \text{ for } i<j \text{ or } i=3 \wedge j = 1$$. It turns out that we have the following probabilities (easy to check!):

$$\mathbb{P}(\omega_1 > \omega_2) = \frac{5}{9}$$,

$$\mathbb{P}(\omega_2 > \omega_3) = \frac{5}{9}$$,

$$\mathbb{P}(\omega_3 > \omega_1) = \frac{5}{9}$$.

That is, the probabilities not only for the two first relation, but also for the third relation, are equal and more than $$\frac{1}{2}$$. So, “in the long run”, in more than half of the cases and even with the same probability,

• the dots of the first dice will be higher than the second one,
• the dots of the second one will be higher than the third one,
• and surprisingly, the dots of the third one will also be higher than the dots of the first one.

This is exactly the non-transitivity of random variables and is actually quite counterintuitive, hence “paradoxical”. And to make it appear even more paradoxical, note also that the expected values of all three dice are identical: $$E[\Omega_1] = E[\Omega_2] = E[\Omega_3] = 5$$.

The random variable $$X : (\Omega, \mathcal{P}(\Omega),\mathbb{P}) \to \mathbb{R}$$ involved in this random experiment, could for instance be the function

$$X(\omega_1, \omega_2, \omega_3) := \left\{\begin{array}{rl} 123, & \quad \omega_1 > \omega_2 > \omega_3 \\ 132, & \quad \omega_1 > \omega_3 > \omega_2 \\ 213, & \quad \omega_2 > \omega_1 > \omega_3 \\ 231, & \quad \omega_2 > \omega_3 > \omega_1 \\ 312, & \quad \omega_3 > \omega_1 > \omega_2 \\ 321, & \quad \omega_3 > \omega_2 > \omega_1. \\ \end{array} \right.$$

$$\bigcirc$$

This fact, that probabilities do not need to be transitive, must be taken into account in all areas of applications of stochastics.