By Wolfgang Keller

Originally written 2020-09-05

Last modified 2024-01-14

What we want to find is an algorithm for \begin{align*} \mathit{exp}: \left(M \times \prod_{n \in \mathbb{Z}_{\geq 0}} \{0,1\}^{\{0, \ldots, n-1\}}\right) & \rightarrow M: \\ a\ n\ f & \mapsto a^{\sum_{i=0}^{n-1} f_i \cdot 2^i}. \end{align*} where \(M\) is a (multiplicatively written) monoid.

For \(k \in \{n, \ldots, 0\}\), set \[ \texttt{R0}_k := a^{\sum_{i=k}^{n-1} f_i \cdot 2^{i-k}}. \] We have for \(k \in \{n-1, \ldots, 0\}\): \[ \texttt{R0}_k = \begin{cases} \texttt{R0}_{k+1} \cdot \texttt{R0}_{k+1} & \text{if } f_k = 0, \\ \texttt{R0}_{k+1} \cdot \texttt{R0}_{k+1} \cdot a & \text{if } f_k = 1, \end{cases} \]

This yields the following algorithm:

In particular for cryptographic applications, often desires that independently of the execution path that the code takes, the exact number of executed monoid multiplications is the same. The idea behind this is that for many cryptographical applications, the exponent \(e\) must be kept secret (with, for example. only \(\deg e\) known to the public).

One method for this, which we present in this section, is called a **Montgomery ladder**.

Using a Montgomery ladder does, of course, **not** suffice to ensure that the code is secure
against timing attacks (even if the monoid operations are implemented in a way
that they are executed in constant time).
Instead, if the operands of the monoid are stored in memory (because they do not fit into
registers), cache timing information can leak which operands were used
and thus which code branch was taken.

Because of this, in most cases a better implementation option for code that is secure against
timing attacks is to execute *both* possible branches (current bit is 0 vs
1) and use a (constant-time) conditional move afterwards to
select the “correct” branch that the code “actually took”.

In the sense of the previous warning, we describe the Montgomery ladder for the mathematical ideas behind it.

The first step is to observe that in the loop of Algorithm ,
`R0`

is set to

- either
`R0 * R0`

- or
`(R0 * R0) * R1 = R0 * (R0 * R1)`

, which is, since`R1 = a`

for the whole code, equal to`(R0 * R0) * a = R0 * (R0 * a)`

.

So, we change the code such that the value of `R1`

is rather `R0 * a`

before and after each iteration of the loop, i.e. for \(k \in \{n, \ldots, 0\}\), set
\(\texttt{R1}_k := a^{\sum_{i=k}^{n-1} f_i \cdot 2^{i-k} + 1}\),
i.e. altogether for \(k \in \{n, \ldots, 0\}\), set
\begin{align*}
\texttt{R0}_k & := a^{\sum_{i=k}^{n-1} f_i \cdot 2^{i-k}}, \\
\texttt{R1}_k & := a^{\sum_{i=k}^{n-1} f_i \cdot 2^{i-k} + 1}.
\end{align*}

Consider that for \(k \in \{n-1, \ldots, 0\}\), we have:

- If \(\texttt{f[k]}=0\), we have \begin{align*} \texttt{R1}_k & = \texttt{R0}_{k+1} \cdot \texttt{R1}_{k+1}, \\ \texttt{R0}_k & = \texttt{R0}_{k+1} \cdot \texttt{R0}_{k+1}. \end{align*}
- If \(\texttt{f[k]}=1\), we have \begin{align*} \texttt{R0}_k & = \texttt{R0}_{k+1} \cdot \texttt{R1}_{k+1}, \\ \texttt{R1}_k & = \texttt{R1}_{k+1} \cdot \texttt{R1}_{k+1}. \end{align*}

This yields the following algorithm, the Montgomery ladder:

For \(k \in \{-1, \ldots, n-1\}\), set \begin{align*} \texttt{R0}_k & := a^{2^{k+1}}, \\ \texttt{R1}_k & := a^{\sum_{i=0}^k f_i \cdot 2^i}. \end{align*} We have for \(k \in \{0, \ldots, n-1\}\): \begin{align*} \texttt{R0}_k & = \texttt{R0}_{k-1} \cdot \texttt{R0}_{k-1}, \\ \texttt{R1}_k & = \begin{cases} \texttt{R1}_{k-1} & \text{if } f_k = 0, \\ \texttt{R0}_{k-1} \cdot \texttt{R1}_{k-1} & \text{if } f_k = 1. \end{cases} \end{align*}

This yields the following algorithm:

Does there also exist a Montgomory ladder variant of Algorithm ? There does.

Instead of having \(\texttt{R0}_k = a^{2^{k+1}}\) before the loop iteration \(k-1\), we set \(\texttt{R0}_k := a^{2^{k+1}-\sum_{i=0}^k f_i \cdot 2^i}\), i.e. altogether for \(k \in \{-1, \ldots, n-1\}\), set \begin{align*} \texttt{R0}_k & := a^{2^{k+1}-\sum_{i=0}^k f_i \cdot 2^i}, \\ \texttt{R1}_k & := a^{\sum_{i=0}^k f_i \cdot 2^i}. \end{align*}

Consider that for \(k \in \{0, \ldots. n-1\}\), we have:

- If \(\texttt{f[k]}=0\), we have \begin{align*} \texttt{R1}_k & = \texttt{R1}_{k-1}, \\ \texttt{R0}_k & = \texttt{R0}_{k-1} \cdot \texttt{R0}_{k-1} \cdot \texttt{R1}_{k-1}. \end{align*}
- If \(\texttt{f[k]}=1\), we have \begin{align*} \texttt{R0}_k & = \texttt{R0}_{k-1}, \\ \texttt{R1}_k & = \texttt{R0}_{k-1} \cdot \texttt{R1}_{k-1} \cdot \texttt{R1}_{k-1}. \end{align*}

This yields the following algorithm, a dual to the Montgomery ladder:

*We make it clear that to our knowledge this dual variant of the Montgomory ladder has never been published in the literature before, it is thus a novel result.*

Let \(R\) be a commutative ring, let \(M\) be an \(R\)-module, let \(k \in \mathbb{Z}_{\geq 1}\), and let \(f_{-(k-1)}, \ldots, f_0 \in M\) (start values) and \(a_{-k}, \ldots, a_{-1} \in R\) (coefficients) be given. Consider the sequence \begin{align*} f: \mathbb{Z}_{\geq -(k-1)} & \rightarrow M: \\ n & \mapsto \begin{cases} f_n & \text{if } n \in \{-(k-1), \ldots, 0\}, \\ \sum\limits_{i=-k}^{-1} a_i f_{n+i} & \text{otherwise.} \end{cases} \end{align*} Of course, we can use an iterative algorithm to compute the \(n\)th element of the sequence. But we want to use one of the exponentiation by squaring algorithms from section to obtain a faster algorithm.

Let \begin{align*} A & := \left( \begin{array}{ccccc} a_{-1} & a_{-2} & \cdots & a_{-(k-1)} & a_{-k} \\ 1 & & & & \\ & 1 & & & \\ & & \ddots & & \\ & & & 1 & \end{array} \right), \\ f_{start} & := \left( \begin{array}{c} f_0 \\ f_{-1} \\ \vdots \\ f_{-(k-1)} \end{array} \right). \end{align*} The trick for the algorithm is to consider that for \(n \in \mathbb{Z}_{\geq 0}\), we have \[ f_n = \left( A^n \cdot f_{start} \right)_0. \]

Matrices over a commutative ring form a monoid with respect to multiplication. Thus, we can use one of the exponentiation by squaring algorithms from section to compute \(A^n\).