home
Home
feed
Search Posts by Tag
info
About Me
code
My Projects
Matrix Exponentiation for n’th Term of Linear Recurrence

Matrix Exponentiation for n’th Term of Linear Recurrence

Chancellor Ceti

June 2023

Introduction

A problem that often rises in the domain of competitive programming is, given an integer n, a sequence of numbers \(f_0,f_1..,f_{k-1}\), and a sequence of numbers \(c_i\) that satisfy \[\sum_{i=1}^{k}c_if_{n-i}=f_n\] finding \(f_n\) efficiently. A naive approach to this would be starting at \(f_k\) and using the recurrence relation to successively compute all \(f_i\) up to \(f_n\). This approach clearly runs in linear time, \(O(n)\). While this is an acceptable starting point, as \(n\) grows, this calculation takes more and more time– we can do better than this. Rest assured, the solution I outline here will run in logarithmic time, \(O(k^3\cdot \log_2n)\), by making use of the idea of binary exponentiation. I’ll start by explaining why we need matrices here in Section 2, and then in Section 3, I’ll show how we can make this run in logarithmic time using binary exponentiation. Finally, I’ll show my personal Rust implementation and offer some closing thoughts on this whole process. With that out of the way, let’s start by talking about the matrix part of matrix exponentiation.

Matrix Logic

Note– you need to know what a matrix is and how to multiply two matrices together in order to understand this. If that sounds unfamiliar to you, you should probably brush up on that. If you were awake in your linear algebra class, this should be a walk in the park. We have \[\sum_{i=1}^{k}c_if_{n-i}=f_n\] and are given \(f_0,f_1,..,f_{k-1}\). Let’s put the values of \(f_i\) that we have into a column vector \(F\) such that \[F=\begin{pmatrix}f_0\\f_1\\.\\.\\.\\f_{k-1}\end{pmatrix}\] Now the question is, what can we do to \(F\) that would produce \[F'=\begin{pmatrix}f_1\\f_2\\.\\.\\.\\f_k\end{pmatrix}\] Now we introduce the matrix \[T=\begin{pmatrix}0 & 1 & 0 & 0 & . & .\\ 0 & 0 & 1 & 0 & . & .\\ 0 & 0 & 0 & 1 & . & .\\ . & . & . & . & . & .\\ c_k & c_{k-1} & c_{k-2} & . & . & c_1 \end{pmatrix}\] An elementary calculation will confirm that \[T * F = F'= \begin{pmatrix}f_1 \\ f_2 \\ .\\ . \\f_k\end{pmatrix}\] Clearly \(T\) bears some relevance to this problem. Upon some further experimentation, we see that, in general, \[T^n*F=F^{(n)}\] where \[F^{(n)}=\begin{pmatrix}f_n\\f_{n+1}\\.\\.\\f_{n+k-1}\end{pmatrix}\] and \(T^n\) denotes the multiplication of \(T\) with itself \(n\) times. The induction proof is here for those who enjoy rigor. If you don’t like strange matrix manipulation, then you can skip ahead to Section 3.

Let’s start with the base case, \(n=1\). \[T*F=\begin{pmatrix}1*f_1\\1*f_2\\.\\.\\c_kf_0+c_{k-1}f_1+\cdots+c_1f_{k-1}\end{pmatrix}=\begin{pmatrix}f_1\\f_2\\.\\.\\f_k \end{pmatrix}\] due to the definition of matrix multiplication and the recurrence definition of \(f_k\). So the base case is clearly true, showing \(T^1*F=F'\). Now assume that \(T^n*F=F^{(n)}\) for any n. Then \[T^{n+1}*F=\begin{pmatrix}f_n\\f_{n+1}\\.\\.\\f_{n+k-1}\end{pmatrix}*\begin{pmatrix}0 & 1 & 0 & 0 & . & .\\ 0 & 0 & 1 & 0 & . & .\\ 0 & 0 & 0 & 1 & . & .\\ . & . & . & . & . & .\\ c_k & c_{k-1} & c_{k-2} & . & . & c_1 \end{pmatrix}=\begin{pmatrix}f_{n+1}\\f_{n+2}\\.\\.\\f_{n+k}\end{pmatrix}\] by the same definitions as before. So clearly the hypothesis is true, and \(T^n*F=F^{(n)}\).

Great, we now know that \(f_n\) is just the first entry in the vector \(F^{(n)}=T^n*F\). Now we come to the question of how to compute this in logarithmic time– enter binary exponentiation!

How to raise matrices to powers FAST

Binary exponentiation, as the name suggests, utilizes the binary representation of the exponent you are raising a base to. Naive exponentiation, running in linear time, would utilize the fact that \[T^n=\prod_{i=1}^{n}T\] and multiply \(T\) by itself \(n\) times for \(O(n)\) running time(note: the empty product is equal to \(1\)). This, however, is not optimal. The following definition of \(T^n\) can be verified by simple arithmetic. \[T^n=\begin{cases} 1 &\text{if } n==0\\ (T^{\frac{n}{2}})^2 &\text{if } n>0 \text{ and } n \text{ even} \\ (T^{\frac{n-1}{2}})^2*T &\text{if } n>0 \text{ and }n \text{ odd} \end{cases}\] This suggests an algorithm of the form

function pow(t:matrix, n:int)->matrix
    if n==0
        return I
    res = pow(t,floor(n/2))
    if n%2==0
        return res*res
    else
        return res*res*t

where I is the identity matrix. We can do away with the messy recursion by writing the equivalent

function pow(t:int,n:int)
    res = I
    while n>0
        if n%2==1
            res = res*t
        t = t*t
        n/=2
    return res

since \(I*T=T\). This is \(O(\log_2n)\) time complexity since it has as many iterations as it takes to reduce \(n\) to \(1\) by dividing by \(2\), which is the nature of a logarithmic function. So, armed, with binary exponentiation, let’s see what a proper implementation of this looks like!

Code and Closing Thoughts

A decent implementation of the algorithm described below is given here, written in Rust. The modulo parameter in my functions is a trivial extension to the original concept designed to meet certain problem constraints. All in all, this runs in \(O(k^3*\log_2n)\) time because matrix multiplication is not a constant time operation– there’s three nested loops, with time dependent on \(k\), in the

matrix_mult

function. This can definitely be optimized further, but for almost all purposes, this will be perfectly fine performance. If anyone finds a mistake in any of this or a cool optimization, please contact me on Discord at

floofydoggo

or email me at chancellorceti@gmail.com And now, finally the long-awaited code!

fn matrix_mult(
    a: Vec<Vec<i128>>,
    b: Vec<Vec<i128>>,
    modulo: i128
    )-> Vec<Vec<i128>> {
    let mut res: Vec<Vec<i128>> = vec![vec![0; b[0].len()];
    a.len()];
    for i in 0..a.len() {
        for j in 0..b[0].len() {
            for k in 0..a.len() {
                res[i][j] += a[i][k] * b[k][j];
            }
            res[i][j] %= modulo;
        }
    }
    res
}
fn matrix_pow(
    m: Vec<Vec<i128>>,
    n: i128,
    modulo: i128
)-> Vec<Vec<i128>> {
    let mut base = m.clone();
    let mut res: Vec<Vec<i128>> = vec![vec![];m.len()];
    for i in 0..res.len(){
        let mut row:Vec<i128> = vec![0;res.len()];
        row[i]=1;
        res[i]=row;
    }
    let mut nc = n.clone();
    while nc > 0 {
        if nc % 2 == 1 {
            res = matrix_mult(
                res,
                base.clone(),
                modulo.clone()
            );
            nc -= 1;
        }
        base = matrix_mult(
            base.clone(),
            base.clone(),
            modulo
        );
        nc /= 2;
    }
    res
}
pub fn nth_term(
    coeffs: Vec<i128>,
    init: Vec<i128>,
    n: i128,
    modulo: i128
    ) -> i128 {
    let k = coeffs.len();
    let mut t: Vec<Vec<i128>> = vec![vec![]; k];
    for i in 0..k - 1 {
        let mut row: Vec<i128> = vec![0; k];
        row[i + 1] = 1;
        t[i] = row;
    }
    t[k - 1] = coeffs.into_iter().rev().collect();
    let t_pow = matrix_pow(t, n, modulo);
    let mut init_matrix: Vec<Vec<i128>> =
        vec![vec![]; init.len()];
    for i in 0..init.len() {
        init_matrix[i] = vec![init[i]];
    }
    let res_matrix = matrix_mult(
        t_pow,
        init_matrix,
        modulo
        );
    res_matrix[0][0]
}