0% found this document useful (0 votes)
69 views

Modular Multiplication: Dr. Arunachalam V Associate Professor, SENSE

Modular multiplication involves computing A×B mod N, where A and B are residues modulo N. The document discusses several algorithms for modular multiplication with optimizations like precomputations and avoiding full multiplication. These include Barrett's algorithm, Montgomery's algorithm, and McLaughlin's algorithm. Montgomery's algorithm is efficient when performing many multiplications with the same modulus N. It works by transforming residues into Montgomery form and using Montgomery multiplication.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Modular Multiplication: Dr. Arunachalam V Associate Professor, SENSE

Modular multiplication involves computing A×B mod N, where A and B are residues modulo N. The document discusses several algorithms for modular multiplication with optimizations like precomputations and avoiding full multiplication. These include Barrett's algorithm, Montgomery's algorithm, and McLaughlin's algorithm. Montgomery's algorithm is efficient when performing many multiplications with the same modulus N. It works by transforming residues into Montgomery form and using Montgomery multiplication.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Modular Multiplication

Dr. Arunachalam V
Associate Professor, SENSE
Introduction

• Modular multiplication means computing A×B mod N, where A and B are


residues modulo N.
• Of course, once the product C = A×B has been computed, it suffices to perform
a modular reduction C mod N, which itself reduces to an integer division.
• The algorithms presented here benefit from some precomputations involving
N, and are thus specific to the case where several reductions are performed
with the same modulus.
• Also, some algorithms avoid performing the full product C = A×B ; one such
example is McLaughlin’s algorithm.
Precomputations and different algorithms
• Algorithms with precomputations include Barrett’s algorithm, which
computes an approximation to the inverse of the modulus, thus trading division
for multiplication; Montgomery’s algorithm, which corresponds to Hensel’s
division with remainder only, and its sub-quadratic variant, which is the LSB-
variant of Barrett’s algorithm; and finally McLaughlin’s algorithm.
• The cost of the precomputations is not taken into account: it is assumed to be
negligible if many modular reductions are performed.
• However, we assume that the amount of precomputed data uses only linear,
that is O(logN), space.
• As usual, we assume that the modulus N has n words in base β, that A and B
have at most n words, and in some cases that they are fully reduced, i.e.,
0≤ , < .
Barrett’s algorithm
• Barrett’s algorithm is attractive when many divisions have to be made with the
same divisor; this is the case when one performs computations modulo a fixed
integer.
• The idea is to precompute an approximation to the inverse of the divisor.
• Thus, an approximation to the quotient is obtained with just one multiplication,
and the corresponding remainder after a second multiplication.
• A small number of corrections suffice to convert the approximations into exact
values.
• For the sake of simplicity, we describe Barrett’s algorithm in base β, where β
might be replaced by any integer, in particular 2n or β n.
= 1980; = 36; = 64 = 4096 = 113

= (1980) = (30 × 64 + 60)


= 52; = 108 = 3 ×

= 55; =0

Theorem 2.4.1 Algorithm BarrettDivRem is correct and step 5 is performed


at most 3 times.
Complexity of the algorithm

• The multiplications at steps 2 and 3 may be replaced by short products, more


precisely the multiplication at step 2 by a high short product, and that at step 3
by a low short product .
• Barrett’s algorithm can also be used for an unbalanced division, when dividing
+ 1 words by n words for ≥ 2, which amounts to k divisions of 2n
words by the same n-word divisor.
• In this case, we say that the divisor is implicitly invariant.
• In the FFT range, this cost might be lowered to 1.5M(n) using the “wrap-
around trick”; moreover, if the forward transforms of I and B are stored, the
cost decreases to M(n), assuming M(n) is the cost of three FFTs.
Montgomery’s algorithm

• Montgomery’s algorithm is very efficient for modular arithmetic modulo a


fixed modulus N.
• The main idea is to replace a residue by = , where
is the “Montgomery form” corresponding to the residue A, with λ an integer
constant such that , = 1.
• Addition and subtraction are unchanged, since + = + .
• The multiplication of two residues in Montgomery form does not give exactly
what we want: ( ) ≠( ) .
• The trick is to replace the classical modular multiplication by “Montgomery’s
multiplication”: ′, ′ = .
• For some values of λ, ′, ′ can easily be computed, in
particular for = , where N uses n words in base .
REDC & Fast REDC

• Algorithm 2.6 is a quadratic algorithm (REDC) to compute


′, ′ in this case, and a sub-quadratic reduction
(FastREDC) is given in Algorithm 2.7.
• Another view of Montgomery’s algorithm for = is to consider that it
computes the remainder of Hensel’s division.
• For example, with inputs C = 766 970 544 842 443 844, N = 862 664
913, and β = 1000,
• Algorithm REDC precomputes μ = 23; then we have = 412, which
yields ← + 412 = 766 970 900 260 388 000;
• then = 924 , which yields
← + 924 = 767 768 002 640 000 000;
• then = 720 , which yields ← + 720 = 1 388 886 740.
• At step 4, R = 1 388 886 740, and since ≥ , REDC returns
− = 526 221 827
Precomputation of µ
• For example, N = 862 664 913, and β = 1000,
• =− ⇒ =1
• Apply Euclid’s algorithm till the reminder is 1
• 1000 = 913 1 + 87 ⇒ 1000 + 913 −1 = 87
• 913 = 87 10 + 43 ⇒ 913 + 87 −10 = 43
• 87 = 43 2 + 1 ⇒ 87 + 43 −2 = 1
• Rewrite the factors in terms of β and least word of N
• 87 + 913 + 87 −10 −2 = 1 ⇒ 913 −2 + 87 21 = 1
• 913 −2 + 1000 + 913 −1 21 = 1 ⇒ 1000 21 + 913 −23 = 1
• Therefore precomputed μ = 23;

Refer this video https://www.youtube.com/watch?v=shaQZg8bqUM for finding µ


Comparison with classical method
• Compared to classical division (Algorithm BasecaseDivRem),
• Montgomery’s algorithm has two significant advantages:
• the quotient selection is performed by a multiplication modulo the
word base , which is more efficient than a division by the most
significant word of the divisor as in BasecaseDivRem;
• and there is no repair step inside the for-loop — the repair step is
at the very end.
Reference
1. Chapter 2.4 of Richard P Brent and Paul Zimmerman, “Modern
Computer Arithmetic”, Cambridge University Press 2010.
Next Class

MORE EXAMPLES

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy