Creating a repo for everything on website
This commit is contained in:
524
content/posts/rsa-optimization.md
Normal file
524
content/posts/rsa-optimization.md
Normal file
@@ -0,0 +1,524 @@
|
||||
---
|
||||
title: "RSA Optimization"
|
||||
date: 2022-12-06
|
||||
draft: false
|
||||
---
|
||||
|
||||
# INTRODUCTION
|
||||
|
||||
RSA is a public key cryptosystem, which was named after the creators of
|
||||
the algorithm: Rivest, Shamir, and Adleman [@STALLINGS]. It is widely
|
||||
used for both confidentiality and authentication. The main advantage for
|
||||
using RSA is the keys are created in such a way that the public key is
|
||||
publish and can be used to encrypt all messages to the owner of the
|
||||
public key. Unlike symmetric key schemes, RSA does not require sender
|
||||
and receiver to agree on a common key to encrypt and decrypt messages.
|
||||
To send an encrypted message, the encryptor only has to look up the
|
||||
public key of the intended recipient.
|
||||
|
||||
The RSA algorithm works as follows. First, a public key and private key
|
||||
pair are generated. The public key is published and can be used to
|
||||
encrypt messages and verify signatures of the owner of the public key.
|
||||
The corresponding private key is known only to the owner and can be used
|
||||
to decrypt messages encrypted with the owner's public key. The private
|
||||
key can also be used to sign messages. The public and private keys are
|
||||
generated by choosing two distinct large primes, `p` and `q`. Then `n`
|
||||
is computed by multiplying `p` and `q` together. Since `n` is the
|
||||
product of two primes, we can compute Euler's totient function by
|
||||
`φ(n)=(p-1)(q-1)`. Then, an integer `e` is chosen such that
|
||||
`1<e<φ(n)` and `gcd(e,φ(n))=1`, where gcd stands for
|
||||
greatest common denominator. Finally an integer `d` is computed such
|
||||
that `d=e^{-1}(mod φ(n))`. Thus, the public key is the modulus
|
||||
`n` and the encryption exponent `e`. The private key is the decryption
|
||||
exponent `d`.
|
||||
|
||||
RSA is criticized for the amount of time it takes to encrypt and decrypt
|
||||
messages. To encrypt a message, the sender uses the public key, `(n,e)`.
|
||||
The message is converted to an integer `m`, such that `0<m<n`. Then, the
|
||||
ciphertext message `c` is computed by `c=m^{e}(mod n)`. The message
|
||||
can be recovered by using the private key, `d`, and the formula
|
||||
`m=c^{d}(mod n)`. Currently to be considered secure, the size of `n`
|
||||
needs to be 1024 bits or more. This makes RSA computationally slow
|
||||
because arithmetic modulo large numbers is notoriously slow. For
|
||||
example, on small handheld devices, RSA decryption with 1024 bit modulus
|
||||
can take up to 40 seconds [@BONEH]. Also, RSA decryption can
|
||||
significantly reduce the number of requests a web server can handle. In
|
||||
addition, as computers become more powerful, 1024-bit RSA will not be
|
||||
considered secure. This means that in the future, the size of the
|
||||
modulus will increase. Therefore, we need to create methods we can use
|
||||
RSA in a practical fashion.
|
||||
|
||||
To improve the speed of RSA encryption and decryption, we propose to
|
||||
create an instruction set architecture that will be made solely for RSA.
|
||||
In this paper, we will talk strictly about using RSA for encrypting and
|
||||
decrypting messages. However the same instruction set architecture we
|
||||
propose in this paper can be used for signing and verifying messages
|
||||
with RSA.
|
||||
|
||||
# CHARACTERISTICS OF RSA
|
||||
|
||||
There are three areas that the RSA can be optimized: finding the
|
||||
encryption and decryption exponent, prime number generation, and
|
||||
encrypting and decrypting messages. In this section we will describe how
|
||||
finding the encryption and decrypting exponent, prime number generation,
|
||||
and encrypting and decrypting the message is usually done, without
|
||||
specialized instructions.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
The current approach to verifying that the encryption exponent is
|
||||
coprime to the `φ(n)` is by using the Euclidean Algorithm. To find
|
||||
the decryption exponent, the Extended Euclidean Algorithm is used.
|
||||
Therefore, we will include these algorithms in our architecture and
|
||||
create instructions to accommodate the Euclidean Algorithm and Extended
|
||||
Euclidean Algorithm.
|
||||
|
||||
The Euclidean Algorithm can be executed using the following pseudocode
|
||||
[@STALLINGS]:
|
||||
|
||||
;; load phi of n (computed using Euler's totient function) and encryption exponent
|
||||
load r1, location_of_phi(n)
|
||||
load r2, location_of_encryption_exponent
|
||||
|
||||
loop while r2 not equal to zero
|
||||
|
||||
div t0, r1, r2 ;; t0 = r1 / r2
|
||||
mult t1, r2, t0 ;; t1 = r2 * r
|
||||
sub t2, r1, t1 ;; t2 = r1 - t1
|
||||
|
||||
store r2, r1 ;; r1 = r2
|
||||
store t2, r2 ;; r2 = t2
|
||||
|
||||
;; end loop
|
||||
|
||||
return r2 ;; this is equal to the gcd of phi of n and the encryption exponent
|
||||
|
||||
|
||||
If the `gcd(φ(n), e) = 1`, then `e` is a valid encryption exponent
|
||||
for RSA, and we can now find the decryption exponent, `d`. If the
|
||||
`gcd(φ(n), e) \neq 1`, then a new encryption exponent `e` needs to
|
||||
be chosen such that `1<e<φ(n)`. Then, the Euclidean Algorithm
|
||||
needs to be ran again to verify that `gcd(φ(n), e) = 1`.
|
||||
Typically, to execute the Euclidean Algorithm, a divide, multiply, and
|
||||
subtract instruction would be needed in each iteration of the loop. In
|
||||
the design section, we will describe an instruction that will combine
|
||||
the divide, multiply, and subtract instruction into one instruction.
|
||||
|
||||
Once the encryption exponent, `e` is chosen, we need to find the
|
||||
decryption exponent `d`. The decryption exponent, `d` needs to satisfy
|
||||
the following formula: `d*e = 1 (mod φ(n))`. Another way of
|
||||
describing `d` is it is the multiplicative inverse of `e` modular
|
||||
`φ(n)`. The Extended Euclidean Algorithm can be executed using the
|
||||
following pseudocode [@CORMEN]:
|
||||
|
||||
;; load phi of n (computed using Euler's totient function) and encryption exponent
|
||||
load r1, location_of_phi(n)
|
||||
load r2, location_of_encryption_exponent
|
||||
|
||||
;; values that will be used and updated in the loop of the algorithm
|
||||
store 0, r3 ;; r3 = 0
|
||||
store 0, r4 ;; r4 = 0
|
||||
store 1, r5 ;; r5 = 1
|
||||
store 1, r6 ;; r6 = 1
|
||||
|
||||
loop while r2 not equal to zero
|
||||
|
||||
div t0, r1, r2 ;; t0 = r1/r2
|
||||
store r2, t1 ;; t1 = r2
|
||||
mult t2, r1, t0 ;; t2 = r1 * t0
|
||||
sub r2, r2, t2 ;; r2 = r2 - t2
|
||||
store t1, r1 ;; r1 = t1
|
||||
store r3, t3 ;; t3 = r3
|
||||
mult t4, t0, r3 ;; t4 = t0 * r3
|
||||
sub r3, r5, t4 ;; r3 = r5 - t4
|
||||
store r3, r5 ;; r5 = r3
|
||||
store r4, t5 ;; t5 = r4
|
||||
mult t6, t0, r4 ;; t6 = t0 * r4
|
||||
sub r4, r6, t6 ;; r4 = r6 - t6
|
||||
store t5, r6 ;; r6 = t5
|
||||
|
||||
;; end loop
|
||||
|
||||
return r5, r6 ;; r5 and r6 are e and d
|
||||
|
||||
Since we chose an `e` such that `gcd(φ(n), e) = 1`, there will
|
||||
always exist a `d` such that `ed = 1 (mod φ(n))`. The Extended
|
||||
Euclidean Algorithm returns the integer `d`. Typically to execute the
|
||||
Extended Euclidean Algorithm, one divide, six stores, three multiplies,
|
||||
and three subtracts are needed within the loop. In the design section,
|
||||
we will use two instructions that will combine some of the instructions
|
||||
used in the Extended Euclidean Algorithm to reduce the number of stalls
|
||||
within the loop.
|
||||
|
||||
## PRIME NUMBER GENERATION
|
||||
|
||||
The common approach to generating large primes in making encryption and
|
||||
decryption keys is to randomly select integers and test them for
|
||||
primality. Since for large numbers, being sure of primality can be an
|
||||
expensive operation, algorithms are used that find "probable primes."
|
||||
These algorithms test numbers for primality with a certainty over a
|
||||
desired threshold. The particular algorithm we will investigate is known
|
||||
as Rabin's test, whose high-level use in prime generators as
|
||||
follows[@BEAUCHEMIN]:
|
||||
|
||||
;;Generates a prime number of binary length l and certainty factor k
|
||||
;;The number will have a exp(4,-k) chance of being composite (non-prime)
|
||||
function genPrime (l, k):
|
||||
repeat
|
||||
n <- Randomly selected l-digit odd number
|
||||
until repeatRabin(n, k) = "prime"
|
||||
return n
|
||||
|
||||
function repeatRabin(n, k)
|
||||
from i <- 0 to k or until done
|
||||
done <- rabin(n) = "composite"
|
||||
end
|
||||
if done
|
||||
return "composite"
|
||||
else
|
||||
return "prime"
|
||||
|
||||
function rabin(n)
|
||||
s <- the number of trailing zeros in binary form of n-1
|
||||
t <- n-1 with trailing zeros removed
|
||||
a <- integer randomly selected between 2 and n-2
|
||||
x <- exp(a,d) mod n
|
||||
if x = 1 or x = n - 1
|
||||
return "prime"
|
||||
for r <- 1 ... s - 1
|
||||
x <- exp(x,2) mod n
|
||||
if x = 1 return "composite"
|
||||
if x = n - 1 return "prime"
|
||||
end for
|
||||
return "composite"
|
||||
|
||||
This technique will take a long amount of time for large values of `t`
|
||||
and `s`. For large values of `t`, exponentiation done in a typical
|
||||
manner would take an increasing number of iterations to calculate (as
|
||||
shown below). The instructions to calculate `x^{2} (mod n)` would be
|
||||
executed `s` times. These two factors indicate a heavy reliance on the
|
||||
ability of a system to calculate exponentiation.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION
|
||||
|
||||
One aspect of RSA to improve upon is performing large exponentiation.
|
||||
Currently the implementation of exponentiation is performed by the
|
||||
compiler, which creates inconsistencies in speed between compilers and
|
||||
is generally not optimal. The general practice for compilers is to loop
|
||||
for the value in the exponent while accumulating the total.
|
||||
|
||||
Current pseudocode for how compilers handle exponentiation [@COHEN]:
|
||||
|
||||
function power (base, exponent)
|
||||
set result = base
|
||||
loop from 1 to exponent do
|
||||
result = result * base
|
||||
return result
|
||||
|
||||
This technique is very slow, especially for large exponents. It will
|
||||
cause a lot of stalls as the processor waits for the results of the
|
||||
previous computation. It will also stall waiting for another multiplier
|
||||
to be available. We can lessen the number of stalls using a technique
|
||||
known as exponentiation by squaring. This technique is explained further
|
||||
in the design section.
|
||||
|
||||
# DESIGN
|
||||
|
||||
In this section, we will describe specialized instructions that will be
|
||||
used for prime number generation, computing the encryption and
|
||||
decryption exponent, and encrypting a decrypting a message.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
The issue with implementing the Euclidean Algorithm the traditional way
|
||||
is a divide, multiply, and subtract instructions are needed for each
|
||||
iteration of the loop. We noticed that the Euclidean Algorithm is
|
||||
finding a modulus in each iteration of the loop, thus we propose to use
|
||||
a modulus instruction for the Euclidean Algorithm, which is defined
|
||||
below.
|
||||
|
||||
mod x, y, z : x = z - ( (z/y) * y)
|
||||
|
||||
By using the modulus instruction, we have reduced the number of stalls
|
||||
in each loop. Unlike the traditional implementation, which requires both
|
||||
the multiply and subtraction instruction to stall until the instruction
|
||||
before it writes it's value to memory, the modulus instruction has all
|
||||
the values immediately.
|
||||
|
||||
Therefore, we can implement the Euclidean Algorithm with the following
|
||||
pseudocode:
|
||||
|
||||
;; load phi of n (computed using Euler's totient function) and encryption exponent
|
||||
load r1, location_of_phi(n)
|
||||
load r2, location_of_encryption_exponent
|
||||
|
||||
loop while r2 not equal to zero
|
||||
|
||||
mod t0, r1, r2 ;; t0 = r2 - ( (r2/r1) * r1)
|
||||
|
||||
store r2, r1 ;;r1 = r2
|
||||
store t0, r2 ;;r2 = t2
|
||||
|
||||
;; end loop
|
||||
|
||||
return r2 ;; this is equal to the gcd of phi of n and the encryption exponent
|
||||
|
||||
|
||||
Using the mod instruction, we can eliminate the use of the divide,
|
||||
multiply, and subtract instructions. Also, we use only one temporary
|
||||
register, as opposed to three temporary registers. We will provide an
|
||||
analysis to the speedup that the modulus instruction gives the
|
||||
implementation of the Euclidean Algorithm in the justification and
|
||||
analysis section.
|
||||
|
||||
For the implementation of the Extended Euclidean Algorithm, we propose
|
||||
to use two specialized instructions: a modular instruction and a
|
||||
multiply-subtract instruction. The modular instruction combines the
|
||||
divide, multiply, and subtract instruction, and is defined the same as
|
||||
it is above, in the Euclidean Algorithm. The multiply-subtract
|
||||
instruction is defined as follows:
|
||||
|
||||
multsub w, x, y, z : w = x - (y * z)
|
||||
|
||||
Like in the Euclidean Algorithm, by using the modular instruction in the
|
||||
Extended Euclidean Algorithm, we are reducing the number of stalls in
|
||||
each loop iteration. We are further reducing the number of stalls by
|
||||
also using the multiply-subtract instruction. Thus, we can implement the
|
||||
Extended Euclidean Algorithm with the following pseudocode:
|
||||
|
||||
;; load phi of n (computed using Euler's totient function) and encryption exponent
|
||||
load r1, location_of_phi(n)
|
||||
load r2, location_of_encryption_exponent
|
||||
|
||||
store 0, r3 ;; r3 = 0
|
||||
store 0, r4 ;; r4 = 0
|
||||
store 1, r5 ;; r5 = 1
|
||||
store 1, r6 ;; r6 = 1
|
||||
|
||||
loop while r2 not equal to zero
|
||||
|
||||
div t0, r1, r2 ;; t0 = r1/r2
|
||||
store r2, t1 ;; t1 = r2
|
||||
mod r2, r1, r2 ;; r2 = r2 - ( (r2/r1) * r1)
|
||||
store t1, r1 ;; r1 = t1
|
||||
store r3, t2 ;; t2 = r3
|
||||
multsub r3, r5, t0, r3 ;; r3 = r5 - (t0 * r3)
|
||||
store t2, r5 ;; r5 = t2
|
||||
store r4, t3 ;; t3 = r4
|
||||
multsub r4, r6, t0, r4 ;; r4 = r6 - (t0 * r3)
|
||||
store t3, r6 ;; r6 = t3
|
||||
|
||||
;; end loop
|
||||
|
||||
return r5, r6 ;; r5 and r6 are e and d
|
||||
|
||||
Using the modular and multiply-subtract instructions, we have reduced
|
||||
the amount of instructions in the loop of the implementation of the
|
||||
Extended Euclidean Algorithm from 13 instructions to 10 instructions. We
|
||||
also reduce the temporary registers from five to three. We will provide
|
||||
an analysis to the speedup given to the Extended Euclidean Algorithm by
|
||||
using the modular instruction and the multiply-subtract instruction in
|
||||
the justification and analysis section.
|
||||
|
||||
## PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
|
||||
One issue already discussed in the previous section is that of stalls
|
||||
during large exponents. The way exponentiation is handled causes many
|
||||
stalls slowing down the program. One way to lessen the number of stalls
|
||||
is by using a technique called exponentiation by squaring. This
|
||||
technique uses the square of the result to accumulate the final solution
|
||||
instead of just the base. In this way it'll divide the number of
|
||||
multiplies in half, thus divide the number of stalls in half.
|
||||
|
||||
The pseudocode for how exponentiation by squaring will work [@COHEN]:
|
||||
|
||||
function power (base, exponent)
|
||||
set binaryExp = (n1, n2, n3, ... nm)
|
||||
;; where binaryExp is the binary representation of exponent and n1 is 1
|
||||
set result = base
|
||||
set singleResult = 1
|
||||
loop for x from 0 to (length of binaryExp - 1) do
|
||||
result = result * result
|
||||
if binaryExp[x] == 1
|
||||
singleResult = singleResult * base
|
||||
|
||||
result = result * singleResult
|
||||
return result
|
||||
|
||||
An example for calculating 3 to the 10th is as follows:
|
||||
|
||||
result := 3
|
||||
binaryExp := "1010"
|
||||
|
||||
Iteration for digit 2:
|
||||
result := result^{2} = 3^{2} = 9
|
||||
1010bin - Digit equals "0"
|
||||
|
||||
Iteration for digit 3:
|
||||
result := result^{2} = (32)2 = 34 = 81
|
||||
1010bin - Digit equals "1" --> result := result*3 = (32)2*3 = 35 = 243
|
||||
|
||||
Iteration for digit 4:
|
||||
result := result2 = ((32)2*3)2 = 310 = 59049
|
||||
1010bin - Digit equals "0"
|
||||
|
||||
return result
|
||||
|
||||
The idea is to implement this exponentiation by squares algorithm in the
|
||||
processor using a pow command. This pow command takes the destination
|
||||
register, the register containing the base and the register containing
|
||||
the exponent. This command then runs the loop using one multiplier to
|
||||
accumulate the square and one multiplier to accumulate the value when a
|
||||
1 is hit. The loop will grab one binary digit at a time from the
|
||||
exponent register. We will always run the result accumulator, then
|
||||
depending on the digit, may use the second accumulator. It will then run
|
||||
through one more multiplier to multiply the squares by the 1's
|
||||
multiplier.
|
||||
|
||||
# JUSTIFICATION AND ANALYSIS
|
||||
|
||||
In this section, we will describe how our specialized instructions will
|
||||
improve the performance of the RSA encryption.
|
||||
|
||||
## ENCRYPTION AND DECRYPTION EXPONENT
|
||||
|
||||
Using the modular instruction in the Euclidean Algorithm, we can reduce
|
||||
the number of stalls needed. Instead of needing to stall for the result
|
||||
of the divide and the result of the multiply within the loop, the
|
||||
modular instruction does not have to stall for any previous results. In
|
||||
the loop of the Euclidean Algorithm without the modular instruction, two
|
||||
of the five instructions need to stall to wait for a previous result.
|
||||
|
||||
CPI = Ideal CPI + The number of stalls per instruction
|
||||
|
||||
CPIold = 1 + 2/5
|
||||
CPIold = 1.4
|
||||
|
||||
CPInew = 1 (since no instructions in the loop need to stall for a result)
|
||||
|
||||
Thus the speedup for the Euclidean Algorithm in the loop is:
|
||||
Speedup = CPIold / CPI new
|
||||
Speedup = 1.4 / 1
|
||||
Speedup = 1.4
|
||||
|
||||
Thus, with respect to stalls, we have a speedup of 1.4 by using the
|
||||
modular instruction in the Euclidean Algorithm. Of course, the latency
|
||||
of the modular instruction is likely to be longer than the latency of
|
||||
the divide, multiply, and subtract instructions, since it is performing
|
||||
three calculations in one instruction. However, since we are building a
|
||||
system only for RSA, we can use specialized hardware to reduce the
|
||||
latency of the modular instruction to receive the speedup of 1.4. As
|
||||
mentioned in the design section, another advantage to using the modular
|
||||
instruction is we reduce the number of temporary registers from three to
|
||||
one.
|
||||
|
||||
We are assume that the store instruction finishes in one clock cycle.
|
||||
Thus, in the Extended Euclidean Algorithm, three of the thirteen
|
||||
instructions in the loop need to stall for previous results. However, by
|
||||
using the modular and multipy-subtract instructions, no instruction
|
||||
needs to stall to wait for a previous result to be available.
|
||||
|
||||
CPI = Ideal CPI + The number of stalls per instruction
|
||||
|
||||
CPIold = 1 + 3/13
|
||||
CPIold = 1.23
|
||||
|
||||
CPInew = 1 (since no instructions in the loop need to stall for a result)
|
||||
|
||||
Thus the speedup for the Euclidean Algorithm in the loop is:
|
||||
Speedup = CPIold / CPI new
|
||||
Speedup = 1.23 / 1
|
||||
Speedup = 1.23
|
||||
|
||||
Thus with respect to stalls, we have a speedup of 1.23 by using the
|
||||
modular and multiply-subtract instructions in the Extended Euclidean
|
||||
Algorithm. As mentioned above, it is likely that the latency of the
|
||||
modular and multiply-subtract instructions are higher than the
|
||||
traditional instructions. However, we are only building a system for RSA
|
||||
encryption and decryption, so we can build hardware that will reduce the
|
||||
latency of the modular and multiply-subtract instructions to receive the
|
||||
speedup of 1.23. Also, an advantage to using the modular and
|
||||
multiply-subtract instructions is we reduce the number of temporary
|
||||
registers needed from five to three.
|
||||
|
||||
## PRIME NUMBER GENERATION, ENCRYPTION, AND DECRYPTION
|
||||
|
||||
Using the pow command we can reduce stalls of large exponents in half.
|
||||
Since the algorithm breaks the exponent into a binary representation of
|
||||
its self and loops through the digits, the worse case scenario is
|
||||
looping half the number of times. One example of this is if the exponent
|
||||
is eight which is 1000 in base two. Since the number of loops is divided
|
||||
in half, the number of multiplies are also divided in half. Even with a
|
||||
large number of ones in the base two representation of the exponent,
|
||||
this can use a seperate multiplyer, since it doesn't rely on the result
|
||||
of the squares.
|
||||
|
||||
Using the above pseudocode we can assume that approximately 1/4 of the
|
||||
instructions are multiplies, thus 1/4 will stall. Using that knowledge
|
||||
and the knowledge that the number of stalls is cut in half we can use
|
||||
the following equations to determin the overall speedup:
|
||||
|
||||
CPI = Ideal CPI + The number of stalls per instruction
|
||||
|
||||
CPI old = 1 + 1/4
|
||||
CPI old = 1.25
|
||||
|
||||
CPI new = 1 + 1/4/2
|
||||
CPI new = 1 + 1/8
|
||||
CPI new = 1.125
|
||||
|
||||
Thus the speedup of this section is:
|
||||
Speedup = CPI old / CPI new
|
||||
Speedup = 1.25/1.125
|
||||
Speedup = 1.11
|
||||
|
||||
# CONCLUSIONS
|
||||
|
||||
In analyzing the typical algorithms used as a part of RSA, we have
|
||||
identified two primary bottlenecks in both encryption and decryption:
|
||||
modulus and exponentiation operations. We propose that because these two
|
||||
operations are so prevalent, they warrant specialized instructions. We
|
||||
also noticed that it is beneficial to use a multiply-subtract
|
||||
instruction to increase speedup while finding a decryption exponent.
|
||||
These specialized instructions improve performance by reducing the
|
||||
number of stalls, and therefore improving ideal CPI. The process of
|
||||
finding encryption and decryption exponents benefits from the increased
|
||||
efficiency of modulus operations. Likewise, the processes of prime
|
||||
generation and encryption and decryption in general, both benefit from a
|
||||
faster exponentiation operation.
|
||||
|
||||
We find that by implementing these instructions, the whole procedure
|
||||
finds an overall speedup, from specific speedups of 1.11 for all uses of
|
||||
exponentiation, 1.4 for finding an encryption exponent, and 1.23 for
|
||||
finding a decryption exponent. Thus, we met our goals for improving the
|
||||
performance of the RSA algorithm by creating a specialized instruction
|
||||
set architecture.
|
||||
|
||||
A further investigation we could make is to look at specific hardware
|
||||
that supports our instruction set architecture. To gain the maximum
|
||||
speedup, we need to use hardware that reduces the latency for all the
|
||||
specialized instructions to a negligible amount. Since we are creating a
|
||||
system with the sole task of encrypting and decrypting messages using
|
||||
RSA, we can create hardware that allows the specialized instructions to
|
||||
have a latency comparable to the traditional instructions.
|
||||
|
||||
# Bibliography
|
||||
|
||||
3 Beauchemin, Pierre, Brassard, Crepeau, Claude, Goutier, Claude, and
|
||||
Pomerance, Carl The Generation of Random Numbers That Are Probably Prime
|
||||
|
||||
Bellare, M., Garay, J., and Rabin T. Fast Batch Verification for Modular
|
||||
Exponentiation and Digital Signatures. *Advances in Cryptology-
|
||||
Eurocrypt 98 Proceedings, Lecture Notes in Computer Sciencey*, 1998.
|
||||
|
||||
Boneh, Dan and Shacham, Horavl. Fast Variants of RSA. *Appears in
|
||||
Cryptobytes*, 2002. Cohen, H., Frey, G. (editors): Handbook of elliptic
|
||||
and hyperelliptic curve cryptography. Discrete Math. Appl., (Chapman and
|
||||
Hall/CRC 2006).
|
||||
|
||||
Cormen, T., Leiserson, C., Rivest, R., and Stein, C. Introduction to
|
||||
Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001.
|
||||
|
||||
Stallings, W., Cryptography and Network Security 2nd ed. (Prentice-Hall,
|
||||
1998)
|
||||
|
||||
Reference in New Issue
Block a user